Fig 1

Fig. 1

Three implemented convolutional neural networks (CNNs): (1) CNN-1 simultaneously trained on the images of the sagittal and coronal views, (2) CNN-2 trained only on the images of the sagittal view, and (3) CNN-3 trained only on the images of the coronal view.

Fig 2

Fig. 2

The saliency maps of a representative patient's MRI, coronal, and sagittal views. coloured regions in the saliency maps, where red denotes higher relative influence than blue, indicate the most influential regions on the CNN's performance.

Fig 3

Fig. 3

Receiver operating characteristic (ROC) curve for the CNNs with the orthopaedic surgeon and orthopaedic resident results superimposed.

Expand allCollapse all

Abstract

Introduction

MRI is the modality of choice for cartilage imaging, however, its diagnostic performance is variable.

Objectives

We aimed to evaluate whether deep learning can be utilized to accurately identify cartilage defects when applied to the interpretation of knee MRI images and compare deep learning's performance to that of an orthopaedic trainee and an orthopaedic surgeon.

Methods

We analyzed data from patients who underwent knee MRI evaluation and subsequent arthroscopic knee surgery (207 with-, 90 without cartilage defect). Patients' arthroscopic findings were compared to preoperative MRI images to verify the presence or absence of isolated tibiofemoral cartilage defects. For each patient, the most representative MRI image slice of the patient's condition was selected (defect or no-defect) from a coronal- and sagittal view. We developed three convolutional neural networks (CNNs) to analyze the images: CNN-1 trained on the images of the sagittal and coronal views; CNN-2 trained on the images of the sagittal view; CNN-3 trained on the images of the coronal view. We implemented image-specific saliency maps to visualize the CNNs decision-making process. The same test dataset images were then provided to an experienced orthopaedic surgeon and an orthopaedic trainee.

Results

Saliency maps demonstrated that the CNNs learned to focus on the clinically relevant areas of the MRI. The CNN-1 achieved higher performance (sensitivity-86.96, specificity-100%, positive predictive value [PPV]-100%, negative predictive value [NPV]-66.67%) than the orthopaedic surgeon (sensitivity-82.61%, specificity-83.33%, PPV- 95%, NPV-55.56%),

Conclusions

CNN can be used to enhance the diagnostic performance of MRI in identifying isolated tibiofemoral cartilage defects.

Introduction

Articular cartilage injuries are common and have the potential to progress to osteoarthritis if left untreated. In the clinical work-up of a suspected symptomatic chondral defect, magnetic resonance imaging (MRI) is the modality of choice to better assess such pathology.1x1Merkely, G., Ackermann, J., and Lattermann, C. Articular cartilage defects: incidence, diagnosis, and natural history. Oper Tech Sports Med. 2018; 26: 156–161

Crossref | Scopus (17)
| Google ScholarSee all References
,2x2Rodrigues, M.B. and Camanho, G.L. Mri Evaluation of Knee Cartilage. Rev Bras Ortop. 2010; 45: 340–346

Crossref | PubMed
| Google ScholarSee all References
This non-invasive approach allows articular cartilage to be better observed since its high soft-tissue contrast displays a different signal intensity of the articular cartilage compared to the nearby menisci and bone.3x3Disler, D.G., Peters, T.L., and Muscoreil, S.J. Fat-suppressed spoiled GRASS imaging of knee hyaline cartilage: technique optimization and comparison with conventional MR imaging. AJR Am J Roentgenol. 1994; 163: 887–892

Crossref | PubMed | Scopus (108)
| Google ScholarSee all References
,4x4Potter, H.G., Linklater, J.M., Allen, A.A., Hannafin, J.A., and Haas, S.B. Magnetic resonance imaging of articular cartilage in the knee. An evaluation with use of fast-spin-echo imaging. J Bone Joint Surg Am. 1998; 80: 1276–1284

Crossref | PubMed | Scopus (507)
| Google ScholarSee all References
,5x5Recht, M.P., Kramer, J., and Marcelis, S. Abnormalities of articular cartilage in the knee: analysis of available MR techniques. Radiology. 1993; 187: 473–478

Crossref | PubMed | Scopus (337)
| Google ScholarSee all References
Although the utilization of MRI is considered the standard of care, the accuracy of MRI in detecting these lesions varies depending on several factors including the MRI technique, protocol, and magnet strength as well as the size of the lesion.6x6Zhang, M., Min, Z., Rana, N., and Liu, H. Accuracy of magnetic resonance imaging in grading knee chondral defects. Arthroscopy. 2013; 29: 349–356

Abstract | Full Text | Full Text PDF | PubMed | Scopus (19)
| Google ScholarSee all References
Furthermore, articular cartilage can be a challenging tissue to image due to its very thin and layered microarchitecture overlying a complex 3D osseous base.7x7Merkely, G., Hinckel, B., Shah, N., Small, K., and Lattermann, C. Magnetic resonance imaging of the patellofemoral articular cartilage. (In:)Patellofemoral Pain, Insatbility and Arthritis. Springer, Berlin, Heidelberg; 2020: 47–61

Crossref
| Google ScholarSee all References
There is a wide reported range of diagnostic performance of 2D FSE MR for assessing knee cartilage, with overall sensitivity ranging from 26% to 96%, specificity of 50% to 100%, and accuracy of 49% to 94%.8x8Figueroa, D., Calvo, R., Vaisman, A., Carrasco, M.A., Moraga, C., and Delgado, I. Knee chondral lesions: incidence and correlation between arthroscopic and magnetic resonance findings. Arthroscopy. 2007; 23: 312–315

Abstract | Full Text | Full Text PDF | PubMed | Scopus (99)
| Google ScholarSee all References
, 9x9Quatman, C.E., Hettrich, C.M., Schmitt, L.C., and Spindler, K.P. The clinical utility and diagnostic performance of magnetic resonance imaging for identification of early and advanced knee osteoarthritis: a systematic review. Am J Sports Med. 2011; 39: 1557–1568

Crossref | PubMed | Scopus (49)
| Google ScholarSee all References
, 10x10Smith, T.O., Drew, B.T., Toms, A.P., Donell, S.T., and Hing, C.B. Accuracy of magnetic resonance imaging, magnetic resonance arthrography and computed tomography for the detection of chondral lesions of the knee. Knee Surg Sports Traumatol Arthrosc. 2012; 20: 2367–2379

Crossref | PubMed | Scopus (28)
| Google ScholarSee all References
, 11x11Sonin, A.H., Pensy, R.A., Mulligan, M.E., and Hatem, S. Grading articular cartilage of the knee using fast spin-echo proton density-weighted MR imaging without fat suppression. AJR Am J Roentgenol. 2002; 179: 1159–1166

Crossref | PubMed | Scopus (137)
| Google ScholarSee all References
Previous studies have demonstrated that deeper lesions, such as Grade III and IV, are identified more often while smaller/shallower or earlier stage lesions are not as accurately detected.12x12McGibbon, C.A. and Trahan, C.A. Measurement accuracy of focal cartilage defects from MRI and correlation of MRI graded lesions with histology: a preliminary study. Osteoarthritis Cartilage. 2003; 11: 483–493

Abstract | Full Text | Full Text PDF | PubMed | Scopus (70)
| Google ScholarSee all References
Detecting early cartilage degeneration is crucial to the treatment and prevention of further symptomatic pain and reducing risk or delaying progression of OA.13x13Kijowski, R., Blankenbaker, D.G., Munoz Del Rio, A., Baer, G.S., and Graf, B.K. Evaluation of the articular cartilage of the knee joint: value of adding a T2 mapping sequence to a routine MR imaging protocol. Radiology. 2013; 267: 503–513

Crossref | PubMed | Scopus (132)
| Google ScholarSee all References
Following MRI, the next diagnostic step often entails diagnostic knee arthroscopy before determination of a definitive intervention.1x1Merkely, G., Ackermann, J., and Lattermann, C. Articular cartilage defects: incidence, diagnosis, and natural history. Oper Tech Sports Med. 2018; 26: 156–161

Crossref | Scopus (17)
| Google ScholarSee all References
While arthroscopy is the gold standard in diagnosing chondral lesions, the expense and invasive nature of such procedure begs for more accurate pre-operative screening systems that may one day obviate the role of diagnostic arthroscopy prior to definitive surgery.

In recent years, machine learning has gained rapid popularity in medical applications, revolutionizing the way high-volume medical data is processed and interpreted.14x14Borjali, A., Chen, A., Muratoglu, O., Morid, M., and Varadarajan, K. Deep Learning in Orthopedics: how Do We Build Trust in the machine?. Healthcare Transformation. 2020; (0)

Crossref
| Google ScholarSee all References
In general, machine learning refers to a series of mathematical algorithms that enable the machine to "learn" the relationship between input and output data without being explicitly told how to do so.14x14Borjali, A., Chen, A., Muratoglu, O., Morid, M., and Varadarajan, K. Deep Learning in Orthopedics: how Do We Build Trust in the machine?. Healthcare Transformation. 2020; (0)

Crossref
| Google ScholarSee all References
Deep learning is a subset of machine learning that is mainly concerned with image analysis and extracting knowledge from complex imaging data sets such as medical images.14x14Borjali, A., Chen, A., Muratoglu, O., Morid, M., and Varadarajan, K. Deep Learning in Orthopedics: how Do We Build Trust in the machine?. Healthcare Transformation. 2020; (0)

Crossref
| Google ScholarSee all References
, 15x15Borjali, A., Langhorn, J., Monson, K., and Raeymaekers, B. Using a patterned microtexture to reduce polyethylene wear in metal-on-polyethylene prosthetic bearing couples. Wear. 2017; 392: 77–83

Crossref | PubMed | Scopus (32)
| Google ScholarSee all References
, 16x16Borjali A., Chen A., Bedair H., Comparing performance of deep convolutional neural network with orthopaedic surgeons on identification of total hip prosthesis design from plain radiographs. medRxiv.2020.

Google ScholarSee all References
Radiologists and orthopaedic surgeons have applied deep learning to provide automatic interpretations of medical images to improve their diagnostic accuracy and speed.17x17Xue, Y., Zhang, R., Deng, Y., Chen, K., and Jiang, T. A preliminary examination of the diagnostic value of deep learning in hip osteoarthritis. PLoS ONE. 2017; 12: e0178992

Crossref | PubMed | Scopus (84)
| Google ScholarSee all References
,18x18Tiulpin, A., Thevenot, J., Rahtu, E., Lehenkari, P., and Saarakkala, S. Automatic knee osteoarthritis diagnosis from plain radiographs: a deep learning-based approach. Sci Rep. 2018; 8: 1727

Crossref | PubMed | Scopus (219)
| Google ScholarSee all References
,19x19Borjali, A., Chen, A.F., Muratoglu, O.K., Morid, M.A., and Varadarajan, K.M. Detecting total hip replacement prosthesis design on plain radiographs using deep convolutional neural network. J Orthop Res. 2020; 38: 1465–1471

Crossref | PubMed | Scopus (34)
| Google ScholarSee all References
In addition, deep learning may be used to transfer tertiary centers' expertise to smaller community institutions and more remote areas where experts can not be readily accessible.

Consequently, this study's primary aim was to evaluate whether deep learning applied to the interpretation of MRI images and founded upon a "ground truth" of arthroscopically-verified diagnoses can be utilized to identify cartilage defects accurately. The secondary aim was to compare deep learning's performance in identifying cartilage defects to those of an experienced orthopaedic surgeon and a less experienced orthopaedic resident.

Methods

Data collection

After institutional review board approval, informed consent was obtained from all patients when they were entered into our study database. In this retrospective study, we analyzed data from patients who underwent knee MRI evaluation and consequently had arthroscopic knee surgery between September 2011 and December 2019. Patients' preoperative MRI images were compared to arthroscopic findings by an orthopaedic surgeon to verify whether an isolated cartilage defect was present in the tibiofemoral joint or not. Proton density-weighted fast spin-echo MRI scans (1.5-T; Siemens) were used. Subsequently for each patient, one image from the coronal view and one image from the sagittal view that were the most representative imaging slices of the patient's condition (defect or no-defect) were selected from the entire MRI images based on the arthroscopic findings by two independent examiners (an orthopaedic resident and a research assistant). If a disagreement between the two examiners existed then the senior Sports Medicine Fellowship trained orthopaedic surgeon identified the appropriate image.

Data analysis

Images were randomly assigned to "training", "validation", and final "test" subsets with an 80:10:10 split ratio. SPSS (version 21.0; IBM Corp) was used to analye our patient population's baseline demographics. Continuous variables are reported as mean ± standard deviation, whereas categorical variables are reported as numbers and percentages. The normal distribution of the data was confirmed using the Shapiro-Wilk test. Continuous data were compared with the independent sample t-test. Categorical data were compared with the Chi-square test. We used the training subset to train three convolutional neural networks (CNNs). The first network (CNN-1) was simultaneously trained on the images of the sagittal and coronal views. The second network (CNN-2) was only trained on the images of the sagittal view, and the third network (CNN-3) was only trained on the images of the coronal view (Fig. 1).

Fig 1 Opens large image

Fig. 1

Three implemented convolutional neural networks (CNNs): (1) CNN-1 simultaneously trained on the images of the sagittal and coronal views, (2) CNN-2 trained only on the images of the sagittal view, and (3) CNN-3 trained only on the images of the coronal view.

We optimized hyper-parameters iteratively on the validation dataset using a grid search strategy. We utilized the transfer learning method by modifying a CNN that was initially developed for non-medical image classification and used it for this application.19x19Borjali, A., Chen, A.F., Muratoglu, O.K., Morid, M.A., and Varadarajan, K.M. Detecting total hip replacement prosthesis design on plain radiographs using deep convolutional neural network. J Orthop Res. 2020; 38: 1465–1471

Crossref | PubMed | Scopus (34)
| Google ScholarSee all References
We used Xception CNN architecture20x20F.C. Xception, Deep learning with depthwise separable convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.

Google ScholarSee all References
pre-trained on ImageNet21x21Deng, J., Dong, W., Socher, R. et al. Imagenet: a large-scale hierarchical image database. In 200. in: 9 IEEE conference on computer vision and pattern recognition. ; 2009

Crossref
| Google ScholarSee all References
database for each view to train CNN-2 and CNN-3. Then we used a support vector machine (SVM) to combine two views to train CNN-1. We implemented image-specific saliency maps to visualize the CNNs decision-making process. Image-specific saliency map ranks all the pixels of an image based on their relative influence on the CNN's output (classification score). We plotted these relative influences as a heat map to depict the saliency map for each image. colored regions in the saliency maps indicate the most important regions on the CNN's output, where red denotes higher relative influence than blue.

We used the test dataset (29 patients), which had been kept separate from all previous training and validation processes, to evaluate the CNNs' ultimate performance. These CNNs were implemented using Tensorflow (Keras backend) on a workstation comprised of an Intel(R) Xeon(R) Gold 6128 processor, 64GB of DDR4 RAM, and an NVIDIA Quadro P5000 graphic card.

To compare the CNNs' performance against human interpretation, the same test dataset images were also provided to two independent clinicians including an experienced orthopaedic surgeon and a less experienced orthopaedic resident. It is worth mentioning that the orthopaedic surgeon had extensive expertise in the diagnosis and treatment of cartilage defects. The orthopaedic surgeon and the orthopaedic resident had access to both the sagittal and coronal view images, while they were blinded to the arthroscopic findings. In all test analyses, the orthopaedic surgeon and the orthopaedic resident, as well as all the CNNs were asked to identify whether a cartilage defect was present or absent in a binary fashion.

We used ROC Curve to visualizes the trade-off between the false positive rate and the true positive rate for different probability thresholds for a binary diagnostic model. ROC curve plots the false positive rate (false positive rate = false positives / [false positives + true negatives]) on the x-axis versus the true positive rate (true positive rate = true positives / [true positives + false negatives]) on the y-axis for different thresholds between 0.0 and 1.0. The true positive rate is also known as sensitivity and the false positive rate is also known as the inverted specificity (false positive rate = 1 - Specificity). The PPV and NPV are the proportions of true positives and true negatives in the results respectively, where PPV = true positives/ (true positives + false positives) and NPV = true negatives/ (true negatives + false negatives).

Results

Two-hundred and ninety-seven knees were evaluated in this study, 207 with cartilage defect and 90 without cartilage defect in the tibiofemoral joint. Patients with cartilage defects were significantly older (37.6 ± 11.2) than patients with no defect (33.8 ± 11.8) (P < .05). In addition, more females had cartilage defects (120 females had a defect [58% of all defects] vs. 36 females with no cartilage defect [40% of all no-defects], P < .05).  Baseline demographic information is displayed in Table 1. The mean time between MRI and surgery was 3.1 ± 3.4 months.

Table 1Patient demographics.
TotalDefectNo-defect(95% CI)P value
Number of patients (%)297207 (69.7)90 (30.3)
Age, y, mean SD36.4 ± 11.537.6 ± 11.233.8 ± 11.8(0.9 – 6.6).01
BMI (kg/m2), mean SD28.1 ± 5.428.2 ± 5.528.1 ± 5.2(−1.3 - 1.4).93
Female, n (%)156 (52.5)120 (58.0)36 (40.0)< .01
Right knee, n (%)142 (47.8)98 (47.3)44 (48.9).89
View Table in HTML

Abbreviations: CI, confidence interval; n, number; SD, standard deviation; y, year.

Saliency maps demonstrated that CNNs applied focused consideration along with the clinically relevant articular cartilage of the tibiofemoral joint on MRI images during the decision-making processes. (Fig. 2)

Fig 2 Opens large image

Fig. 2

The saliency maps of a representative patient's MRI, coronal, and sagittal views. coloured regions in the saliency maps, where red denotes higher relative influence than blue, indicate the most influential regions on the CNN's performance.

Figure 3 shows the receiver operating characteristic (ROC) curves for all CNNs classifying the MRI images into "defect" and "no defect" categories. The diagnostic performances of the orthopaedic surgeon and the orthopaedic resident are also overlaid in Figure 3. While all CNNs achieved the same area under the curve value, at the reported threshold, CNN-1 outperformed CNN-2 and CNN-3.

Fig 3 Opens large image

Fig. 3

Receiver operating characteristic (ROC) curve for the CNNs with the orthopaedic surgeon and orthopaedic resident results superimposed.

Table 2 shows the binary diagnoses (cartilage defect vs. no-defect) of all the CNNs and the orthopaedic surgeon and the orthopaedic resident for the 29 patients in the test dataset. Defect characteristics (location, ICRS grade, and size) are also indicated for each patient.

Table 2Diagnosis results of the orthopaedic surgeon, the orthopaedic resident, and all the CNNs for patients in the test dataset with defect characteristics.
PatientGroundLocationICRS GradeSizeOrthopaedicOrthopaedicCNN-1CNN-2CNN-3
#Truth(0 to 4)SurgeonResident
1defectMFC22cm2defectdefectdefectdefectno-defect
2defectMFC and MTP11cm2 and 1cm2defectno-defectdefectno-defectdefect
3defectMFC and MTP11cm2 and 1cm2defectno-defectdefectno-defectdefect
4defectMTP35mm2defectno-defectdefectdefectdefect
5defectMFC and LFC46cm2 and 3cm2defectno-defectdefectdefectdefect
6defectMFC42cm2defectdefectdefectdefectdefect
7defectMFC410cm2defectdefectdefectdefectdefect
8defectMFC42.5cm2defectno-defectdefectdefectdefect
9defectLFC and LTP39cm2 and 9cm2defectno-defectdefectdefectdefect
10defectLFC11cm2defectdefectdefectdefectno-defect
11defectMFC and MTP15mm2 and 1.5cm2defectdefectdefectno-defectdefect
12defectMFC and MTP24cm2 and 5mm2defectdefectdefectdefectdefect
13defectMFC33cm2no-defectno-defectdefectdefectdefect
14defectMFC36cm2defectno-defectdefectdefectdefect
15defectMFC and LFC44cm2 and 3cm2defectno-defectdefectdefectdefect
16defectMFC45cm2defectno-defectdefectdefectdefect
17defectMFC and MTP41.5cm2 and 1.4cm2defectno-defectdefectdefectdefect
18defectMFC15mm2defectno-defectdefectdefectdefect
19defectLFC and LTP43.2cm2 and 2.5cm2defectdefectdefectdefectdefect
20defectLFC43cm2defectno-defectdefectdefectdefect
21no-defect0defectdefectno-defectno-defectno-defect
22no-defect0no-defectdefectno-defectdefectno-defect
23no-defect0defectdefectdefectdefectdefect
24no-defect0no-defectdefectno-defectno-defectno-defect
25no-defect0no-defectno-defectdefectno-defectdefect
26no-defect0defectdefectno-defectdefectno-defect
27no-defect0no-defectno-defectdefectno-defectdefect
28no-defect0defectno-defectno-defectno-defectno-defect
29no-defect0no-defectdefectno-defectno-defectno-defect
View Table in HTML

Abbeviations: CI, confidence interval; LFC, lateral femoral condyle; LTP, lateral tibial plateau; MFC, medial femoral condyle; MTP, medial tibial plateau; n, number; SD, standard deviation; y, year.

CNN-1 slightly outperformed the orthopaedic surgeon and made two more accurate diagnoses overall, with one more accurate diagnosis of defect and one more accurate diagnosis of no-defect made by the CNN-1. All CNNs significantly outperformed the orthopaedic resident. Interestingly, both the orthopaedic surgeon  (sensitivity 82.61%, specificity 83.33%, positive predictive value [PPV] 95%, negative predictive value [NPV] 55.56%) and the CNN-1 (sensitivity 86.96, specificity 100%, PPV 100%, NPV 66.67%), were less accurate in determining a negative diagnosis of no-defect compared to determining a positive diagnosis of present cartilage defect (Table 3). In particular, orthopaedic resident identified 7/20 defects (35%) and 3/9 no-defects correctly (33%). Orthopaedic surgeon classified 19/20 defects (95%) and 5/9 no-defects appropriately (55%), while CNN-1 was the most accurate with 20/20 defects (100%) and 6/9 no-defects (66.7%). In general, no-defects seemed to be more challenging to identify.

Table 3Diagnostic performance of the orthopaedic surgeon, the orthopaedic resident, and all the CNNs for patients in the test dataset.
Orthopaedic SurgeonOrthopaedic ResidentCNN-1CNN-2CNN-3
Accuracy82.76%34.48%89.66%79.31%82.76%
Sensitivity95.00%35.00%100.00%85.00%90.00%
Specificity55.56%33.33%66.67%66.67%66.67%
PPV*82.61%53.85%86.96%85.00%85.71%
NPV83.33%18.75%100.00%66.67%75.00%
View Table in HTML
low asteriskpositive predictive value (PPV).
negative predictive value (NPV).

Discussion

In this study, we developed three CNNs to provide a binary automated diagnosis for the presence or absence of isolated tibiofemoral cartilage defects utilizing MRI images and demonstrated that CNNs can be used to enhance the diagnostic performance of MRI in identifying isolated tibiofemoral cartilage defects. We also visualized the decision-making process of these CNNs using saliency maps to highlight the influential regions of MRI images on the CNNs outcome. These saliency maps demonstrated that these CNNs appropriately focused on clinically relevant articular cartilage margins to make a diagnosis (Fig. 2). Important to note here that these CNNs were not provided any direct instructions on where in the entire MRI image to look for the defect. We showed that the diagnostic accuracy of these CNNs was comparable to that of an experienced orthopaedic surgeon (Table 2). The CNN-1 achieved the best overall performance with an accuracy of 89.66%, sensitivity of 89.96%, specificity of 100%, PPV of 100%, and NPV of 66.67%; ultimately arriving at 2 (out of 29) additional accurate diagnoses over the orthopaedic surgeon (Fig. 3). These findings demonstrate that such CNN can be used in clinical settings to improve the speed and diagnostic accuracy of MRI interpretation to identify cartilage defects compared to manual review by an expert. Furthermore, these experiments show that with the provision of additional data sources during training, CNNs can integrate these additional data points to arrive at more accurate outputs; as demonstrated by the CNN-1, which was simultaneously trained on the images of the sagittal and coronal views outperforming the CNN-2 and CNN-3, which were trained on the images of sagittal or coronal views alone, respectively.

The reported diagnostic performance of 2D FSE MR for assessing knee cartilage is wide, with overall sensitivity ranging from 26% to 96%, specificity of 50% to 100%, and accuracy of 49% to 94%.[9–12] A meta-analysis of 8 studies conducted by Zhang et al. concluded that manually reviewed knee MRI demonstrates an overall sensitivity of 75% (95% CI, 62% to 84%), specificity of 94% (95% CI, 89% to 97%), and diagnostic odds ratio of 12.5 (95% CI, 6.5 to 24.2) in detecting knee chondral lesions, higher than grade I14. Similarly, in a cross-sectional study of 36 patients comparing MRI-grading (manual) and arthroscopic-grading of cartilage disease; von Engelhardt et al. calculated grade-specific diagnostic values with sensitivity ranging from 20% to 70%, specificity ranging from 74% to 95%, and overall accuracy ranging from 70% to 92%.22x22von Engelhardt, L.V., Lahner, M., and Klussmann, A. Arthroscopy vs. MRI for a detailed assessment of cartilage disease in osteoarthritis: diagnostic value of MRI in clinical practice. BMC Musculoskelet Disord. 2010; 11: 75

Crossref | PubMed | Scopus (38)
| Google ScholarSee all References
Based on the low sensitivities appreciated in both of these studies, the authors conclude that a negative result on MRI should not preclude diagnostic arthroscopy.6x6Zhang, M., Min, Z., Rana, N., and Liu, H. Accuracy of magnetic resonance imaging in grading knee chondral defects. Arthroscopy. 2013; 29: 349–356

Abstract | Full Text | Full Text PDF | PubMed | Scopus (19)
| Google ScholarSee all References
,22x22von Engelhardt, L.V., Lahner, M., and Klussmann, A. Arthroscopy vs. MRI for a detailed assessment of cartilage disease in osteoarthritis: diagnostic value of MRI in clinical practice. BMC Musculoskelet Disord. 2010; 11: 75

Crossref | PubMed | Scopus (38)
| Google ScholarSee all References
Interestingly, our work results suggest a significant improvement in the sensitivity and specificity of automatic CNN-enhanced MRI interpretation over traditional manual review in detecting cartilage defects.

These early findings carry bright implications for the healthcare system. As the accuracy of these CNNs proves comparable to, if not outperforming that of orthopaedic surgeons; the diagnostic utility of such deep learning networks may be fully realized. Firstly, achieving high sensitivities approaching 100% demonstrate that CNN-enhanced MRI interpretation may serve as an effective screening tool, essentially ruling out cartilage defect and obviating the need for diagnostic arthroscopy. This not only saves patients and the healthcare system redundant time and financial costs, but also cuts the invasive risks of surgical intervention/complication. Additionally, the automated nature of CNN interpretation improves high volume data interpretation and diagnostic speed while also enhancing specialized care access. Deep learning tools like these can be integrated into software and imaging systems, bringing tertiary centers' diagnostic expertise to community hospitals and remote rural clinics.

This study's significant strength is its foundation upon a "ground truth" of diagnoses verified by arthroscopy. In the field of machine and deep learning, the "ground truth" is the objective reality that the model gets trained on to predict an outcome or automate a data analysis task. The best objective measure, or "gold standard," for cartilage pathology is direct visualization and interrogation by arthroscopy. As such, our CNNs are trained to directly predict the "gold standard" arthroscopic diagnosis, free of any intervening subjective bias inherent to human radiologic interpretation. Yet we caution that the results of this early study bear limited generalizability at this time. All the CNNs underwent rigorous deep learning protocols incorporating a sizable dataset of 297 subjects, but were "trained" to determine only a binary diagnosis of isolated cartilage "defect" or "no-defect." Conclusions of diagnostic performance surrounding parameters of disease severity, the grade of cartilage damage, or concomitant pathology cannot be made due to the limited sample size. Similarly, comparisons against human performance are limited to the scope of a single orthopaedic surgeon and a single orthopaedic resident. However, these limitations mark exciting frontiers for further investigation. Furthermore, our study demonstrated that additional input data can be effectively integrated by CNNs to provide more accurate output interpretations. As such, our future work seeks to incorporate full data volumes from 3-view MRI series, a diversity of expert imaging interpretations, more granular diagnostic grades, and cases with patellofemoral or concomitant intra-articular pathology to further hone the diagnostic performance and clinical utility of CNN-enhanced MRI interpretation.

Conclusion

Convolutional neural networks (CNNs) can be used to enhance the diagnostic performance of MRI in identifying isolated tibiofemoral cartilage defects.

Declaration of Competing Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. 1Merkely, G., Ackermann, J., and Lattermann, C. Articular cartilage defects: incidence, diagnosis, and natural history. Oper Tech Sports Med. 2018; 26: 156–161
  2. 2Rodrigues, M.B. and Camanho, G.L. Mri Evaluation of Knee Cartilage. Rev Bras Ortop. 2010; 45: 340–346
  3. 3Disler, D.G., Peters, T.L., and Muscoreil, S.J. Fat-suppressed spoiled GRASS imaging of knee hyaline cartilage: technique optimization and comparison with conventional MR imaging. AJR Am J Roentgenol. 1994; 163: 887–892
  4. 4Potter, H.G., Linklater, J.M., Allen, A.A., Hannafin, J.A., and Haas, S.B. Magnetic resonance imaging of articular cartilage in the knee. An evaluation with use of fast-spin-echo imaging. J Bone Joint Surg Am. 1998; 80: 1276–1284
  5. 5Recht, M.P., Kramer, J., and Marcelis, S. Abnormalities of articular cartilage in the knee: analysis of available MR techniques. Radiology. 1993; 187: 473–478
  6. 6Zhang, M., Min, Z., Rana, N., and Liu, H. Accuracy of magnetic resonance imaging in grading knee chondral defects. Arthroscopy. 2013; 29: 349–356
  7. 7Merkely, G., Hinckel, B., Shah, N., Small, K., and Lattermann, C. Magnetic resonance imaging of the patellofemoral articular cartilage. (In:)Patellofemoral Pain, Insatbility and Arthritis. Springer, Berlin, Heidelberg; 2020: 47–61
  8. 8Figueroa, D., Calvo, R., Vaisman, A., Carrasco, M.A., Moraga, C., and Delgado, I. Knee chondral lesions: incidence and correlation between arthroscopic and magnetic resonance findings. Arthroscopy. 2007; 23: 312–315
  9. 9Quatman, C.E., Hettrich, C.M., Schmitt, L.C., and Spindler, K.P. The clinical utility and diagnostic performance of magnetic resonance imaging for identification of early and advanced knee osteoarthritis: a systematic review. Am J Sports Med. 2011; 39: 1557–1568
  10. 10Smith, T.O., Drew, B.T., Toms, A.P., Donell, S.T., and Hing, C.B. Accuracy of magnetic resonance imaging, magnetic resonance arthrography and computed tomography for the detection of chondral lesions of the knee. Knee Surg Sports Traumatol Arthrosc. 2012; 20: 2367–2379
  11. 11Sonin, A.H., Pensy, R.A., Mulligan, M.E., and Hatem, S. Grading articular cartilage of the knee using fast spin-echo proton density-weighted MR imaging without fat suppression. AJR Am J Roentgenol. 2002; 179: 1159–1166
  12. 12McGibbon, C.A. and Trahan, C.A. Measurement accuracy of focal cartilage defects from MRI and correlation of MRI graded lesions with histology: a preliminary study. Osteoarthritis Cartilage. 2003; 11: 483–493
  13. 13Kijowski, R., Blankenbaker, D.G., Munoz Del Rio, A., Baer, G.S., and Graf, B.K. Evaluation of the articular cartilage of the knee joint: value of adding a T2 mapping sequence to a routine MR imaging protocol. Radiology. 2013; 267: 503–513
  14. 14Borjali, A., Chen, A., Muratoglu, O., Morid, M., and Varadarajan, K. Deep Learning in Orthopedics: how Do We Build Trust in the machine?. (0)Healthcare Transformation. 2020;
  15. 15Borjali, A., Langhorn, J., Monson, K., and Raeymaekers, B. Using a patterned microtexture to reduce polyethylene wear in metal-on-polyethylene prosthetic bearing couples. Wear. 2017; 392: 77–83
  16. 16Borjali A., Chen A., Bedair H., Comparing performance of deep convolutional neural network with orthopaedic surgeons on identification of total hip prosthesis design from plain radiographs. medRxiv.2020.
  17. 17Xue, Y., Zhang, R., Deng, Y., Chen, K., and Jiang, T. A preliminary examination of the diagnostic value of deep learning in hip osteoarthritis. PLoS ONE. 2017; 12: e0178992
  18. 18Tiulpin, A., Thevenot, J., Rahtu, E., Lehenkari, P., and Saarakkala, S. Automatic knee osteoarthritis diagnosis from plain radiographs: a deep learning-based approach. Sci Rep. 2018; 8: 1727
  19. 19Borjali, A., Chen, A.F., Muratoglu, O.K., Morid, M.A., and Varadarajan, K.M. Detecting total hip replacement prosthesis design on plain radiographs using deep convolutional neural network. J Orthop Res. 2020; 38: 1465–1471
  20. 20F.C. Xception, Deep learning with depthwise separable convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
  21. 21Deng, J., Dong, W., Socher, R. et al. Imagenet: a large-scale hierarchical image database. In 200. in: 9 IEEE conference on computer vision and pattern recognition. ; 2009
  22. 22von Engelhardt, L.V., Lahner, M., and Klussmann, A. Arthroscopy vs. MRI for a detailed assessment of cartilage disease in osteoarthritis: diagnostic value of MRI in clinical practice. BMC Musculoskelet Disord. 2010; 11: 75
1Gergo Merkely M.D. and Alireza Borjali Ph.D. are both first authors.

 

Linked Articles

Unknown widget #d2170c4d-a9cf-482f-ac17-ef77d57a1866

of type linkedContentList

Related Articles

Unknown widget #c2ffda61-8426-42f7-926b-03d7330eede2

of type relatedArticleListWidget