VIENNA - An artificial intelligence (AI) algorithm can accurately assess bone age on hand x-rays, yielding interchangeable performance to that of experienced musculoskeletal radiologists in research presented on 14 July at ECR 2022.
Researchers from Washington University in the U.S. and Austrian AI software developer ImageBiopsy Lab tested a commercial AI software application on nearly 350 hand x-rays performed for bone age assessment. They found it yielded nearly 90% plate accuracy and was interchangeable with three expert readers.
"Clinical bone age assessment can benefit from a reliable AI automation which can be fully integrated into the reading workflow to support the clinicians," said presenter Dr. Matthew DiFranco of ImageBiopsy Lab.
Although bone analysis of left-hand x-rays based on the Greulich and Pyle (G&P) guidelines remains a common reference standard for skeletal maturity assessment, this method is time-consuming for radiologists and is prone to intra- and interreader variability, according to DiFranco.
ImageBiopsy Lab has developed Panda, a CE-marked AI software application that automates bone estimates based on the G&P atlas. To assess the performance of the software, the researchers gathered 5,541 hand radiographs acquired for bone age assessment between 2011 and 2020 from multiple sites affiliated with Washington University. They then used stratified random sampling to select 345 bone age x-rays for the study, including boys ages 2-17 and girls ages 2-16. There were an equal number of patients per year of age.
The x-rays for these patients were read in a blinded manner by pediatric radiologists with 6, 19, and 27 years of experience, respectively. For the purposes of the study, the ground truth was established as either the mean of the three readers or the consensus of the three readers when any two initial reads differed by more than six months. The exams were also subsequently processed by version 1.06 of the Panda software.
|Bone age assessment performance in comparison with ground truth|
|Reader 1||Reader 2||Reader 3||AI software|
|Mean absolute deviation||5.67 months||3.95 months||4.94 months||5.79 months|
|Root mean squared error||9.26 months||6.81 months||7.78 months||7.46 months|
In addition, the AI software produced plate accuracy (within one G&P plate of the ground truth) of 89.7%. It also had interchangeability (the mean change in interreader differences when interchanging AI with a random qualified reader) of negative 5.8 months.
"We saw good agreement between the AI and Greulich and Pyle ground truth on a U.S. cohort from the clinical routine," DiFranco said. "The plate accuracy was nearly 90%, suggesting that the AI software can aid experts in pointing to the right place in the atlas. Also, the AI software demonstrated interchangeability with expert readers."