One related publication that I thought would have fit well into the comment about disparate pain treatment is this[1] from Ziad Obermeyer's group. They found that when they trained a model using patient pain and knee X-rays, much of the disparity in symptoms could be accounted for by findings from the X-rays themselves. It's a nice example of where using the patients' symptoms and objective data may actually outperform current medical standards, which fits with her participatory comments in the final paragraph.
> It's a nice example of where using the patients' symptoms and objective data may actually outperform current medical standards
I think you’re jumping the gun here, this paper was a hot topic when it was published. Patient symptoms combined with objective data is already the medical standard.
Note that:
1. KLG is not a measure of pain but of OA radiographic severity.
2. KLG 3-4 is not a prerequisite for surgery.
From the article:
> While radiographic severity is not part of the formal guideline in allocations for arthroplasty (which only requires evidence of radiographic damage), empirically, patients with higher KLGs are more likely to receive surgery.
TKA patients skew to higher grade for many reasons, one being that studies have shown KLG 2 patients who undergo TKA are more likely to experience dissatisfaction (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8344222/).
There are a lot of “ifs” in this paper which did not examine whether KLG 1-2 but ALG-P 3-4 patients benefit from TKA over conservative mgmt or other surgical interventions. It’s also unclear whether this better selects patients for TKA than KLG 1-2 + pain scores and other clinical variables.
All this shows is that KLG is a poor correlate for pain, which is known and not what the score is designed/used for.
I'm not a radiologist so I could well be overinterpreting. However, if so, I am not sure that I am alone. This study published in Nature Medicine was hailed by radiologists as one of the "notable successes in using explainability methods to aid in the discovery of knowledge" [1].
Your sober assessment seems valuable, and would make for an interesting letter to the editor.
Not sure what a letter to the editor would accomplish. The nature paper only interpreted radiographs and the only claim of the authors was basically that the model is a better predictor of pain than KLG.
Your comment misinterpreted this as “using the patients' symptoms and objective data” (when they only used objective data) and added “may actually outperform current medical standards” which was not the claim as current medical standards already consider patient symptoms in addition to objective data, as stated in the article reference to the TKA guideline.
When I report a joint xray I’m not assessing the patient’s pain level, they can be asked that.
> Your comment misinterpreted this as “using the patients' symptoms and objective data” (when they only used objective data)
This represents an important misunderstanding of the methods of the paper. The model was trained using images (objective data) and the pain score (patients' symptoms). From the methods: "A convolutional neural network was trained to predict KOOS pain score for each knee using each X-ray image."
Also with respect to the author's claims, from the paper's abstract:
> Because algorithmic severity measures better capture underserved patients’ pain, and severity measures influence treatment decisions, algorithmic predictions could potentially redress disparities in access to treatments like arthroplasty.
You think I'm misinterpreting, but I still think that the paper is more important than you're giving credit.
Inference on the validation set is xray -> pain score. It does not incorporate patient symptoms to make the prediction. In real life a surgeon incorporates the xray + patient symptoms/pain score.
Skipped a step: the model needs to be trained, which requires the patient symptoms as the target for weight updates. I think that you simply misread my original comment.
Perhaps I got lost but I am discussing your original statement of “using the patients' symptoms and objective data may actually outperform current medical standards” which relates to the model predictions/inference not training.
In this context we are talking about a pain predictor from an xray which is neat but not the point of KL grading.
The comparator, current medical standards you reference, would be a model outperforming surgeon assessment in conjunction with radiographic findings. Not the predictive value of KL grade.
1 = https://www.nature.com/articles/s41591-020-01192-7