Loading…
An examination of the reliability of a classification algorithm for subgrouping patients with low back pain
Test-retest design to examine interrater reliability. Examine the interrater reliability of individual examination items and a classification decision-making algorithm using physical therapists with varying levels of experience. Classifying patients based on clusters of examination findings has show...
Saved in:
Published in: | Spine (Philadelphia, Pa. 1976) Pa. 1976), 2006, Vol.31 (1), p.77-82 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Test-retest design to examine interrater reliability.
Examine the interrater reliability of individual examination items and a classification decision-making algorithm using physical therapists with varying levels of experience.
Classifying patients based on clusters of examination findings has shown promise for improving outcomes. Examining the reliability of examination items and the classification decision-making algorithm may improve the reproducibility of classification methods.
Patients with low back pain less than 90 days in duration participating in a randomized trial were examined on separate days by different examiners. Interrater reliability of individual examination items important for classification was examined in clinically stable patients using kappa coefficients and intraclass correlation coefficients. The findings from the first examination were used to classify each patient using the decision-making algorithm by clinicians with varying amounts of experience. The reliability of the classification algorithm was examined with kappa coefficients.
A total of 123 patients participated (mean age 37.7 [+/-10.7] years, 44% female), 60 (49%) remained stable between examinations. Reliability of range of motion, centralization/peripheralization judgments with flexion and extension, and the instability test were moderate to excellent. Reliability of centralization/peripheralization judgments with repeated or sustained extension or aberrant movement judgments were fair to poor. Overall agreement on classification decisions was 76% (kappa = 0.60, 95% confidence interval 0.56, 0.64), with no significant differences based on level of experience.
Reliability of the classification algorithm was good. Further research is needed to identify sources of disagreements and improve reproducibility. |
---|---|
ISSN: | 0362-2436 1528-1159 |
DOI: | 10.1097/01.brs.0000193898.14803.8a |