Loading…
Input representations and classification strategies for automated human gait analysis
•Aggregating data from several trials of one person is advantageous.•Majority voting on classifier's predictions and mean waveforms performed best.•Utilizing derived signal representations (e.g. signal differences) is advantageous.•Combining original input signals and derived representations is...
Saved in:
Published in: | Gait & posture 2020-02, Vol.76, p.198-203 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Aggregating data from several trials of one person is advantageous.•Majority voting on classifier's predictions and mean waveforms performed best.•Utilizing derived signal representations (e.g. signal differences) is advantageous.•Combining original input signals and derived representations is advantageous.
Quantitative gait analysis produces a vast amount of data, which can be difficult to analyze. Automated gait classification based on machine learning techniques bear the potential to support clinicians in comprehending these complex data. Even though these techniques are already frequently used in the scientific community, there is no clear consensus on how the data need to be preprocessed and arranged to assure optimal classification accuracy outcomes.
Is there an optimal data aggregation and preprocessing workflow to optimize classification accuracy outcomes?
Based on our previous work on automated classification of ground reaction force (GRF) data, a sequential setup was followed: firstly, several aggregation methods – early fusion and late fusion – were compared, and secondly, based on the best aggregation method identified, the expressiveness of different combinations of signal representations was investigated. The employed dataset included data from 910 subjects, with four gait disorder classes and one healthy control group. The machine learning pipeline comprised principle component analysis (PCA), z-standardization and a support vector machine (SVM).
The late fusion aggregation, i.e., utilizing majority voting on the classifier's predictions, performed best. In addition, the use of derived signal representations (relative changes and signal differences) seems to be advantageous as well.
Our results indicate that great caution is needed when data preprocessing and aggregation methods are selected, as these can have an impact on classification accuracies. These results shall serve future studies as a guideline for the choice of data aggregation and preprocessing techniques to be employed. |
---|---|
ISSN: | 0966-6362 1879-2219 |
DOI: | 10.1016/j.gaitpost.2019.10.021 |