Loading…

Quantitative analysis of manual annotation of clinical text samples

•Significant correlation between concept coverage and the number of available concepts.•Significant correlation between term coverage and the number of terms available.•No significant differences in concept and term coverage between terminology settings.•Low inter-annotator agreement coefficient ind...

Full description

Saved in:
Bibliographic Details
Published in:International journal of medical informatics (Shannon, Ireland) Ireland), 2019-03, Vol.123, p.37-48
Main Authors: Miñarro-Giménez, Jose A., Cornet, Ronald, Jaulent, M.C., Dewenter, Heike, Thun, Sylvia, Gøeg, Kirstine Rosenbeck, Karlsson, Daniel, Schulz, Stefan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Significant correlation between concept coverage and the number of available concepts.•Significant correlation between term coverage and the number of terms available.•No significant differences in concept and term coverage between terminology settings.•Low inter-annotator agreement coefficient independently of the terminology setting.•Lack of correlation between inter-annotator agreement and concept or term coverage. Semantic interoperability of eHealth services within and across countries has been the main topic in several research projects. It is a key consideration for the European Commission to overcome the complexity of making different health information systems work together. This paper describes a study within the EU-funded project ASSESS CT, which focuses on assessing the potential of SNOMED CT as core reference terminology for semantic interoperability at European level. This paper presents a quantitative analysis of the results obtained in ASSESS CT to determine the fitness of SNOMED CT for semantic interoperability. The quantitative analysis consists of concept coverage, term coverage and inter-annotator agreement analysis of the annotation experiments related to six European languages (English, Swedish, French, Dutch, German and Finnish) and three scenarios: (i) ADOPT, where only SNOMED CT was used by the annotators; (ii) ALTERNATIVE, where a fixed set of terminologies from UMLS, excluding SNOMED CT, was used; and (iii) ABSTAIN, where any terminologies available in the current national infrastructure of the annotators’ country were used. For each language and each scenario, we configured the different terminology settings of the annotation experiments. There was a positive correlation between the number of concepts in each terminology setting and their concept and term coverage values. Inter-annotator agreement is low, irrespective of the terminology setting. No significant differences were found between the analyses for the three scenarios, but availability of SNOMED CT for the assessed language is associated with increased concept coverage. Terminology setting size and concept and term coverage correlate positively up to a limit where more concepts do not significantly impact the coverage values. The results did not confirm the hypothesis of an inverse correlation between concept coverage and IAA due to a lower amount of choices available. The overall low IAA results pose a challenge for interoperability and indicate the need for further r
ISSN:1386-5056
1872-8243
DOI:10.1016/j.ijmedinf.2018.12.011