Papers
arxiv:2309.02244

Augmenting Chest X-ray Datasets with Non-Expert Annotations

Published on Sep 5, 2023
Authors:
,
,
,
,

Abstract

Non-expert annotations of chest drain tubes in X-rays were collected to enhance existing datasets, showing good agreement with expert labels and highlighting the importance of ground truth annotation quality.

AI-generated summary

The advancement of machine learning algorithms in medical image analysis requires the expansion of training datasets. A popular and cost-effective approach is automated annotation extraction from free-text medical reports, primarily due to the high costs associated with expert clinicians annotating medical images, such as chest X-rays. However, it has been shown that the resulting datasets are susceptible to biases and shortcuts. Another strategy to increase the size of a dataset is crowdsourcing, a widely adopted practice in general computer vision with some success in medical image analysis. In a similar vein to crowdsourcing, we enhance two publicly available chest X-ray datasets by incorporating non-expert annotations. However, instead of using diagnostic labels, we annotate shortcuts in the form of tubes. We collect 3.5k chest drain annotations for NIH-CXR14, and 1k annotations for four different tube types in PadChest, and create the Non-Expert Annotations of Tubes in X-rays (NEATX) dataset. We train a chest drain detector with the non-expert annotations that generalizes well to expert labels. Moreover, we compare our annotations to those provided by experts and show "moderate" to "almost perfect" agreement. Finally, we present a pathology agreement study to raise awareness about the quality of ground truth annotations. We make our dataset available on Zenodo at https://zenodo.org/records/14944064 and our code available at https://github.com/purrlab/chestxr-label-reliability.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2309.02244 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2309.02244 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2309.02244 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.