Skip to Main Content (Press Enter)

Logo UNISS
  • ×
  • Home
  • Degrees
  • Courses
  • Jobs
  • People
  • Outputs
  • Organizations
  • Third Mission
  • Expertise & Skills

Logo UNISS

|

UNIFIND

uniss.it
  • ×
  • Home
  • Degrees
  • Courses
  • Jobs
  • People
  • Outputs
  • Organizations
  • Third Mission
  • Expertise & Skills
  1. Outputs

Dealing with confounders and outliers in classification medical studies: The Autism Spectrum Disorders case study

Academic Article
Publication Date:
2020
Short description:
Dealing with confounders and outliers in classification medical studies: The Autism Spectrum Disorders case study / Ferrari, E., Bosco, P., Calderoni, S., Oliva, P., Palumbo, L., Spera, G., Fantacci, M.E., Retico, A.. - In: ARTIFICIAL INTELLIGENCE IN MEDICINE. - ISSN 0933-3657. - 108:(2020), p. 101926. [10.1016/j.artmed.2020.101926]
abstract:
Machine learning (ML) approaches have been widely applied to medical data in order to find reliable classifiers to improve diagnosis and detect candidate biomarkers of a disease. However, as a powerful, multivariate, data-driven approach, ML can be misled by biases and outliers in the training set, finding sample-dependent classification patterns. This phenomenon often occurs in biomedical applications in which, due to the scarcity of the data, combined with their heterogeneous nature and complex acquisition process, outliers and biases are very common. In this work we present a new workflow for biomedical research based on ML approaches, that maximizes the generalizability of the classification. This workflow is based on the adoption of two data selection tools: an autoencoder to identify the outliers and the Confounding Index, to understand which characteristics of the sample can mislead classification. As a study-case we adopt the controversial research about extracting brain structural biomarkers of Autism Spectrum Disorders (ASD) from magnetic resonance images. A classifier trained on a dataset composed by 86 subjects, selected using this framework, obtained an area under the receiver operating characteristic curve of 0.79. The feature pattern identified by this classifier is still able to capture the mean differences between the ASD and Typically Developing Control classes on 1460 new subjects in the same age range of the training set, thus providing new insights on the brain characteristics of ASD. In this work, we show that the proposed workflow allows to find generalizable patterns even if the dataset is limited, while skipping the two mentioned steps and using a larger but not well designed training set would have produced a sample-dependent classifier.
Iris type:
1.1 Articolo in rivista
Keywords:
Autism Spectrum Disorders; Autoencoder; Confounders; Confounding Index; Machine learning; MRI; Outliers; Reproducibility
List of contributors:
Ferrari, E.; Bosco, P.; Calderoni, S.; Oliva, P.; Palumbo, L.; Spera, G.; Fantacci, M. E.; Retico, A.
Handle:
https://iris.uniss.it/handle/11388/240747
Published in:
ARTIFICIAL INTELLIGENCE IN MEDICINE
Journal
  • Use of cookies

Powered by VIVO | Designed by Cineca | 26.5.2.0