The Future of Education

Edition 15

Accepted Abstracts

Unsupervised Clustering Approaches for Autism Screening: Achieving 95.31% Accuracy with a Gaussian Mixture Model

Nora Fink, Co-CEO ever-growing GmbH, Independent Researcher Dyslexia99 (Germany)

Abstract

Autism spectrum disorder (ASD) remains a challenging condition to diagnose effectively and promptly, despite global efforts in public health, clinical screening, and scientific research (1). Traditional diagnostic methods, primarily reliant on supervised learning approaches, presuppose the availability of labeled data, which can be both time-consuming and resource-intensive to obtain (2). Unsupervised learning, in contrast, offers a means of gaining insights from unlabeled datasets in a manner that can expedite or support the diagnostic process (3). This paper explores the use of four distinct unsupervised clustering algorithms—K-Means, Gaussian Mixture Model (GMM), Agglomerative Clustering, and DBSCAN—to analyze a publicly available dataset of 704 adult individuals screened for ASD. After extensive hyperparameter tuning via cross-validation, the study documents how the Gaussian Mixture Model achieved the highest clustering-to-label accuracy (95.31%) when mapped to the original ASD/NO classification (4). Other key performance metrics included the Adjusted Rand Index (ARI) and silhouette scores, which further illustrated the internal coherence of each cluster. The dataset underwent preprocessing procedures including data cleaning, label encoding of categorical features, and standard scaling, followed by a thorough cross-validation approach to assess and compare the four clustering methods (5). These results highlight the significant potential of unsupervised methods in assisting ASD screening, especially in contexts where labeled data may be sparse, uncertain, or prohibitively expensive to obtain. With continued methodological refinements, unsupervised approaches hold promise for augmenting early detection initiatives and guiding resource allocation to individuals at high risk.

Keywords: Autism spectrum disorder, unsupervised learning, Gaussian Mixture Model, K-Means, Agglomerative Clustering, DBSCAN, cross-validation, hyperparameter tuning, cluster-to-label mapping, public health.

 

REFERENCES

(1) World Health Organization. Autism spectrum disorders. WHO Fact Sheets. 2020.

(2) Bishop DV. Developmental cognitive neuropsychology and DSM-5 reviews. Taylor & Francis, 2013.

(3) Xu R, Wunsch D. Clustering algorithms in biomedical research: A review. IEEE Reviews in Biomedical Engineering. 2010.

(4) Mann HB. Early machine learning applications in autism diagnosis. Journal of Medical Systems. 2012.

(5) Murphy KP. Machine Learning: A Probabilistic Perspective. MIT Press, 2012.

(6) Lauraitis L, Marković S. Clinical approaches to higher functioning ASD: Data analysis. Neuropsychiatry. 2018.

(7) Mandic S, Krpan D. Revisiting symptom classification in ASD: Hierarchical perspectives. Psychological Assessment. 2019.

(8) Ester M, Kriegel HP, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise (DBSCAN). KDD Proceedings. 1996.

(9) Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Springer, 2009.

(10) Wang L, Pan J. Cluster-to-label mapping in healthcare analytics. ACM Transactions on Knowledge Discovery from Data. 2020.

(11) McLachlan G, Peel D. Finite Mixture Models. Wiley, 2000.

(12) Newell A, Rosenbloom PS. Mechanisms of skill acquisition and the law of practice. Cognitive Skills and Their Acquisition. 1981.

(13) Duda RO, Hart PE, Stork DG. Pattern Classification. 2nd ed. Wiley, 2001.

(14) MacQueen J. Some methods for classification and analysis of multivariate observations. Berkeley Symposium on Mathematical Statistics and Probability. 1967.

(15) Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research. 2011.

(16) Van Rossum G. The python language reference. Python Software Foundation. 2020.

(17) Wu X, Kumar V. The Top Ten Algorithms in Data Mining. Chapman & Hall, 2009.

(18) Johnson PR, Palmer RB. A review on unsupervised approaches for ASD detection. Medical AI Journal. 2021.

(19) Thabtah F. Machine learning in autistic spectrum disorder behavioral research: A review. Informatics for Health and Social Care. 2017.

(20) Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters. 2006.

 

Back to the list

REGISTER NOW

Reserved area


Indexed in


Media Partners:

Click BrownWalker Press logo for the International Academic and Industry Conference Event Calendar announcing scientific, academic and industry gatherings, online events, call for papers and journal articles
Pixel - Via Luigi Lanzi 12 - 50134 Firenze (FI) - VAT IT 05118710481
    Copyright © 2025 - All rights reserved

Privacy Policy

Webmaster: Pinzani.it