
Advanced Machine Learning for Early Detection of Heart Diseases through Audio Analysis
Phone-grade heart sound recordings fuel explainable ensembles for proactive cardiac screening.
Designed prevention and clinical support ensembles for early cardiac screening on the Dangerous Heartbeat Dataset (CHSC2011). Heart sounds were resampled at 4 kHz, segmented into 1-second windows, described with MFCC, chroma, spectral and temporal descriptors, and reduced from 338 to 41 features via Spearman-based filters. The prevention ensemble keeps false normals under control (ROC-AUC 0.96, TPR 43.4% at 1% FPR) while the five-class support ensemble delivers macro F1 81.6 with per-class risk analysis and SHAP explanations.
The study processes the Dangerous Heartbeat Dataset (CHSC2011) by resampling heterogeneous recordings at 4 kHz, slicing 1-second windows and extracting 338 temporal, spectral and cepstral descriptors (MFCC, chroma STFT, RMS, ZCR, CQT, spectral centroid/bandwidth/roll-off). A two-step Spearman filter removes weakly correlated and redundant attributes, shrinking the footprint to 41 features. Two complementary pipelines are trained: a prevention ensemble (Random Forest + MLP Ultra + MLP Rollercoaster) that evaluates normal-vs-rest performance across strict FPR thresholds, and a five-class diagnostic ensemble (Random Forest + MLP Ultra) that balances macro F1, MCC and per-class risk. SHAP values and permutation importance explain decisions and highlight waveform regions that drive predictions.
This project is licensed under the MIT License. Feel free to use, modify, and distribute the code as per the terms of the license.