Close Menu
Breaking News in Technology & Business – Tech Geekwire

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech GeekwireBreaking News in Technology & Business – Tech Geekwire
    • New
      • Amazon
      • Digital Health Technology
      • Microsoft
      • Startup
    • AI
    • Corporation
    • Crypto
    • Event
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech Geekwire
    Home ยป Explainable AI for Voice Pathology Detection Using Mel Spectrograms and Deep Learning
    AI

    Explainable AI for Voice Pathology Detection Using Mel Spectrograms and Deep Learning

    techgeekwireBy techgeekwireJune 1, 2025No Comments3 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email

    Voice disorders result from various pathological processes caused by anatomical, functional, or paralytic factors, affecting voice production across physiological, auditory, aerodynamic, acoustic, and perceptual aspects. The main categories of voice pathology include hyperkinetic dysphonia, hypokinetic dysphonia, and other conditions such as reflux laryngitis, vocal fold nodules, and vocal fold paralysis.

    Methodology

    The study utilized the VOice ICar fEDerico II (VOICED) dataset, which contains recordings of healthy and pathological voices. The dataset consists of 208 voice recordings collected during a clinical study, with a prevalence of disordered voices (150) over healthy ones (58). The voices were recorded using a Samsung Galaxy S4 mobile device, sampled at 8000 Hz with 32-bit resolution, and saved in .txt format.

    Each 5-second recording of the vowel /a/ was divided into 250 ms segments with a 125 ms overlap, generating 36 images per sound. These segments were then transformed into Mel spectrograms, which provide a time-frequency representation of the audio signals. The study employed pre-trained CNN networks (OpenL3, YAMNET, VGGish) for classification, using transfer learning and cross-validation with k = 5.

    Results

    The OpenL3 network achieved the highest accuracy at 99.44%, outperforming YAMNET (94.36%) and VGGish (95.34%). The classification performance was evaluated using accuracy, precision, recall/sensitivity, and the Area Under the Curve (AUC) value. The confusion matrices and ROC curves demonstrated high classification performance for all eight classes: Glottic Insufficiency, Hyperkinetic Dysphonia, Hypokinetic Dysphonia, Prolapse, Reflux Laryngitis, Vocal Fold Nodules, Vocal Fold Paralysis, and Healthy.

    Explainability Analysis

    The study used Occlusion Sensitivity, an XAI technique, to understand the behavior of the deep neural network. The XAI maps revealed that the model primarily used specific frequency bands to classify voice pathologies. For instance, Hyperkinetic Dysphonia was characterized by dominant frequency bands around 700 Hz and 100 Hz, while Hypokinetic Dysphonia showed bands over 200 Hz and 900 Hz. The average XAI maps across all correctly classified images highlighted the most used areas for classification, demonstrating differentiability between classes.

    Conclusion

    This research demonstrates the effectiveness of using Mel spectrograms and deep learning for voice pathology detection, achieving high accuracy with the OpenL3 network. The application of XAI techniques provided insights into the model’s decision-making process, showing that specific frequency bands are crucial for distinguishing between different pathologies. The proposed system has the potential to be used as a support tool for specialists, particularly in telemedicine applications. Future work could expand to include other vowels and pathologies, enhancing the system’s versatility and diagnostic capabilities.

    deep learning explainable AI Healthcare Technology Mel spectrograms voice pathology
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    techgeekwire
    • Website

    Related Posts

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025

    Andreessen Horowitz Backs Controversial Startup Cluely Despite ‘Rage-Bait’ Marketing

    July 4, 2025

    Invesco QQQ ETF Hits All-Time High as Tech Stocks Continue to Soar

    July 4, 2025

    ContractPodAi Partners with Microsoft to Advance Legal AI Automation

    July 4, 2025
    Leave A Reply Cancel Reply

    Top Reviews
    Editors Picks

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025

    Andreessen Horowitz Backs Controversial Startup Cluely Despite ‘Rage-Bait’ Marketing

    July 4, 2025
    Advertisement
    Demo
    About Us
    About Us

    A rich source of news about the latest technologies in the world. Compiled in the most detailed and accurate manner in the fastest way globally. Please follow us to receive the earliest notification

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Our Picks

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025
    Categories
    • AI (2,696)
    • Amazon (1,056)
    • Corporation (990)
    • Crypto (1,130)
    • Digital Health Technology (1,079)
    • Event (523)
    • Microsoft (1,230)
    • New (9,568)
    • Startup (1,164)
    © 2025 TechGeekWire. Designed by TechGeekWire.
    • Home

    Type above and press Enter to search. Press Esc to cancel.