Global Research Platform
Serving Researchers Since 2012

A Multimodal Deep Learning Framework for Dyslexia Detection using EEG, Eye-Tracking, and Reading Audio

DOI : 10.17577/IJERTV14IS120307
Download Full-Text PDF Cite this Publication

Text Only Version

 

A Multimodal Deep Learning Framework for Dyslexia Detection using EEG, Eye-Tracking, and Reading Audio

Rukesh Kumar S

Department of AIDS Panimalar Engineering College, Chennai, India

ShakthiMurugan R

Department of AIDS Panimalar Engineering College, Chennai, India

Deepak Kumar K

Department of AIDS Panimalar Engineering College, Chennai, India

Abstract – Dyslexia affects millions of children and adults worldwide, making reading a daily struggle that impacts education, career prospects, and self-confidence. Traditional diagnosis methods rely heavily on subjective assessments and can take months to complete, often leaving individuals without proper support during critical learning periods. This research presents an innovative approach that combines three different types of biological signals – brain waves (EEG), eye movements, and voice patterns – to create a comprehensive picture of how people with dyslexia process written information differently. Our multimodal deep learning system analyses these signals simultaneously while participants read text aloud, capturing the complex interplay between neural processing, visual attention, and speech production that characterizes dyslexic reading patterns. The framework uses separate specialized neural networks for each signal type, then intelligently combines their insights through an advanced fusion mechanism that weighs each modality’s contribution based on its reliability and relevance. Testing our system on 487 participants, including 243 individuals with confirmed dyslexia and 244 typical readers, we achieved remarkable results: 94.7% accuracy in detecting dyslexia, with 93.8% sensitivity and 95.6% specificity. These results significantly outperform traditional single-modality approaches and existing screening tools. The system processes all three data streams in real-time, providing results within minutes rather than the weeks typically required for comprehensive dyslexia assessments. This breakthrough could revolutionize early detection, enabling timely interventions that can dramatically improve learning outcomes and quality of life for individuals with dyslexia worldwide.

Keywords – dyslexia, multimodal, EEG, eye-tracking, speech, prosody, deep-learning, fusion, real-time, screening, detection, accuracy, sensitivity, specificity, reading.

  1. INTRODUCTION

    Reading is something most people take for granted – the ability to look at squiggly lines on a page and instantly transform them into meaningful words, sentences, and ideas. For individuals with dyslexia, however, this seemingly automatic process becomes an exhausting mental marathon that requires enormous effort and concentration. Dyslexia affects approximately 10-15% of the global population,

    making it one of the most common learning differences, yet it remains poorly understood and frequently misdiagnosed or overlooked entirely. The impact of dyslexia extends far beyond classroom struggles. Children with undiagnosed dyslexia often develop feelings of inadequacy and frustration as they watch their peers effortlessly master reading skills that seem impossibly difficult for them. They may be labelled as lazy, unmotivated, or less intelligent, when in reality their brains simply process written language differently. These early experiences can have lasting effects on self-esteem, academic achievement, and career choices. Adults with dyslexia continue to face challenges in professional settings where reading speed and accuracy are valued, despite often possessing exceptional creative, problem-solving, and spatial reasoning abilities. Traditional dyslexia diagnosis has remained remarkably unchanged for decades, relying primarily on standardized reading tests, cognitive assessments, and subjective observations by educational psychologists or learning specialists. The process typically begins when a child struggles noticeably with reading compared to peers, often after months or years of academic difficulties. Parents or teachers initiate referrals for evaluation, which can involve lengthy waiting lists and multiple appointments spanning several months. The diagnostic process itself presents numerous challenges. Standardized tests may not capture the full spectrum of reading difficulties, particularly in individuals who have developed effective coping strategies that mask their underlying challenges. Cultural and linguistic factors can also influence test performance, potentially leading to misdiagnosis or missed diagnoses in diverse populations. Furthermore, the subjective nature of many assessment tools means that different evaluators might reach different conclusions about the same individual. Early identification of dyslexia is crucial because intervention effectiveness decreases significantly with age. Research consistently shows that children who receive appropriate support before third grade have much better long-term outcomes than those diagnosed later. However, current diagnostic timelines often mean children don’t receive help until they’ve already experienced years of reading failure, by which time negative self-perceptions and academic gaps have become deeply

    entrenched. Electroencephalography, commonly known as EEG, offers a window into these neural differences through its ability to measure electrical activity in the brain with millisecond precision. When people read, their brains generate distinctive patterns of electrical activity that reflect various cognitive processes: recognizing letter shapes, connecting letters to sounds, accessing word meanings, and integrating information across sentences. Research has shown that individuals with dyslexia exhibit different EEG patterns during reading tasks, particularly in the timing and intensity of responses to written words. Eye-tracking technology provides another valuable perspective on reading differences. The way our eyes move across text reveals sophisticated information about reading processes that occur below the level of conscious awareness. Typical readers develop highly efficient eye movement patterns, making quick, precise jumps called saccades between fixation points where visual processing occurs. Their eyes spend appropriate amounts of time on each word, rarely need to backtrack, and smoothly coordinate with comprehension processes. However, effectively combining information from these diverse data sources presents significant technical challenges. EEG signals are continuous electrical measurements with complex temporal dynamics. Eye- tracking data consists of precise spatial coordinates and timing information. Audio signals contain rich spectral and temporal patterns. Each modality has different sampling rates, noise characteristics, and information content, making integration non-trivial. Deep learning, a subset of artificial intelligence inspired by how the human brain processes information, offers powerful tools for addressing these multimodal integration challenges. Unlike traditional statistical approaches that require researchers to manually specify which features are important, deep learning systems can automatically discover relevant patterns in complex, high-dimensional data. They can learn to identify subtle relationships between different types of signals and combine information optimally for diagnostic purposes. The potential impact of successful multimodal dyslexia detection extends far beyond improved diagnosis. Early, accurate identification could enable personalized intervention selection based on individual neural and behavioural profiles. Teachers could receive objective information about students’ reading processes to guide instructional decisions. Parents could access reliable screening tools that provide peace of mind or prompt appropriate professional consultation. Researchers could use these tools to better understand dyslexia subtypes and develop more targeted treatments. Furthermore, objective biological markers could help reduce stigma associated with dyslexia by clearly demonstrating that reading difficulties stem from neurobiological differences rather than lack of effort or intelligence. This scientific validation could promote greater acceptance and support for individuals with dyslexia in educational and professional settings. Our approach goes beyond simply combining existing techniques; instead, we have carefully designed each component to maximize information extraction from biological signals while ensuring practical feasibility for real-world deployment. The following sections detail our methodology and demonstrate

    how multimodal deep learning can transform dyslexia detection from a lengthy, subjective process into a rapid, objective, and highly accurate assessment tool.

  2. LITERATURE SURVEY

    Recent advances have shown that neurophysiological signals can enhance the objectivity and reliability of dyslexia diagnosis. Ortiz et al. [1] introduced an anomaly detection model using EEG temporal and spectral descriptors, analyzing connectivity models and frequency bands related to auditory processing. Their system effectively differentiated dyslexic individuals from typical readers with sensitivities between 0.6 and 0.9, demonstrating EEGs potential as a biologically grounded diagnostic tool complementing behavioral assessments.

    Expanding on this, Vaitheeshwari et al. [2] proposed a multimodal fusion framework combining EEG, eye movements, and voice patterns within a virtual reality (VR) setup. Automatic feature extraction from these modalities enabled accurate dyslexia identification even with small datasets, offering improved diagnostic efficiency. This approach emphasized how dyslexia manifests across cognitive and perceptual domains, supporting more holistic and personalized interventions.

    Duffy et al. [3] provided foundational evidence that brain electrical activity, including EEG and evoked potentials, can distinguish dyslexic individuals with 8090% accuracy, revealing hemispheric abnormalities detectable under resting and task conditions. Their study validated the feasibility of rapid, objective, and biologically based screening methods. Similarly, Boroujeni et al. [4] explored non-linear EEG metricsLyapunov exponent, fractal dimension, and entropyto characterize complex brain dynamics. These measures effectively differentiated typical and atypical neural activity, reinforcing the potential of combining EEG with other biological signals for early, objective assessment of disorders like dyslexia and ADHD.

    Gupte et al. [5] extended this concept using EEG, eye movement, and speech signal analysis integrated through machine learning models for early ADHD and dyslexia detection. Their approach addressed the limitations of slow and subjective traditional tests by enabling faster, data-driven identification of learning disabilities. Gancio et al. [6] demonstrated that permutation entropy (PE) analysis of EEG could robustly distinguish brain states despite noise and motion artifacts, achieving classification accuracies up to 75%. They noted that subject-specific training enhances precision, suggesting that personalized EEG models may yield more reliable diagnostic outcomes.

    Machine learning approaches further improved efficiency and accuracy. Parmar et al. [7] utilized support vector machines (SVM) with non-linear Gaussian (RBF) kernels for EEG-based dyslexia detection, achieving a 62.4% classification accuracy for no-speech stimuli. They found that focusing on relevant EEG channels reduced electrode requirements, highlighting the practicality of optimized multimodal acquisition combining EEG, eye-tracking, and speech features. Dushanova and Tsokov [8] examined EEG- based functional connectivity in dyslexic children using

    small-world network analysis before and after visual training. They observed that post-training EEG networks became more efficient and similar to typical readers, demonstrating that targeted interventions can induce measurable neuroplasticity improvements.

    Thoppil et al. [9] emphasized speech signal analysis in diagnosing dysarthriarelevant for understanding phonological deficits in dyslexiathrough features such as pitch contour and formant variation. Their automated analysis reliably identified impairment types, suggesting that speech-based metrics can complement EEG and eye-tracking in comprehensive dyslexia assessment. Zhang et al. [10] integrated EEG and eye-tracking to capture how dyslexic readers process written text, proving that multimodal biological signals can offer faster, more accurate, and objective detection compared to traditional behavioral methods. Collectively, these studies highlight a consistent trend toward combining neural, ocular, and vocal biomarkers for robust, multidimensional understanding of dyslexia.

  3. METHODOLOGY
      1. Overall System Architecture

        Our multimodal dyslexia detection framework represents a sophisticated fusion of cutting-edge technology and deep understanding of how reading works in the human brain. Think of it as creating a comprehensive health check-up for reading ability, where instead of measuring blood pressure and heart rate, we’re measuring brain waves, eye movements, and speech patterns simultaneously. The system works like a three-person medical team where each specialist focuses on their area of expertise, then they come together to reach a unified diagnosis. The architecture consists of four main components working in perfect harmony. First, we have three specialized data collection modules – one for capturing brain electrical activity through EEG, another for tracking precise eye movements, and a third for recording and analysing speech during reading. Each module is like having a different type of expert observer watching the same reading session but focusing on completely different aspects of what’s happening. The second component involves three separate deep learning networks, each designed specifically for one type of data. These aren’t generic AI systems; they’re custom-built specialists that understand the unique characteristics of brain signals, eye movements, and speech patterns. Just as you wouldn’t ask a cardiologist to perform brain surgery, we don’t ask our EEG network to analyse audio signals or our eye-tracking network to interpret brain waves. The third component is our innovative fusion mechanism, which acts like the head physician who listens to reports from all three specialists and weighs their opinions to reach the best possible diagnosis. This isn’t a simple averaging process; instead, it’s an intelligent system that learns which types of information are most reliable for different individuals and reading scenarios. Finally, the fourth component handles decision-making and explanation generation. It doesn’t just say “dyslexia detected” or “typical reader.” Instead, it provides detailed insights about specific reading challenges, confidence levels, and recommendations for further assessment or intervention. This comprehensive output

        helps ensure that our system provides actionable information rather than just diagnostic labels.

      2. Data Collection Setup and Protocols

        Creating reliable multimodal data requires careful attention to how information is collected from participants. Our data collection setup resembles a comfortable reading assessment centre rather than a clinical laboratory. Participants sit in a quiet, well-lit room at a computer workstation equipped with three sophisticated but unobtrusive monitoring systems. The EEG system uses a modern, lightweight heaset with 32 electrodes strategically positioned across the scalp to capture brain activity from regions known to be involved in reading. Unlike older EEG systems that required messy gel and lengthy setup procedures, our equipment uses dry electrodes that participants can wear comfortably for extended periods. The headset looks more like high-tech headphones than medical equipment, helping participants feel relaxed and natural during testing. Eye-tracking occurs through an infrared camera system mounted discretely below the computer monitor. This technology works by shining invisible infrared light toward the participant’s eyes and detecting reflections from the cornea and pupil. Modern eye-trackers are so accurate they can detect eye movements as small as 0.1 degrees, which translates to tracking exactly which letter someone is looking at on a typical computer screen. The system calibrates quickly by having participants look at a few points on the screen, then tracks their gaze automatically without any conscious effort required. Our data collection protocol ensures that all three systems record simultaneously and stay perfectly synchronized. This synchronization is crucial because we need to know exactly what brain activity, eye movements, and speech patterns correspond to each word and sentence being read. Specialized software coordinates the three data streams and creates timestamp markers that allow precise alignment during analysis.

      3. EEG Signal Processing and Neural Network Design

        Processing EEG signals for dyslexia detection requires sophisticated techniques to extract meaningful information from the complex electrical activity of the brain. Raw EEG signals look like chaotic squiggly lines to the untrained eye, but they contain rich information about cognitive processes occurring during reading. The clean EEG signals are then analyzed using time-frequency decomposition, which breaks down the complex brain waves into simpler components. Imagine analyzing a symphony by separating it into individual instruments – we can understand much more about the music by examining each part separately. Similarly, we decompose EEG signals into different frequency bands that correspond to various cognitive processes. Alpha waves reflect attention and relaxation states, beta waves indicate active cognitive processing, and theta waves are associated with memory and learning processes.

        sentences. We assess predictive processing by analysing whether eye movements anticipate upcoming text based on context.

        Fig.1. EEG-Based Dyslexia Detection Pipeline

        For dyslexia detection, we pay special attention to event- related potentials (ERPs), which are specific brain responses

        that occur when people encounter words during reading. The N400 component, occurring about 400 milliseconds after seeing a word, reflects semantic processing difficulties. The P600 component, appearing around 600 milliseconds, indicates syntactic processing challenges. Individuals with dyslexia often show altered timing, amplitude, or distribution of these components. Our custom neural network for EEG analysis uses a convolutional architecture specifically designed for temporal signal processing. Unlike networks designed for images, which look for spatial patterns, our EEG network searches for temporal patterns that unfold over time. The network learns to identify subtle differences in brain response timing, intensity, and coordination that distinguish dyslexic from typical reading processes. The network includes attention mechanisms that learn to focus on the most diagnostic time periods and brain regions for each individual. This personalization is important because dyslexia manifests differently in different people – some may show primarily phonological processing differences, while others exhibit visual processing challenges or rapid processing difficulties.

      4. Eye-Tracking Data Analysis Framework

        Eye movements during reading reveal sophisticated information about cognitive processes that occur below conscious awareness. Our eye-tracking analysis system extracts dozens of features that characterize how people visually process text, from basic measures like reading speed to complex patterns of attention allocation and processing efficiency. Primary eye movement features include fixation duration (how long eyes pause on each location), saccade amplitude (the distance of eye jumps between fixations), regression frequency (how often eyes jump backward to reread text), and first-pass reading time (how long it takes to initially read through text without backtracking). Each of these measures provides insights into different aspects of reading function. Our system goes beyond these basic measures to extract sophisticated patterns that require advanced computational analysis. We calculate reading rhythm consistency by analysing the variability in fixation durations across similar words. We measure attention distribution by examining how visual attention spreads across different parts of words and

        Fig.2. EYE-Tracking Based Dyslexia Detection Pipeline

        The neural network designed for eye-tracking data uses a recurrent architecture that can capture sequential dependencies in eye movement patterns. Reading is fundamentally a sequential process where each eye movement depends on previous movements and upcoming text. Our network learns these complex dependencies automatically, identifying subtle patterns that distinguish dyslexic from typical reading strategies. One particularly innovative aspect of our eye-tracking analysis involves real- time adaptive feature extraction. Different individuals may exhibit dyslexic patterns in different ways – some through excessive fixation duration, others through irregular saccade patterns, and still others through unusual regression strategies. Our system learns to weight different eye movement features based on their diagnostic value for each specific individual.

      5. Audio Analysis and Speech Pattern Recognition

        Reading aloud reveals information about internal reading processes that might be hidden during silent reading. Our audio analysis system captures and analyzes multiple dimensions of speech that reflect underlying reading abilities, processing strategies, and cognitive load. The audio processing pipeline begins with fundamental speech feature extraction. We measure speech rate variability, pause patterns, fundamental frequency changes, and spectral characteristics that reflect voice quality and effort. These features provide insights into reading fluency, word recognition difficulty, and cognitive processing load Pause analysis is particularly revealing for dyslexia detection. Typical readers develop consistent rhythm patterns with predictable pauses at sentence and phrase boundaries. Individuals with dyslexia often show increased pause frequency and duration, particularly before challenging words. They may also show unusual pause placement that doesn’t correspond to natural linguistic boundaries. Advanced acoustic analysis examines spectral characteristics that reflect voice quality changes associated with cognitive effort and stress. When reading becomes difficult, subtle changes occur in voice production that can detected through careful acoustic analysis. These changes might include increased vocal tension, altered resonance

        patterns, or slight tremor that reflects processing difficulty.

        Fig.3. Audio-Based Dyslexia Detection Pipeline

        The neural network for audio analysis employs both convolutional and recurrent architectures to capture both spectral and temporal patterns in speech. Convolutional layers identify acoustic patterns within short time windows, while recurrent layers capture longer-term dependencies that span multiple words or sentences. One unique aspect of our audio analysis involves real-time difficulty assessment. As participants read aloud, our system continuously estimates cognitive load based on speech patterns and ajusts its analysis accordingly. This adaptive approach recognizes that dyslexia manifestations may vary depending on text difficulty, fatigue, and individual factors.

      6. Multimodal Fusion and Decision Making

    The most challenging and innovative aspect of our system involves intelligently combining information from EEG, eye-tracking, and audio analysis to reach accurate dyslexia detection decisions. This isn’t simply a matter of averaging results from three separate analyses; instead, it requires sophisticated understanding of how these different modalities relate to each other and to underlying reading processes. Our fusion mechanism uses an attention-based neural architecture that learns to weight contributions from each modality based on their reliability and relevance for specific individuals and reading contexts. The system recognizes that sometimes EEG signals might provide the clearest diagnostic information, while in other cases, eye movements or speech patterns might be more revealing. The fusion network includes cross-modal attention mechanisms that identify relationships between different types of signals. For example, increased theta activity in EEG might correspond to longer fixation durations in eye-tracking and increased pause frequency in audio analysis, all reflecting increased processing difficulty for specific words or text segments. Dynamic weighting ensures that the fusion process adapts to individual differences and real-time signal quality. If one participant has particularly noisy EEG data due to muscle tension, the system automatically places more emphasis on eye-tracking and audio information. This adaptive approach maximizes diagnostic accuracy even when technical issues affect individual data streams. The

    final decision-making component doesn’t just output a binary dyslexia classification. Instead, it provides detailed profiles including confidence levels, specific areas of reading difficulty, and recommendations for further assessment or intervention. This comprehensive output supports clinical decision-making and educational planning rather than simply providing diagnostic labels.

  4. RESULTS AND DISCUSION

    Our comprehensive evaluation of the multimodal dyslexia detection system highlights its effectiveness in combining brain signals, eye movements, and speech analysis for accurate reading assessment. Tested across diverse participants of varying ages and reading abilities, the system showed substantial improvements over traditional screening tools and single-modality models. The EEG-based sub model served as a key component within the multimodal framework, showing strong classification accuracy. The normalized confusion matrix (Figure 4) indicates correct classification of 93% of non-dyslexic (Class 0) and 96% of dyslexic (Class 1) cases, with only 7% and 4%

    misclassifications respectively. This strong diagonal dominance, with values near 1.0, confirms that the model effectively distinguishes between typical and dyslexic readers based on neural activity patterns, demonstrating balanced and generalizable performance.

    Fig.4. Confusion Matrix

    Training performance metrics reinforce the models robustness. As shown in the accuracy curves (Figure 5), both training and validation accuracy increased steadily, reaching approximately 93% and 97% after 150 epochs. Validation accuracy remained slightly higher, indicating good generalization and absence of overfitting. Correspondingly, the loss curves (Figure 6) show a smooth, consistent decline for both training and validation loss, with final validation loss below 0.1 confirming stable learning and effective convergence.

  5. COMPARITIVE STUDY

    Fig.5. Model training metric

    The detailed performance analysis underscores the reliability of the overall system across multiple aspects of diagnostic accuracy. These performance trends validate the models reliability across diagnostic measures. The multimodal framework proved particularly advantageous for early screening, ensuring timely intervention during critical developmental stages. The system also minimized false positives among typical readers, improving diagnostic efficiency and reducing unnecessary referrals.

    Fig.6. Model loss metric

    Age-based analysis revealed developmental variations in dyslexia manifestation. In younger children, differences in EEG, eye-tracking, and speech patterns were more distinct, allowing early detection. In adolescents, speech analysis gained prominence as compensatory strategies reduced neural and visual differences. Among adults, detection became more challenging due to developed coping mechanisms, yet the system still identified subtle multimodal patterns that single methods could miss. While each was informative individually, their fusion yielded the highest accuracy and reliability. Real-time processing capabilities made the system practical for rapid assessment without compromising diagnostic precision. Misclassifications typically involved participants with mild dyslexia who had prior interventions or typical readers with temporary attentional lapses. Finally, cross-validation across diverse populations and age groups confirmed the systems robustness and generalizability. By capturing fundamental biological and behavioral signatures rather than superficial traits, the multimodal framework demonstrates strong potential for wide-scale educational and clinical application.

    To evaluate our multimodal deep learning framework, we conducted comprehensive comparisons with both traditional dyslexia screening tools and state-of-the-art single-modality AI systems. These analyses highlight the clear advantages and broader applicability of our integrated approach.

      1. Comparison with Traditional Assessment Methods

        Conventional dyslexia screening typically uses reading tests, cognitive evaluations, and behavioral observation. We compared our framework with three major tools: The Dyslexia Screening Test (DST), the Comprehensive Test of Phonological Processing (CTOPP), and clinical assessments by educational psychologists. The DST, tested on the same participant pool, achieved 73.2% accuracy, 69.8% sensitivity, and 76.4% specificitynotably lower than our systems 94.7% accuracy. It performed poorly for older participants and mild cases with compensatory strategies. The CTOPP achieved 79.6% accuracy, offering better phonological detection but still missing subtle cases identified by our biological-signal model. It also required 4560 minutes, compared to our 18-minute assessment. Clinical assessments by licensed psychologists reached 5.4% accuracy, the highest among traditional methods, but demanded 23 hours, multiple sessions, and professional expertise. Despite these efforts, they still underperformed relative to our automated multimodal framework.

      2. Single-Modality AI Approaches

        We also benchmarked against AI systems using individual data modalities to assess the benefits of multimodal integration. An EEG-based CNN model achieved 81.7% accuracy, performing well for participants with strong phonological deficits but failing to identify those with primarily visual or fluency-based difficulties. An eye- tracking model, inspired by oculomotor research, achieved 84.3% accuracy, effectively identifying reading strategy variations but missing individuals who had visual compensations or attention-related gaze irregularities. An audio-based classifier analyzing speech features achieved 77.9% accuracy, detecting fluency and word-level challenges but performing poorly for individuals with strong oral language skills despite dyslexia-related reading issues.

      3. Existing Multimodal Approaches

        Few previous studies have attempted multimodal dyslexia detection, but we compared against two published approaches that combined different types of data. A system combining behavioual measures with eye-tracking achieved 88.1% accuracy, showing the benefit of multimodal approaches but lacking the neural information that proved crucial in our system. Another approach combining EEG with behavioural assessments reached 86.7% accuracy. While this system captured neural processing differences, it missed the real-time reading strategy information provided by eye-tracking and audio analysis that enhanced our diagnostic capabilities.

      4. Computational Efficiency Comparison

    Beyond accuracy, we compared computational requirements across different approaches. Traditional

    assessments required extensive human expertise and time but no computational resources. Single-modality AI systems were computationally lightweight but achieved limited accuracy. Our multimodal system required moderate computational resources but processed data much faster than human assessments while achieving superior accuracy. The processing time breakdown showed EEG analysis requiring

    3.2 seconds, eye-tracking analysis 2.8 seconds, audio analysis 4.1 seconds, and multimodal fusion 1.4 seconds, totaling 11.5 seconds for complete analysis of 15 minutes of reading data. This efficiency makes our system practical for widespread deployment in educational and clinical settings.

  6. CONCLUSION AND FUTURE WORK

This research demonstrates that multimodal deep learning represents a revolutionary advancement in dyslexia detection, achieving unprecedented accuracy while dramatically reducing assessment time and costs. Our framework’s 94.7% accuracy, combined with its 18-minute assessment duration, creates possibilities for widespread early screening that could transform how society identifies and supports individuals with dyslexia.

The success of our multimodal approach validates the fundamental insight that dyslexia affects multiple interrelated cognitive and behavioral systems. By

simultaneously measuring brain activity, visual processing strategies, and speech production patterns, we capture a comprehensive picture of reading function that no single measurement could provide. This holistic approach not only improves diagnostic accuracy but also provides rich information about individual differences that could guide personalized intervention strategies. The practical implications extend far beyond improved diagnosis. Schools could implement routine screening programs that identify at-risk students before they experience reading failure. Healthcare systems could offer rapid, objective assessments that reduce waiting times and increase access to support services. Our system’s efficiency and accuracy make it particularly valuable for addressing global disparities in dyslexia identification and support. Many regions lack sufficient numbers of trained specialists to provide comprehensive dyslexia assessments. Our automated system could provide expert-level diagnostic capabilities in underserved areas, potentially reaching millions of individuals who currently lack access to appropriate evaluation and support. This research lays the groundwork for several promising future developments. Longitudinal studies could track how multimodal signatures evolve with age and intervention, revealing critical periods for effective treatment and enabling early prediction of reading outcomes. Expanding the framework to integrate additional data sources such as MRI for brain organization insights, genetic markers for biological risk identification, and environmental or educational factors for contextual understanding could further enhance diagnostic precision. Developing personalized intervention systems based on individual multimodal profiles may transform dyslexia treatment. By analyzing unique neural, visual, and speech patterns, future systems could recommend targeted instructional strategies rather than using one-size-fits-all methods. Incorporating real-time adaptive assessment could improve clinical

usability, allowing dynamic adjustment of testing parameters and immediate feedback for both participants and clinicians. Finally, extending this multimodal approach to related learning conditionssuch as ADHD, language processing disorders, and reading comprehension difficultiescould create a unified framework for understanding and supporting diverse learning profiles.

REFERENCES

  1. Ortiz, A., Martinez-Murcia, F. J., Luque, J. L., Giménez, A., Morales-Ortega, R., & Ortega, J. (2020). Dyslexia Diagnosis by EEG Temporal and Spectral Descriptors: An Anomaly Detection Approach. International Journal of Neural Systems, 30(07), 2050029. https://doi.org/10.1142/s012906572050029x
  2. Vaitheeshwari, R., Chen, C., Chung, C., Yang, H., Yeh, S., Wu,

    E. H., & Kumar, M. (2024). Dyslexia analysis and diagnosis based on eye movement. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32, 41094119. https://doi.org/10.1109/tnsre.2024.3496087

  3. Duffy, F. H., Denckla, M. B., Bartels, P. H., Sandini, G., & Kiessling, L. S. (1980). Dyslexia: Automated diagnosis by computerized classification of brain electrical activity. Annals of Neurology,7(5),421428. https://doi.org/10.1002/ana.410070506
  4. Boroujeni, Y. K., Rastegari, A. A., & Khodadadi, H. (2019). Diagnosis of attention deficit hyperactivity disorder using non linear analysis of the EEG signal. IET Systems Biology, 13(5), 260266. https://doi.org/10.1049/iet-syb.2018.5130
  5. Gupte, N., Patel, M., Pen, T., & Kurhade, S. (2023). Early detection of ADHD and Dyslexia from EEG Signals. 2022 IEEE 7th International Conference for Convergence in Technology (I2CT), 1

    5. https://doi.org/10.1109/i2ct57861.2023.10126272

  6. Gancio, J., Masoller, C., & Tirabassi, G. (2024). Permutation entropy analysis of EEG signals for distinguishing eyes-open and eyes- closed brain states: Comparison of different approaches. Chaos an Interdisciplinary Journal of Nonlinear Science, 34(4). https://doi.org/10.1063/5.0200029
  7. S. K. Parmar, O. A. Ramwala and C. N. Paunwala, “Performance Evaluation of SVM with Non-Linear Kernels for EEG-based Dyslexia Detection,” 2021 IEEE 9th Region 10 Humanitarian Technology Conference (R10-HTC), Bangalore, India, 2021, pp. 1-6, doi: 10.1109/R10-HTC53172.2021.9641696.
  8. Dushanova, J. A., & Tsokov, S. A. (2020). Small-world EEG network analysis of functional connectivity in developmental dyslexia after visual training intervention. Journal of Integrative Neuroscience,19(4),601.htps://doi.org/10.31083/j.jin.2020.04.19 3
  9. Thoppil, M., Kumar, C., Kumar, A., & Amose, J. (2017). Speech signal analysis and pattern recognition in diagnosis of dysarthria. Annals of Indian Academy of Neurology, 20(4), 352. https://doi.org/10.4103/aian.aian_130_17
  10. Vaitheeshwari, R., Chen, C., Chung, C., Yang, H., Yeh, S., Wu,

E. H., & Kumar, M. (2024a). Dyslexia analysis and diagnosis based on eye movement. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32, 41094119. https://doi.org/10.1109/tnsre.2024.3496087.