Researchers have made a significant breakthrough in the field of brain-computer interfaces (BCIs). A new study published in Communications Biology details how artificial intelligence (AI) has been successfully trained to generate natural language directly from brain recordings. This development brings us closer to the long-sought goal of seamless brain-to-text communication.

Imagine a world where thoughts can be translated into words without the need for speech or typing. Scientists are making this vision more tangible. The study explores the use of brain recordings for language generation, furthering our understanding of how the brain processes language. The research has exciting implications for AI-based communication, model training, and even therapies for those with speech impairments.
Decoding Language and Thoughts
The AI system, named BrainLLM, was tested across three different datasets. Its ability to generate text was most accurate when trained with larger neurological datasets. This indicates that more comprehensive brain data improves AI-driven language prediction, a critical finding for future development.
Decoding thoughts directly from brain activity has been a considerable challenge. Previous efforts relied on classification models that matched brain activity to predefined words or phrases; however, these methods were limited in their flexibility and couldn’t fully capture the complexities of human expression. The advent of large language models (LLMs), like those powering ChatGPT, revolutionized text generation by predicting word sequences. The challenge lay in integrating these powerful models directly with brain recordings.
About the Study
To address this challenge, the researchers developed BrainLLM, a system that merges brain recordings with an LLM to generate natural language. The study used functional magnetic resonance imaging (fMRI) data collected from participants while they listened to or read language stimuli. FMRI provides a non-invasive method for observing brain activity. The model was trained on three public datasets containing fMRI recordings of participants exposed to various linguistic stimuli.
Researchers designed a “brain adapter,” a neural network that converts brain activity into a format that the LLM can understand. The adapter extracts features from brain signals and combines them with traditional text-based inputs, allowing the LLM to generate words closely aligned with the encoded linguistic information in brain activity.
The process began with the collection of participants’ brain activity data as they processed written or spoken language. These recordings were then transformed into a mathematical representation of brain activity. The specialized neural network mapped these representations onto a space compatible with the LLM’s text embeddings. The model then processed these combined inputs and generated word sequences based on both the brain activity data and prior text prompts. By training BrainLLM on thousands of brain scans and corresponding linguistic inputs, the researchers fine-tuned the system to generate words aligned with brain activity.
Unlike prior methods that required selecting words from a predefined set, BrainLLM can generate continuous text without these restrictions. The researchers then evaluated the model’s performance against existing methods, testing it on a variety of language tasks including predicting the next word in a sequence, reconstructing entire passages, and comparing generated text with human-perceived language continuations.
Major Findings
BrainLLM represents a significant advancement by creating open-ended sentences instead of classifying brain activity into pre-set words, moving closer to practical brain-to-text communication. The system’s ability to generate language that closely matched brain activity was significantly better compared to traditional classification-based methods. BrainLLM produced more coherent and appropriate text when processing brain recordings.
The model showed the highest accuracy when trained with larger datasets, suggesting that increased brain data could lead to further improvements. One of the key breakthroughs was BrainLLM’s capacity to generate continuous text, not simply choose from predetermined words or options. This represents a major step forward for real-world application, where unrestricted communication is vital.
Moreover, human evaluators preferred text generated by BrainLLM over baseline models, indicating that the system effectively captured meaningful linguistic patterns. BrainLLM was effective at reconstructing surprising language–words or phrases that an LLM alone would likely struggle to predict. This demonstrates that brain signals can enhance language modeling in unforeseen ways.
The system performed best when analyzing brain activity from areas known to be involved in language processing, such as Broca’s area and the auditory cortex. Highest accuracy was observed when signals from Broca’s area were used, suggesting its central role in natural language reconstruction, although refining brain signal mapping could further advance accuracy and reliability.
Although the model performed well, the study also had limitations. Accuracy varied across individuals, and open-ended language reconstruction from brain recordings wasn’t yet optimal. Researchers also noted that fMRI isn’t practical for real-time usage due to its high cost and complexity.
Conclusions
BrainLLM’s best performance occurred when reconstructing text that was unexpected or difficult for standard AI models to predict. This suggests that brain signals add valuable context beyond what AI alone can infer. Overall, the study represents an important advance for brain-to-text technology, demonstrating that integrating brain recordings with large language models can enhance natural language generation. While real-world applications are still some years away, this research lays the foundation for brain-computer interfaces that could assist people with speech disabilities to connect with the world seamlessly.
The researchers are exploring alternative brain-imaging techniques such as electroencephalography (EEG), which may enable real-time decoding of language form brain activity. Additionally, they have suggested integrating BrainLLM with motor-based BCIs, which have already been successful for movement-related communication. The combination of advancements in brain signal decoding and machine learning promises to move us closer to a world where thoughts can be directly translated into words.
Journal reference: Ye, Z., Ai, Q., Liu, Y. et al. (2025). Generative language reconstruction from brain recordings. Communications Biology 8, 346. DOI: 10.1038/s42003-025-07731-7, https://www.nature.com/articles/s42003-025-07731-7