Following his research into speech signal processing for DARPA’s RATS program, Dr. Nima Mesgarani of Columbia University’s Zuckerman Institute and fellow researchers announce that a brain-computer interface (BCI) has been used to turn brainwave patterns into speech with the help of a speech synthesizer.
Dr. Mesgarani, along with neurosurgeon Dr. Ashesh Dinesh Mehta, used a BCI known as an electrocorticography to measure neural activity from five neurosurgical patients undergoing treatment for epilepsy as they listened to continuous speech sounds.
The goal was to analyze the brainwaves of the patients as they listened to the speakers, so that patterns could be found that could be used to recreate speech by training a “vocoder,” which is “a computer algorithm that can synthesize speech after being trained on recordings of people talking,” according to the Zuckerman Institute.
“We asked epilepsy patients already undergoing brain surgery to listen to sentences spoken by different people, while we measured patterns of brain activity,” said Dr. Mesgarani. “These neural patterns trained the vocoder.”
For example, the epilepsy patients were asked to listen to people counting the numbers between 0 and 9. When the speaker said the number, the patients would think the number in their heads and their brain activity could be read by the BCI and run through the vocoder.
Read More: Brain-computer interface allows for telepathic piloting of drones
From there, “the sound produced by the vocoder in response to those signals was analyzed and cleaned up by neural networks, a type of artificial intelligence that mimics the structure of neurons in the biological brain. The end result was a robotic-sounding voice reciting a sequence of numbers.”
What we get isn’t an exact form of telepathy; however, with 75% accuracy the technology is getting closer and closer to it.
“The sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy,” said Mesgarani.
Ultimately, they hope their system could be part of an implant, similar to those worn by some epilepsy patients, that translates the wearer’s thoughts directly into words.
“In this scenario, if the wearer thinks ‘I need a glass of water,’ our system could take the brain signals generated by that thought, and turn them into synthesized, verbal speech,” said Dr. Mesgarani.
“This would be a game changer. It would give anyone who has lost their ability to speak, whether through injury or disease, the renewed chance to connect to the world around them,” he added.
DARPA Speech Technology Research
According to Dr. Mesgarani, the vocoder uses “the same technology used by Amazon Echo and Apple Siri to give verbal responses to our questions.”
In turn that same technology that goes into commercial digital assistants was first developed at the Defense Advanced Research Projects Agency (DARPA) under the PAL (Personal Assistant that Learns) project.
DARPA worked with military users to refine PAL prototypes for operational use, and with the defense acquisition community to transition PAL technologies into military systems.
Will today’s announcement of using BCIs and vocoders for turning brainwaves into speech be used first for the military or for healthcare? What are other potential applications?
Prior to today’s announcement, Dr. Mesgarani was a co-author of the research paper “Developing a Speech Activity Detection System for the DARPA RATS Program.”
The Robust Automatic Transcription of Speech (RATS) program was launched to create algorithms and software for performing the following tasks on potentially speech-containing signals received over communication channels that are extremely noisy and/or highly distorted:
- Speech Activity Detection: Determine whether a signal includes speech or is just background noise or music.
- Language Identification: Once a speech signal has been detected, identify the language being spoken.
- Speaker Identification: Once a speech signal has been detected, identify whether the speaker is an individual on a list of known speakers.
- Key Word Spotting: Once a speech signal has been detected and the language has been identified, spot specific words or phrases from a list of terms of interest.
In a presentation dated from 2014 and entitled “Reverse engineering the neural mechanisms involved in speech processing,” Dr. Mesgarani referenced the RATS program and talked about “decoding speech signals and attentional focus directly from the brain activity,” which was realized today with the creation a brain-computer interface that turns brainwave patterns into speech.
As the latest research from Columbia University’s Zuckerman Institute shows, “Reconstructing speech from the neural responses recorded from the human auditory cortex […] opens up the possibility of using this technique as a speech brain-computer interface to restore speech in severely paralyzed patients.”
DARPA and Columbia University
In 2017, DARPA awarded a four-year $15.8 million grant to Columbia Engineering Professor Ken Shepard, a pioneer in the development of electronics that interface with biological systems, who is leading a team that is inventing an implanted brain-interface device that could transform the lives of people with neurodegenerative diseases or people who are hearing and visually impaired.
Shepard’s project is in DARPA’s Neural Engineering System Design (NESD) program, part of the larger Federal-government-wide BRAIN Initiative. NESD is aimed at developing an implantable neural interface that can provide unprecedented signal resolution and data-transfer bandwidth between the brain and the digital world.
The implanted chips are ultra-conformable to the brain surface, very light, and flexible enough to move with the tissue. The chip does not penetrate the brain tissue and uses wireless powering and data telemetry. “
By using the state-of-the-art in silicon nanoelectronics and applying it in unusual ways, we are hoping to have big impact on brain-computer interfaces,” said Shepard. “We have assembled a world-class team to translate our efforts to human use at the conclusion of this program.”
At Columbia, the project includes Rafael Yuste (professor of biological sciences and neuroscience, Arts & Sciences), Liam Paninski ( professor of statistics and neuroscience, Arts & Sciences, and Zuckerman Institute Principal Investigator), and Luca Carloni (professor of computer science, Engineering).