Stimulus Speech Decoding from Human Cortex with Generative Adversarial Network Transfer Learning

Ran Wang, Xupeng Chen, Amirhossein Khalilian-Gourtani, Zhaoxi Chen, Leyao Yu, Adeen Flinker, Yao Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Decoding auditory stimulus from neural activity can enable neuroprosthetics and direct communication with the brain. Some recent studies have shown successful speech decoding from intracranial recording using deep learning models. However, scarcity of training data leads to low quality speech reconstruction which prevents a complete brain-computer-interface (BCI) application. In this work, we propose a transfer learning approach with a pre-trained GAN to disentangle representation and generation layers for decoding. We first pre-train a generator to produce spectrograms from a representation space using a large corpus of natural speech data. With a small amount of paired data containing the stimulus speech and corresponding ECoG signals, we then transfer it to a bigger network with an encoder attached before, which maps the neural signal to the representation space. To further improve the network generalization ability, we introduce a Gaussian prior distribution regularizer on the latent representation during the transfer phase. With at most 150 training samples for each tested subject, we achieve a state-of-the-art decoding performance. By visualizing the attention mask embedded in the encoder, we observe brain dynamics that are consistent with findings from previous studies investigating dynamics in the superior temporal gyrus (STG), pre-central gyrus (motor) and inferior frontal gyrus (IFG). Our findings demonstrate a high reconstruction accuracy using deep learning networks together with the potential to elucidate interactions across different brain regions during a cognitive task.

Original languageEnglish (US)
Title of host publicationISBI 2020 - 2020 IEEE International Symposium on Biomedical Imaging
PublisherIEEE Computer Society
Pages390-394
Number of pages5
ISBN (Electronic)9781538693308
DOIs
StatePublished - Apr 2020
Event17th IEEE International Symposium on Biomedical Imaging, ISBI 2020 - Iowa City, United States
Duration: Apr 3 2020Apr 7 2020

Publication series

NameProceedings - International Symposium on Biomedical Imaging
Volume2020-April
ISSN (Print)1945-7928
ISSN (Electronic)1945-8452

Conference

Conference17th IEEE International Symposium on Biomedical Imaging, ISBI 2020
Country/TerritoryUnited States
CityIowa City
Period4/3/204/7/20

Keywords

  • electrocorticographic (ECoG)
  • generative adversarial networks (GAN)
  • speech decoding
  • superior temporal gyrus (STG)
  • transfer learning

ASJC Scopus subject areas

  • Biomedical Engineering
  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'Stimulus Speech Decoding from Human Cortex with Generative Adversarial Network Transfer Learning'. Together they form a unique fingerprint.

Cite this