Joint processing of linguistic properties in brains and language models

Joint processing of linguistic properties in brains and language models, Subba Reddy Oota, Manish Gupta and Mariya Toneva, NeurIPS-2023

21^st year_narratives_listening_dataset

21^st year dataset statistics:

18 subjects
fMRI brain recordings
8267 words
2226 TRs (Time Repetition)
TR = 1.5 secs

How to download 21^st year dataset

Datalad can be installed using pip

python -m pip install datalad

It is highly recommended to configure Git before using DataLad. Set both 'user.name' and 'user.email' configuration variables.

- git config --global user.name "username"
- git config --global user.email emailid

git-annex installation is required for downloading the dataset

sudo apt-get install git-annex

Download the dataset using datalad

datalad install https://datasets.datalad.org/labs/hasson/narratives/derivatives/afni-nosmooth

Download each subject data (considered the fsaverage6) using bash script

cd afni-nosmooth
bash download_data.sh
python brain_data_21styear_fsaverage6.py

Extract stimuli representations using bert model with context length 20

Narratives 21^st-year Dataset

python extract_features_words.py --input_file ./Narratives/21styear_align.csv --model bert-base --sequence_length 20 --output_file bert_conext20_21styear

To build voxelwise encoding model for different stimuli representations

five arguments are passed as input:
#subject_number
#layers
stimulus vector
context length
output directory

cd brain_predictions
python brain_predictions_21styear_text.py 1 12 bert_conext20_21styear.npy 20 output_predictions

Poster

Slides

slides

Video

For Citation of our work

@inproceedings{oota2022joint,
  title={Joint processing of linguistic properties in brains and language models},
  author={Oota, Subba Reddy and Gupta, Manish and Toneva, Mariya},
  booktitle={Proceedings of the Thirty-seventh Conference on Neural Information Processing Systems },
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
brain_dataset		brain_dataset
brain_predictions		brain_predictions
stimulus_resources		stimulus_resources
21styear_chunking_spacy.txt		21styear_chunking_spacy.txt
21styear_transcript.txt		21styear_transcript.txt
21styear_treedepth.txt		21styear_treedepth.txt
LICENSE		LICENSE
README.md		README.md
architecture.png		architecture.png
brain_data_21styear_fsaverage6.py		brain_data_21styear_fsaverage6.py
chunking_visualisations		chunking_visualisations
download_data.sh		download_data.sh
extract_features_words.py		extract_features_words.py
generate_chunking_conll_format.ipynb		generate_chunking_conll_format.ipynb
phrasal_syntax_21styear.txt		phrasal_syntax_21styear.txt
requirement.txt		requirement.txt
text_model_config.json		text_model_config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joint processing of linguistic properties in brains and language models

Poster

Slides

Video

For Citation of our work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Joint processing of linguistic properties in brains and language models

Poster

Slides

Video

For Citation of our work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages