Cognitive Plausibility of Deep Learning Language Models (GD3.0_2021-18_CogML)


go!digital programme of the ÖAW (Austrian Academy of Sciences)


Human language processing is a highly complex cognitive operation, which we have yet to understand in its entirety. Psycho- and neurolinguistic research of the past decades has focused on identifying the nature, as well as the order of the underlying computational steps involved in this operation, e.g., lexical activation or syntactic processing. These investigations have mainly been based on testable hypotheses provided by theoretical linguistic models of language.

At the same, there has been remarkable progress in Natural Language Processing (NLP) research using deep learning language models. An innovative approach to model training (pre-training fine-tuning) has yielded neural language models (LMs) that acquire general linguistic knowledge and can subsequently easily be fine-tuned to various NLP tasks, from structural part-of-speech tagging to semantically related natural language understanding.

What remains a largely open question, however, is what kind of linguistic knowledge LMs acquire during pre-training and which aspects of it they use to solve the tasks. Interestingly, while able to perform sophisticated machine translation tasks, these LMs fail at even very basic language tasks, for instance learning of abstract sameness relations, which in linguistics are assumed to be primitives underlying successful (human-like) language processing. In other words, it seems that despite their impressive developments, LMs’ underlying processing is not necessarily comparable to human cognitive operations involved in language processing and much less psycholinguistically plausible. In fact, deep neural language models are only seldom informed by concurrent findings from experimental linguistics.

The present interdisciplinary project aims to bridge this gap with an innovative, integrative research approach that combines the strengths of modern psycholinguistics and deep learning NLP research. Specifically, we aim to identify cognitive mechanisms involved in human language processing and to subsequently integrate these into state-of-the-art deep natural language processing models in order to make these NLP models more human-like. This integration will be achieved either through adapted model training, or through adaptations at the architectural level (or both). We expect that firstly, our approach will yield improved NLP models that show increased performance in common NLP benchmark tests, which we will test directly in a comparison with the original unmodified models. Secondly, we expect that the thus created more cognitively plausible language models will generate new testable hypotheses for linguistics. Based on these, psycholinguistic experiments with human participants will be conducted to investigate our models’ predictive value and prediction accuracy. Accordingly, the integrative approach of this project has the potential to open new pathways in both domains, natural language processing and linguistics.


Due to the interdisciplinary nature of the project, several computational linguistic and psycholinguistic methods will be used – to some extent in adapted form in the respective other domain. Deep learning models of language will be chosen and then actively modified based on state-of-the-art machine learning methods. In addition to possible adaptations of the model architecture, options of model training will be investigated that require specially designed (language) datasets, so that NLP models will be able to acquire the cognitive core mechanisms that have been identified. In interaction with this, it may also be necessary to experiment with parameters of the model architecture during pre-training, in order to more closely simulate the language acquisition process.

To assess the success of our implementations, the training process will be continuously accompanied by state-of-the-art analysis methods at inference time (analogous to online language processing in humans). One method that will be particularly valuable in this regard is the analysis of self-attention heads, as they visualize the relations an NLP model recognizes in the language input, which allows conclusions about the nature of a respective cognitive mechanism. To evaluate the NLP performance in comparison to the original models, the benchmarking procedures commonly used in deep learning LM research will be applied.

Depending on which cognitive mechanisms will be identified and implemented, the last project phase will comprise experiments with human participants involving behavioral and/or neurophysiological methods. These could be behavioral categorization tasks, reaction time or eye-gaze measures, or involve online electrophysiological measurements of event-related potentials or neural entrainment. The data generated from these will be analyzed using state-of-the-art statistical methods, e.g., linear mixed models or cluster-based tests.

Data Management

The NLP experiments with language models will reuse publicly available datasets which are suitable for benchmarking models on specific tasks. CNN - DailyMail CNN - Daily Mail ( for text summarization, 20Newsgroups ( for topic recognition, and IMDB Movie Reviews ( for sentiment analysis are some of the most popular. No new data are expected to be generated from this part of the experiments. Data generated from experiments with human participants depends on the method(s) decided upon in WP2. The data will comprise personal information of participants (e.g. age, level of education, languages spoken), behavioral data, e.g. reaction times, and possibly neurophysiological data which will be handled confidentially and only be accessibble to staff members who are immediately involved in the project.

For the human data collected, the BIDs data format will be used (Gorgolewski et al., 2016).

All primary research data will be handled according to FAIR data principles (Wilkinson et al. 2016) and made accessible along with the respective publications (if possible, via open access with the respective journals, otherwise via platforms such as arxiv). Any collected behavioral or neurophysiological data will be stored in BIDs format to ease data sharing (Gorgolewski et al., 2016). 

Regarding the psycholinguistic data collected with human subjects, data quality will be ensured through vigorous documentation of the data collection process (as is common standard in the Babelfish Psycholinguistics Lab) and saving of all necessary information in the BIDs structure of the collected data. If the publication outlet (and the ethics committee) allows, the data will be shared in an anonymized manner along with the relevant publication for full transparency.

Funding organization