Tuesday, June 15, 2021

New paper (Vasishth and Gelman): How to embrace variation and accept uncertainty in linguistic and psycholinguistic data analysis

A new paper, just accepted in the journal Linguistics:


Title: How to embrace variation and accept uncertainty in linguistic and psycholinguistic data analysis 

Abstract: The use of statistical inference in linguistics and related areas like psychology typically involves a binary decision: either reject or accept some null hypothesis using statistical significance testing. When statistical power is low, this frequentist data-analytic approach breaks down: null results are uninformative, and effect size estimates associated with significant results are overestimated. Using an example from psycholinguistics, several alternative approaches are demonstrated for reporting inconsistencies between the data and a theoretical prediction. The key here is to focus on committing to a falsifiable prediction, on quantifying uncertainty statistically, and learning to accept the fact that—in almost all practical data analysis situations—we can only draw uncertain conclusions from data, regardless of whether we manage to obtain statistical significance or not. A focus on uncertainty quantification is likely to lead to fewer excessively bold claims that, on closer investigation, may turn out to be not supported by the data.

Friday, May 14, 2021

New Psych Review paper by Max Rabe et al: A Bayesian approach to dynamical modeling of eye-movement control in reading of normal, mirrored, and scrambled texts

 An important new paper by Max Rabe, a PhD student in the psychology department at Potsdam:

Open access pdf download:

Reproducible code and data: 

Title: A Bayesian approach to dynamical modeling of eye-movement control in reading of normal, mirrored, and scrambled texts

Abstract: In eye-movement control during reading, advanced process-oriented models have been developed to reproduce behavioral data. So far, model complexity and large numbers of model parameters prevented rigorous statistical inference and modeling of interindividual differences. Here we propose a Bayesian approach to both problems for one representative computational model of sentence reading (SWIFT; Engbert et al., Psychological Review, 112, 2005, pp. 777–813). We used experimental data from 36 subjects who read the text in a normal and one of four manipulated text layouts (e.g., mirrored and scrambled letters). The SWIFT model was fitted to subjects and experimental conditions individually to investigate between-subject variability. Based on posterior distributions of model parameters, fixation probabilities and durations are reliably recovered from simulated data and reproduced for withheld empirical data, at both the experimental condition and subject levels. A subsequent statistical analysis of model parameters across reading conditions generates model-driven explanations for observable effects between conditions. 

Sunday, May 09, 2021

Two important new papers from my lab on lossy compression, encoding, and retrieval interference

My student Himanshu Yadav is on a roll; he has written two very interesting papers investigating alternative models of similarity-based interference. 

 The first one will appear in the Cognitive Science proceedings

 Title: Feature encoding modulates cue-based retrieval: Modeling interference effects in both grammatical and ungrammatical sentences
AbstractStudies on similarity-based interference in subject-verb number agreement dependencies have found a consistent facilitatory effect in ungrammatical sentences but no conclusive effect in grammatical sentences. Existing models propose that interference is caused either by a faulty representation of the input (encoding-based models) or by difficulty in retrieving the subject based on cues at the verb (retrieval-based models). Neither class of model captures the observed patterns in human reading time data. We propose a new model that integrates a feature encoding mechanism into an existing cue-based retrieval model. Our model outperforms the cue-based retrieval model in explaining interference effect data from both grammatical and ungrammatical sentences. These modeling results yield a new insight into sentence processing, encoding modulates retrieval. Nouns stored in memory undergo feature distortion, which in turn affects how retrieval unfolds during dependency completion.

The second paper will appear in the International Conference on Cognitive Modeling (ICCM) proceedings:

Title: Is similarity-based interference caused by lossy compression or cue-based retrieval? A computational evaluation
AbstractThe similarity-based interference paradigm has been widely used to investigate the factors subserving subject-verb agreement processing. A consistent finding is facilitatory interference effects in ungrammatical sentences but inconclusive results in grammatical sentences. Existing models propose that interference is caused either by misrepresentation of the input (representation distortion-based models) or by mis-retrieval of the interfering noun phrase based on cues at the verb (retrieval-based models). These models fail to fully capture the observed interference patterns in the experimental data. We implement two new models under the assumption that a comprehender utilizes a lossy memory representation of the intended message when processing subject-verb agreement dependencies. Our models outperform the existing cue-based retrieval model in capturing the observed patterns in the data for both grammatical and ungrammatical sentences. Lossy compression models under different constraints can be useful in understanding the role of representation distortion in sentence comprehension.

Wednesday, April 21, 2021

Tuesday, April 20, 2021

New paper in Cognitive Science (open access): A Computational Evaluation of Two Models of Retrieval Processes in Sentence Processing in Aphasia

 An exciting new paper by my PhD student Paula Lissón

Download from here:

Code and data:

Title: A Computational Evaluation of Two Models of Retrieval Processes in Sentence Processing in Aphasia

AuthorsPaula Lissón, Dorothea Pregla, Bruno Nicenboim, Dario Paape, Mick L. van het Nederend, Frank Burchert, Nicole Stadie, David Caplan, Shravan Vasishth


Can sentence comprehension impairments in aphasia be explained by difficulties arising from dependency completion processes in parsing? Two distinct models of dependency completion difficulty are investigated, the Lewis and Vasishth (2005) activation‐based model and the direct‐access model (DA; McElree, 2000). These models' predictive performance is compared using data from individuals with aphasia (IWAs) and control participants. The data are from a self‐paced listening task involving subject and object relative clauses. The relative predictive performance of the models is evaluated using k‐fold cross‐validation. For both IWAs and controls, the activation‐based model furnishes a somewhat better quantitative fit to the data than the DA model. Model comparisons using Bayes factors show that, assuming an activation‐based model, intermittent deficiencies may be the best explanation for the cause of impairments in IWAs, although slowed syntax and lexical delayed access may also play a role. This is the first computational evaluation of different models of dependency completion using data from impaired and unimpaired individuals. This evaluation develops a systematic approach that can be used to quantitatively compare the predictions of competing models of language processing.

Sunday, April 18, 2021

New paper (to appear in Open Mind):

A postdoc in our lab, Dario Paape, has had a paper accepted in the MIT Press open access journal Open Mind, which is one of the few serious open access journals available as an outlet for psycholinguists (another is Glossa Psycholinguistics). Unlike many of the so-called open access journals out there, Open Mind is a credible journal, not least because of its editorial board (the editor in chief is none other than Ted Gibson). The review process was as or more thoughtful and more thorough than I have experience in journals like Journal of Memory and Language (definitely a notch over Cognition). I am hopeful that we as a community can break free from these for-profit publishers and move towards open access journals like Open Mind and Glossa Psycholinguistics.

Download preprint from here:

Title: Does local coherence lead to targeted regressions and illusions of grammaticality?

Authors: Dario Paape, Shravan Vasishth, and Ralf Engbert

Abstract: Local coherence effects arise when the human sentence processor is temporarily misled by a locally grammatical but globally ungrammatical analysis ("The coach smiled at THE PLAYER TOSSED A FRISBEE by the opposing team"). It has been suggested that such effects occur either because sentence processing occurs in a bottom-up, self-organized manner rather than being under constant grammatical supervision (Tabor, Galantucci, & Richardson, 2004), or because local coherence can disrupt processing due to readers maintaining uncertainty about previous input (Levy, 2008). We report the results of an eye-tracking study in which subjects read German grammatical and ungrammatical sentences that either contained a locally coherent substring or not and gave binary grammaticality judgments. In our data, local coherence affected on-line processing immediately at the point of the manipulation. There was, however, no indication that local coherence led to illusions of grammaticality (a prediction of self-organization), and only weak, inconclusive support for local coherence leading to targeted regressions to critical context words (a prediction of the uncertain-input approach). We discuss implications for self-organized and noisy-channel models of local coherence.

New paper: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation

My PhD student Himanshu Yadav has recently submitted this amazing paper for review to a journal. This is the first in a series of papers that we are working on relating to the important topic of individual-level variability in sentence processing, a topic of central concern in our Collaborative Research Center on variability at Potsdam.

Download the preprint from here:

Title: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation

Authors: Himanshu Yadav, Dario Paape, Garrett Smith, Brian Dillon, and Shravan Vasishth

Abstract: Cue-based retrieval theories of sentence processing assume that syntactic dependencies are resolved through a content-addressable search process. An important recent claim is that in certain dependency types, the retrieval cues are weighted such that one cue dominates. This cue-weighting proposal aims to explain the observed average behavior, but here we show that there is systematic individual-level variation in cue weighting. Using the Lewis and Vasishth cue-based retrieval model, we estimated individual-level parameters for processing speed and cue weighting using 13 published datasets; hierarchical Approximate Bayesian Computation (ABC) was used to estimate the parameters. The modeling reveals a nuanced picture of cue weighting: we find support for the idea that some participants weight cues differentially, but not all participants do. Only fast readers tend to have the higher weighting for structural cues, suggesting that reading proficiency might be associated with cue weighting. A broader achievement of the work is to demonstrate how individual differences can be investigated in computational models of sentence processing without compromising the complexity of the model.

Wednesday, March 31, 2021

New paper: The benefits of preregistration for hypothesis-driven bilingualism research

Download from: here

The benefits of preregistration for hypothesis-driven bilingualism research

Daniela Mertzen, Sol Lago and Shravan Vasishth

Preregistration is an open science practice that requires the specification of research hypoth- eses and analysis plans before the data are inspected. Here, we discuss the benefits of preregis- tration for hypothesis-driven, confirmatory bilingualism research. Using examples from psycholinguistics and bilingualism, we illustrate how non-peer reviewed preregistrations can serve to implement a clean distinction between hypothesis testing and data exploration. This distinction helps researchers avoid casting post-hoc hypotheses and analyses as con- firmatory ones. We argue that, in keeping with current best practices in the experimental sciences, preregistration, along with sharing data and code, should be an integral part of hypothesis-driven bilingualism research.