Shravan Vasishth's Slog (Statistics blog): Computational Modeling

Showing posts with label Computational Modeling. Show all posts

Tuesday, March 21, 2023

Himanshu Yadav, PhD

Today Himanshu defended his dissertation. His dissertation consists of three published papers.

1. Himanshu Yadav, Garrett Smith, Sebastian Reich, and Shravan Vasishth. Number feature distortion modulates cue-based retrieval in reading. Journal of Memory and Language, 129, 2023.

2. Himanshu Yadav, Dario Paape, Garrett Smith, Brian W. Dillon, and Shravan Vasishth. Individual differences in cue weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation. Open Mind, 2022

3. Himanshu Yadav, Garrett Smith, Daniela Mertzen, Ralf Engbert, and Shravan Vasishth Proceedings of the Annual Meeting of the Cognitive Science Society, 44, 2022.

Congratulations to Himanshu for his truly outstanding work!

Saturday, March 11, 2023

New paper: SEAM: An Integrated Activation-Coupled Model of Sentence Processing and Eye Movements in Reading.

Michael Betancourt, a giant in the field of Bayesian statistical modeling, once indirectly pointed out to me (in a podcast interview) that one should not try to model latent cognitive processes in reading by computing summary statistics like the mean difference between conditions and then fitting the model on those summary statistics. But that is exactly what we do in psycholinguistics. Most models (including those from my lab) evaluate model performance on summary statistics from the data (usually, a mean difference), abstracting away quite dramatically from the complex processes that resulted in those reading times and regressive eye movements.

What Michael wanted instead was a detailed process model of how the observed fixations and eye movement patterns arise. Obviously, such a model would be extremely complicated, because one would have to specify the full details of oculomotor processes and their impact on eye movements, as well as a model of language comprehension, and specify how these components interact to produce eye movements at the single trial level. This kind of model will quickly become computationally intractable if one tries to estimate the model parameters using data. So that's a major barrier to building such a model.

Interestingly, both eye movement control models and models of sentence comprehension exist. But these live in parallel universes. Psychologists have almost always focused on eye movement control, ignoring the impact of sentence comprehension processes (I once heard a talk by a psychologist who publicly called out psycholinguists, labeling them as "crazy" for studying language processing in reading :). Similarly, most psycholinguists just ignore the lower-level processes unfolding in reading, and just assume that language processing events are responsible for differences in fixation durations or in left-ward eye movements (regressions). The most that psycholinguists like me are willing to do is add word frequency etc. as a co-predictor to reading time or other dependent measures when investigating reading. But in most cases even that would go too far :).

What is missing is a model that brings these two lines of work into one integrated reading model that co-determines where we move our eyes to and for how long.

Max Rabe, who is wrapping up his PhD work in psychology at Potsdam in Germany, demonstrates how this could be done: he takes a fully specified model of eye movement control in reading (SWIFT) and integrates into it linguistic dependency completion processes, following the principles of the cognitive architecture ACT-R. A key achievement is that the activation of a word being read is co-determined by both oculomotor processes as specified in SWIFT, and cue-based retrieval processes as specified in the activation-based model of retrieval. A key achievement is to show how regressive eye movements are triggered when sentence processing difficulty (here, similarity-based interference) arises during reading.

What made the model fitting possible was Bayesian parameter estimation: Max Rabe shows in an earlier (2021) Psychological Review paper (preprint here) how parameter estimation can be carried out in complex models where the likelihood function may not be easy to work out.

Download the paper from arXiv.

Tuesday, March 07, 2023

Job opening: Postdoc position, starting 1 Oct 2023 (Vasishth lab, University of Potsdam, Germany)

I am looking for a postdoc working in sentence processing (psycholinguistics); the position is at the TV-L 13 salary level. This is a teaching + research position in my lab (vasishth.github.io) in the University of Potsdam, Germany. The planned start date is 1st October 2023, and the initial appointment (following a six-month probationary period) is three years; this can be extended following a positive evaluation.

Principal tasks:

- Teaching two 90 minute classes to undergraduates and graduate students every semester. We teach courses on frequentist and Bayesian statistics, the foundations of mathematics for non-STEM students entering an MSc program in the linguistics department, psycholinguistics (reviews of current research), introductions to psycholinguistics and to experimental methodology.

- Carrying out and publishing research on sentence processing (computational modeling and/or experimental work (e.g., eye-tracking, ERP, self-paced reading). For examples of our research, see: https://vasishth.github.io/publications.html.

- Participation in lab discussions and research collaborations.

Qualifications that you should have:

- A PhD in linguistics, psychology, or some related discipline. In exceptional circumstances, I will consider a prospective PhD student (with a full postdoc salary) who is willing to teach as well as do a PhD with me.

- Published scientific work.

- A background in sentence comprehension research (modeling or experimental or both).

- A solid quantitative background (basic fluency in mathematics and statistical computing at the level needed for statistical modeling and data analysis in psycholinguistics).

An ability to teach in German is desirable but not necessary. A high level of English fluency is expected, especially in writing.

The University and the linguistics department:

The University of Potsdam's Linguistics department is located in Golm, which is a suburb of the city Potsdam, and which can be reached within 40 minutes or so from Berlin through a direct train connection. The linguistics department has a broad focus on almost all areas relating to linguistics (syntax, morphology, semantics, phonetics/phonology, language acquisition, sentence comprehension and production, computational linguistics). The research is highly interdisciplinary, involving collaborations with psychology and mathematics, among other areas. We are a well-funded lab, with projects in a collaborative research grant (SFB 1287) on variability, as well as through individual grants.

The research focus of my lab:

Our lab currently consists of six postdocs (see here), and three guest professors who work closely with lab members. We work mostly on models of sentence comprehension, developing both implemented computational models as well as doing experimental work to evaluate these models.

Historically, our postdocs have been very successful in getting professorships: Lena Jäger, Sol Lago, Titus von der Malsburg, Daniel Schad, João Veríssimo, Samar Husain, Bruno Nicenboim. One of our graduates, Felix Engelmann, has his own start-up in Berlin.

For representative recent work from our lab, see:

- Himanshu Yadav, Garrett Smith, Sebastian Reich, and Shravan Vasishth. Number feature distortion modulates cue-based retrieval in reading. Journal of Memory and Language, 129, 2023.

- Shravan Vasishth and Felix Engelmann. Sentence Comprehension as a Cognitive Process: A Computational Approach. Cambridge University Press, Cambridge, UK, 2022.

- Dario Paape and Shravan Vasishth. Estimating the true cost of garden-pathing: A computational model of latent cognitive processes. Cognitive Science, 46:e13186, 2022.

- Daniel J. Schad, Bruno Nicenboim, Paul-Christian Bürkner, Michael Betancourt, and Shravan Vasishth. Workflow Techniques for the Robust Use of Bayes Factors. Psychological Methods, 2022.

- Bruno Nicenboim, Shravan Vasishth, and Frank Rösler. Are words pre-activated probabilistically during sentence comprehension? Evidence from new data and a Bayesian random-effects meta-analysis using publicly available data. Neuropsychologia, 142, 2020.

How to apply:

To apply, please send me an email (vasishth@uni-potsdam.de) with subject line "Postdoc position 2023", attaching a CV, a one-page statement of interest (research and teaching), copies of any publications (including the dissertation), and names of two or three referees that I can contact. The application period remains open until filled, but I hope to make a decision by end-July 2023 at the latest.

Wednesday, March 23, 2022

Summer School on Statistical Methods for Linguistics and Psychology, Sept. 12-16, 2022 (applications close April 1)

The Sixth Summer School on Statistical Methods for Linguistics and Psychology will be held in Potsdam, Germany, September 12-16, 2022. Like the previous editions of the summer school, this edition will have two frequentist and two Bayesian streams. Currently, this summer school is being planned as an in-person event.

The application form closes April 1, 2022. We will announce the decisions on or around April 15, 2022.

Course fee: There is no fee because the summer school is funded by the Collaborative Research Center (Sonderforschungsbereich 1287). However, we will charge 40 Euros to cover costs for coffee and snacks during the breaks and social hours. And participants will have to pay for their own accommodation.

For details, see: https://vasishth.github.io/smlp2022/

Curriculum:

1. Introduction to Bayesian data analysis (maximum 30 participants). Taught by Shravan Vasishth, assisted by Anna Laurinavichyute, and Paula Lissón

This course is an introduction to Bayesian modeling, oriented towards linguists and psychologists. Topics to be covered: Introduction to Bayesian data analysis, Linear Modeling, Hierarchical Models. We will cover these topics within the context of an applied Bayesian workflow that includes exploratory data analysis, model fitting, and model checking using simulation. Participants are expected to be familiar with R, and must have some experience in data analysis, particularly with the R library lme4.
Course Materials Previous year's course web page: all materials (videos etc.) from the previous year are available here.
Textbook: here. We will work through the first six chapters.

2. Advanced Bayesian data analysis (maximum 30 participants). Taught by Bruno Nicenboim, assisted by Himanshu Yadav

This course assumes that participants have some experience in Bayesian modeling already using brms and want to transition to Stan to learn more advanced methods and start building simple computational cognitive models. Participants should have worked through or be familiar with the material in the first five chapters of our book draft: Introduction to Bayesian Data Analysis for Cognitive Science. In this course, we will cover Parts III to V of our book draft: model comparison using Bayes factors and k-fold cross validation, introduction and relatively advanced models with Stan, and simple computational cognitive models.

Course Materials Textbook here. We will start from Part III of the book (Advanced models with Stan). Participants are expected to be familiar with the first five chapters.

3. Foundational methods in frequentist statistics (maximum 30 participants). Taught by Audrey Buerki, Daniel Schad, and João Veríssimo.

Participants will be expected to have used linear mixed models before, to the level of the textbook by Winter (2019, Statistics for Linguists), and want to acquire a deeper knowledge of frequentist foundations, and understand the linear mixed modeling framework more deeply. Participants are also expected to have fit multiple regressions. We will cover model selection, contrast coding, with a heavy emphasis on simulations to compute power and to understand what the model implies. We will work on (at least some of) the participants' own datasets. This course is not appropriate for researchers new to R or to frequentist statistics.

Course Materials Textbook draft here.

4. Advanced methods in frequentist statistics with Julia (maximum 30 participants). Taught by Reinhold Kliegl, Phillip Alday, Julius Krumbiegel, and Doug Bates.
Applicants must have experience with linear mixed models and be interested in learning how to carry out such analyses with the Julia-based MixedModels.jl package) (i.e., the analogue of the R-based lme4 package). MixedModels.jl has some significant advantages. Some of them are: (a) new and more efficient computational implementation, (b) speed — needed for, e.g., complex designs and power simulations, (c) more flexibility for selection of parsimonious mixed models, and (d) more flexibility in taking into account autocorrelations or other dependencies — typical EEG-, fMRI-based time series (under development). We do not expect profound knowledge of Julia from participants; the necessary subset of knowledge will be taught on the first day of the course. We do expect a readiness to install Julia and the confidence that with some basic instruction participants will be able to adapt prepared Julia scripts for their own data or to adapt some of their own lme4-commands to the equivalent MixedModels.jl-commands. The course will be taught in a hybrid IDE. There is already the option to execute R chunks from within Julia, meaning one needs Julia primarily for execution of MixedModels.jl commands as replacement of lme4. There is also an option to call MixedModels.jl from within R and process the resulting object like an lme4-object. Thus, much of pre- and postprocessing (e.g., data simulation for complex experimental designs; visualization of partial-effect interactions or shrinkage effects) can be carried out in R.
Course Materials Github repo: here.

New paper in Computational Brain and Behavior: Sample size determination in Bayesian Linear Mixed Models

We've just had a paper accepted in Computational Brain and Behavior, an open access journal of the Society for Mathematical Psychology.

Even though I am not a psychologist, I feel an increasing affinity to this field compared to psycholinguistics proper. I will be submitting more of my papers to this journal and other open access journals (Glossa Psycholx, Open Mind in particular) in the future.

Some things I liked about this journal:

- A fast and well-informed, intelligent, useful set of reviews. The reviewers actually understand what they are talking about! It's refreshing to find people out there who speak my language (and I don't mean English or Hindi). Also, the reviewers signed their reviews. This doesn't usually happen.

- Free availability of the paper after publication; I didn't have to do anything to make this happen. By contrast, I don't even have copies of my own articles published in APA journals. The same goes for Elsevier journals like the Journal of Memory and Language. Either I shell out $$$ to make the paper open access, or I learn to live with the arXiv version of my paper.

- The proofing was *excellent*. By contrast, the Journal of Memory and Language adds approximately 500 mistakes into my papers every time they publish it (then we have to correct them, if we catch them at all). E.g., in this paper we had to issue a correction about a German example; this error was added by the proofer! Another surprising example of JML actually destroying our paper's formatting is this one; here, the arXiv version has better formatting than the published paper, which cost several thousand Euros!

- LaTeX is encouraged. By contrast, APA journals demand that papers be submitted in W**d.

Here is the paper itself: here, we present an approach, adapted from the work of two statisticians (Wang and Gelfand), for determining approximate sample size needed for drawing meaningful inferences using Bayes factors in hierarchical models (aka linear mixed models). The example comes from a psycholinguistic study but the method is general. Code and data are of course available online.

The pdf: https://link.springer.com/article/10.1007/s42113-021-00125-y

Thursday, February 03, 2022

EMLAR 2022 tutorial on Bayesian methods

At EMLAR 2022 I will teach two sessions that will introduce Bayesian methods. Here is the abstract for the two sessions:

EMLAR 2022: An introduction to Bayesian data analysis

Taught by Shravan Vasishth (vasishth.github.io)

Session 1. Tuesday 19 April 2022, 1-3PM (Zoom link will be provided)

Modern probabilistic programming languages like Stan (mc-stan.org)

have made Bayesian methods increasingly accessible to researchers

in linguistics and psychology. However, finding an entry point

into these methods is often difficult for researchers. In this

tutorial, I will provide an informal introduction to the

fundamental ideas behind Bayesian statistics, using examples

that illustrate applications to psycholinguistics.

I will also discuss some of the advantages of the Bayesian

approach over the standardly used frequentist paradigms:

uncertainty quantification, robust estimates through regularization,

the ability to incorporate expert and/or prior knowledge into

the data analysis, and the ability to flexibly define the

generative process and thereby to directly address the actual research

question (as opposed to a straw-man null hypothesis).

Suggestions for further reading will be provided. In this tutorial,

I presuppose that the audience is familiar with linear mixed models

(as used in R with the package lme4).

Session 2. Thursday 21 April 2022, 9:30-11:30 (Zoom link will be provided)

This session presupposed that the participant has attended

Session 1. I will show some case studies using brms and Stan

code that will demonstrate the major applications of

Bayesian methods in psycholinguistics. I will reference/use some of

the material described in this online textbook (in progress):

https://vasishth.github.io/bayescogsci/book/

Thursday, January 20, 2022

New opinion paper in Trends in Cognitive Sciences: Data Assimilation in Dynamical Cognitive Science (Engbert et al.)

Here's a new opinion paper in Trends in Cognitive Sciences, by Ralf Engbert, Max Rabe, et al.

Link to paper: here

Tuesday, December 14, 2021

New paper: Syntactic and semantic interference in sentence comprehension: Support from English and German eye-tracking data

This paper is part of a larger project that has been running for 4-5 years, on the predictions of cue-based retrieval theories. This paper revisits Van Dyke 2007's design, using eye-tracking (the data are from comparable designs in English and German). The reading time patterns are consistent with syntactic interference at the moment of retrieval in both English. Semantic interference shows interesting differences between English and German---in English, semantic interference seems to happen simultaneously with syntactic interference, but in German, semantic interference is delayed (it appears in the post-critical region). The morphosyntactic properties of German could be driving the lag in semantic interference. We also discuss the data in the context of the quantitative predictions from the Lewis & Vasishth cue-based retrieval model.

One striking fact about psycholinguistics in general and interference effects in particular is that most of the data tend to come from English. Very few people work on non-English languages. I bet there are a lot of surprises in store for us once we step out of the narrow confines of English. I bet that most theories of sentence processing are overfitted to English and will not scale. And if you submit a paper to a journal using data from a non-English language, there will always be a reviewer or editor who asks you to explain why you chose language X!=English, and not English. Nobody ever questions you if you study English. A bizarre world.

Title: Syntactic and semantic interference in sentence comprehension: Support from English and German eye-tracking data

Abstract:

A long-standing debate in the sentence processing literature concerns the time course of syntactic and semantic information in online sentence comprehension. The default assumption in cue-based models of parsing is that syntactic and semantic retrieval cues simultaneously guide dependency resolution. When retrieval cues match multiple items in memory, this leads to similarity-based interference. Both semantic and syntactic interference have been shown to occur in English. However, the relative timing of syntactic vs. semantic interference remains unclear. In this first-ever cross-linguistic investigation of the time course of syntactic vs. semantic interference, the data from two eye-tracking reading experiments (English and German) suggest that the two types of interference can in principle arise simultaneously during retrieval. However, the data also indicate that semantic cues may be evaluated with a small timing lag in German compared to English. This suggests that there may be cross-linguistic variation in how syntactic and semantic cues are used to resolve linguistic dependencies in real-time.

Download pdf from here: https://psyarxiv.com/ua9yv

Tuesday, December 07, 2021

New paper accepted in MIT Press Journal Open Mind: Individual differences in cue weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation

My PhD student Himanshu Yadav has just had an important paper on modeling individual differences provisionally accepted in the open access journal Open Mind. One reason that this paper is important is that it demonstrates why it is crucial to understand systematic individual-level behavior in the data, and what this observed data implies for computational models of sentence processing. As Blastland and Spiegelhalter put it, "The average is an abstraction. The reality is variation." Our focus should be on understanding and explaining the variation, not just average behavior. More exciting papers on this topic are coming soon from Himanshu!

The reviews from Open Mind were very high quality, certainly as high or higher quality than I have received from many top closed-access journals over the last 20 years. The journal has a top-notch editorial board, led by none other than Ted Gibson. This is our second paper in Open Mind; the first was this one. I plan to publish more of our papers in this journal (along with the other open access journal, Glossa Psycholinguistics, also led by a stellar set of editors, Fernanda Ferreira and Brian Dillon). I hope that these open access journals can become the norm for our field. I wonder what it will take for that to happen.

Himanshu Yadav, Dario Paape, Garrett Smith, Brian W. Dillon, and Shravan Vasishth. Individual differences in cue weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation. Open Mind, 2021. Provisionally accepted.

The pdf is here.

Monday, December 06, 2021

New paper: Similarity-based interference in sentence comprehension in aphasia: A computational evaluation of two models of cue-based retrieval.

My PhD student Paula Lissón has just submitted this important new paper for review to a journal. This paper is important for several reasons but the most important one is that it's the first to quantitatively compare two competing computational models of retrieval in German sentence processing using data from unimpaired controls and individuals with aphasia. The work is the culmination of four years of hard work involving collecting a relatively large data-set (this amazing feat was achieved by Dorothea Pregla, and documented in a series of papers she has written, for example see this one in Brain and Language), and then developing computational models in Stan to systematically evaluate competing theoretical claims. This line of work should raise the bar in psycholinguistics when it comes to testing predictions of different theories. It is pretty common in psycholinguistics to wildly wave one's hands and say things like "sentence processing in individuals with aphasia is just noisy", and be satisfied with that statement and then publish it as a big insight into sentence processing difficulty. An important achievement of Paula's work, which builds on Bruno Nicenboim's research on Bayesian cognitive modeling, is to demonstrate how to nail down the claim and how to test it quantitatively. It seems kind of obvious that one should do that, but surprisingly, this kind of quantitative evaluation of models is still relatively rare in the field.

Title: Similarity-based interference in sentence comprehension in aphasia: A computational evaluation of two models of cue-based retrieval.

Abstract: Sentence comprehension requires the listener to link incoming words with short-term memory representations in order to build linguistic dependencies. The cue-based retrieval theory of sentence processing predicts that the retrieval of these memory representations is affected by similarity-based interference. We present the first large-scale computational evaluation of interference effects in two models of sentence processing – the activation-based model, and a modification of the direct-access model – in individuals with aphasia (IWA) and control participants in German. The parameters of the models are linked to prominent theories of processing deficits in aphasia, and the models are tested against two linguistic constructions in German: Pronoun resolution and relative clauses. The data come from a visual-world eye-tracking experiment combined with a sentence-picture matching task. The results show that both control participants and IWA are susceptible to retrieval interference, and that a combination of theoretical explanations (intermittent deficiencies, slow syntax, and resource reduction) can explain IWA’s deficits in sentence processing. Model comparisons reveal that both models have a similar predictive performance in pronoun resolution, but the activation-based model outperforms the direct-access model in relative clauses.

Download: here. Paula also has another paper modeling English data from unimpaired controls and individuals in aphasia, in Cognitive Science.

Friday, November 12, 2021

Book: Sentence comprehension as a cognitive process: A computational approach (Vasishth and Engelmann)

My book with Felix Engelmann has just been published. It puts together in one place 20 years of research on retrieval models, carried out by my students, colleagues, and myself.

Friday, May 14, 2021

New Psych Review paper by Max Rabe et al: A Bayesian approach to dynamical modeling of eye-movement control in reading of normal, mirrored, and scrambled texts

An important new paper by Max Rabe, a PhD student in the psychology department at Potsdam:

Open access pdf download: https://psyarxiv.com/nw2pb/

Reproducible code and data: https://osf.io/t9sbf/

Title: A Bayesian approach to dynamical modeling of eye-movement control in reading of normal, mirrored, and scrambled texts

Abstract: In eye-movement control during reading, advanced process-oriented models have been developed to reproduce behavioral data. So far, model complexity and large numbers of model parameters prevented rigorous statistical inference and modeling of interindividual differences. Here we propose a Bayesian approach to both problems for one representative computational model of sentence reading (SWIFT; Engbert et al., Psychological Review, 112, 2005, pp. 777–813). We used experimental data from 36 subjects who read the text in a normal and one of four manipulated text layouts (e.g., mirrored and scrambled letters). The SWIFT model was fitted to subjects and experimental conditions individually to investigate between-subject variability. Based on posterior distributions of model parameters, fixation probabilities and durations are reliably recovered from simulated data and reproduced for withheld empirical data, at both the experimental condition and subject levels. A subsequent statistical analysis of model parameters across reading conditions generates model-driven explanations for observable effects between conditions.

Sunday, May 09, 2021

Two important new papers from my lab on lossy compression, encoding, and retrieval interference

My student Himanshu Yadav is on a roll; he has written two very interesting papers investigating alternative models of similarity-based interference.

The first one will appear in the Cognitive Science proceedings:

Download: https://psyarxiv.com/76aex/

Title: Feature encoding modulates cue-based retrieval: Modeling interference effects in both grammatical and ungrammatical sentences

Abstract: Studies on similarity-based interference in subject-verb number agreement dependencies have found a consistent facilitatory effect in ungrammatical sentences but no conclusive effect in grammatical sentences. Existing models propose that interference is caused either by a faulty representation of the input (encoding-based models) or by difficulty in retrieving the subject based on cues at the verb (retrieval-based models). Neither class of model captures the observed patterns in human reading time data. We propose a new model that integrates a feature encoding mechanism into an existing cue-based retrieval model. Our model outperforms the cue-based retrieval model in explaining interference effect data from both grammatical and ungrammatical sentences. These modeling results yield a new insight into sentence processing, encoding modulates retrieval. Nouns stored in memory undergo feature distortion, which in turn affects how retrieval unfolds during dependency completion.

The second paper will appear in the International Conference on Cognitive Modeling (ICCM) proceedings:

Download: https://psyarxiv.com/3et95/

Title: Is similarity-based interference caused by lossy compression or cue-based retrieval? A computational evaluation

Abstract: The similarity-based interference paradigm has been widely used to investigate the factors subserving subject-verb agreement processing. A consistent finding is facilitatory interference effects in ungrammatical sentences but inconclusive results in grammatical sentences. Existing models propose that interference is caused either by misrepresentation of the input (representation distortion-based models) or by mis-retrieval of the interfering noun phrase based on cues at the verb (retrieval-based models). These models fail to fully capture the observed interference patterns in the experimental data. We implement two new models under the assumption that a comprehender utilizes a lossy memory representation of the intended message when processing subject-verb agreement dependencies. Our models outperform the existing cue-based retrieval model in capturing the observed patterns in the data for both grammatical and ungrammatical sentences. Lossy compression models under different constraints can be useful in understanding the role of representation distortion in sentence comprehension.

Wednesday, April 21, 2021

Video recording of my talk at Stanford (April 20, 2021)

I gave a talk at Stanford on April 20, 2021. Here is the recording:

Tuesday, April 20, 2021

New paper in Cognitive Science (open access): A Computational Evaluation of Two Models of Retrieval Processes in Sentence Processing in Aphasia

An exciting new paper by my PhD student Paula Lissón

Download from here: https://onlinelibrary.wiley.com/doi/10.1111/cogs.12956

Code and data: https://osf.io/kdjqz/

Title: A Computational Evaluation of Two Models of Retrieval Processes in Sentence Processing in Aphasia

Authors: Paula Lissón, Dorothea Pregla, Bruno Nicenboim, Dario Paape, Mick L. van het Nederend, Frank Burchert, Nicole Stadie, David Caplan, Shravan Vasishth

Abstract:

Can sentence comprehension impairments in aphasia be explained by difficulties arising from dependency completion processes in parsing? Two distinct models of dependency completion difficulty are investigated, the Lewis and Vasishth (2005) activation‐based model and the direct‐access model (DA; McElree, 2000). These models' predictive performance is compared using data from individuals with aphasia (IWAs) and control participants. The data are from a self‐paced listening task involving subject and object relative clauses. The relative predictive performance of the models is evaluated using k‐fold cross‐validation. For both IWAs and controls, the activation‐based model furnishes a somewhat better quantitative fit to the data than the DA model. Model comparisons using Bayes factors show that, assuming an activation‐based model, intermittent deficiencies may be the best explanation for the cause of impairments in IWAs, although slowed syntax and lexical delayed access may also play a role. This is the first computational evaluation of different models of dependency completion using data from impaired and unimpaired individuals. This evaluation develops a systematic approach that can be used to quantitatively compare the predictions of competing models of language processing.

Sunday, April 18, 2021

New paper (to appear in Open Mind):

A postdoc in our lab, Dario Paape, has had a paper accepted in the MIT Press open access journal Open Mind, which is one of the few serious open access journals available as an outlet for psycholinguists (another is Glossa Psycholinguistics). Unlike many of the so-called open access journals out there, Open Mind is a credible journal, not least because of its editorial board (the editor in chief is none other than Ted Gibson). The review process was as or more thoughtful and more thorough than I have experience in journals like Journal of Memory and Language (definitely a notch over Cognition). I am hopeful that we as a community can break free from these for-profit publishers and move towards open access journals like Open Mind and Glossa Psycholinguistics.

Download preprint from here: https://psyarxiv.com/2ztgw/

Title: Does local coherence lead to targeted regressions and illusions of grammaticality?

Authors: Dario Paape, Shravan Vasishth, and Ralf Engbert

Abstract: Local coherence effects arise when the human sentence processor is temporarily misled by a locally grammatical but globally ungrammatical analysis ("The coach smiled at THE PLAYER TOSSED A FRISBEE by the opposing team"). It has been suggested that such effects occur either because sentence processing occurs in a bottom-up, self-organized manner rather than being under constant grammatical supervision (Tabor, Galantucci, & Richardson, 2004), or because local coherence can disrupt processing due to readers maintaining uncertainty about previous input (Levy, 2008). We report the results of an eye-tracking study in which subjects read German grammatical and ungrammatical sentences that either contained a locally coherent substring or not and gave binary grammaticality judgments. In our data, local coherence affected on-line processing immediately at the point of the manipulation. There was, however, no indication that local coherence led to illusions of grammaticality (a prediction of self-organization), and only weak, inconclusive support for local coherence leading to targeted regressions to critical context words (a prediction of the uncertain-input approach). We discuss implications for self-organized and noisy-channel models of local coherence.

New paper: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation

My PhD student Himanshu Yadav has recently submitted this amazing paper for review to a journal. This is the first in a series of papers that we are working on relating to the important topic of individual-level variability in sentence processing, a topic of central concern in our Collaborative Research Center on variability at Potsdam.

Download the preprint from here: https://psyarxiv.com/4jdu5/

Title: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation

Authors: Himanshu Yadav, Dario Paape, Garrett Smith, Brian Dillon, and Shravan Vasishth

Abstract: Cue-based retrieval theories of sentence processing assume that syntactic dependencies are resolved through a content-addressable search process. An important recent claim is that in certain dependency types, the retrieval cues are weighted such that one cue dominates. This cue-weighting proposal aims to explain the observed average behavior, but here we show that there is systematic individual-level variation in cue weighting. Using the Lewis and Vasishth cue-based retrieval model, we estimated individual-level parameters for processing speed and cue weighting using 13 published datasets; hierarchical Approximate Bayesian Computation (ABC) was used to estimate the parameters. The modeling reveals a nuanced picture of cue weighting: we find support for the idea that some participants weight cues differentially, but not all participants do. Only fast readers tend to have the higher weighting for structural cues, suggesting that reading proficiency might be associated with cue weighting. A broader achievement of the work is to demonstrate how individual differences can be investigated in computational models of sentence processing without compromising the complexity of the model.

Friday, March 26, 2021

Freshly minted professor from our lab: Prof. Dr. Titus von der Malsburg

One of my first PhD students, Titus von der Malsburg, has just been sworn in as a Professor of Psycholinguistics and Cognitive Modeling (tenure track assistant professor) at the Institute of Linguistics, University of Stuttgart in Germany. Stuttgart is one of the most exciting places to be in Germany for computationally oriented scientists.

Titus is the eighth professor coming out of my lab. He does very exciting work in psycholinguistics; check out his work here.

Tuesday, March 09, 2021

Talk at Stanford (April 20 2021) Dependency completion in sentence processing: Some recent computational and empirical investigations

Title: Dependency completion in sentence processing: Some recent computational and empirical investigations

When: April 20, 2021, 9PM German time

Where: zoom.

How to watch: https://linguistics.stanford.edu/events/dependency-completion-sentence-processing-some-recent-computational-and-empirical

Shravan Vasishth (vasishth.github.io)

Abstract:

Dependency completion processes in sentence processing have been intensively studied in psycholinguistics (e.g., Gibson 2000). I will discuss some recent work (e.g., Yadav et al. 2021) on computational models of dependency completion as they relate to a class of effects, so-called interference effects (Jäger et al., 2017). Using antecedent-reflexive and subject-verb number dependencies as a case study (Jäger et al., 2020), I will discuss the evidence base for some of the competing theoretical claims relating to these phenomena. A common thread running through the talk will be that the well-known replication and statistical crisis in psychology and other areas (Nosek et al., 2015, Gelman and Carlin, 2014) is also unfolding in psycholinguistics and needs to be taken seriously (e.g., Vasishth, et al., 2018).

References

Andrew Gelman and John Carlin (2014). Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors. Perspectives on Psychological Science, 9(6), 641-651.

Edward Gibson, (2000). The dependency locality theory: A distance-based theory of linguistic complexity. Image, Language, Brain, 2000, 95-126.

Lena A. Jäger, Felix Engelmann, and Shravan Vasishth, (2017). Similarity-based interference in sentence comprehension: Literature review and Bayesian meta-analysis. Journal of Memory and Language, 94:316-339.

Lena A. Jäger, Daniela Mertzen, Julie A. Van Dyke, and Shravan Vasishth, (2020). Interference patterns in subject-verb agreement and reflexives revisited: A large-sample study. Journal of Memory and Language, 111.

Brian A. Nosek, & Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716-aac4716.

Shravan Vasishth, Daniela Mertzen, Lena A. Jäger, and Andrew Gelman, (2018). The statistical significance filter leads to overoptimistic expectations of replicability. Journal of Memory and Language, 103:151-175.

Shravan Vasishth and Felix Engelmann, (2021). Sentence comprehension as a cognitive process: A computational approach. Cambridge University Press. In Press.

Himanshu Yadav, Garrett Smith, and Shravan Vasishth, (2021). Feature encoding modulates cue-based retrieval: Modeling interference effects in both grammatical and ungrammatical sentences. Submitted.

Monday, February 22, 2021

Video recording of talk at Tuebingen: Individual differences in sentence processing

Here is the video recording of my talk from Feb 22, 2021:

Search

Tuesday, March 21, 2023

Saturday, March 11, 2023

Tuesday, March 07, 2023

Wednesday, March 23, 2022

Thursday, February 03, 2022

Thursday, January 20, 2022

Tuesday, December 14, 2021

Tuesday, December 07, 2021

Monday, December 06, 2021

Friday, November 12, 2021

Friday, May 14, 2021

Sunday, May 09, 2021

Wednesday, April 21, 2021

Tuesday, April 20, 2021

Sunday, April 18, 2021

Friday, March 26, 2021

Tuesday, March 09, 2021

Monday, February 22, 2021

Blog Archive