This blog is a repository of cool things relating to statistical computing, simulation and stochastic modeling.
Search
Wednesday, April 21, 2021
Video recording of my talk at Stanford (April 20, 2021)
Tuesday, April 20, 2021
New paper in Cognitive Science (open access): A Computational Evaluation of Two Models of Retrieval Processes in Sentence Processing in Aphasia
An exciting new paper by my PhD student Paula Lissón
Download from here: https://onlinelibrary.wiley.com/doi/10.1111/cogs.12956
Code and data: https://osf.io/kdjqz/
Title: A Computational Evaluation of Two Models of Retrieval Processes in Sentence Processing in Aphasia
Authors: Paula Lissón, Dorothea Pregla, Bruno Nicenboim, Dario Paape, Mick L. van het Nederend, Frank Burchert, Nicole Stadie, David Caplan, Shravan Vasishth
Abstract:
Can sentence comprehension impairments in aphasia be explained by difficulties arising from dependency completion processes in parsing? Two distinct models of dependency completion difficulty are investigated, the Lewis and Vasishth (2005) activation‐based model and the direct‐access model (DA; McElree, 2000). These models' predictive performance is compared using data from individuals with aphasia (IWAs) and control participants. The data are from a self‐paced listening task involving subject and object relative clauses. The relative predictive performance of the models is evaluated using k‐fold cross‐validation. For both IWAs and controls, the activation‐based model furnishes a somewhat better quantitative fit to the data than the DA model. Model comparisons using Bayes factors show that, assuming an activation‐based model, intermittent deficiencies may be the best explanation for the cause of impairments in IWAs, although slowed syntax and lexical delayed access may also play a role. This is the first computational evaluation of different models of dependency completion using data from impaired and unimpaired individuals. This evaluation develops a systematic approach that can be used to quantitatively compare the predictions of competing models of language processing.
Sunday, April 18, 2021
New paper (to appear in Open Mind):
A postdoc in our lab, Dario Paape, has had a paper accepted in the MIT Press open access journal Open Mind, which is one of the few serious open access journals available as an outlet for psycholinguists (another is Glossa Psycholinguistics). Unlike many of the so-called open access journals out there, Open Mind is a credible journal, not least because of its editorial board (the editor in chief is none other than Ted Gibson). The review process was as or more thoughtful and more thorough than I have experience in journals like Journal of Memory and Language (definitely a notch over Cognition). I am hopeful that we as a community can break free from these for-profit publishers and move towards open access journals like Open Mind and Glossa Psycholinguistics.
Download preprint from here: https://psyarxiv.com/2ztgw/
Title: Does local coherence lead to targeted regressions and illusions of grammaticality?
New paper: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation
My PhD student Himanshu Yadav has recently submitted this amazing paper for review to a journal. This is the first in a series of papers that we are working on relating to the important topic of individual-level variability in sentence processing, a topic of central concern in our Collaborative Research Center on variability at Potsdam.
Download the preprint from here: https://psyarxiv.com/4jdu5/
Title: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation
Authors: Himanshu Yadav, Dario Paape, Garrett Smith, Brian Dillon, and Shravan Vasishth
Abstract: Cue-based retrieval theories of sentence processing assume that syntactic dependencies are resolved through a content-addressable search process. An important recent claim is that in certain dependency types, the retrieval cues are weighted such that one cue dominates. This cue-weighting proposal aims to explain the observed average behavior, but here we show that there is systematic individual-level variation in cue weighting. Using the Lewis and Vasishth cue-based retrieval model, we estimated individual-level parameters for processing speed and cue weighting using 13 published datasets; hierarchical Approximate Bayesian Computation (ABC) was used to estimate the parameters. The modeling reveals a nuanced picture of cue weighting: we find support for the idea that some participants weight cues differentially, but not all participants do. Only fast readers tend to have the higher weighting for structural cues, suggesting that reading proficiency might be associated with cue weighting. A broader achievement of the work is to demonstrate how individual differences can be investigated in computational models of sentence processing without compromising the complexity of the model.
Wednesday, March 31, 2021
New paper: The benefits of preregistration for hypothesis-driven bilingualism research
Download from: here
The benefits of preregistration for hypothesis-driven bilingualism research
Daniela Mertzen, Sol Lago and Shravan Vasishth
Preregistration is an open science practice that requires the specification of research hypoth- eses and analysis plans before the data are inspected. Here, we discuss the benefits of preregis- tration for hypothesis-driven, confirmatory bilingualism research. Using examples from psycholinguistics and bilingualism, we illustrate how non-peer reviewed preregistrations can serve to implement a clean distinction between hypothesis testing and data exploration. This distinction helps researchers avoid casting post-hoc hypotheses and analyses as con- firmatory ones. We argue that, in keeping with current best practices in the experimental sciences, preregistration, along with sharing data and code, should be an integral part of hypothesis-driven bilingualism research.
Friday, March 26, 2021
Freshly minted professor from our lab: Prof. Dr. Titus von der Malsburg
One of my first PhD students, Titus von der Malsburg, has just been sworn in as a Professor of Psycholinguistics and Cognitive Modeling (tenure track assistant professor) at the Institute of Linguistics, University of Stuttgart in Germany. Stuttgart is one of the most exciting places to be in Germany for computationally oriented scientists.
Titus is the eighth professor coming out of my lab. He does very exciting work in psycholinguistics; check out his work here.
Wednesday, March 17, 2021
New paper: Workflow Techniques for the Robust Use of Bayes Factors
Workflow Techniques for the Robust Use of Bayes Factors
Inferences about hypotheses are ubiquitous in the cognitive sciences. Bayes factors provide one general way to compare different hypotheses by their compatibility with the observed data. Those quantifications can then also be used to choose between hypotheses. While Bayes factors provide an immediate approach to hypothesis testing, they are highly sensitive to details of the data/model assumptions. Moreover it's not clear how straightforwardly this approach can be implemented in practice, and in particular how sensitive it is to the details of the computational implementation. Here, we investigate these questions for Bayes factor analyses in the cognitive sciences. We explain the statistics underlying Bayes factors as a tool for Bayesian inferences and discuss that utility functions are needed for principled decisions on hypotheses. Next, we study how Bayes factors misbehave under different conditions. This includes a study of errors in the estimation of Bayes factors. Importantly, it is unknown whether Bayes factor estimates based on bridge sampling are unbiased for complex analyses. We are the first to use simulation-based calibration as a tool to test the accuracy of Bayes factor estimates. Moreover, we study how stable Bayes factors are against different MCMC draws. We moreover study how Bayes factors depend on variation in the data. We also look at variability of decisions based on Bayes factors and how to optimize decisions using a utility function. We outline a Bayes factor workflow that researchers can use to study whether Bayes factors are robust for their individual analysis, and we illustrate this workflow using an example from the cognitive sciences. We hope that this study will provide a workflow to test the strengths and limitations of Bayes factors as a way to quantify evidence in support of scientific hypotheses. Reproducible code is available from this https URL.
Also see this interesting twitter thread on this paper by Michael Betancourt:
I believe this paper was initiated towards the end of drafting the Bayesian workflow in cognitive science paper with Daniel and @ShravanVasishth when I mentioned that many of the workflow ideas could be generalized to Bayes factor implementations with a little bit of work.
— \mathfrak{Michael "Shapes Dude" Betancourt} (@betanalpha) March 17, 2021
Monday, March 15, 2021
New paper: Is reanalysis selective when regressions are consciously controlled?
A new paper by Dr. Dario Paape; download from here: https://psyarxiv.com/gnehs
Abstract
The selective reanalysis hypothesis of Frazier and Rayner (1982) states that readers direct their eyes towards critical words in the sentence when faced with garden-path structures (e.g., Since Jay always jogs a mile seems like a short distance to him). Given the mixed evidence for this proposal in the literature, we investigated the possibility that selective reanalysis is tied to conscious awareness of the garden-path effect. To this end, we adapted the well-known self-paced reading paradigm to allow for regressive as well as progressive key presses. Assuming that regressions in such a paradigm are consciously controlled, we found no evidence for selective reanalysis, but rather for occasional extensive, heterogeneous rereading of garden-path sentences. We discuss the implications of our findings for the selective reanalysis hypothesis, the role of awareness in sentence processing, as well as the usefulness of the bidirectional self-paced reading method for sentence processing research.
Tuesday, March 09, 2021
Talk at Stanford (April 20 2021) Dependency completion in sentence processing: Some recent computational and empirical investigations
Wednesday, March 03, 2021
Talk at Hong Kong Virtual Psycholinguistics Forum (VPF, 心理语言学线上论坛)
When: 10 March 2021
When: 10AM Berlin time
Where: Zoom:
https://cuhk.zoom.us/j/779556638
https://cuhk.zoom.cn/j/779556638 (mainland China)
Title: Case and Agreement Attraction in Armenian: Experimental and Computational Investigations
Abstract: https://osf.io/3wn79/
Monday, February 22, 2021
Video recording of talk at Tuebingen: Individual differences in sentence processing
Thursday, February 11, 2021
Talk in Tuebingen: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation
Where: Universität Tübingen, Seminar für Sprachwissenschaft
How: Zoom
[This is part of the PhD work of Himanshu Yadav, and the project is led by him. Co-authors: Dario Paape, Garrett Smith, and Brian Dillon.]
Abstract
Cue-based retrieval theories of sentence processing assume that syntactic dependencies are resolved through a content-addressable search process. An important recent claim is that in certain dependency types, the retrieval cues are weighted such that one cue dominates. This cue-weighting proposal aims to explain the observed average behavior. We show that there is systematic individual-level variation in cue weighting. Using the Lewis and Vasishth cue-based retrieval model, we estimated individual-level parameters for processing speed and cue weighting using data from 13 published reading studies; hierarchical Approximate Bayesian Computation (ABC) with Gibbs sampling was used to estimate the parameters. The modeling reveals a nuanced picture about cue-weighting: we find support for the idea that some participants weight cues, but not all do; and only fast readers tend to have the predicted cue weighting, suggesting that reading proficiency might be associated with cue weighting. A broader achievement of the work is to demonstrate how individual differences can be investigated in computational models of sentence processing using hierarchical ABC.
Tuesday, February 02, 2021
Bayesian statistics: A tutorial taught at Experimental Methods for Language Acquisition research (EMLAR XVII 2021)
Saturday, January 16, 2021
Applications are open for the fifth summer school in statistical methods for linguistics and psychology (SMLP)
Instructors: Doug Bates, Reinhold Kliegl, Phillip Alday, Bruno Nicenboim, Daniel Schad, Anna Laurinavichyute, Paula Lisson, Audrey Buerki, Shravan Vasishth.
There will be four streams running in parallel: introductory and advances courses on frequentist and Bayesian statistics. Details, including how to apply, are here.
Saturday, January 02, 2021
Should statistical data analysis in psychology be like defecating?
There was an interesting thread on twitter about linear mixed models (LMMs) that someone made me aware of recently. (I stopped following twitter because of its general inanity, but this thread is worth commenting on.) The gist of the complaints (trying to recreate this list from memory) were. My list is an amalgamation of comments from different people; I think that the thread started here:
Inspired by @IrisVanRooij, I want to express some concerns that may be controversial and even outrageous to some but I feel we at least should have a discussion. I'm wondering if statistics in psycholinguistics could use a rethink. It feels like the tail now wags the dog.
— Fernanda Ferreira (@fernandaedi) December 20, 2020
To summarize the complaints:
- LMMs take too long to fit (cf. repeated measures ANOVA). This slows down student output.
- Too much time is spent on thinking about what the right analysis is.
- The interpretation of LMMs can change dramatically depending on which model you fit.
- Reviewers will always object to whatever analysis one does and demand a different one. Often which analysis one does doesn't matter as regards interpretation.
- The lme4 package exhibits all kinds of weird and unstable behavior. Should we trust its output?
- The focus has shifted away from substantive theoretical issues within psych* to statistical methods, but psych* people cannot be statisticians and can never know enough. This led to the colorful comment that doing statistics should be like taking a crap---it shouldn't become the center of your entire existence.
Indeed, a mathematical psychologist I know, someone who knows what they're doing, once told me that if you cannot answer your question with a paired t-test, you are asking the wrong question. In fact, if I go back to my existing data-sets that I have published between 2002 and 2020, almost all of them can be reasonably analyzed using a series of paired t-tests.
There is a presupposition that lies behind the above complaints: the purpose of data analysis is to find out whether an effect is significant or not. Once one understands that that's not the primary purpose of a statistical analysis, things start to make more sense. The problem is that it's just very hard to comprehend this point; this is because the idea of null hypothesis significance testing is very deeply entrenched in our minds. Walking away from it feels impossible.
Here are some thoughts about the above objections.
1. If you want the simplicity of paired t-tests and repeated measures ANOVA, absolutely go for it. But release your data and code, and be open to others analyzing your data differently. I think it's perfectly fine to spend your entire life doing just paired t-tests and publishing the resulting t and p-values. Of course, you are still fitting linear mixed models, but heavily simplified ones. Sometimes it won't matter whether you fit a complicated model or a simple one, but sometimes it will. It has happened to me that a paired t-test was exactly the wrong thing to do, and I spent a lot of time trying to model the data differently. Should one care about these edge cases? I think this is a subjective decision that each one of us has to make individually. Here is another example of a simple two-condition study where a complicated model that took forever to fit gave new insight into the underlying process generating the data. The problem here comes down to the goal of a statistical analysis. If we accept the premise that statistical significance is the goal, then we should just go ahead and fit that paired t-test. If, instead, the goal is to model the generative process, then you will start losing time. What position you take really depends on what you want to achieve.
2. There is no one right analysis, and reviewers will always object to whatever analysis you present. The reason that reviewers propose alternative analyses has nothing to do with the inherent flexibility of statistical methods. It has to do with academics being contrarians. I notice this in my own behavior: if my student does X, I want them to do Y!=X. If they do Y, I want them to do X!=Y. I suspect that academics are a self-selected lot, and one thing they are good at is objecting to whatever someone else says or does. So, the fact that reviewers keep asking for different analyses is just the price one has to pay for dealing with academics, it's not an inherent problem with statistics per se. Notice that reviewers also object to the logic of a paper, and to the writing. We are so used to dealing with those things that we don't realize it's the same type of reaction we are seeing to the statistical analyses.
3. If you want speed and still want to fit linear mixed models, use the right tools. There are plenty of ways to fit linear mixed models fast. rstanarm, LMMs in Julia, etc. E.g., Doug Bates, Phillip Alday, and Reinhold Kliegl taught a one-week course on fitting LMMs super fast in Julia: see here.
4. The interpretation of linear mixed models depends on model specification. This surprises many people, but the surprise is due to the fact that people have a very incomplete understanding of what they are doing. If you cannot be bothered to study linear mixed modeling theory (understandable, life is short), stick to paired t-tests.
5. lme4's unstable and weird behavior is problematic, but this is not enough reason to abandon linear mixed models. The weirdness of messages, and the inconsistencies of lme4 are really frustrating, one has to admit that. Perhaps this is the price one has to pay for free software (although, having used non-free software like Word, SPSS, Excel, I'm not so sure there is any advantage). But the fact is that LMMs give you the power to incorporate variance components in a sensible way, and lme4 does the job, if you know what you are doing. Like any other instrument one thinks about using as a professional, if you can't be bothered to learn to use it, then just use some simpler method you do know how to use. E.g., I can't use fMRI; I don't have access to the equipment. I'm forced to work with simpler methods, and I have to live with that. If you want more control over your hierarchical models than lme4 provides, learn Stan. E.g., see our chapter on hierarchical models here.
Personally, I think that it is possible to learn enough statistics to be able to use linear mixed models competently; one doesn't need to become a statistician. The curriculum I think one needs in psych and related areas is encapsulated in our summer school on statistical methods, which we run annually at Potsdam. It's a time commitment, but it's worth it. I have seen many people go from zero knowledge to fitting sophisticated hierarchical models, so I know that people can learn all this without it taking over their entire life.
Probably the biggest problem behind all these complaints is the misunderstanding surrounding null hypothesis significance testing. Unfortunately,p-values will rarely tell you anything useful, significant or not, unless you are willing to put in serious time and effort (the very thing people want to avoid doing). So it really not going to matter much whether you compute them using paired t-tests or linear mixed models.
Thursday, December 17, 2020
New paper: The effect of decay and lexical uncertainty on processing long-distance dependencies in reading
The effect of decay and lexical uncertainty on processing long-distance dependencies in reading
Kate Stone, Titus von der Malsburg, Shravan Vasishth
Download here: https://peerj.com/articles/10438/
Abstract:
To make sense of a sentence, a reader must keep track of dependent relationships between words, such as between a verb and its particle (e.g. turn the music down). In languages such as German, verb-particle dependencies often span long distances, with the particle only appearing at the end of the clause. This means that it may be necessary to process a large amount of intervening sentence material before the full verb of the sentence is known. To facilitate processing, previous studies have shown that readers can preactivate the lexical information of neighbouring upcoming words, but less is known about whether such preactivation can be sustained over longer distances. We asked the question, do readers preactivate lexical information about long-distance verb particles? In one self-paced reading and one eye tracking experiment, we delayed the appearance of an obligatory verb particle that varied only in the predictability of its lexical identity. We additionally manipulated the length of the delay in order to test two contrasting accounts of dependency processing: that increased distance between dependent elements may sharpen expectation of the distant word and facilitate its processing (an antilocality effect), or that it may slow processing via temporal activation decay (a locality effect). We isolated decay by delaying the particle with a neutral noun modifier containing no information about the identity of the upcoming particle, and no known sources of interference or working memory load. Under the assumption that readers would preactivate the lexical representations of plausible verb particles, we hypothesised that a smaller number of plausible particles would lead to stronger preactivation of each particle, and thus higher predictability of the target. This in turn should have made predictable target particles more resistant to the effects of decay than less predictable target particles. The eye tracking experiment provided evidence that higher predictability did facilitate reading times, but found evidence against any effect of decay or its interaction with predictability. The self-paced reading study provided evidence against any effect of predictability or temporal decay, or their interaction. In sum, we provide evidence from eye movements that readers preactivate long-distance lexical content and that adding neutral sentence information does not induce detectable decay of this activation. The findings are consistent with accounts suggesting that delaying dependency resolution may only affect processing if the intervening information either confirms expectations or adds to working memory load, and that temporal activation decay alone may not be a major predictor of processing time.
Saturday, December 12, 2020
New paper: A Principled Approach to Feature Selection in Models of Sentence Processing
A Principled Approach to Feature Selection in Models of Sentence Processing
Garrett Smith and Shravan Vasishth
Paper downloadable from: https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.12918
Abstract
Among theories of human language comprehension, cue-based memory retrieval has proven to be a useful framework for understanding when and how processing difficulty arises in the resolution of long-distance dependencies. Most previous work in this area has assumed that very general retrieval cues like [+subject] or [+singular] do the work of identifying (and sometimes misidentifying) a retrieval target in order to establish a dependency between words. However, recent work suggests that general, handpicked retrieval cues like these may not be enough to explain illusions of plausibility (Cunnings & Sturt, 2018), which can arise in sentences like The letter next to the porcelain plate shattered. Capturing such retrieval interference effects requires lexically specific features and retrieval cues, but handpicking the features is hard to do in a principled way and greatly increases modeler degrees of freedom. To remedy this, we use well-established word embedding methods for creating distributed lexical feature representations that encode information relevant for retrieval using distributed retrieval cue vectors. We show that the similarity between the feature and cue vectors (a measure of plausibility) predicts total reading times in Cunnings and Sturt’s eye-tracking data. The features can easily be plugged into existing parsing models (including cue-based retrieval and self-organized parsing), putting very different models on more equal footing and facilitating future quantitative comparisons.
Tuesday, November 24, 2020
How to become a professor in Germany---a unique tutorial
How to become a Professor in Germany (Online-Seminar)
For sign-up details, see here: https://www.dhvseminare.de/index.php?module=010700&event=187&catalog_id=3&category_id=15&language_id=
Live-Online-Seminar
This seminar addresses young scientists who are considering a career as a professor at a German university or are already in the middle of an application process for a professorship in Germany. It will give the participants an overview of the career paths to a professorship, covering the legal requirements, the appointment procedure and the legal status of a professor.
The seminar also
addresses how to approach the search for relevant job advertisements,
how to prepare the written application documents and how to make a good
impression during the further steps in the selection process.
In the
second part of the seminar, the participants will receive an overview
of the next steps for the successful candidates. This includes the
appointment negotiations with German universities, the legal framework
and the strategic preparation for those negotiations.
Speaker:
RA Dr. Vanessa Adam, Justitiarin für Hochschul- und Arbeitsrecht im Deutschen Hochschulverband
RA (syn.) Katharina Lemke, Justitiarin für Hochschul- und Arbeitsrecht im Deutschen Hochschulverband
Schedule:
09:00-10:00 Career Paths to a Professorship (Fr. Lemke)
10:00-10:15 Break
10:15-11:45 Application for a Professorship (Fr. Dr. Adam)
11:45-12:15 Break
12:15-13:15 Negotiations with the University (Legal Framework) (Fr. Lemke)
13:15-13:30 Break
13:30-14:30 Negotiations with the University (Strategy) (Fr. Dr. Adam)
Included in the price:
Seminar documents in electronic form (via download).
Thursday, November 12, 2020
New paper: A computational evaluation of two models of retrieval processes in sentence processing in aphasia
Wednesday, November 11, 2020
New paper: Modeling misretrieval and feature substitution in agreement attraction: A computational evaluation
This is an important new paper from our lab, led by Dario Paape, and with Serine Avetisyan, Sol Lago, and myself as co-authors.
One thing that this paper accomplishes is that it showcases the incredible expressive power of Stan, a probabilistic programming language developed by Andrew Gelman and colleagues at Columbia for Bayesian modeling. Stan allows us to implement relatively complex process models of sentence processing and test their performance against data. Paape et al show how we can quantitatively evaluate the predictions of different competing models. There are plenty of papers out there that test different theories of encoding interference. What's revolutionary about this approach is that one is forced to make a commitment about one's theories; no more vague hand gestures. The limitations of what one can learn from data and from the models is always going to be an issue---one never has enough data, even when people think they do. But in our paper we are completely upfront about the limitations; and all code and data are available at https://osf.io/ykjg7/ for the reader to look at, investigate, and build upon on their own.
Download the paper from here: https://psyarxiv.com/957e3/
Modeling misretrieval and feature substitution in agreement attraction: A computational evaluation
Abstract
We present a self-paced reading study investigating attraction effects on number agreement in Eastern Armenian. Both word-by-word reading times and open-ended responses to sentence-final comprehension questions were collected, allowing us to relate reading times and sentence interpretations on a trial-by-trial basis. Results indicate that readers sometimes misinterpret the number feature of the subject in agreement attraction configurations, which is in line with agreement attraction being due to memory encoding errors. Our data also show that readers sometimes misassign the thematic roles of the critical verb. While such a tendency is principally in line with agreement attraction being due to incorrect memory retrievals, the specific pattern observed in our data is not predicted by existing models. We implement four computational models of agreement attraction in a Bayesian framework, finding that our data are better accounted for by an encoding-based model of agreement attraction, rather than a retrieval-based model. A novel contribution of our computational modeling is the finding that the best predictive fit to our data comes from a model that allows number features from the verb to overwrite number features on noun phrases during encoding.