Search

Saturday, March 11, 2023

New paper: SEAM: An Integrated Activation-Coupled Model of Sentence Processing and Eye Movements in Reading.

Michael Betancourt, a giant in the field of Bayesian statistical modeling, once indirectly pointed out to me (in a podcast interview) that one should not try to model latent cognitive processes in reading by computing summary statistics like the mean difference between conditions and then fitting the model on those summary statistics. But that is exactly what we do in psycholinguistics. Most models (including those from my lab) evaluate model performance on summary statistics from the data (usually, a mean difference), abstracting away quite dramatically from the complex processes that resulted in those reading times and regressive eye movements. 

What Michael wanted instead was a detailed process model of how the observed fixations and eye movement patterns arise. Obviously, such a model would be extremely complicated, because one would have to specify the full details of oculomotor processes and their impact on eye movements, as well as a model of language comprehension,  and specify how these components interact to produce eye movements at the single trial level. This kind of model will quickly become computationally intractable if one tries to estimate the model parameters using data. So that's a major barrier to building such a model.

Interestingly, both eye movement control models and models of sentence comprehension exist. But these live in parallel universes. Psychologists have almost always focused on eye movement control, ignoring the impact of sentence comprehension processes (I once heard a talk by a psychologist who publicly called out psycholinguists, labeling them as "crazy" for studying language processing in reading :). Similarly, most psycholinguists just ignore the lower-level processes unfolding in reading, and just assume that language processing events are responsible for differences in fixation durations or in left-ward eye movements (regressions). The most that psycholinguists like me are willing to do is add word frequency etc. as a co-predictor to reading time or other dependent measures when investigating reading. But in most cases even that would go too far :).

What is missing is a model that brings these two lines of work into one integrated reading model that co-determines where we move our eyes to and for how long.

 Max Rabe, who is wrapping up his PhD work in psychology at Potsdam in Germany, demonstrates how this could be done: he takes a fully specified model of eye movement control in reading (SWIFT) and integrates into it linguistic dependency completion processes, following the principles of the cognitive architecture ACT-R. A key achievement is that the activation of a word being read is co-determined by both oculomotor processes as specified in SWIFT, and cue-based retrieval processes as specified in the activation-based model of retrieval.  A key achievement is to show how regressive eye movements are triggered when sentence processing difficulty (here, similarity-based interference) arises during reading.

What made the model fitting possible was Bayesian parameter estimation: Max Rabe shows in an earlier (2021) Psychological Review paper (preprint here) how parameter estimation can be carried out in complex models where the likelihood function may not be easy to work out.

 Download the paper from arXiv.




No comments: