This blog is a repository of cool things relating to statistical computing, simulation and stochastic modeling.
Search
Wednesday, September 16, 2020
Zoom link for my talk: Twenty years of retrieval models
Title: Twenty years of retrieval models
Abstract:
After Newell wrote his 1973 article, "You can't play twenty questions with nature and win", several important cognitive architectures emerged for modeling human cognitive processes across a wide range of phenomena. One of these, ACT-R, has played an important role in the study of memory processes in sentence processing. In this talk, I will talk about some important lessons I have learnt over the last 20 years while trying to evaluate ACT-R based computational models of sentence comprehension. In this connection, I will present some new results from a recent set of sentence processing studies on Eastern Armenian.
Reference: Shravan Vasishth and Felix Engelmann. Sentence comprehension as a cognitive process: A computational approach. 2021. Cambridge University Press. https://vasishth.github.io/RetrievalModels/ Zoom registration link:
You are invited to a Zoom webinar. When: Sep 25, 2020 09:30 PM Amsterdam, Berlin, Rome, Stockholm, Vienna Topic: UMass talk Vasishth
Register in advance for this webinar: https://zoom.us/webinar/register/WN_89F7BObjSwmxnK6DRC9fuQ
After registering, you will receive a confirmation email containing information about joining the webinar.
Tuesday, September 15, 2020
Twenty years of retrieval models: A talk at UMass Linguistics (25 Sept 2020)
Twenty years of retrieval models
Shravan Vasishth (vasishth.github.io)
After Newell wrote his 1973 article, "You can't play twenty questions with nature and win", several important cognitive architectures emerged for modeling human cognitive processes across a wide range of phenomena. One of these, ACT-R, has played an important role in the study of memory processes in sentence processing. In this talk, I will talk about some important lessons I have learnt over the last 20 years while trying to evaluate ACT-R based computational models of sentence comprehension. In this connection, I will present some new results from a recent set of sentence processing studies on Eastern Armenian.
Reference: Shravan Vasishth and Felix Engelmann. Sentence comprehension as a cognitive process: A computational approach. 2021. Cambridge University Press. https://vasishth.github.io/RetrievalModels/
Monday, September 07, 2020
Registration open for two statistics-related webinars: SMLP Wed 9 Sept, and Fri 11 Sept 2020
As part of the summer school in Statistical Methods for Linguistics and Psychology, we have organized two webinars that anyone can attend. However, registration is required. Details below
Keynote speakers
- Wed 9 Sept, 5-6PM:Christina Bergmann (Title: The "new" science: transparent, cumulative, and collaborative)
Register for webinar: here
Abstract: Transparency, cumulative thinking, and a collaborative mindset are key ingredients for a more robust foundation for experimental studies and theorizing. Empirical sciences have long faced criticism for some of the statistical tools they use and the overall approach to experimentation; a debate that has in the last decade gained momentum in the context of the "replicability crisis." Culprits were quickly identified: False incentives led to "questionable research practices" such as HARKing and p-hacking and single, "exciting" results are over-emphasized. Many solutions are gaining importance, from open data, code, and materials - rewarded with badges - over preregistration to a shift away from focusing on p values. There are a host of options to choose from; but how can we pick the right existing and emerging tools and techniques to improve transparency, aggregate evidence, and work together? I will discuss answers fitting my own work spanning empirical (including large-scale), computational, and meta-scientific studies, with a focus on strategies to see each study for what it is: A single brushstroke of a larger picture. - Fri 11 Sept, 5-6PM: Jeff Rouder Title: Robust cognitive modeling
Register for webinar: here
Abstract: In the past decade, there has been increased emphasis on the replicability and robustness of effects in psychological science. And more recently, the emphasis has been extended to cognitive process modeling of behavioral data under the rubric of “robust models." Making analyses open and replicable is fairly straightforward; more difficult is understanding what robust models are and how to specify and analyze them. Of particular concern is whether subjectivity is part of robust modeling, and if so, what can be done to guard against undue influence of subjective elements. Indeed, it seems the concept of "researchers' degrees of freedom" plays writ large in modeling. I take the challenge of subjectivity in robust modeling head on. I discuss what modeling does in science, how to specify models that capture theoretical positions, how to add value in analysis, and how to understand the role of subjective specification in drawing substantive inferences. I will extend the notion of robustness to mixed designs and hierarchical models as these are common in real-world experimental settings.
Jeff Rouder's keynote address at AMLaP 2020: Qualitative vs. Quantitative Individual Differences: Implications for Cognitive Control
For various reasons, Jeff Rouder could not present his keynote address live.
Qualitative vs. Quantitative Individual Differences: Implications for Cognitive Control
Jeff Rouder (University of Missouri) rouderj@missouri.edu
Consider a task with a well-established effect such as the Stroop effect. In such tasks, there is often a canonical direction of the effect—responses to congruent items are faster than incongruent ones. And with this direction, there are three qualitatively different regions of performance: (a) a canonical effect, (b) no effect, or (c) an opposite or negative effect (for Stroop, responses to incongruent stimuli are faster than responses to congruent ones). Individual differences can be qualitative in that different people may truly occupy different regions; that is, some may have canonical effects while others may have the opposite effect. Or, alternatively, it may only be quantitative in that all people are truly in one region (all people have a true canonical effect). Which of these descriptions holds has two critical implications. The first is theoretical: Those tasks that admit qualitative differences may be more complex and subject to multiple processing pathways or strategies. Those tasks that do not admit qualitative differences may be explained more universally. The second is practical: it may be very difficult to document individual differences in a task or correlate individual differences across task if these tasks do not admit qualitative individual differences. In this talk, I develop trial-level hierarchical models of quantitative and qualitative individual differences and apply these models to cognitive control tasks. Not only is there no evidence for qualitative individual differences, the quantitative individual differences are so small that there is little hope of localizing correlations in true performance among these tasks.
Sunday, September 06, 2020
Some thoughts on teaching statistics courses online
Someone asked to write down how I teach online. Because of corona, I have moved all my courses at the university online, and as a consequence I had to clean up my act and get things in order.
The first thing I did was record all my lectures in advance. This was a hugely time-consuming enterprise. I bought a licence for screencast-o-matic, which is something like 15 Euros a year, and a Blue Yeti microphone (144 Euros, including express shipping). I already have a Logitech HD 1080p camera. I also bought a Windows (Dell) tablet computer through the university, so I could write freehand with an electronic pen. Somehow, writing freehand during a lecture solidifies understanding in the student's mind in a way that a mere slide presentation does not. I don't know why this is the case but I firmly believe one should show derivations in real time.
The way I do my recordings is that I start screencast-o-matic (the new Mac OS X makes this incredibly hard, you have to repeatedly open the settings and give the software permission to record--thanks, Apple). Then, I record the lecture in one shot, no editing at all. If I make a mistake during the lecture, I just live with it (and sometimes the mistakes are horrendous). Sometimes my cat Molly video-bombs my lectures, I just let it all happen. All this makes my video recordings less than professional looking, but I think it's good enough. Nobody has complained about this so far. I use Google Chrome's Remote Desktop feature to link my Macbook Pro with the Windows machine, and switch between RStudio on the Mac and the Windows tablet for writing. On Windows, I use the infinite writing space provided by OneNote. For writing on pdfs, I use the PDF reader by Xodo.
Here are my videos from my frequentist course:
https://vasishth.github.io/IntroductionStatistics/
The way students are expected to work is to watch the videos, and then do exercises that I give out. My lecture notes provide a written record of the material, plus the exercises:
https://vasishth.github.io/Freq_CogSci/
The solutions are given out after the submission deadline. In my courses, I stipulate that you can only take the class if you commit to doing at least 80% of the homework. I force people to quit the class if they don't do the HW; many people try to audit the classes without doing the HW. In my experience, they don't get anything out of the class, so I don't allow audits without doing the HW. This is a very effective strategy, because it forces the students to engage. One rule I have is that if you submit the HW and make an honest attempt to solve the problems you will get 100% on the HW no matter what. This decouples learning from grades and reduces student stress considerably, and allows them to actually learn the material. Some students complain that the HW is hard; but it's supposed to make them think, and there is no shame in not being able to do it. Some students are unable to adjust to the fact that not everything will be easy to do.
Two other components of the class are (a) weekly meetings over zoom, where students can ask me anything, and (b) an online discussion forum where people can post questions. Students used these options really intelligently, and although I had to spend a lot of time answering questions on the forum, I think on balance it was worth the effort. I think the students got a lot out of my courses, judging from the teaching evaluations (here and here).
The main takeaway for me was that the online component of these stats courses that I teach is crucial for student learning, and in future editions of my courses, there will always be an online component. One day we will have face to face classes, and I think those are very valuable for establishing human contact. But the online component really adds value, especially the pre-recorded lectures and the discussion forum.
Some thoughts on the completely online Architectures and Mechanisms of Language Processing conference
Some time ago, I wrote a blog post on the carbon cost of conferences:
https://vasishth-statistics.blogspot.com/2019/10/estimating-carbon-cost-of.html
The background for this post was that at the time I was in the process of organizing the AMLaP 2020 conference, and was beginning to wonder whether these international conferences are even sustainable given the climate crisis unfolding. In discussions with others, one question someone raised was: what is the actual carbon cost of conferences? This made me curious to find out what the rough carbon cost would be, hence the above-linked post. At the time, it didn't even occur to me that a viable alternative could be a completely online conference.
But then corona happened, and Brian Dillon moved CUNY completely online. I didn't attend that conference because I was going through a medical crisis at the time. But around that time I realized that I would have to move AMLaP online as well. By then my medical situation was going from bad to worse, so I handed over control to Titus von der Malsburg. Titus masterfully navigated all the obstacles to get AMLaP up and running, helped by a large team consisting of my lab members and several other department members. I was pretty amazed to see how superbly organized and well-coordinated this team was.
Having attended this and a satellite conference, SAFAL, online, I have to admit that an online conference just doesn't have the same look and feel of a real conference. It's just something different to sit down with colleagues from all over the world and chat with them over a beer. An online conversation over zoom just doesn't cut it. However, if we want to take the carbon cost issue seriously, I feel that online conferences are here to stay. At the very least, it should be possible in the future to allow for hybrid conferences; people should be able to participate (and I mean, ask questions after talks and meet people) from a remote place. I got several emails and other types of messages from people telling me they could only participate because AMLaP was online; some were pregnant and unable to travel, some (like me) had too serious a medical condition to allow them to travel, and some just don't have the money to go to a conference. Interestingly, Indian psycholinguists from India were well-represented at AMLaP, I think for the first time (I didn't have any direct hand in making this happen, the Indians are an emerging group of highly competent and sophisticated psycholinguists). So I think the online format makes the conference more inclusive as well.
One further thing many people noticed is that younger people were asking more questions after talks than in physical conferences. In physical psycholinguistic conferences, sometimes senior people dominate in the discussions. This isn't even possible to do in an online conference because the moderators have total control over which question is asked and by whom. But it seemed like it was mostly younger people who felt comfortable asking questions online; I saw very few questions from senior people. This is good news, because the younger people should be out there engaging with the field.
This year, we we used gather.town to socialize. Take a look at it. Initially I was skeptical this would allow for much socializing, but it worked surprisingly well. I noticed that some of the young people were hesitating to approach older ones, so I boldly went up to them and talked with them. It worked well; I met several young MSc and early PhD students. I also met up with colleagues I haven't seen for over a decade I think (Tessa Warren for example). It was nothing like face-to-face meetings but it was still fun and better than nothing. Pro tip: you can make your avatar on gather.town dance by pressing the z button. Cool. Brian Dillon, Dustin Chacon, and I had a brief dance party (no music though). You get little hearts getting bigger and bigger over your avatar's head if you dance. Neat.
So overall, despite the huge disadvantage that one can't meet people in person, there is enough gain from running conferences online that all future conferences should have at least a live streaming component. The talks should be on twitch or some other platform, and they should be recorded and stored online for everyone to view. This will create a more inclusive environment and can only be good for the field. As a side effect, it is also positive thing we can do towards reducing the effects of the climate crisis. Every little bit counts.
You can watch the conference recording on twitch. A more permanent recording will appear on the amlap2020.org home page eventually.