Cognitive Science of Language lecture series: Dr. Marco Marelli (Jan. 25, 2021)

Who: Marco Marelli (University of Milano-Bicocca, Milano, Italy)

What: Compositional effects in the processing of compound words: A computational perspective grounded in linguistic and visual experience

When: Monday January 25, 2021; 2:30-4:20 pm EST

Where: Zoom


McMaster’s Department of Linguistics and Languages invites you to the next talk in the Cognitive Science of Language lecture series. The lecture will be delivered online by Dr. Marco Marelli. Dr. Marelli is an associate professor of General Psychology at the University of Milano-Bicocca, Milano, Italy. His work focuses on the psychology of language, and in particular on the impact of semantics on word processing and the interface between language and the conceptual system. His more recent research projects combine methods from experimental psychology and computational modelling and are dedicated to compositionality (at the level of both phrases and morphologically complex words) and the interplay between linguistic, emotional and perceptual experience in conceptual processes. He is an associate editor of Behavior Research Methods and a consulting editor of Morphology. 

The talk is free but participants must register. Registration link can be found here:

Please make sure to register in advance. For logistic reasons the registration for this event will only be reviewed until 2pm on the event date.  


Since the seminal LSA proposal (Landauer & Dumais, 1997) distributional semantics has provided efficient data-driven models of the human semantic system, representing word meaning through vectors recording lexical co-occurrences in large text corpora. However, these approaches generate static descriptions of the semantic system, falling short of capturing the highly dynamical interactions occurring at the meaning level during language processing. 

In the present work, I discuss the CAOSS model (Compounding as Abstract Operations in Semantic Space), a first step in this direction that moves from distributional semantics to capture the meaning of compound words (Marelli et al., 2017). 

In CAOSS, word meanings are represented as vectors encoding lexical co-occurrences in a reference corpus (e.g., the meaning of “snow” will be based on how often “snow” appears with the other words), according to the tenets of distributional semantics. A compositional procedure is induced as a weighted sum: given two vectors (constituent words) u and v, their composed representation (the compound) can be computed as c=M*u+H*v, where M and H are weight matrices estimated from corpus examples. The matrices are trained using least squares regression, having the vectors of the constituents as independent words (“car” and “wash”,  “rail” and ”way”) as inputs and the vectors of example compounds (“carwash”, “railway”) as outputs, so that the similarity between M*u+H*v and c is maximized. In other words, the matrices are defined in order to recreate the compound examples as accurately as possible. Once the two weight matrices are estimated, they can be applied to any word pair in order to obtain meaning representations for untrained word combinations (e.g., “snow building”). 

In a series of behavioral experiments, model predictions were tested against psycholinguistic data. CAOSS is shown to mirror evidence related to the processing of novel compounds (Marelli et al., 2017; Günther & Marelli, 2020), and in particular the impact of relational information (Gagné, 2001; Gagné & Shoben, 2007) as well as the “morpheme interference effect” (Crepaldi et al., 2010). Moreover, CAOSS also provides a central contribution to the understanding of semantic transparency in familiar compounds: CAOSS estimates are shown to best characterize the transparency impact in word processing (Günther & Marelli, 2019). Finally, I discuss how CAOSS is not to be considered a “disembodied model”, since one can easily ground it in perception by feeding it images together with text data (Günther et al., 2020). 

The model simulations indicate that compositionality-related phenomena are reflected in language statistics. Human speakers are able to learn these aspects from language experience and automatically apply them to the processing of any word combination. The present model is flexible enough to emulate this procedure, predicting sensible relational similarities for novel compounds and correctly capturing the contribution to semantic transparency provided by compositional operations. The model is also shown to generalize to other kind of data, being able to capture the contribution of perceptual experience in the internal dynamics of compound-word processing. Such evidence directly links linguistic composition to conceptual combination, speaking for the possible role of general-level learning procedures at the foundations of both phenomena. 

CCPTalks: Individual differences in the production and perception of prosodic boundaries in American English

CCPTalks is a new series presented by the Centre for Comparative Psycholinguistics at the University of Alberta. Join us for their first presentation, featuring Dr. Jiseung Kim (Alberta), on January 22, 2021!

Title: Individual differences in the production and perception of prosodic boundaries in American English

Date: Friday, January 22, 2021

Time: 9:00am MST (GMT-7)

Location: Zoom (contact for link)

I present the findings of my dissertation which investigated the hypothesis that individuals vary in their production and perception of prosodic boundaries, and that the properties they use to signal prosodic contrasts are closely related to the properties used to perceive those contrasts. A group of native speakers of American English participated in an acoustic study and subsequently an eye-tracking study that examined production and perception of three acoustic properties related to Intonational Phrase (IP) boundary: pause, pitch reset, and phrase-final lengthening. The results showed individual differences to a substantial degree, and offered limited evidence of a production-perception relation: a trend was observed in which individuals with longer pause durations.

July 28 Open Office Hour: Incorporating Neuropsychological Tests into Experimental Research

Next week, join us as we welcome Dr. Simritpal Malhi (Glenrose Rehabilitation Hospital) for a discussion about neuropsychological tests and their relationship with experimental research. Topics will include when it is appropriate to include tests, and what kinds can be used, the ethics of including neuropsychological tests, and future research involving the management of big data (i.e., neuropsychological norms).

For more information, view the complete event listing on our Events page:

Presentation slides will be made available after the event.

Open Office Hours – May 12, 2020

Thank you to everyone who has participated in our Open Office Hour series!

Our next Words in the World Open Office Hour will take place on Tuesday, May 12, at 12pm (Eastern Time, GMT -4). Jordan Gallant (Brock  University) will share his expertise using Gitlab and PsychoPy3 in collaborative remote research. Here is his summary:

“The COVID-19 pandemic has forced our research activities out of the lab and into online virtual environments. Not only has this changed the research methods available to us, but it has also fundamentally changed the way the that we work together. However, this office hour is here to say that this change need not be for the worse. Collaborative remote work can offer distinct advantages when paired with the right technology to support it. In this Open Office Hour I will discuss the merits of using online project development platforms such as Gitlab for collaborative research projects. Specifically, I will look at how PsychoPy3 and Gitlab can support the collaborative construction and administering of online experiments. In the process, I hope to instill a sense that, rather than being a quick fix for temporary problems, this is a paradigm worth carrying into the post-COVID future.

Accompanying video tutorials: YouTube

Open Office Hours are delivered using Zoom. Passwords are sent out via email in advance of Open Office Hours. If you would like to join the Open Office Hour mailing list, please sign up here: Open Office Hours Sign-Up Form.

Follow-up: No Lab, No Problem

On March 31, 2020, Jordan Gallant offered an Open Office Hour on how to use PsychoPy3 to conduct experiments online. If you were unable to attend that meeting, we now have a full recording available! He has also made additional video supplements on how to use auditory stimuli and how to code a self-paced reading task.

Jordan Gallant’s introductory video can be found here:

A supplementary video on the use of PsychPy3 with audio files can be found here:

A supplementary video on how to code a self-paced reading task can be found here:

A methodological paper with an introduction to the use of PsychoPy3 in psycholinguistics (Gallant & Libben, 2019) can be found here:

COVID-19 & Open Office Hours (online)

Due to the uncertainty surrounding the Coronavirus pandemic, many across the Words in the World network have suspended face-to-face operations including experimentation in traditional laboratory environments. We are therefore making a concerted effort to migrate as many of our research projects to an online format as possible. By moving toward this goal, we are working not only to protect the health and safety of our colleagues and research participants, but also to move forward with the majority of our research endeavours.

With these purposes in mind, we’d like to take this opportunity to introduce a new Words in the World feature: Open Office Hours. The purpose of the Open Office Hour is to provide an accessible online version of the traditional university office hour, in which our research partners hold a brief informal discussion on a topic within their expertise and take questions regarding that topic. Our first Open Office Hours are listed below and focus on online experimentation in Psycholinguistic research.


A follow-up office hour with Dr. Kuperman is scheduled for Friday, March 27, from 1 – 2pm Eastern (GMT -4). See the announcement here:

“How to collect psycholinguistic data from home: Introduction to crowdsourcing tools”

Host: Victor Kuperman

Date: Tuesday, March 24, 2020

Time: 1 p.m. – 2 p.m. EST (GMT -4)

Ability to collect experimental data outside of the lab is of great importance for reaching out to populations outside of university convenience subject pools. This importance is even greater when lab testing is undesirable. This first session of “open office hours” will introduce rich possibilities for data collection using crowdsourcing tools like Amazon’s Mechanical Turk ( We will cover several basic types of experiments (surveys, collection of ratings, linguistic judgments, and written responses), and discuss practicalities of online testing. Several small experiments will be created and results collected and discussed.

No prior knowledge is expected. The session is designed for 20-30 minutes of an informal presentation, followed by the Q&A. Ideas for experiments are very welcome.

Connect via Zoom:

See the event listing for alternative ways to connect.

“Running chronometric experiments online using PsychoPy3″

Host: Jordan Gallant

Date: Tuesday, March 31, 2020 (GMT -4)

Time: 1 p.m. – 2 p.m. EST

Connect via Zoom:

Online experiments offer a range of possibilities and benefits that have yet to be fully explored. This office hour will introduce PsychoPy3, a new experiment development software that uses Javascript to create experiments that can be run on web-browsers. In the first half of the office hour, I will demonstrate how a simple lexical decision experiment can be 1) created, 2) hosted online, and 3) run using participants recruited via Mechanical Turk. The second half will be a Q&A where the limitations/possibilities of PsychoPy3 and online chronometric experimentation in general can be discussed.

Upcoming Talk: There is a big gap in our understanding of reading fluency and the study of serial naming can help address it

On Monday July 22, 2019, Dr. Athanassios Protopapas (University of Oslo) will be giving a talk on word reading fluency at McMaster University in Hamilton, Ontario. This invited talk is hosted by The Reading Lab and the Centre for Advanced Research in Experimental and Applied Linguistics at McMaster. See the abstract below for more information.

All are welcome to attend!


Date: July 22, 2019

Time: 12 – 2pm

Location: LRW 4018 (through ARiEAL entrance at LRW 4020), McMaster University


Word list reading fluency is theoretically expected to depend mainly on single word reading speed. Yet the correlation between the two diminishes with increasing fluency, while fluency remains strongly correlated to serial digit naming. This suggests that multi-element sequence processing is an important component of fluency. When multiple stimuli to be named are presented simultaneously, the total naming time is shorter than when they are presented individually (termed “serial advantage”). Presumably, this occurs because one or more stimuli can be processed simultaneously, for example by one stimulus being mapped to its phonological representation while the previous one is articulated and the next one is visually perceived. This temporal overlap, termed “cascaded” processing, amounts to the parallel processing of multiple sequential stimuli along a serial pipeline.

I will present data from serial and discrete naming and reading tasks in different orthographies supporting the hypotheses that (a) these tasks pattern along distinct dimensions of performance concerning sequential vs. single-entity processing; (b) stimuli are amenable to cascaded processing to the extent they are individually processed as unmediated single chunks; and (c) the serial advantage is limited by the slowest processing component. The first hypothesis suggests that a distinct skill domain, beyond single word processing, underlies efficient processing of word sequences (i.e., fluency). The second hypothesis distinguishes between alphanumeric and nonalphanumeric naming and sets the context for the study of word reading fluency development. The third hypothesis suggests that as long as articulation is faster than the preceding cognitive steps then the serial advantage is largely determined by the duration of the spoken words, but articulation goes on to become the rate-limiting factor as word recognition speeds up during reading development.

Serial word reading aligns increasingly with the serial naming factor at higher grades, suggesting that word reading fluency is gradually dominated by skill in simultaneously processing multiple successive items (“cascading”), beyond automatization of individual words. This explains why discrete word reading is decreasingly correlated with word reading fluency as reading skill increases and why serial digit naming (i.e., RAN) is such a strong concurrent and longitudinal predictor of word reading fluency.