Discussion
Based on studies primarily with children, it has been hypothesized that
autistic individuals show attenuated MSI, particularly for speech
stimuli. Our results provide compelling evidence that the presumption
this difference extends into adulthood is unfounded. Once controlling
for age, we found no significant difference between autistic and
non-autistic individuals in susceptibility to the McGurk/MacDonald
illusion. Because this ran against the grain of the findings of the
largest meta-analysis on the topic (Zhang et al., 2019), we confirmed
that group was not a significant factor in our results using both age
matched and Bayesian follow up analyses (see Supplementary Materials).
While Zhang et al. (2019) concluded that the difference between groups
actually increases in magnitude with age, it only included one study
with adults (Saalasti et al., 2012), which the original authors did not
take as evidence for a difference in the strength of the
McGurk/MacDonald effect. Moreover, some findings suggest that
differences between autistic and non-autistic individuals in MSI may be
resolved during adolescence (Foxe et al. 2015; Taylor, Isaac, & Milne,
2010). Our findings with adults are consistent with the trajectory of
improvement these results imply.
Instead of a difference according to group, we found evidence that the
degree of MSI increases with age (with the average rate of the illusion
nearly tripling from the youngest to oldest participants) for both
autistic and non-autistic individuals. While an increase in the rate of
the McGurk/MacDonald effect between younger and older adults has been
detected in non-autistic participants (Setti et al., 2013), this is the
first study comparing them in both autistic and non-autistic samples.
The near perfect overlap of the correlations between age and MSI between
groups serves as compelling evidence that while autistic children may
not experience the development of visual influence on speech perception
as early as their non-autistic peers, autistic adults do show comparable
visual influence into their older years. These findings of similar age
effects across adulthood resonate with recent longitudinal research
suggesting similar cognitive aging profiles between autistic and
non-autistic individuals (Torenvliet et al., 2023).
The reason for such a strong effect of age on the rate of the illusion
could be reduced reliability of the auditory signal resulting from the
progressive hearing loss common in aging, which often goes uncorrected
(Walling & Dickson, 2012). The comparative reliability of auditory and
visual inputs has been shown to affect the rate at which the
McGurk/MacDonald effect occurs, and their respective influence shifts
during development (Hirst et al., 2018). Additionally, MSI may also
serve a compensatory role in speech perception as hearing declines. Both
notions are supported by research showing an increase in MSI and visual
dominance later in life (Diaconescu et al., 2013), as well as enhanced
susceptibility to the McGurk/MacDonald effect associated with
age-related hearing loss (Rosemann & Thiel, 2018; Stropahl & Debener,
2017). Cortical reorganization leading to increased functional
connectivity between auditory and visual regions may facilitate these
effects in those with age-related hearing loss (Puschmann & Thiel,
2017). It is encouraging that MSI appears to serve this compensatory
role as effectively in autistic adults as non-autistic ones.
Another potential factor in differences between our findings and others
is the threat of an attentional confound. Autistic children have been
shown to demonstrate an atypical preference for non-social stimuli,
viewing faces less frequently than their non-autistic peers (Gale et
al., 2019; Vacas et al., 2021). Additionally, in two McGurk/MacDonald
studies using eye-tracking, it was found that autistic children attended
less to the pertinent areas of the face than non-autistic ones (Feng et
al., 2021; J. R. Irwin et al., 2011), partially explaining differences
in susceptibility to the illusion. Accordingly, studies that do not
control for visual attention may overstate differences in MSI. A merit
of our design is that while we do not directly measure eye movements,
our simultaneity judgment task requires participants to attend to the
mouth during trials. The performance of participants on this task,
resembling a typical Gaussian distribution peaking near simultaneity,
suggests that they were indeed attending to the faces. While the
addition of eye-tracking would help to confirm this, in online
experiments such as ours, where it is not possible (due to privacy
reasons), the addition of an simultaneity judgment task provides an
excellent means of reducing the risk of attentional differences being
conflated with differences in MSI.
Beyond our findings with regard to the McGurk/MacDonald illusion, our
results have illuminated much about the nuances of temporal processing
and how they compare between autistic and non-autistic individuals. In
many ways, our results remained consistent with standard findings in
fundamental temporal processing research. Synchrony distributions
followed a typical Gaussian shape, peaking with a slight visual lead, as
is consistently found with audiovisual stimuli (Dixon & Spitz, 1980;
Slutsky & Recanzone, 2001; Zampini et al., 2005). Incongruent stimuli
were perceived as synchronous significantly less frequently than
congruent ones, as was shown in other studies measuring simultaneity
judgments for McGurk/MacDonald stimuli (Jertberg et al., 2023; Van
Wassenhove et al., 2007). Rapid temporal recalibration was detected,
with the PSS shifting according to the previous modality order (Van der
Burg et al., 2013, 2015, 2018). However, our results also captured novel
differences between groups.
Firstly, with regard to synchrony distributions, we found differences in
the magnitude of the effect of congruency according to group. Both
groups were less likely to perceive incongruent stimuli as synchronized,
but this effect was particularly pronounced for the non-autistic ones.
This was even true at 0 ms, when participants went from recognizing the
physical simultaneity of the stimuli 91.8% to 46.2% of the time in the
non-autistic group, and 89.7% to 50.8% of the time in the autistic
group. This suggests a profound interference of phonetic incongruence on
basic temporal processing, one that van Wassenhove et al. (2007)
attributed to a weaker correlation between the facial kinematics (what
is seen) and acoustic dynamic envelope (what is heard). But why does the
magnitude of this difference vary between autistic and non-autistic
individuals, when the disparity between these factors remains the same?
One interpretation might be that the autistic participants simply have a
lower temporal resolution than the non-autistic ones, and therefore less
room for interference in temporal processing. However, we did not
replicate findings that the WPS, the common measure of temporal acuity,
differs between groups, so this interpretation is not supported by our
results. Alternatively, these differences could be due to impoverished
lip reading ability, which has been found to account for some or all of
the disparity in susceptibility to the McGurk/MacDonald effect in
autistic children (Iarocci et al., 2010; Smith & Bennetto, 2007).
Impoverished lip reading ability may be viewed as a weaker association
between a viseme and its associated phoneme. This may translate into a
diminished incongruence effect, as the autistic participants would be
less sensitive to the difference driving it. That being said, were this
the case, one might also expect an attenuated visual influence of the
visemes, and hence a lower rate of the McGurk/MacDonald effect, in the
autistic participants. As such, further research into the lip-reading
abilities of autistic adults and their potential influence on temporal
processing of audiovisual speech stimuli is necessary.
Delving deeper into the temporal dynamics at play, we did not detect the
differences between groups in the WPS or rapid temporal recalibration
formerly reported. With regard to the WPS, the largest meta-analysis to
date examining potential differences between autistic and non-autistic
participants found a consistent enlargement of its width among those
with autism (Zhou et al., 2018), suggesting blunted temporal acuity.
However, there was again a limited number of studies germane to the
topic (with only four studies investigating the audio-visual WPS), most
had small samples (ranging from 32-64 participants), and all of them
focused on children. More recent research involving adults paints a
different picture. Two studies (Weiland et al., 2022; Zhou et al., 2022)
with larger samples of adults found no difference between autistic and
non-autistic participants in the width of the WPS, suggesting that
autistic individuals may also catch up in the honing of temporal
processing by the time they reach adulthood. A very similar pattern
emerges with rapid temporal recalibration, where smaller studies with
younger participants found differences between autistic and non-autistic
individuals (J. Noel et al., 2017; Turi et al., 2016), but the largest
adult study did not (Weiland et al., 2022). However, the research here
is more limited, and Weiland et al. (2022) also recruited from the NAR,
so their sample may partially overlap with ours. Accordingly, further
examination of potential differences between autistic and non-autistic
individuals in the WPS and rapid temporal recalibration (and the
possibility of their resolution) is warranted.
We did, however, detect a difference between groups in the overall mean
PSS value. Autistic participants showed a greater mean PSS, irrespective
of stimulus type, suggesting a heightened sensory preference for visual
lead. This finding may also explain the difference in the magnitude of
the congruence effect between groups, at least in part, given that it
was largest with a slight visual lead. The two most obvious potential
explanations for the PSS difference would be either faster processing of
auditory information or slower processing of visual information in
autism. Unfortunately, there is little research into this topic, at
least with regard to speech stimuli. Furthermore, as always, the studies
that do exist tend to focus on children. We could not find studies
investigating differences in auditory processing speed between autistic
and non-autistic individuals. However, those that have looked into
visual processing speed show faster visual processing in autism, if
anything (Foss-Feig et al., 2013; Samson et al., 2011). Granted, this
was in more basic sensory processing, such as recognition of motion
(Foss-Feig et al., 2013), but it certainly does not provide evidence for
a unisensory processing speed explanation. More research should be done
into processing speed for the different sensory modalities in autism,
both with simple and complex stimuli.
An alternative explanation falls more in line with our discussion of
differences in representation of visual speech stimuli. If autistic
individuals have differently developed representations of verbal lip
movements (as suggested by their weaker lip-reading abilities) and
weaker associations between them and the sounds of language, as
suggested by van Wassenhove et al. (2007), it stands to reason that it
might take them more time to interpret lip-movements and integrate them
with their corresponding vocal sounds. This might translate into a
greater sensory preference for visual lead when processing speech
stimuli. However, given the dearth of evidence provided by the
literature on the alternative sensory processing speed hypotheses, this
interpretation is highly speculative, and further research should
explore the factors contributing to differences in PSS between autistic
and non-autistic individuals. An excellent starting point would be to
see whether this difference is unique to speech stimuli (supporting the
notion that it is driven by differences in representation of verbal
mouth movements) or whether it applies more broadly to simple
audiovisual stimuli (suggesting a basic sensory processing speed
explanation).
While the large size of our sample and sound experimental design are
strengths of our study, it is, of course, not without its limitations.
Firstly, this experiment was part of a larger online experimental
battery, which placed constraints on the number of trials participants
could complete. A larger number of trials and range of SOAs would have
allowed more sophisticated analyses of temporal processing and higher
resolution representation of participants’ WPS and recalibration
effects. This also would have allowed us to investigate potential
effects of congruence on recalibration and, conversely, of recalibration
on the likelihood for participants to perceive the illusion. A related
shortcoming of this study is that the time limitation meant we were
unable to include unisensory trial types. These allow a researcher to
quantify participants’ ability to identify visemes and phonemes on their
own, which is important as autistic children have shown differences in
their lip-reading abilities when compared to non-autistic ones (Foxe et
al., 2015; Iarocci et al., 2010; J. R. Irwin et al., 2011; Smith &
Bennetto, 2007; Taylor et al., 2010). While we did not find a difference
in audiovisual speech processing anyway, we are unable to speak to the
influence of unisensory factors in our findings. Future research should
assess the degree to which autistic and non-autistic adults may differ
in their perception of visemes and phonemes exclusively as well as in
combination to better isolate any potential differences in MSI.
Finally, it must be noted that well-educated adults with comparatively
high IQs are overrepresented in the NAR sample (Scheeren et al., 2022).
It could be argued that our sample is therefore less likely to capture
the segments of the autistic population that may suffer from the most
severe deficits in areas like MSI. In particular, those with
intellectual disabilities are underrepresented. That being said, the
parity in IQ (as estimated by the ICAR) between groups does mean that
our results can speak directly to differences resulting from the sensory
factors related to autism that are not confounded by cognitive ones
related to intellectual impairment. If differences between groups in MSI
were found to be driven by the individuals who are underrepresented
here, it would be unclear whether they were due to autism or
intellectual disability.
In conclusion, our study has confirmed several findings with regard to
basic temporal and multisensory processing, as well as challenged the
degree to which reported differences between autistic and non-autistic
children in these areas extend to adulthood. Our findings that MSI,
temporal processing acuity, and rapid temporal recalibration all seem to
be intact among autistic adults is highly encouraging given the
essential role MSI has in speech perception and compensation for the
unisensory deterioration that is inevitable with aging. Additionally,
our novel findings with regard to differences in the degree of
interference in temporal processing posed by incongruent stimuli and in
the mean PSS values between groups are intriguing, and demand further
research to disentangle alternative explanations. Understanding these
phenomena is of paramount importance given the relevance of temporal and
multisensory processing to higher order social factors and the proven
efficacy of multisensory training. Pinpointing the age at which related
interventions may be of use is crucial to their proper timing, which our
findings suggest is prior to adulthood. Finally, our results underline
the importance of expanding sample sizes and age ranges in autism
research. Restricting our focus to children does a disservice to the
vast majority of individuals with autism and leads to a limited
understanding of the broader trajectory of this developmental condition,
which can only be broadened by giving autistic adults the attention they
deserve.