Hearing Research 359 (2018) 13e22
Contents lists available at ScienceDirect
Hearing Research
journal homepage: www.elsevier.com/locate/heares
Research Paper
Depth matters - Towards finding an objective neurophysiological
measure of behavioral amplitude modulation detection based on
neural threshold determination
Saskia M. Waechter a, b, Alejandro Lopez Valdes a, b, Cristina Simoes-Franklin a, c, e,
Laura Viani c, d, e, Richard B. Reilly a, b, d, e, *
a
Trinity Centre for Bioengineering, Trinity College, The University of Dublin, Dublin 2, Ireland
School of Engineering, Trinity College, The University of Dublin, Dublin 2, Ireland
National Cochlear Implant Program, Beaumont Hospital, Dublin 9, Ireland
d
Royal College of Surgeons in Ireland, Dublin 2, Ireland
e
School of Medicine, Trinity College, The University of Dublin, Dublin 2, Ireland
b
c
a r t i c l e i n f o
a b s t r a c t
Article history:
Received 14 January 2017
Received in revised form
7 December 2017
Accepted 11 December 2017
Available online 14 December 2017
With increasing numbers undergoing intervention for hearing impairment at a young age, the clinical
need for objective assessment tools of auditory discrimination abilities is growing. Amplitude modulation (AM) sensitivity has been known to be an important factor for speech recognition particularly
among cochlear implant (CI) users. It therefore would be useful to develop objective measures of AM
detection for future clinical assessment of CI users; this study aimed to verify the feasibility of a
neurophysiological approach studying a cohort of normal-hearing participants. The mismatch waveform
(MMW) was evaluated as a potential objective measure of AM detection for a low modulation rate (8 Hz).
This study also explored the relationship between behavioral AM detection and speech-in-noise
recognition. The following measures were obtained for 15 young adults with no known hearing
impairment: (1) psychoacoustic sinusoidal AM detection ability for a modulation rate of 8 Hz; (2) neural
AM detection thresholds estimated from morphology weighted cortical auditory evoked potentials elicited to various AM depths; and (3) AzBio sentence scores for speech-in-noise recognition. No significant
correlations were found between speech recognition and behavioral AM detection measures. Individual
neural thresholds were obtained from MMW data and showed significant positive correlations with
behavioral AM detection thresholds. Neural thresholds estimated from morphology weighted MMWs
provide a novel, objective approach for assessing low-rate AM detection. The findings of this study
encourage the continued investigation of the MMW as a neural correlate of low-rate AM detection in
larger normal-hearing cohorts and subsequently in clinical cohorts such as cochlear implant users.
© 2017 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords:
Temporal auditory processing
Objective measures
Mismatch waveform
Amplitude modulation detection
Speech recognition
1. Introduction
Abbreviations: ACC, acoustic change complex; AFC, alternative forced choice;
AM, amplitude modulation; AMD, amplitude modulation depth; ASSR, auditory
steady-state response; AUC, area-under-the-curve; BT, behavioral threshold; CI,
cochlear implant; EEG, electroencephalography; EFR, envelope following response;
ERP, event-related potential; IV, intersection value; LTASS, long-term average
speech spectrum; MMW, mismatch waveform; NT, neural threshold; RMS, rootmean square; SNR, signal-to-noise ratio; SIN, speech-in-noise; SD, standard deviation; TFS, temporal fine structure
* Corresponding author. Trinity Centre for Bioengineering, 152-160 Pearse Street,
Trinity College, The University of Dublin, Dublin 2, Ireland.
E-mail address: REILLYRI@tcd.ie (R.B. Reilly).
Speech processing in humans is a complex process based on the
integration of spectral and temporal information, where temporal
information can be divided into the slow amplitude fluctuations in
the envelope and the faster temporal changes conveyed in the
temporal fine structure (TFS). There is evidence that TFS cues can
enhance speech recognition in adverse listening conditions
(Hopkins and Moore, 2009). However, the temporal envelope is
considered as one of the most important features for speech
intelligibility (Drullman, 1995; Shannon et al., 1995). Specifically,
envelope fluctuations with rates below 16 Hz are crucial for
https://doi.org/10.1016/j.heares.2017.12.005
0378-5955/© 2017 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
14
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
phoneme recognition (Drullman, 1995; Xu et al., 2005). The significance of slow envelope fluctuations in speech recognition has
prompted many recent investigations of brain activity in response
to sounds with low-rate amplitude modulation (AM). A literature
review by Edwards and Chang (2013) highlighted the tuning of the
human auditory system to low-rate AM in the fluctuation range
(~1e10 Hz) and its implications for speech processing.
Given the importance of envelope fluctuations for sound
perception and particularly for speech processing, this study
explored human auditory brain responses to modulated sounds
with a low AM rate (8 Hz) and examined their applicability as an
objective metric. Such an objective metric may provide a valuable
tool for clinical hearing assessment without relying on subjective
feedback (Hall and Swanepoel, 2010), addressing the clinical demand for objective metrics of auditory processing due to increasing
numbers undergoing intervention for hearing impairment at a
young age (Rajan et al., 2017). Experiments were conducted for a
normal-hearing cohort to verify the feasibility of a neurophysiological approach based on neural change detection. Paradigms were
designed in a way that allows future replication of the same test
battery in a CI user cohort.
As detailed by Picton (2013), neural responses elicited by temporal auditory features can be employed to assess various aspects of
temporal auditory processing. Previous studies have assessed the
relationship between behavioral AM detection abilities and corresponding neural measures (Purcell et al., 2004; Manju et al., 2014;
Han and Dimitrijevic, 2015; Luke et al., 2015; Dimitrijevic et al.,
2016). The main objective of this study was to build on this
research by estimating individual neural thresholds (NTs) from late
cortical auditory evoked potentials (CAEPs) for low-rate AM
detection. These NTs were derived from CAEP data elicited by
various amplitude modulation depths (AMDs), and compared to
behavioral AM detection thresholds for a modulation rate of 8 Hz.
We hypothesized that NTs would be significantly correlated with
behavioral AM detection thresholds of the same AM rate.
Periodic neural responses such as the auditory steady state
response (ASSR) (Manju et al., 2014; Luke et al., 2015) and the envelope following response (EFR) (Purcell et al., 2004), as well as
transient CAEPs in the form of the acoustic change complex (ACC)
(Han and Dimitrijevic, 2015) have been investigated as potential
candidates to determine objective, neural measures of AM detection. The ASSR measures the neural response to a fixed AM rate,
whereas the EFR may be evoked by sweeping stimuli, e.g. stimuli
with continuously changing AM rates or AMDs. Significant correlations between behavioral AM detection thresholds and electrophysiological measures have been found for the EFR (Purcell et al.,
2004) and the ASSR (Manju et al., 2014), for normal-hearing cohorts
at modulation rates above 20 Hz and with constant AMDs in the
neurophysiological paradigms. Dimitrijevic et al. (2016) investigated EFRs elicited by stimuli with continuously changing AMDs at
a fixed AM rate of 41 Hz. Neural AMD thresholds obtained from the
EFRs showed significant correlations with behavioral AM detection
thresholds. Luke et al. (2015) investigated the electrically evoked
ASSR for CI users for modulation rates of 4 Hz and 40 Hz and found
a significant correlation between ASSRs at 40 Hz and behavioral AM
detection thresholds at 20 Hz. No studies have been reported
investigating the relationship between AM detection abilities and
EFRs/ASSRs elicited by stimuli with differing AMDs for modulation
rates in the fluctuation range, e.g. below 20 Hz. Han and
Dimitrijevic (2015) investigated the influence of the AMD on
ACCs for differing modulation rates including a low-rate AM of 4 Hz,
showing a fast decline in ACC amplitude with decreasing AMD.
The present study explored a transient neurophysiological
response referred to as the mismatch waveform (MMW) (Lopez
Valdes et al., 2014). The MMW was obtained using an auditory
oddball paradigm by subtracting the standard CAEP from the
deviant CAEP. This difference waveform showed two distinct
components: a negative component corresponding to the widely
studied mismatch negativity, which is the result of perceptual
€ta
€nen et al., 2007;
change detection in stimulus sequences (N€
aa
Fishman, 2014), followed by a positive component which is associated with cognitive processes and may be a result of involuntary
attention directed towards the deviant stimulus, similar to the P3a
response (He et al., 2009). Previous studies have shown that both
components are positively correlated with the magnitude of stimulus change (Katayama and Polich, 1998; He et al., 2009). Thus, both
components were investigated as part of the MMW, similar to work
by Lopez Valdes et al. (2014), who have demonstrated positive
correlations between MMW-based neurophysiological thresholds
and psychoacoustic thresholds for spectral-ripple discrimination.
This study aimed to expand on those findings, transitioning from
spectral processing to temporal processing with the overall goal of
designing a combined test battery to assess spectro-temporal
auditory processing abilities.
While not the main focus of this study, the relationship between
the ability to detect low-rate AM with AMDs near threshold and
speech recognition scores was also addressed. Although the literature suggests a lack of correlations between psychoacoustic
measures (e.g. pitch discrimination, intensity discrimination and
modulation detection) and speech recognition measures within
normal-hearing cohorts (Strouse et al., 1998; Watson and Kidd,
2002; Goldsworthy et al., 2013), the calculation of correlations
was implemented in order to have a fully translatable experimental
battery for replication in CI users. Across groups of younger and
older normal-hearing and CI participants, Jin et al. (2014) found
significant correlations between AM detection thresholds (at 2 and
4 Hz AM rates) and speech-in-noise recognition with modulated
noise maskers, but no within-group correlations were reported.
Experiments with CI user cohorts have shown significant correlations between speech measures (i.e. vowel, consonant, phoneme,
syllable, and sentence recognition) and low-rate (modulation rate
fm < 20 Hz) (Gnansia et al., 2014; De Ruiter et al., 2015) as well as
high-rate (fm 20 Hz) AM detection abilities (Cazals et al., 1994; Fu,
2002; Luo et al., 2008; Won et al., 2011; De Ruiter et al., 2015). These
reported correlations in CI user cohorts provide support for the
importance of AM sensitivity in electrical hearing and encourage
the investigation of an objective measure of AM sensitivity.
2. Materials & methods
2.1. Participants
15 young adults (9 female, 6 male; 19e28 years, mean:
23.2 ± 2.5 years) with no known hearing impairment participated
in this study. Three participants were not native English speakers
and their data were excluded from analysis relating to the speech
test, but their results from electrophysiological and AM detection
paradigms were included. Informed written consent was obtained
from all participants prior to participation and all experimental
procedures were approved by the Ethics (Medical Research) Committee at Beaumont Hospital, Beaumont, Dublin and the Research
Ethics Committee at Trinity College Dublin.
Participants were seated in a quiet room and auditory stimuli
were presented monaurally to the left ear via headphones (Sennheiser HD 205) for all experimental paradigms. The presentation
level of 70 dB SPL was verified with a KEMAR mannequin (45 BC)
with pinna simulator (KB 0091), pre-amplifier (26CS) and prepolarized pressure microphone (40A0) (all from G.R.A.S. Sound &
Vibration). All stimuli were energy matched by adjusting the rootmean-square (RMS) amplitude.
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
2.2. AM stimuli
Stimuli were created in MATLAB (MATLAB Release 2013b, The
MathWorks, Inc., Natick, Massachusetts, United States) with a
sampling rate of 44100 Hz. The modulation rate was 8 Hz, which
provided four full AM cycles for a sound duration of 500 ms. The
noise carrier was created by filtering a 500-ms white Gaussian
noise stimulus with a long-term average speech spectrum (LTASS)
filter (Byrne et al., 1994). The AM signal, s(t), was created by
multiplying the noise carrier, c(t), with a sinusoidal signal according
to Equation (1):
sðtÞ ¼ ½1 þ m*sinð2*p*fm *t þ ФÞ*cðtÞ
(1)
where t denotes time, fm is the modulation rate (8 Hz), Ф is the
starting phase (-p/2) and m is the AMD with values between zero
and one. The psychoacoustics literature commonly reports the
AMD in decibels expressed as 20log(m), but it may also be reported
in percentage expressed as 100*m, in line with the literature
investigating neurophysiological measures of AM processing
(Purcell et al., 2004; Han and Dimitrijevic, 2015; Dimitrijevic et al.,
2016). For the neurophysiological paradigm, AMDs were chosen on
a linear scale with a constant step size of 25% (100%, 75%, 50% and
25%), thus, the AMD in this study is reported in percentage,
expressed as 100*m, unless noted otherwise. The chosen starting
phase results in minimum amplitude of the noise signal at stimulus
onset. To avoid loudness cues resulting from changes in AMD, the
unmodulated and modulated stimuli were energy matched by
adjusting the RMS amplitude to a constant value. Additionally, level
roving was applied in the psychoacoustic paradigm with a range of
±3 dB to reduce the usefulness of any potentially remaining loudness cues.
15
fatigue, data for this paradigm were acquired in a separate session
to the other tests. The percentage of correct responses was calculated for each AMD as the sum of hits and correct rejections divided
by the total number of trials, where hits refers to the number of
trials in which modulated stimuli were correctly identified as
modulated and correct rejections refers to the number of trials in
which unmodulated stimuli were correctly identified as unmodulated. Additionally, the sensitivity index, d0 (d-prime), is reported. In
the case of extreme values of zero and one for the hit rate or false
alarm rate, a correction was applied by adjusting zero to 0.5N and
one to 1-0.5N, where N is the number of possible hits or false
alarms, respectively (Macmillan and Kaplan, 1985).
2.4. Speech test
The AzBio speech test (Spahr et al., 2012) was employed.
Recorded sentences were presented with male and female speakers
with an American English accent and masked with a ten-talker
babble noise. Three SNRs (10, 5 and 0 dB) were used for one sentence list each. Speech recognition scores for a normal-hearing
cohort were expected to show ceiling effects for SNRs of 10 dB
and 5 dB, but were included in the test battery to facilitate study
replication in a CI user cohort, in which speech-in-noise recognition
is known to be poorer (Oxenham and Kreft, 2014). Every sentence
list included 20 sentences with a mean of seven words per sentence
across the three used lists. The number of correctly identified words
was counted and the speech recognition score was calculated as the
percentage of correctly identified words. All presented words were
considered for each sentence's recognition score. The speech signal
was presented at a constant level and the noise signal was adjusted
according to the SNR.
2.5. Electrophysiology
2.3. Psychoacoustics
Behavioral AM detection was evaluated with two paradigms:
One paradigm estimated the behavioral threshold (BT) for AM
detection with an adaptive procedure, and the second paradigm
yielded the percentage of correct discrimination for a set of specific
AMDs, providing an estimate of the overall psychometric function.
BTs were determined with a three-alternative-forced-choice
(3AFC), two-down/one-up paradigm yielding an estimate of the
70.7% correct point on the psychometric function (Levitt, 1971). The
inter-stimulus interval for each trial of three consecutive stimuli
was set to 100 ms. Participants were provided with visual feedback,
with the selected button lighting up green or red for a correct or
incorrect response, respectively. The starting AMD was 0 dB,
expressed as 20log(m). The step size was 4 dB for the first four
reversals and 2 dB thereafter. A run was completed after 12 reversals and the BT was taken as the arithmetic mean of the
20log(m) values at the last eight reversals. Data were acquired for
four runs and the final BTs were calculated as the mean across runs.
BTs are reported, both in percentage (100*m) and in dB (20log(m)),
to facilitate easy comparison with neural thresholds (NTs) and BTs
reported in the literature, respectively.
To obtain an estimate of the psychometric function, unmodulated and modulated stimuli with a duration of 500 ms were presented in a single interval yes/no task with blocked modulation
depths (i.e. 10%, 12.5%, 25%, 50%, 75%, and 100%). The participant
had a 2 s time window following stimulus presentation to decide
whether the stimulus was modulated or unmodulated by clicking
the corresponding button. No feedback was provided. Modulated
and unmodulated stimuli had equal probability of presentation.
Stimulus presentations were divided into three runs for each AMD
with a total of 120 stimulus presentations for each AMD. To avoid
2.5.1. Data acquisition
Single-channel EEG data were acquired through a custom-built,
single-channel, high sampling rate EEG setup, previously designed
and validated to acquire EEG data from CI recipients and which
includes an electrical artefact reduction algorithm (Mc Laughlin
et al., 2013). In this setup, the recording electrode is positioned at
the vertex and referenced to the right mastoid; the right collar bone
is used as the system ground. As the long term goal is to use the
protocol with CI users, data were sampled at a high rate of 125 kHz
for artefact reduction purposes (Mc Laughlin et al., 2013). Such a
high sampling rate is unnecessary for EEG signals from normalhearing participants and uneconomical for further postprocessing. Hence, data were down-sampled offline by a factor of
100. The amplifier's high-pass filter was set to 0.03 Hz and the lowpass filter was set to 100 Hz and data were amplified with a gain of
2000. Electrode impedances were measured before, during and
after the electrophysiological recordings. Impedances were kept
below 5 kU for all electrode combinations.
Cortical responses were elicited using an unattended, auditory
oddball paradigm in which modulated (deviant) and unmodulated
(standard) noise sounds (details in Section 2.2) were presented for
deviants with AMDs of 100%, 75%, 50%, and 25%. Each stimulus had
a duration of 500 ms and an inter-stimulus interval of 1 s. Each
condition was presented in separate blocks of 160 stimulus repetitions each with a total of four blocks per condition. Each block
contained 20 initial presentations of the standard, followed by 140
mixed presentations of the standard (90% occurrence probability)
and deviant (10% occurrence probability), resulting in 56 deviant
and 584 standard presentations in total for each AMD. The order of
AMD blocks was pseudo-randomized for each participant.
Participants were seated in a quiet room, watching a silent,
16
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
captioned movie of their choice and were instructed to keep body
movements to a minimum. As a measure of signal quality, an
additional brief block of pure tone stimuli was added at the start
and end of the EEG recording to elicit the robust N1-P2 complex
(500 Hz, 500 ms duration, 1 s inter-stimulus interval). All participants exhibited visible N1-P2 complexes, so no participant's data
were excluded from further analysis.
2.5.2. Data processing
Offline post-processing of the down-sampled data included
zero-phase band-pass filtering between 1 Hz and 15 Hz with a 4th
order Butterworth filter, gain removal, epoching ( 300 ms prestimulus to 700 ms post-stimulus), linear de-trending, baseline
correction, and separation into standard and deviant epochs.
Standard and deviant epochs were averaged to form the respective
CAEPs and the MMW was obtained by subtracting the standard
from the deviant CAEP.
Morphology weighting: The MMW was evaluated in terms of
area-under-the-curve (AUC) in the region of 110 ms to 310 ms poststimulus onset (Fig. 3A). Random non-task related fluctuations in
the EEG data may result in spurious AUC measurements. For this
reason, a morphology weighting approach was developed (Fig. 1).
Morphology weighting was achieved by assessing the Pearson
correlation between MMWs at differing AMDs and a participantspecific template. These correlation coefficients were associated
with weights according to a weighting function (binary or exponential weighting, Fig. 3C). Multiplication of the AUC values with
the assigned weights for each AMD and each participant provided
the morphology weighted AUC curves. The application of weights
to AUC values based on similarity of the MMW with the
participant-specific template reduced the influence of random
fluctuations on AUC values.
Morphology weights calculation: Each participant showed a clear
MMW for the 100% AMD, i.e. a negative component followed by a
positive component in the time region of interest. Thus, the individual MMW at 100% AMD served as a participant-specific template
for the morphology weighting approach and is referred to as the
‘template’ in the following. Subsequently, correlation coefficients
between the individual MMWs for lower AMDs (MMW75, MMW50
and MMW25) and this template were calculated.
It has been reported that increasing task difficulty may result in
increased MMW latencies (Tiitinen et al., 1994; Kimura and Takeda,
2013). MMW latency shifts may lead to lower correlation coefficients despite overall similar morphology. To compensate for
such latency shifts, MMW75, MMW50 and MMW25 were aligned
with the template prior to the correlation calculation (Fig. 2). To
determine the latency shift required for the alignment, the peaks of
the template were determined in the corresponding time range of
interest and a time window of 5 ms to þ40 ms around these peak
latencies served as the search window for peak detection of the
remaining MMWs. The search window was chosen to be asymmetrical, as latencies were not expected to decrease for lower
AMDs. MMWs were aligned separately, by positive peak and by
negative peak, providing two sets of correlation coefficients.
Fig. 1. Overview of data processing steps for morphology weighting of mismatch waveforms (MMWs) to obtain weighted area-under-the-curve (AUC) functions. For each
participant and each amplitude modulation depth (AMD), (1) the AUC is calculated between 110 ms and 310 ms post-stimulus onset. (2) Separately for the negative and the positive
peaks, the MMWs were aligned (Fig. 2) by shifting the waveforms. The correlation coefficients between the template (MMW100) and MMWs for lower AMDs are calculated. Based on
the correlation coefficient, a weight is assigned (binary or exponential weighting function). The overall weight is the average of the obtained values for negative and positive peak
alignment. (3) The morphology weighted AUC values are obtained by multiplying the determined weights with their respective AUC values.
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
17
Fig. 2. Mismatch waveform (MMW) alignment for an example participant, allowing for latency shifts between differing amplitude modulation depths (AMDs); MMWs for 75%, 50%
and 25% AMD were aligned with the template (100% AMD) with regard to the negative peak (B) and the positive peak (C) of the MMW.
Correlation coefficients were calculated after the MMWs were
aligned. Each correlation coefficient was assigned a weight and to
obtain the final weight for each AUC value, the weights derived
from negative and positive peak alignments were averaged. The
morphology weighted AUC values (Fig. 3D) were obtained by
multiplying the unweighted AUC values (Fig. 3B) with the weights.
Weighting functions: Weight assignment was based on two
weighting approaches, binary and exponential weighting. The binary weighting represents an “all-or-nothing” approach in which a
weight of one was assigned to all MMWs with correlation coefficients of 0.85 or above and a weight of zero was assigned
otherwise (Fig. 3C). The threshold of 0.85 was determined empirically and preserves MMWs that show the characteristic waveform
with little variation, but suppresses MMWs that do not show high
similarity to the template. For the less stringent exponential
weighting, correlation coefficients were assigned into bins of 0.1
width. The first bin with correlation coefficients between one and
0.9 was assigned a weight of one, and with each bin the weight was
halved (i.e. 0.5, 0.25, 0.125) (Fig. 3C). All correlation coefficients
below 0.6 were assigned a weight of zero. Each weighting type was
applied individually throughout the data analysis and final correlations between BTs and NTs were compared to assess the influence
of the weighting approach.
Neural threshold calculation: The NT was taken as the interpolated AMD at which the individual's weighted AUC curve dropped
below a derived intersection value (Fig. 3E). A range of different
intersection values was investigated.
Correlation analysis between BTs and NTs: The potential linear
relationship between BTs and NTs was assessed with Pearson's
correlation coefficient r. Correlation analysis was carried out for a
range of potential intersection values with both weighting functions. For this analysis, all standard epochs were averaged for each
AMD. To verify the validity of the novel methodology, the entire
analysis was carried out for 300 permutations of different sub-sets
of 56 standards for an example intersection value of 0.35 and both
weighting functions.
3. Results
3.1. Psychoacoustics
The individual mean BTs were between 8.1% ( 21.8 dB) and
16.7% ( 15.5 dB) AMD. The group mean threshold was 12.1%
( 18.3 dB) with a standard deviation of 2.4% ( 33.4 dB) . Behavioral
AM detection scores in the psychometric function showed expected
ceiling effects at the group level for AMDs of 25% and above, while
detection accuracy decreased significantly below 25% AMD
(Fig. 4A). Table 1 summarizes the group mean percentage correct
values and their standard deviations for the various AMDs. Additionally, the group mean hit rates, false alarm rates and the sensitivity index d0 are reported (Table 1). A non-parametric Friedman
test revealed a statistically significant difference in task performance between differing AMDs (Х2(5) ¼ 55.03, p-value<.001).
Post-hoc analysis with Wilcoxon signed-rank tests was conducted
with Bonferroni correction to adjust for multiple comparisons with
an adjusted significance level of 0.003 for 15 comparisons. There
were significant differences between 12.5% and all other AMDs
(Z 3.24, p-value.001) and between 10% and all other AMDs
(Z 3.24, p-value.001).
3.2. Speech test
Speech-in-noise recognition scores for the native English
speakers revealed ceiling effects at the 10 dB SNR and close to
ceiling effects at the 5 dB SNR, but for 0 dB SNR a decline in speech
recognition and increased variation among participants occurred
(Fig. 4B, Table 2). A non-parametric Friedman test revealed significant differences between SNRs (Х2(2) ¼ 24.00, p-value<.001).
Fig. 3. (A) Example area-under-the-curve (AUC) between 110 ms and 310 ms for an individual participant's mismatch waveform (MMW) for the 100% amplitude modulation depth
(AMD); (B) unprocessed AUC curves derived from individual MMWs for all participants; (C) exponential and binary weighting function assigning weights to AUC scores depending
on correlation coefficients between the associated MMW and the participant's template; (D) weighted AUC curves obtained by multiplying assigned weights with unweighted AUC
values for all participants' weighted AUC curves; results displayed for binary weighting function; (E) neural thresholds determined as the intersection point between weighted AUC
curves and the chosen intersection value (IV); the neural threshold (NT) represents the highest, interpolated AMD at which the AUC curve drops below the IV.
18
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
Fig. 4. (A) Psychometric function of amplitude modulation detection showing group mean scores and standard deviations (all participants, n¼15); (B) Boxplots indicating the 25th
and 75th percentiles and the median speech-in-noise scores for three signal-to-noise ratios (SNRs) for the native speakers (n¼12); (C) Regression line and correlation results
between speech-in-noise scores (0 dB SNR) and the 12.5% AMD of the psychometric function (native speakers, n¼12); both levels represent the tested level at which participants
showed poorer task performance and increased performance variability.
Table 1
Group mean results of the psychometric function reporting the total percentage of
correct responses (hits and correct rejections), the corresponding standard deviation (SD), hit rates, false alarm rates and the sensitivity index d0 for six amplitude
modulation depths (AMDs) which are reported in percentage, expressed as 100*m,
and in dB, expressed as 20log(m).
AMD [%] AMD [dB] Total correct [%] SD
Hit rate False alarm rate d0
100
75
50
25
12.5
10
0.985
0.983
0.990
0.973
0.831
0.494
0
2.5
6.0
12.0
18.0
20
98.8
98.7
98.8
97.9
84.9
66.7
1.2
1.2
1.4
2.2
10.3
9.8
0.016
0.018
0.024
0.024
0.131
0.170
4.44
4.41
4.46
4.20
2.41
1.01
Post-hoc analysis with Wilcoxon signed-rank tests was conducted
with Bonferroni correction to adjust for multiple comparisons with
an adjusted significance level of 0.017 for three comparisons. Significant differences were observed between all three conditions
(10 dB vs. 5 dB: Z ¼ 3.06, p-value ¼ .002; 10 dB vs. 0 dB: Z ¼ 5.06,
p-value ¼ .002; 5 dB vs. 0 dB: Z ¼ 3.06, p-value ¼ .002). Nonnative speakers showed large variations in their performance as
well as overall poorer performance for lower SNRs and were
therefore excluded from data analysis relating to speech recognition scores.
3.3. Electrophysiology
The group mean MMWs revealed a clear morphology for 100%
and 75% AMD, whereas the waveforms for 50% and 25% AMD only
showed random fluctuations (Fig. 5A). The individual morphology
weighted AUC values showed a strong decline from 100% to 75%
and from 75% to 50% AMD, but then remained constant at a low
(mostly zero) level for 50% and 25% AMD (Fig. 3D and Fig. 5B).
Statistical analysis by means of a non-parametric Friedman test
revealed a significant effect of AMD for the binary weighted MMW
AUC values (Х2(3) ¼ 37.08, p-value<.001). Post-hoc analysis was
carried out with Wilcoxon signed-rank tests, and Bonferroni
correction provided an adjusted significance level of 0.008 for six
comparisons. Weighted MMW AUC values for 100% AMD differed
significantly from those for all other AMDs (100% vs. 25%: Z ¼ 3.41,
Table 2
Group mean data and their standard deviations (SDs) for the speech-in-noise
test; the table includes scores for native speakers (n¼12) at three tested
signal-to-noise ratios (SNRs).
SNR
Mean
SD
10 dB
5 dB
0 dB
98.2
94.0
67.8
1.65
1.67
7.54
p-value ¼ .001, 100% vs. 50%: Z ¼ 3.41, p-value ¼ .001, 100% vs.
75%: Z ¼ 3.35, p-value ¼ .001) and 75% AMD scores differed
significantly from those for 25% AMD (Z ¼ 2.80, p-value ¼ .005),
but not 50% AMD (Z ¼ 2.58, p-value ¼ .010). No significant difference was found between scores for 25% and 50% AMD
(Z ¼ 0.54, p-value ¼ .593).
For binary weighting, five out of the 15 MMW75 values were
assigned a weight of zero, suggesting that five participants showed
only a clear MMW response for the MMW100. For the MMW50 and
MMW25 only two non-zero weights were observed for each set of
15 MMWs (Fig. 5B). Closer analysis of the MMWs associated with
non-zero weights at 25% and 50% AMD showed that participant
‘NH2’ demonstrated clearer MMWs than other participants and
AUC values different from zero for all AMDs. Further inspection of
the data suggested that non-zero AUC values for participant ‘NH4’
at 50% AMD and for ‘NH16’ at 25% AMD were unexpected based on
the observed waveform since the MMW did not resemble the
template. However, minimal random fluctuations with the shape of
the template in the region of interest led to high correlation values,
therefore, assigning greater weights to these AUC values.
3.4. Correlation analysis
Speech test vs. psychoacoustics: Only conditions without evident
floor or ceiling effects in the group mean scores were included in
the correlation analysis, namely the 0 dB condition of the speech
test which was compared to the AM detection scores of the psychometric function at 10% and 12.5% AMD, and the BTs. Potential
linear relationships between experimental measures were investigated with Pearson's correlation coefficients. No significant correlations were found between speech scores at 0 dB SNR and BTs
(r ¼ 0.15, p-value ¼ .638) and AM detection at 10% AMD (r ¼ 0.24,
p-value ¼ .461). Comparison of the speech scores at 0 dB SNR and
the behavioral AM detection scores for the 12.5% AMD (Fig. 4C)
suggested a moderately strong linear relationship between the two
measures (r ¼ 0.65, p-value ¼ .021), but the correlation did not
remain significant after adjusting the significance level to 0.017
with the conservative Bonferroni correction for multiple
comparisons.
Behavioral vs. neural thresholds: Individuals' BTs and NTs for AM
detection showed statistically significant correlations for a range of
tested intersection values (Table 3). For intersection values of 0.25
or below and 0.5 and above, NT calculation failed for one or more
participants. To validate the applied procedure, the correlation
analysis was carried out for 300 permutations of different sub-sets
of 56 standards and for an example intersection value of 0.35. The
distributions of Pearson's correlation coefficients and their
respective p-values across 300 permutations are shown in Fig. A1
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
19
Fig. 5. (A) Group mean mismatch waveform (MMW) data for the four tested amplitude modulation depths (AMDs) with indicated region of interest (110 mse310 ms); (B) Boxplots
visualizing variation in group data for weighted area-under-the-curve (AUC) values (shown for binary weighting) across AMDs; (C) Example correlations between behavioral and
neural thresholds (BT and NT, respectively) for the binary weighting (‘o’, dashed line) and exponential weighting (‘þ’, solid line) approach. NTs were calculated with an intersection
value of 0.35 and all standard epochs were averaged to obtain the MMW.
Table 3
Overview of the correlations between neural thresholds (NTs) and behavioral
thresholds; The NTs included in this analysis were estimated based on the reported
intersection values (IVs), and all standard epochs were averaged to calculate the
standard response. For high (0.5) and low (0.2) IVs, NT calculation failed for some
participants. This number of participants is indicated by the last column. The IV 0.35
(bold) indicates the IV for which the correlation analysis was additionally carried out
for 300 permutations of randomly chosen subsets of standard epochs (see Fig. A.1,
Appendix).
Binary
Exponential
IV
r
p
no NT
IV
r
p
no NT
0.10
0.15
0.20
0.25
0.30
0.325
0.35
0.40
0.45
0.50
0.593
0.615
0.636
0.588
0.612
0.625
0.639
0.386
0.647
0.651
.026
.019
.015
.021
.015
.013
.010
.156
.009
.012
1
1
1
0
0
0
0
0
0
1
0.10
0.15
0.20
0.25
0.30
0.325
0.35
0.40
0.45
0.50
0.562
0.579
0.609
0.558
0.580
0.592
0.605
0.339
0.611
0.603
.037
.030
.021
.031
.024
.020
.017
.216
.016
.023
1
1
1
0
0
0
0
0
0
1
(Appendix). The central tendency of the right-skewed distributions
of the correlation coefficients and p-values can be expressed by the
median. For the specified intersection value of 0.35, median correlation coefficients of r ¼ 0.603 (p-value ¼ .018) and r ¼ 0.559 (pvalue ¼ .033) were obtained for the linear relationship between BTs
and NTs based on binary and exponential weighting, respectively.
The non-parametric Mann-Whitney test showed that the distributions of Pearson's correlation coefficients between BTs and NTs
across 300 permutations are statistically significantly different for
the binary and exponential morphology weighting (p < .001).
evaluate the perception of acoustic changes related to AM detection. Previous studies have assessed the MMW as a measure of
auditory temporal resolution via gap detection (Desjardins et al.,
1999; Trainor et al., 2001; Uther et al., 2003). Overall, the successful elicitation of neural responses provides evidence for the
feasibility of the application of the MMW for this type of acoustic
change.
The morphology weighting approach reduced AUC values for
the lower AMDs (Fig. 3B and D), showing that MMWs at lower
AMDs do not strongly resemble the template, which is in line with
observations of random fluctuations of the individual MMWs at
25% and 50% AMD. Fig. 6 shows a comparison of the psychometric
function of behavioral AM detection and the weighted AUCs of
neural responses at the corresponding AMDs, highlighting the
difference in response development with decreasing AMD. AUC
values declined at much higher AMDs compared to behavioral responses. The psychometric function showed ceiling effects for
AMDs of 25% and above and deteriorating performance for AMDs
below 25%, which agrees with the literature for 4 Hz AM detection
(Han and Dimitrijevic, 2015). In contrast, the MMW amplitudes
decreased from 100% to 75% AMD, and for an AMD of 50% no clear
MMW was detectable in the group mean data. Similarly, Han and
Dimitrijevic (2015) showed declines in ACC amplitudes between
100% and 50% AMD, and only a weakly discernible ACC for 25% AMD
for an AM rate of 4 Hz.
MMWs have been elicited when acoustic changes were only just
perceptible (Kraus et al., 1993) or even consciously imperceptible
4. Discussion
There were five main findings: (1) MMWs can be elicited by
change detection from unmodulated to modulated noises. (2) The
MMW amplitude decreases with decreasing AMD. (3) Morphology
weighting of MMWs allows the objective estimation of NTs. (4) NTs
are significantly correlated with BTs. (5) No significant correlations
were observed between AM detection and speech scores at 0 dB
SNR.
4.1. Objective measure
4.1.1. Mismatch waveforms
To our knowledge, this is the first application of the MMW to
Fig. 6. Comparison of the group mean psychometric function of AM detection (dashed
line, left y-axis) and the group mean weighted area-under-the-curve (AUC) values
obtained from the neurophysiological data (solid line, right y-axis) for different AMDs.
20
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
(Allen et al., 2000). In light of this, the disappearance of the MMW
for AMDs at which behavioral performance showed ceiling effects
raises questions about the underlying mechanisms that result in
MMW elicitation for this acoustic change type. Auditory information may have to be accumulated for a longer time period to result
in AM detection for low AMDs than for high AMDs. Increasing reaction times with decreasing AMD support this interpretation (Han
and Dimitrijevic, 2015). In the case of the MMW, prolonged temporal integration of stimulus information for low AMDs may result
in temporally jittered neural change detection, preventing a clear
MMW.
4.1.2. Neural thresholds
The morphology weighting approach reduced AUC values at low
AMDs based on dissimilarities with the individual MMW template.
This allowed objective NT calculation, where the NT is determined
as the AMD at which the weighted AUC curve drops below a
specified intersection value. Correlation analyses revealed significant correlations between BTs and NTs for both weighting methods,
binary and exponential weighting, and for a range of intersection
values (Table 3). The intersection value of 0.4 constitutes an
exception with non-significant correlation results. Closer inspection of the data reveals that this is due to an outlier with a very low
NT (participant ‘NH2’). The ‘best’ intersection value of 0.35 was
selected to further validate the analysis procedure by repeating the
analysis with 300 permutations of randomly chosen sub-sets of 56
standards. The resulting distributions of correlation coefficients
and p-values showed that results are repeatable and do not
strongly rely on the choice of standard epochs. Non-parametric
statistical analysis of the distributions showed that binary
weighting yields significantly higher correlations than exponential
weighting, which is likely a result of the more stringent rejection of
MMWs with poor resemblance to the template.
4.1.3. Intersection value
As stated in Section 4.1.2, the statistical significance of the correlation between BTs and NTs was not strongly dependent on the
intersection value. To objectively determine a specific intersection
value for data analysis, different approaches may be applied. Unless
a thresholding paradigm is employed, task difficulty levels in psychoacoustic paradigms are commonly selected with the goal of
deriving a measure of performance between floor and ceiling. Such
difficulty levels provide a means of comparing task performance
across participants and a means for performing correlation analysis
across paradigms. The 75% AMD fulfils these requirements for NTs:
some participants showed clear waveforms for the 75% AMD condition while in others MMW morphologies were poorer. At the 50%
AMD level, no clear MMWs were recognizable. Given the suitability
of the 75% AMD level for correlation analysis, one could determine
the intersection value as the group mean AUC value at this level.
This would provide a value of 0.325 (binary weighting) or 0.323
(exponential weighting), which lies in the range of the intersection
values with the highest correlations (see Table 3).
4.1.4. AM loudness cues
The challenge of loudness balancing AM stimuli is usually
overcome by energy adjustment and/or level roving for behavioral
testing (Viemeister, 1979; Bacon and Viemeister, 1985; Shen and
Richards, 2013; Shen, 2014). Unfortunately, level roving cannot be
employed in MMW paradigms as it may result in MMWs elicited
purely through intensity change detection (von Wedel, 1982;
Martin and Boothroyd, 2000; Harris et al., 2007). The loudness
may change with the overall presentation level, the modulation
rate (Zhang and Zeng, 1997; Moore et al., 1999) and the AMD
(Moore et al., 1999). Despite energy adjustment of AM stimuli,
subjective perception of loudness differences cannot be prevented
without individual behavioral loudness balancing, which would be
required for each tested AMD.
Moore et al. (1999) reported an average difference of approximately 1.5 dB in the RMS level required to achieve equal loudness
for unmodulated and modulated speech-shaped noise at a modulation rate of 8 Hz and with 100% AMD. For 50% AMD, the RMS-level
difference decreased to less than 0.5 dB. Neurophysiological studies
have reported ACC responses elicited by intensity changes of 2 dB in
a vowel change stimulus (Martin and Boothroyd, 2000) and for
pure tone intensity increments (Harris et al., 2007). These findings
do not support the interpretation that unwanted overall loudness
cues between modulated and unmodulated stimuli had a strong
influence on the MMW, but some influence cannot be ruled out.
4.1.5. Limitations
Some limitations of this study should be acknowledged. The use
of only 56 deviant presentations for each condition may be a confounding factor in NT estimation. Kraus et al. (1993) averaged
neural responses for 200 deviant presentations to show neural
change detection near the perception threshold. Increasing the
number of deviant trials would likely have a positive impact on the
SNR of the acquired neural responses. However, this study included
four acoustic change conditions (four AMDs), compared to one
acoustic change type in Kraus et al. (1993), which required a fine
balance between the number of deviant presentations and participant fatigue.
The low number of recording channels can be an advantage or a
limitation. The chosen single-channel setup is clinically friendly,
which is important for future extension to clinical cohorts. However, it also introduces the possibility of slight misplacement of the
recording electrode, resulting in altered MMW amplitudes across
participants, but not influencing recordings within participants.
The discrepancy between the magnitudes of BTs and NTs,
despite significant correlations, needs to be explained and may be
due to several factors: (1) the difference in the experimental
paradigm may in part account for a threshold difference (3AFCdiscrimination easier than single-interval discrimination). (2)
MMW amplitudes are known to decrease with increasing task
difficulty, and therefore, for the 50% and 25% AMD conditions the
MMW may exist but is not distinguishable from the noise floor. The
SNR of the EEG data can be improved by increasing the number of
deviant repetitions, potentially resulting in distinguishable MMWs
at low AMDs, which would in turn lead to lower NTs and decrease
the gap between BTs and NTs. Future research has to identify where
the large discrepancy between BTs and NTs originates from.
4.2. Psychoacoustics
The AM detection thresholds found in this study were higher
than those commonly reported in the literature, although similar
AM detection thresholds were recently presented for 4-Hz AM,
with an average threshold of 13% for a stimulus duration of 1 s,
equating to 4 AM cycles, as in the present study (Han and
Dimitrijevic, 2015). Modulation rates below 10 Hz yield constant
AM thresholds, provided the stimulus duration is chosen sufficiently long with regard to the number of AM cycles at a given AM
rate (Viemeister, 1979; Bacon and Viemeister, 1985; Sheft and Yost,
1990). Previous studies investigating AM detection thresholds for
an 8-Hz modulation rate and broadband noise carriers for young
normal-hearing cohorts reported mean thresholds of approximately 8% (Jin et al., 2014) and 5%e6% (Viemeister, 1979; Bacon and
Viemeister, 1985; Takahashi and Bacon, 1992). In all reported
studies broadband noise stimuli, with and without AM, and with a
duration of 500 ms were presented monaurally via headphones
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
and reported thresholds provide an estimate of the AMD required
for 70.7% correct identification. Presentation levels differed across
studies, but do not significantly affect AM detection thresholds
unless presentation levels are very low (Viemeister, 1979). A potential cause for the higher thresholds reported here, is the carrier
bandwidth. A reduced carrier bandwidth is associated with poorer
AM detection thresholds (Bacon and Viemeister, 1985; Bacon and
Gleitman, 1992; Strickland and Viemeister, 1997). The LTASS
weighted noise carrier in this study emphasized frequencies below
1 kHz while higher frequency content was lower in level. Similar to
band-limited carriers, this LTASS weighted carrier may result in
poorer AM detection thresholds than for broadband noise carriers.
Higher average AM detection thresholds may also be caused by the
level roving (±3 dB), as the loudness changes may distract from the
task at hand, particularly at low AMDs (Chatterjee and Oberzut,
2011).
4.3. Speech identification and AM detection
The lack of significant correlations between speech-in-noise
recognition and behavioral AM detection abilities is not surprising and in line with the literature. Previous studies in normalhearing cohorts have already demonstrated the difficulty in
teasing out relationships between speech measures and various
psychoacoustic measures (Strouse et al., 1998; Watson and Kidd,
2002; Goldsworthy et al., 2013). Watson and Kidd (2002) proposed that speech-in-noise recognition in normal-hearing cohorts
largely depends on a combination of pattern recognition abilities
and the ability to infer the meaning of degraded speech from
splinters of information. In contrast to this, deficits in spectral and
temporal auditory processing caused by hearing impairment may
negatively impact speech processing, which is supported by
21
significant correlations between psychoacoustic and speech measures (Dreschler and Plomp, 1980, 1985; Festen and Plomp, 1983;
Glasberg and Moore, 1989). Moreover, various studies for CI cohorts have shown significant correlations between speech measures and AM detection (Cazals et al., 1994; Fu, 2002; Luo et al.,
2008; Won et al., 2011; Gnansia et al., 2014; De Ruiter et al.,
2015), upholding the hypothesis that AM detection may play a
role in speech processing in the case of electric hearing. As stated
previously, the aim of the test battery design presented in this study
was its future implementation in a CI user cohort, thus, the inclusion of the speech measure was deemed justified.
5. Conclusions
The morphology weighting procedure allowed the calculation of
NTs and may provide a useful analysis tool in CAEP research at the
individual level. Significant correlations between BTs and NTs
encourage further research into the application of the MMW as an
objective measure of low-rate AM detection. However, future work
should address the discrepancy between the magnitudes of BTs and
NTs.
Acknowledgments
We would like to thank all participants who volunteered their
time for the study. This work was supported by funding from
Trinity Centre for Bioengineering and Cochlear Ltd.
Appendix
Fig. A1. Distributions of Pearson's correlation coefficient r (left) between behavioral and neural thresholds of amplitude modulation detection and their respective p-values (right)
based on binary (top) and exponential (bottom) weighting functions for the morphology weighting. Distributions were based on 300 permutations of randomly chosen sub-sets of
56 standard epochs for the calculation of the mismatch waveform. The median of each distribution is indicated by the circle.
22
S.M. Waechter et al. / Hearing Research 359 (2018) 13e22
References
Allen, J., Kraus, N., Bradlow, A., 2000. Neural representation of consciously imperceptible speech sound differences. Percept. Psychophys. 62, 1383e1393.
Bacon, S.P., Viemeister, N.F., 1985. Temporal modulation transfer functions in
normal-hearing and hearing-impaired listeners. Int. J. Audiol. 24, 117e134.
Bacon, S.P., Gleitman, R.M., 1992. Modulation detection in subjects with relatively
flat hearing losses. J. Speech Hear. Res. 35, 642e653.
Byrne, D., et al., 1994. An international comparison of long-term average speech
spectra. J. Acoust. Soc. Am. 96, 2108e2120.
Cazals, Y., Pelizzone, M., Saudan, O., Boex, C., 1994. Low-pass filtering in amplitude
modulation detection associated with vowel and consonant identification in
subjects with cochlear implants. J. Acoust. Soc. Am. 96, 2048e2054.
Chatterjee, M., Oberzut, C., 2011. Detection and rate discrimination of amplitude
modulation in electrical hearing. J. Acoust. Soc. Am. 130, 1567e1580.
De Ruiter, A.M., Debruyne, J.A., Chenault, M.N., Francart, T., Brokx, J.P., 2015.
Amplitude modulation detection and speech recognition in late-implanted
prelingually and postlingually deafened cochlear implant users. Ear Hear. 36,
557e566.
Desjardins, R.N., Trainor, L.J., Hevenor, S.J., Polak, C.P., 1999. Using mismatch negativity to measure auditory temporal resolution thresholds. Neuroreport 10,
2079e2082.
Dimitrijevic, A., Alsamri, J., John, M.S., Purcell, D., George, S., Zeng, F.G., 2016. Human
envelope following responses to amplitude modulation: effects of aging and
modulation depth. Ear Hear. 37, e322e335.
Dreschler, W.A., Plomp, R., 1980. Relation between psychophysical data and speech
perception for hearing-impaired subjects. I. J. Acoust. Soc. Am. 68, 1608e1615.
Dreschler, W.A., Plomp, R., 1985. Relations between psychophysical data and speech
perception for hearing-impaired subjects. II. J. Acoust. Soc. Am. 78, 1261e1270.
Drullman, R., 1995. Temporal envelope and fine structure cues for speech intelligibility. J. Acoust. Soc. Am. 97, 585e592.
Edwards, E., Chang, E.F., 2013. Syllabic (approximately 2-5 Hz) and fluctuation
(approximately 1-10 Hz) ranges in speech and auditory processing. Hear. Res.
305, 113e134.
Festen, J.M., Plomp, R., 1983. Relations between auditory functions in impaired
hearing. J. Acoust. Soc. Am. 73, 652e662.
Fishman, Y.I., 2014. The mechanisms and meaning of the mismatch negativity. Brain
Topogr. 27, 500e526.
Fu, Q.J., 2002. Temporal processing and speech recognition in cochlear implant
users. Neuroreport 13, 1635e1639.
Glasberg, B.R., Moore, B.C., 1989. Psychoacoustic abilities of subjects with unilateral
and bilateral cochlear hearing impairments and their relationship to the ability
to understand speech. Scand. Audiol. Suppl. 32, 1e25.
Gnansia, D., Lazard, D.S., Leger, A.C., Fugain, C., Lancelin, D., Meyer, B., Lorenzi, C.,
2014. Role of slow temporal modulations in speech identification for cochlear
implant users. Int. J. Audiol. 53, 48e54.
Goldsworthy, R.L., Delhorne, L.A., Braida, L.D., Reed, C.M., 2013. Psychoacoustic and
phoneme identification measures in cochlear-implant and normal-hearing listeners. Trends Amplif. 17, 27e44.
Hall, J.W., Swanepoel, D.W., 2010. Rationale for objective hearing assessment. In:
Objective Assessment of Hearing. Plural Pub Inc, pp. 1e5.
Han, J.H., Dimitrijevic, A., 2015. Acoustic change responses to amplitude modulation: a method to quantify cortical temporal processing and hemispheric
asymmetry. Front. Neurosci. 9, 38.
Harris, K.C., Mills, J.H., Dubno, J.R., 2007. Electrophysiologic correlates of intensity
discrimination in cortical evoked potentials of younger and older adults. Hear.
Res. 228, 58e68.
He, C., Hotson, L., Trainor, L.J., 2009. Maturation of cortical mismatch responses to
occasional pitch change in early infancy: effects of presentation rate and
magnitude of change. Neuropsychologia 47, 218e229.
Hopkins, K., Moore, B.C.J., 2009. The contribution of temporal fine structure to the
intelligibility of speech in steady and modulated noise. J. Acoust. Soc. Am. 125,
442e446.
Jin, S.H., Liu, C., Sladen, D.P., 2014. The effects of aging on speech perception in
noise: comparison between normal-hearing and cochlear-implant listeners.
J. Am. Acad. Audiol. 25, 656e665.
Katayama, J., Polich, J., 1998. Stimulus context determines P3a and P3b. Psychophysiology 35, 23e33.
Kimura, M., Takeda, Y., 2013. Task difficulty affects the predictive process indexed by
visual mismatch negativity. Front. Hum. Neurosci. 7, 267.
Kraus, N., McGee, T., Micco, A., Sharma, A., Carrell, T., Nicol, T., 1993. Mismatch
negativity in school-age children to speech stimuli that are just perceptibly
different. Electroencephalogr. Clin. Neurophysiol. Evoked Potentials Section 88,
123e130.
Levitt, H., 1971. Transformed up-down methods in psychoacoustics. J. Acoust. Soc.
Am. 49, 467e477.
Lopez Valdes, A., Mc Laughlin, M., Viani, L., Walshe, P., Smith, J., Zeng, F.G.,
Reilly, R.B., 2014. Objective assessment of spectral ripple discrimination in
cochlear implant listeners using cortical evoked responses to an oddball
paradigm. PloS One 9, e90044.
Luke, R., Van Deun, L., Hofmann, M., van Wieringen, A., Wouters, J., 2015. Assessing
temporal modulation sensitivity using electrically evoked auditory steady state
responses. Hear. Res. 324, 37e45.
Luo, X., Fu, Q.J., Wei, C.G., Cao, K.L., 2008. Speech recognition and temporal
amplitude modulation processing by Mandarin-speaking cochlear implant
users. Ear Hear. 29, 957e970.
Macmillan, N.A., Kaplan, H.L., 1985. Detection theory analysis of group data: estimating sensitivity from average hit and false-alarm rates. Psychol. Bull. 98,
185e199.
Manju, V., Gopika, K.K., Arivudai Nambi, P.M., 2014. Association of auditory steady
state responses with perception of temporal modulations and speech in noise.
ISRN Otolaryngol. 2014, 374035.
Martin, B.A., Boothroyd, A., 2000. Cortical, auditory, evoked potentials in response
to changes of spectrum and amplitude. J. Acoust. Soc. Am. 107, 2155e2161.
Mc Laughlin, M., Lopez Valdes, A., Reilly, R.B., Zeng, F.G., 2013. Cochlear implant
artifact attenuation in late auditory evoked potentials: a single channel
approach. Hear. Res. 302, 84e95.
Moore, B.C.J., Vickers, D.A., Baer, T., Launer, S., 1999. Factors affecting the loudness of
modulated sounds. J. Acoust. Soc. Am. 105, 2757e2772.
€t€
N€
aa
anen, R., Paavilainen, P., Rinne, T., Alho, K., 2007. The mismatch negativity
(MMN) in basic research of central auditory processing: a review. Clin. Neurophysiol. Off. J. Int. Fed. Clin. Neurophysiol. 118, 2544e2590.
Oxenham, A.J., Kreft, H.A., 2014. Speech perception in tones and noise via cochlear
implants reveals influence of spectral resolution on temporal processing.
Trends Hear. 18, 1e14.
Picton, T., 2013. Hearing in time: evoked potential studies of temporal processing.
Ear Hear. 34, 385e401.
Purcell, D.W., John, S.M., Schneider, B.A., Picton, T.W., 2004. Human temporal
auditory acuity as assessed by envelope following responses. J. Acoust. Soc. Am.
116, 3581e3593.
Rajan, G., et al., 2017. Hearing preservation cochlear implantation in children: the
HEARRING Group consensus and practice guide. Cochlear Implants Int. 1e13.
Shannon, R.V., Zeng, F.G., Kamath, V., Wygonski, J., Ekelid, M., 1995. Speech recognition with primarily temporal cues. Science New York NY 270, 303e304.
Sheft, S., Yost, W.A., 1990. Temporal integration in amplitude modulation detection.
J. Acoust. Soc. Am. 88, 796e805.
Shen, Y., 2014. Gap detection and temporal modulation transfer function as
behavioral estimates of auditory temporal acuity using band-limited stimuli in
young and older adults. J. Speech Lang. Hear. Res. JSLHR 57, 2280e2292.
Shen, Y., Richards, V.M., 2013. Temporal modulation transfer function for efficient
assessment of auditory temporal resolution. J. Acoust. Soc. Am. 133, 1031e1042.
Spahr, A.J., Dorman, M.F., Litvak, L.M., Van Wie, S., Gifford, R.H., Loizou, P.C.,
Loiselle, L.M., Oakes, T., Cook, S., 2012. Development and validation of the AzBio
sentence lists. Ear Hear. 33, 112e117.
Strickland, E.A., Viemeister, N.F., 1997. The effects of frequency region and bandwidth on the temporal modulation transfer function. J. Acoust. Soc. Am. 102,
1799e1810.
Strouse, A., Ashmead, D.H., Ohde, R.N., Grantham, D.W., 1998. Temporal processing
in the aging auditory system. J. Acoust. Soc. Am. 104, 2385e2399.
Takahashi, G.A., Bacon, S.P., 1992. Modulation detection, modulation masking, and
speech understanding in noise in the elderly. J. Speech Hear. Res. 35, 1410e1421.
€t€
Tiitinen, H., May, P., Reinikainen, K., N€
aa
anen, R., 1994. Attentive novelty detection
in humans is governed by pre-attentive sensory memory. Nature 372, 90e92.
Trainor, L.J., Samuel, S.S., Desjardins, R.N., Sonnadara, R.R., 2001. Measuring temporal resolution in infants using mismatch negativity. Neuroreport 12,
2443e2448.
€€
€nen, R., 2003.
Uther, M., Jansen, D.H., Huotilainen, M., Ilmoniemi, R.J., Na
ata
Mismatch negativity indexes auditory temporal resolution: evidence from
event-related potential (ERP) and event-related field (ERF) recordings. Cognit.
Brain Res. 17, 685e691.
Viemeister, N.F., 1979. Temporal modulation transfer functions based upon modulation thresholds. J. Acoust. Soc. Am. 66, 1364e1380.
von Wedel, H., 1982. Cortical evoked potentials in response to brief modulation of
signal amplitude. Experiments on auditory temporal resolution. Arch. Oto
Rhino Laryngol. 234, 235e243.
Watson, C.S., Kidd, G.R., 2002. On the lack of association between basic auditory
abilities, speech processing, and other cognitive skills. Semin. Hear. 23, 83e94.
Won, J.H., Drennan, W.R., Nie, K., Jameyson, E.M., Rubinstein, J.T., 2011. Acoustic
temporal modulation detection and speech perception in cochlear implant
listeners. J. Acoust. Soc. Am. 130, 376e388.
Xu, L., Thompson, C.S., Pfingst, B.E., 2005. Relative contributions of spectral and
temporal cues for phoneme recognition. J. Acoust. Soc. Am. 117, 3255e3267.
Zhang, C., Zeng, F.G., 1997. Loudness of dynamic stimuli in acoustic and electric
hearing. J. Acoust. Soc. Am. 102, 2925e2934.