EP3332558B1 - Event detection for playback management in an audio device - Google Patents
Event detection for playback management in an audio device Download PDFInfo
- Publication number
- EP3332558B1 EP3332558B1 EP16763354.4A EP16763354A EP3332558B1 EP 3332558 B1 EP3332558 B1 EP 3332558B1 EP 16763354 A EP16763354 A EP 16763354A EP 3332558 B1 EP3332558 B1 EP 3332558B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- sound
- ambient sound
- field
- input signal
- ambient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims description 41
- 230000003595 spectral effect Effects 0.000 claims description 14
- 230000004044 response Effects 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 7
- 238000000034 method Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 3
- 230000002085 persistent effect Effects 0.000 claims description 2
- 230000005534 acoustic noise Effects 0.000 claims 2
- 206010019133 Hangover Diseases 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 230000004927 fusion Effects 0.000 description 7
- 230000001052 transient effect Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000010219 correlation analysis Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 241000269400 Sirenidae Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1083—Reduction of ambient noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/002—Damping circuit arrangements for transducers, e.g. motional feedback circuits
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/81—Detection of presence or absence of voice signals for discriminating voice from music
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2410/00—Microphones
- H04R2410/05—Noise reduction with a separate noise microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
Definitions
- the field of representative embodiments of this disclosure relates to methods, apparatuses, or implementations concerning or relating to playback management in an audio device.
- Applications include detection of certain ambient events, but are not limited to, those concerning the detection of near-field sound, proximity sound and tonal alarm detection using spatial processing based on signals received from multiple microphones.
- U.S. Pat. No. 8,804,974 teaches ambient event detection in a personal audio device which can then be used to implement an event-based modification of the playback content.
- the above-mentioned references also teach the use of microphones to detect various acoustic events.
- U.S. App. Ser. No. 14/324,286, filed on July 7, 2014 teaches using a speech detector as an event detector to adjust the playback signal during a conversation.
- one or more disadvantages and problems associated with existing approaches to event detection for playback management in a personal audio device may be reduced or eliminated.
- a method for processing audio information in an audio device is provided as defined in appended claim 1 and an integrated circuit for implementing at least a portion of an audio device is provided as defined in appended claim 2. Further advantageous aspects are defined in the dependent claims.
- systems and methods may use at least three different audio event detectors that may be used in an automatic playback management framework.
- Such audio event detectors for an audio device may include a near-field detector that may detect when sounds in the near-field of the audio device is detected, such as a user of the audio device (e.g., a user that is wearing or otherwise using the audio device) speaks, a proximity detector that may detect when sounds in proximity to the audio device is detected, such as when another person in proximity to the user of the audio device speaks, and a tonal alarm detector that detects acoustic alarms that may have been originated in the vicinity of the audio device are proposed.
- Figure 1 illustrates an example of a use case scenario wherein such detectors may be used in conjunction with a playback management system to enhance a user experience, in accordance with embodiments of the present disclosure.
- Figure 2 illustrates an example playback management system that modifies a playback signal based on a decision from an event detector 2, in accordance with embodiments of the present disclosure.
- Signal processing functionality in a processor 50 may comprise an acoustic echo canceller 1 that may cancel an acoustic echo that is received at microphones 52 due to an echo coupling between an output audio transducer 51 (e.g., loudspeaker) and microphones 52.
- an output audio transducer 51 e.g., loudspeaker
- the echo reduced signal may be communicated to event detector 2 which may detect one or more various ambient events, including without limitation a near-field event (e.g., including but not limited to speech from a user of an audio device) detected by near-field detector 3, a proximity event (e.g., including but not limited to speech or other ambient sound other than near-field sound) detected by proximity detector 4, and/or a tonal alarm event detected by alarm detector 5.
- a near-field event e.g., including but not limited to speech from a user of an audio device
- proximity detector e.g., including but not limited to speech or other ambient sound other than near-field sound
- tonal alarm event detected by alarm detector 5 e.g., a tonal alarm event detected by alarm detector 5.
- an event-based playback control 6 may modify a characteristic of audio information (shown as "playback content" in Figure 2 ) reproduced to output audio transducer 51.
- Audio information may include any information that may be reproduced at output audio transducer 51, including without limitation, downlink speech associated with a telephonic conversation received via a communication network (e.g., a cellular network) and/or internal audio from an internal audio source (e.g., music file, video file, etc.).
- a communication network e.g., a cellular network
- internal audio from an internal audio source e.g., music file, video file, etc.
- Figure 3 illustrates an example event detector, in accordance with embodiments of the present disclosure.
- the example event detector may comprise a voice activity detector 10, a music detector 9, a direction of arrival estimator 7, a near-field spatial information extractor 8, a background noise level estimator 11, and decision fusion logic 12 that uses information from voice activity detector 10, music detector 9, direction of arrival estimator 7, near-field spatial information extractor 8, and background noise level estimator 11 to detect audio events, including without limitation, near-field sound, proximity sound other than near-field sound, and a tonal alarm.
- Near-field detector 3 may detect near-field sounds including speech. When such near-field sound is detected, it may be desirable to modify audio information reproduced to output audio transducer 51, as detection of near-field sound may indicate that a user is participating in a conversation. Such near-field detection may need to be able to detect near-field sound in acoustically noisy conditions and be resilient to false detection of near-field sounds in very diverse background noise conditions (e.g., background noise in a restaurant, acoustical noise when driving a car, etc.). As described in greater detail below, near-field detection may require spatial sound processing using a plurality of microphones 51. In some embodiments, such near-field sound detection may be implemented in a manner identical or similar to that described in U.S. Pat. No. 8,565,446 and/or U.S. App. Ser. No. 13/199,593 .
- Proximity detector 4 may detect ambient sounds (e.g., speech from a person in proximity to a user, background music, etc.) other than near-field sounds. As described in greater detail below, because it may be difficult to differentiate proximity sounds from non-stationary background noise and background music, proximity detector may utilize a music detector and noise level estimation to disable proximity detection of proximity detector 4 in order to avoid poor user experience due to false detection of proximity sounds. In some embodiments, such proximity sound detection may be implemented in a manner identical or similar to that described in U.S. Pat. No. 8,126,706 , U.S. Pat. No. 8,565,446 , and/or U.S. App. Ser. No. 13,199,593 .
- Tonal alarm detector 5 may detect tonal alarms (e.g., sirens) proximate to an audio device. To provide maximum user experience, it may be desirable that tonal alarm detector 5 ignores certain alarms (e.g., feeble or low-volume alarms). As described in greater detail below, tonal alarm detection may require spatial sound processing using a plurality of microphones 51. In some embodiments, such tonal alarm detection may be implemented in a manner identical or similar to that described in U.S. Pat. No. 8,126,706 and/or U.S. App. Ser. No. 13,199,593 .
- tonal alarm detection may be implemented in a manner identical or similar to that described in U.S. Pat. No. 8,126,706 and/or U.S. App. Ser. No. 13,199,593 .
- FIG. 4 illustrates functional blocks of a system for deriving near-field spatial statistics that may be used to detect audio events, in accordance with embodiments of the present disclosure.
- the level analysis 41 may be performed on microphones 52 by estimating the inter-microphone level difference ( imd ) between the near and far microphone (e.g., as described in U.S. App. Ser. No. 13/199,593 ).
- Cross-correlation analysis 13 may be performed on signals received by microphones 52 to obtain the direction of arrival information DOA of ambient sound that impinges on microphones 52 (e.g., as described in U.S. Pat. No. 8,565,446 ).
- a maximum normalized correlation value normMaxCorr may also be obtained (e.g., as described in U.S. App. Ser. No. 13/199,593 ).
- Voice activity detector 10 may detect presence of speech and generate a signal speechDet indicative of present or absence of speech in the ambient sound (e.g., as described in the probabilistic based speech presence/absence based approach of U.S. Pat. No. 7,492,889 ).
- Beamformers 15 may, based on signals from microphones 52, generate a near-field signal estimate and an interference signal estimate which may be used by a noise analysis 14 to determine a level of noise noiseLevel in the ambient sound and an interference to near-field signal ratio idr.
- a voice activity detector 36 may use the interference estimate to detect ( proxSpeechDet ) any speech signal that does not originate from the desired signal direction.
- Noise analysis 14 may be performed based on the direction of arrival estimate DOA by updating interference signal energy whenever the direction of arrival estimate DOA of the ambient sound is outside the acceptance angle of the near-field sound.
- the direction of arrival of the near-field sounds may be known a priori for a given microphone array configuration in the industrial design of a personal audio device.
- Figure 5 illustrates example fusion logic for detecting near-field sound, in accordance with embodiments of the present disclosure. As shown in Figure 5 , near-field speech may be detected when all the following criteria are satisfied:
- thresholds idrThres and imdTh may be dynamically adjusted based on a background noise level estimate.
- Proximity detection of proximity detector 4 may be different than near-field sound detection of near-field detector 3 because the signal characteristics of proximity speech may be very similar to ambient signals such as music and noise. Accordingly, proximity detector 4 must avoid false detection of proximity speech in order to achieve acceptable user experience. Accordingly, a music detector 9 may be used to disable proximity detection whenever there is music in the background. Similarly, proximity detector 4 may be disabled whenever background noise level is above certain threshold. The threshold value for background noise may be determined a priori such that a likelihood of false detection below the threshold level is very low.
- Figure 6 illustrates example fusion logic for detecting proximity sound (e.g. speech), in accordance with embodiments of the present disclosure. Moreover, there may exist many environment noise sources that generate acoustic stimuli that are transient in nature.
- a spectral flatness measure (SFM) statistic from the music detector 9 may be used to distinguish speech from transient noises.
- the SFM may be tracked over a period of time and the difference between the maximum and the minimum SFM value over the same duration, defined as sfmSwing may be calculated.
- the value of sfmSwing may generally be small for transient noise signals as the spectral content of these signals are wideband in nature and they tend to be stationary for a short interval of time (300-500 ms).
- the value of sfmSwing may higher for speech signals because the spectral content of speech signal may vary faster than transient signals.
- proximity sound e.g., speech
- the music detector taught in U.S. Pat. No. 8,126,706 may be used to implement music detector 9 to detect the presence of background music.
- Another embodiment of the proximity speech detector is shown in Figure 7 , in accordance with embodiments of the present disclosure. According to this embodiment, proximity speech may be detected if the following conditions are met:
- the following conditions may be indicative of proximity speech, in order to improve the detection rate of proximity speech without increasing occurrence of a false alarm (e.g., due to background noise conditions):
- Tonal alarm detector 5 may be configured to detect alarm signals that are tonal in nature in which a sonic bandwidth of such alarm signals are also narrow (e.g., siren, buzzer).
- the tonality of an ambient sound may be measured by splitting the time domain signal into multiple sub-bands through time to frequency domain transformation and the spectral flatness measure, depicted in Figure 6 as signal sfm[] generated by music detector 9, may be computed in each sub-band.
- Spectral flatness measures sfm[] from all sub-bands may be evaluated, and a tonal alarm event may be detected if the spectrum is flat in most sub-bands but not in all sub-bands.
- near-field spatial statistics 8 of Figure 3 may be used to differentiate the far-field alarm signals from near-field signals.
- Figure 8 illustrates example fusion logic for detecting a tonal alarm event (e.g. siren, buzzer), in accordance with embodiments of the present disclosure. As shown in Figure 8 , a tonal alarm event may be detected when all the following criteria are satisfied:
- FIG. 9 illustrates an example timing diagram illustrating hold-off and hang-over logic that may be applied on an instantaneous audio event detection signal to generate a validated audio event signal, in accordance with embodiments of the present disclosure.
- hold-off logic may generate a validated audio event signal in response to instantaneous detection of an audio event (e.g., near-field sound, proximity sound, tonal alarm event) being persistent for at least a predetermined time, while hang-over logic may continue to assert the validated audio event signal until the instantaneous detection of an audio event has ceased for a second predetermined time.
- an audio event e.g., near-field sound, proximity sound, tonal alarm event
- the following pseudo-code may demonstrate application of the hold-off and hang-over logic to reduce false detection of audio events, in accordance with embodiments of the present disclosure.
- a validated event may be further validated before generating the playback mode switching control.
- the following pseudo-code may demonstrate application of the hold-off and hang-over logic for gracefully switching between a conversational mode (e.g., in which audio information reproduced to output audio transducer 51 may be modified in response to an audio event) and a normal playback mode (e.g., in which the audio information reproduced to output audio transducer 51 is unmodified).
- Figure 10 illustrates different audio event detectors having hold-off and hang-over logic, in accordance with embodiments of the present disclosure.
- the hold-off periods and/or hang-over periods for each detector may be set differently.
- the playback management may be controlled differently based on the type of detected event.
- a playback gain (and hence the audio information reproduced at output audio transducer 51) may be attenuated whenever one or more of the audio events is detected.
- a playback gain may be smoothed using a first order exponential averaging filter represented by the following pseudo-code:
- the smoothing parameters alpha and beta may be set at different values to adjust a gain ramping rate.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Circuit For Audible Band Transducer (AREA)
Description
- The field of representative embodiments of this disclosure relates to methods, apparatuses, or implementations concerning or relating to playback management in an audio device. Applications include detection of certain ambient events, but are not limited to, those concerning the detection of near-field sound, proximity sound and tonal alarm detection using spatial processing based on signals received from multiple microphones.
- Personal audio devices have become prevalent and they are used in diverse ambient environments. The headphones used in these audio devices have become advanced such that the occlusion caused by either passive or active methods prevents a user from keeping track of an ambient sound field external to the audio device. Even though the increased isolation and uninterrupted listening is preferable in most cases, sometimes for safety or enhanced user experience, it is imperative that some specific ambient events are heard by the user and an appropriate action is taken in response to that event. For example, if the user is listening to music through his headset and interrupted by someone attempting to start a conversation with him or her, it may be difficult to maintain the conversation unless the user pauses the playback signal or reduces the volume of the playback signal. For example,
U.S. Pat. No. 7,903,825 proposes an audio device in which the playback signal is modified depending on the ambient acoustic field. As another example,U.S. Pat. No. 8,804,974 teaches ambient event detection in a personal audio device which can then be used to implement an event-based modification of the playback content. The above-mentioned references also teach the use of microphones to detect various acoustic events. As a further example,U.S. App. Ser. No. 14/324,286, filed on July 7, 2014 U.S. Pat. No. 8,565,446 teaches the use of a direction of arrival (DOA) estimate and an interference to desired (near-field) speech signal ratio estimate from a set of plural microphones to detect desired speech in the presence of non-stationary background noise to control a speech enhancement algorithm in a noise reduction echo cancellation (NREC) system. Similarly,U.S. App. Ser. No. 13/199,593 teaches that a maximum of the normalized cross-correlation statistic that is derived through a cross-correlation analysis of plural microphones may be an effective discriminator to detect near-field speech. A spectral flatness measure-based music detector for a NREC system is proposed inU.S. Pat. No. 8,126,706 to differentiate the presence of background noise from the background music. Further examples of audio processing devices and methods are known fromUS 2014/270200 A1 . - In accordance with the teachings of the present disclosure, one or more disadvantages and problems associated with existing approaches to event detection for playback management in a personal audio device may be reduced or eliminated.
- A method for processing audio information in an audio device is provided as defined in appended
claim 1 and an integrated circuit for implementing at least a portion of an audio device is provided as defined in appendedclaim 2. Further advantageous aspects are defined in the dependent claims. - Technical advantages of the present disclosure may be readily apparent to one of ordinary skill in the art from the figures, description and claims included herein. The objects and advantages of the embodiments will be realized and achieved at least by the elements, features, and combinations particularly pointed out in the claims.
- It is to be understood that both the foregoing general description and the following detailed description are examples and explanatory and are not restrictive of the claims set forth in this disclosure.
- A more complete understanding of the example, present embodiments and certain advantages thereof may be acquired by referring to the following description taken in conjunction with the accompanying drawings, in which like reference numbers indicate like features, and wherein:
-
Figure 1 illustrates an example of a use case scenario wherein such detectors may be used in conjunction with a playback management system to enhance a user experience, in accordance with embodiments of the present disclosure; -
Figure 2 illustrates an example playback management system that modifies a playback signal based on a decision from an event detector, in accordance with embodiments of the present disclosure; -
Figure 3 illustrates an example event detector, in accordance with embodiments of the present disclosure; -
Figure 4 illustrates functional blocks of a system for deriving near-field spatial statistics that may be used to detect audio events, in accordance with embodiments of the present disclosure; -
Figure 5 illustrates example fusion logic for detecting near-field sound, in accordance with embodiments of the present disclosure; -
Figure 6 illustrates example fusion logic for detecting proximity sound, in accordance with embodiments of the present disclosure; -
Figure 7 illustrates an embodiment of a proximity speech detector , in accordance with embodiments of the present disclosure; -
Figure 8 illustrates example fusion logic for detecting a tonal alarm event, in accordance with embodiments of the present disclosure; -
Figure 9 illustrates an example timing diagram illustrating hold-off and hang-over logic that may be applied on an instantaneous audio event detection signal to generate a validated audio event signal, in accordance with embodiments of the present disclosure; and -
Figure 10 illustrates different audio event detectors having hold-off and hang-over logic, in accordance with embodiments of the present disclosure. - In accordance with embodiments of this disclosure, systems and methods may use at least three different audio event detectors that may be used in an automatic playback management framework. Such audio event detectors for an audio device may include a near-field detector that may detect when sounds in the near-field of the audio device is detected, such as a user of the audio device (e.g., a user that is wearing or otherwise using the audio device) speaks, a proximity detector that may detect when sounds in proximity to the audio device is detected, such as when another person in proximity to the user of the audio device speaks, and a tonal alarm detector that detects acoustic alarms that may have been originated in the vicinity of the audio device are proposed.
Figure 1 illustrates an example of a use case scenario wherein such detectors may be used in conjunction with a playback management system to enhance a user experience, in accordance with embodiments of the present disclosure. -
Figure 2 illustrates an example playback management system that modifies a playback signal based on a decision from anevent detector 2, in accordance with embodiments of the present disclosure. Signal processing functionality in aprocessor 50 may comprise anacoustic echo canceller 1 that may cancel an acoustic echo that is received atmicrophones 52 due to an echo coupling between an output audio transducer 51 (e.g., loudspeaker) andmicrophones 52. The echo reduced signal may be communicated toevent detector 2 which may detect one or more various ambient events, including without limitation a near-field event (e.g., including but not limited to speech from a user of an audio device) detected by near-field detector 3, a proximity event (e.g., including but not limited to speech or other ambient sound other than near-field sound) detected byproximity detector 4, and/or a tonal alarm event detected byalarm detector 5. If an audio event is detected, an event-basedplayback control 6 may modify a characteristic of audio information (shown as "playback content" inFigure 2 ) reproduced tooutput audio transducer 51. Audio information may include any information that may be reproduced atoutput audio transducer 51, including without limitation, downlink speech associated with a telephonic conversation received via a communication network (e.g., a cellular network) and/or internal audio from an internal audio source (e.g., music file, video file, etc.). -
Figure 3 illustrates an example event detector, in accordance with embodiments of the present disclosure. As shown inFigure 3 , the example event detector may comprise avoice activity detector 10, amusic detector 9, a direction ofarrival estimator 7, a near-fieldspatial information extractor 8, a backgroundnoise level estimator 11, anddecision fusion logic 12 that uses information fromvoice activity detector 10,music detector 9, direction ofarrival estimator 7, near-fieldspatial information extractor 8, and backgroundnoise level estimator 11 to detect audio events, including without limitation, near-field sound, proximity sound other than near-field sound, and a tonal alarm. - Near-
field detector 3 may detect near-field sounds including speech. When such near-field sound is detected, it may be desirable to modify audio information reproduced to outputaudio transducer 51, as detection of near-field sound may indicate that a user is participating in a conversation. Such near-field detection may need to be able to detect near-field sound in acoustically noisy conditions and be resilient to false detection of near-field sounds in very diverse background noise conditions (e.g., background noise in a restaurant, acoustical noise when driving a car, etc.). As described in greater detail below, near-field detection may require spatial sound processing using a plurality ofmicrophones 51. In some embodiments, such near-field sound detection may be implemented in a manner identical or similar to that described inU.S. Pat. No. 8,565,446 and/orU.S. App. Ser. No. 13/199,593 . -
Proximity detector 4 may detect ambient sounds (e.g., speech from a person in proximity to a user, background music, etc.) other than near-field sounds. As described in greater detail below, because it may be difficult to differentiate proximity sounds from non-stationary background noise and background music, proximity detector may utilize a music detector and noise level estimation to disable proximity detection ofproximity detector 4 in order to avoid poor user experience due to false detection of proximity sounds. In some embodiments, such proximity sound detection may be implemented in a manner identical or similar to that described inU.S. Pat. No. 8,126,706 ,U.S. Pat. No. 8,565,446 , and/orU.S. App. Ser. No. 13,199,593 -
Tonal alarm detector 5 may detect tonal alarms (e.g., sirens) proximate to an audio device. To provide maximum user experience, it may be desirable thattonal alarm detector 5 ignores certain alarms (e.g., feeble or low-volume alarms). As described in greater detail below, tonal alarm detection may require spatial sound processing using a plurality ofmicrophones 51. In some embodiments, such tonal alarm detection may be implemented in a manner identical or similar to that described inU.S. Pat. No. 8,126,706 and/orU.S. App. Ser. No. 13,199,593 -
Figure 4 illustrates functional blocks of a system for deriving near-field spatial statistics that may be used to detect audio events, in accordance with embodiments of the present disclosure. Thelevel analysis 41 may be performed onmicrophones 52 by estimating the inter-microphone level difference (imd) between the near and far microphone (e.g., as described inU.S. App. Ser. No. 13/199,593 ).Cross-correlation analysis 13 may be performed on signals received bymicrophones 52 to obtain the direction of arrival information DOA of ambient sound that impinges on microphones 52 (e.g., as described inU.S. Pat. No. 8,565,446 ). Incross-correlation analysis 13, a maximum normalized correlation value normMaxCorr may also be obtained (e.g., as described inU.S. App. Ser. No. 13/199,593 ).Voice activity detector 10 may detect presence of speech and generate a signal speechDet indicative of present or absence of speech in the ambient sound (e.g., as described in the probabilistic based speech presence/absence based approach ofU.S. Pat. No. 7,492,889 ).Beamformers 15 may, based on signals frommicrophones 52, generate a near-field signal estimate and an interference signal estimate which may be used by anoise analysis 14 to determine a level of noise noiseLevel in the ambient sound and an interference to near-field signal ratio idr.U.S. Pat. No. 8,565,446 describes an example approach for estimating interference to near-field signal ratio idr using a pair ofbeamformers 15. Avoice activity detector 36 may use the interference estimate to detect (proxSpeechDet) any speech signal that does not originate from the desired signal direction.Noise analysis 14 may be performed based on the direction of arrival estimate DOA by updating interference signal energy whenever the direction of arrival estimate DOA of the ambient sound is outside the acceptance angle of the near-field sound. The direction of arrival of the near-field sounds may be known a priori for a given microphone array configuration in the industrial design of a personal audio device. - The various statistics generated by the system of
Figure 4 may then be used to detect the presence of near-field sound.Figure 5 illustrates example fusion logic for detecting near-field sound, in accordance with embodiments of the present disclosure. As shown inFigure 5 , near-field speech may be detected when all the following criteria are satisfied: - Direction of arrival estimate DOA of ambient sound is within an acceptance angle of near-field sound (block 16);
- Maximum normalized cross-correlation statistic normMaxCorr is greater than a threshold normMaxCorrThres1 (block 17);
- Interference to near-field desired signal ratio idr is smaller than a threshold idrThres1 (block 18);
- Voice activity is detected as indicated by signal speechDet (block 19); and
- Inter-microphone level difference statistic imd is greater than a threshold imdTh (block 42).
- In some embodiments, thresholds idrThres and imdTh may be dynamically adjusted based on a background noise level estimate.
- Proximity detection of
proximity detector 4 may be different than near-field sound detection of near-field detector 3 because the signal characteristics of proximity speech may be very similar to ambient signals such as music and noise. Accordingly,proximity detector 4 must avoid false detection of proximity speech in order to achieve acceptable user experience. Accordingly, amusic detector 9 may be used to disable proximity detection whenever there is music in the background. Similarly,proximity detector 4 may be disabled whenever background noise level is above certain threshold. The threshold value for background noise may be determined a priori such that a likelihood of false detection below the threshold level is very low.Figure 6 illustrates example fusion logic for detecting proximity sound (e.g. speech), in accordance with embodiments of the present disclosure. Moreover, there may exist many environment noise sources that generate acoustic stimuli that are transient in nature. These noise types can be falsely detected as speech signal by the speech detector. To reduce the likelihood of false detection, a spectral flatness measure (SFM) statistic from themusic detector 9 may be used to distinguish speech from transient noises. For example, the SFM may be tracked over a period of time and the difference between the maximum and the minimum SFM value over the same duration, defined as sfmSwing may be calculated. The value of sfmSwing may generally be small for transient noise signals as the spectral content of these signals are wideband in nature and they tend to be stationary for a short interval of time (300-500 ms). The value of sfmSwing may higher for speech signals because the spectral content of speech signal may vary faster than transient signals. As shown inFigure 6 , proximity sound (e.g., speech) may be detected when all the following criteria are satisfied: - Music is not detected in the background (block 20);
- Direction of arrival of estimate DOA is within an acceptance angle of proximity sound (block 21);
- Maximum normalized cross-correlation statistic normMaxCorr is greater than a threshold, normMaxCorrThres2 (block 22);
- The background noise level noiseLevel is below a threshold noiseLevelTh (block 23); and
- Proximity voice activity is detected, as indicated by signal proxSpeechDet (block 19);
- SFM variation statistic sfmSwing is greater than a threshold sfmSwingTh (block 37);
- Interference to near-field desired signal ratio idr is greater than a threshold idrThres2 (block 40); and
- Inter-microphone level difference statistic imd is close to 0 dB (block 43).
- In some embodiments, the music detector taught in
U.S. Pat. No. 8,126,706 may be used to implementmusic detector 9 to detect the presence of background music. Another embodiment of the proximity speech detector is shown inFigure 7 , in accordance with embodiments of the present disclosure. According to this embodiment, proximity speech may be detected if the following conditions are met: - Interference to near-field desired signal ratio idr is greater than a threshold idrThres2 (block 39);
- Proximity voice activity is detected (block 27);
- Maximum normalized cross-correlation statistic normMaxCorr is greater than a threshold, normMaxCorrThres3 (block 28);
- Direction of arrival of estimate DOA is within an acceptance angle of proximity sound (block 29);
- Music is not detected in the background (block 30);
- Low or medium level background or no background noise is present (block 31). This condition is verified by comparing the estimated background noise level with a threshold, noiseLevelThLo. If low noise level is detected, then the following two conditions are further tested to confirm the presence of proximity speech;
- SFM variation statistic sfmSwing is greater than a threshold sfmSwingTh (block 38);
- Inter-microphone level difference statistic imd is closer to 0 dB (block 44).
- If the above-mentioned background noise level condition is not satisfied at
block 31, then the following conditions may be indicative of proximity speech, in order to improve the detection rate of proximity speech without increasing occurrence of a false alarm (e.g., due to background noise conditions): - Stationary background noise is present (block 32). The stationary background noise may be detected by calculating the ratio of peak-to-root mean square value of the SFM generated by music detector (block 9) over a period of time. Specifically, if the above-mentioned ratio is higher, then non-stationary noise may be present as the spectral flatness measure of a non-stationary noise tends to change faster than stationary noises;
- High noise level is present (block 32). The high noise-condition may be detected if the estimated background noise is greater than a threshold, noiseLevelLo and smaller than a threshold, noiseLevelHi.
- If the above stationary noise and the direction of arrival conditions are not satisfied at
block 32, then the presence of both of the following set of conditions may indicate the presence of proximity speech: - Close-talking proximity talker is present (block 33). A close-talking proximity talker may be detected when the maximum normalized cross-correlation statistic normMaxCorr is greater than a threshold, normMaxCorrThres4 (the threshold normMaxCorrThres4 may be greater than normMaxCorrThres3 to indicate the presence of close talker);
- Low- or medium- or high-level background or no background noise is present (block 34). This condition may be detected if the estimated background noise level is less than a threshold noiseLevelThHi.
- If the above-mentioned direction of arrival condition is not satisfied at
block 29, then the presence of following conditions may be indicative of proximity speech: - The absence of music (block 35);
- Close-talking proximity talker is present (block 33). A close-talking proximity talker may be detected when the maximum normalized cross-correlation statistic normMaxCorr is greater than a threshold, normMaxCorrThres4 (the threshold normMaxCorrThres4 may be greater than normMaxCorrThres3 to indicate the presence of close talker);
- Low- or medium- or high-level background or no background noise is present (block 34). This condition may be detected if the estimated background noise level is less than a threshold noiseLevelThHi.
-
Tonal alarm detector 5 may be configured to detect alarm signals that are tonal in nature in which a sonic bandwidth of such alarm signals are also narrow (e.g., siren, buzzer). In some embodiments, the tonality of an ambient sound may be measured by splitting the time domain signal into multiple sub-bands through time to frequency domain transformation and the spectral flatness measure, depicted inFigure 6 as signal sfm[] generated bymusic detector 9, may be computed in each sub-band. Spectral flatness measures sfm[] from all sub-bands may be evaluated, and a tonal alarm event may be detected if the spectrum is flat in most sub-bands but not in all sub-bands. Moreover, in a playback management system, it may not be necessary to detect far―field alarm signals. Accordingly, near-fieldspatial statistics 8 ofFigure 3 may be used to differentiate the far-field alarm signals from near-field signals.Figure 8 illustrates example fusion logic for detecting a tonal alarm event (e.g. siren, buzzer), in accordance with embodiments of the present disclosure. As shown inFigure 8 , a tonal alarm event may be detected when all the following criteria are satisfied: - Direction of arrival estimate DOA is within an acceptance angle of the alarm signal (block 24);
- Maximum normalized cross-correlation statistic normMaxCorr is greater than a threshold, normMaxCorrThres5 (block 25); and
- Spectral flatness measure sfm[] indicated that the noise spectrum is flat in most sub-bands but not all (block 26).
- In practice, the instantaneous audio event detections of near-
field detector 3,proximity detector 4, andtonal alarm detector 5 as shown inFigure 5 ,6 ,7 , and8 may indicate false audio events. Accordingly, it may be desirable to validate an instantaneous audio event detection signal before communicating an event detection signal toplayback control block 6.Figure 9 illustrates an example timing diagram illustrating hold-off and hang-over logic that may be applied on an instantaneous audio event detection signal to generate a validated audio event signal, in accordance with embodiments of the present disclosure. As shown inFigure 9 , hold-off logic may generate a validated audio event signal in response to instantaneous detection of an audio event (e.g., near-field sound, proximity sound, tonal alarm event) being persistent for at least a predetermined time, while hang-over logic may continue to assert the validated audio event signal until the instantaneous detection of an audio event has ceased for a second predetermined time. -
- A validated event may be further validated before generating the playback mode switching control. For example, the following pseudo-code may demonstrate application of the hold-off and hang-over logic for gracefully switching between a conversational mode (e.g., in which audio information reproduced to
output audio transducer 51 may be modified in response to an audio event) and a normal playback mode (e.g., in which the audio information reproduced tooutput audio transducer 51 is unmodified). -
Figure 10 illustrates different audio event detectors having hold-off and hang-over logic, in accordance with embodiments of the present disclosure. The hold-off periods and/or hang-over periods for each detector may be set differently. In addition, in some embodiments, the playback management may be controlled differently based on the type of detected event. In these and other embodiments, as shown inFigure 9 , a playback gain (and hence the audio information reproduced at output audio transducer 51) may be attenuated whenever one or more of the audio events is detected. In these and other embodiments, in order to provide smooth gain transition, a playback gain may be smoothed using a first order exponential averaging filter represented by the following pseudo-code: - The smoothing parameters alpha and beta may be set at different values to adjust a gain ramping rate.
- Although this disclosure makes reference to specific embodiments, certain modifications and changes can be made to those embodiments without departing from the scope of the appended claims.
Claims (14)
- A method for processing audio information in an audio device comprising:receiving a playback signal comprising audio information;based on the playback signal, generating an audio output signal for communication to at least one transducer (51) of the audio device;receiving at least one input signal indicative of ambient sound external to the audio device, wherein the at least one input signal comprises signals from multiple microphones (52);determining near-field spatial statistics for the ambient sound;detecting, based on the at least one input signal and the determined near-field spatial statistics for the ambient sound, a near-field sound from a user of the audio device and a proximity sound other than near a field sound in the ambient sound;modifying a characteristic of the audio output signal in response to detection of the near-field sound and/or detection of the proximity sound;determining a characteristic of the ambient sound wherein determining the characteristic of the ambient sound comprises determining that the ambient sound includes background music and/or determining that a background noise level of acoustic noise in the ambient sound is above a threshold background noise level; andin response to said determined characteristic of the ambient sound, disabling the detection of the proximity sound to prevent false detection of proximity sound.
- An integrated circuit for implementing at least a portion of an audio device, comprising:an input configured to receive a playback signal comprising audio information;an audio output configured to, based on the playback signal, generate an audio output signal for communication to at least one transducer of the audio device;a microphone input configured to receive at least one input signal indicative of ambient sound external to the audio device, wherein the at least one input signal comprises signals from multiple microphones (52); and a processor (50) configured to:determine near-field spatial statistics for the ambient sound;detect, based on the at least one input signal and the determined near-field spatial statistics for the ambient sound a near-field sound from a user of the audio device and a proximity sound other than a near-field sound in the ambient sound;modify a characteristic of the audio output signal in response to detection of the near-field sound and/or detection of the proximity sound;determine a characteristic of the ambient sound, wherein determining the characteristic of the ambient sound comprises determining that the ambient sound includes background music and/or determining that a background noise level of acoustic noise in the ambient sound is above a threshold background noise level; andin response to said determined characteristic of the ambient sound, disable the detection of the proximity sound to prevent false detection of proximity sound.
- The integrated circuit of Claim 2, the processor (50) further configured to:
determine from the at least one input signal a direction of the ambient sound; and identify the ambient sound as a near-field sound from a user of the audio device in response to the direction of the ambient sound indicating that the ambient sound is sound from a user of the audio device. - The integrated circuit of Claim 2 or Claim 3, wherein modifying the characteristic of the audio output signal comprises attenuating the audio output signal.
- The integrated circuit of any of Claims 2 to 4, the processor (50) further configured to modify the characteristic of the audio output signal in response to a detection of the near-field sound being persistent for at least a predetermined time; and to detect from the at least one input signal absence of the near-field sound in the ambient sound and cease modifying the characteristic of the audio output signal in response to the absence of the near-field sound for at least a second predetermined time.
- The integrated circuit of any of claims 2 to 5, the processor (50) further configured to:determine from the at least one input signal a direction of the ambient sound; andidentify the sound as a proximity sound in response to the direction of the ambient sound indicating that the ambient sound is sound other than the near-field sound.
- The integrated circuit of any of Claims 2 to 6, the processor (50) further configured to:detect from the at least one input signal whether the ambient sound comprises a tonal alarm; andmodify the characteristic of the audio output signal in response to detection of the tonal alarm in the ambient sound.
- The integrated circuit of Claim 7, wherein detecting the tonal alarm in the ambient sound comprises:detecting from the at least one input signal a direction of the ambient sound;detecting from the at least one input signal a spectral flatness measure of the ambient sound; anddetecting the tonal alarm based on the direction of the ambient sound, the presence or absence of background noise, and the near-field spatial statistics.
- The integrated circuit of any of Claims 2 to 8, wherein:the at least one input signal comprises a first microphone signal indicative of ambient sound at a first microphone and a second microphone signal indicative of ambient sound at a second microphone; andthe near-field spatial statistics comprise at least one of:a correlation between the first microphone signal and the second microphone signal;an interference-to-signal ratio associated with near-field sound.; andan inter-microphone level difference between the first microphone signal and the second microphone signal.
- The integrated circuit of any of Claims 2 to 9, wherein detecting the near-field spatial statistics of the ambient sound comprises detecting whether a normalized cross-correlation statistic is greater than a threshold.
- The integrated circuit of any of Claims 2 to 10, wherein detecting the near-field sound from a user of the audio device in the ambient sound comprises:detecting from the at least one input signal a direction of the ambient sound;detecting from the at least one input signal a presence of speech in the ambient sound; anddetecting the near-field sound based on the direction, presence or absence the of speech, and the near-spatial statistics of the ambient sound.
- The integrated circuit of Claim 2, the processor (50) further configured to:detect from the at least one input signal a direction of the ambient sound;detect from the at least one input signal a presence of background noise in the ambient sound;detect from the at least one input signal a presence of speech in the ambient sound other than from a user of the audio device;detect from the at least one input signal a volume of the ambient sound;detect the proximity sound based on the direction, presence or absence of background noise, presence or absence of the speech, the volume, and the near-spatial statistics of the ambient sound.
- The integrated circuit of Claim 12, the processor (50) further configured to:detect variation in spectral content of the ambient sound; anddetect the proximity sound based on the direction, presence or absence of background noise, presence or absence of the speech, the volume, the near-spatial statistics of the ambient sound, and the spectral content of the ambient sound.
- The integrated circuit of Claim 13, wherein detecting the presence of speech other than from the user of the audio device in the ambient sound comprises of:
detecting from the at least one input signal a spectral flatness measure of the ambient sound, wherein detecting the spectral flatness measure of the ambient sound comprises detecting variation in spectral content of the ambient sound.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562202303P | 2015-08-07 | 2015-08-07 | |
US201562237868P | 2015-10-06 | 2015-10-06 | |
US201662351499P | 2016-06-17 | 2016-06-17 | |
PCT/US2016/045834 WO2017027397A2 (en) | 2015-08-07 | 2016-08-05 | Event detection for playback management in an audio device |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3332558A2 EP3332558A2 (en) | 2018-06-13 |
EP3332558B1 true EP3332558B1 (en) | 2021-12-01 |
Family
ID=62079093
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP16763354.4A Active EP3332558B1 (en) | 2015-08-07 | 2016-08-05 | Event detection for playback management in an audio device |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP3332558B1 (en) |
CN (1) | CN108141694B (en) |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4306115A (en) * | 1980-03-19 | 1981-12-15 | Humphrey Francis S | Automatic volume control system |
US20040017921A1 (en) * | 2002-07-26 | 2004-01-29 | Mantovani Jose Ricardo Baddini | Electrical impedance based audio compensation in audio devices and methods therefor |
JP4173765B2 (en) * | 2003-05-02 | 2008-10-29 | アルパイン株式会社 | Hearing loss prevention device |
US8150044B2 (en) * | 2006-12-31 | 2012-04-03 | Personics Holdings Inc. | Method and device configured for sound signature detection |
JP5499633B2 (en) * | 2009-10-28 | 2014-05-21 | ソニー株式会社 | REPRODUCTION DEVICE, HEADPHONE, AND REPRODUCTION METHOD |
US9270244B2 (en) * | 2013-03-13 | 2016-02-23 | Personics Holdings, Llc | System and method to detect close voice sources and automatically enhance situation awareness |
US9338551B2 (en) * | 2013-03-15 | 2016-05-10 | Broadcom Corporation | Multi-microphone source tracking and noise suppression |
-
2016
- 2016-08-05 EP EP16763354.4A patent/EP3332558B1/en active Active
- 2016-08-05 CN CN201680058340.7A patent/CN108141694B/en active Active
Non-Patent Citations (1)
Title |
---|
None * |
Also Published As
Publication number | Publication date |
---|---|
EP3332558A2 (en) | 2018-06-13 |
CN108141694A (en) | 2018-06-08 |
CN108141694B (en) | 2021-03-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11621017B2 (en) | Event detection for playback management in an audio device | |
US11614916B2 (en) | User voice activity detection | |
US10297267B2 (en) | Dual microphone voice processing for headsets with variable microphone array orientation | |
US10079026B1 (en) | Spatially-controlled noise reduction for headsets with variable microphone array orientation | |
KR102578147B1 (en) | Method for detecting user voice activity in a communication assembly, its communication assembly | |
US10395667B2 (en) | Correlation-based near-field detector | |
US7464029B2 (en) | Robust separation of speech signals in a noisy environment | |
US8194882B2 (en) | System and method for providing single microphone noise suppression fallback | |
US11373665B2 (en) | Voice isolation system | |
US9462552B1 (en) | Adaptive power control | |
AU2011248297A1 (en) | Wind suppression/replacement component for use with electronic systems | |
JP2023509593A (en) | Method and apparatus for wind noise attenuation | |
EP3332558B1 (en) | Event detection for playback management in an audio device | |
JP2021524697A (en) | Transmission control of audio devices using auxiliary signals | |
WO2021239254A1 (en) | A own voice detector of a hearing device | |
Azarpour et al. | Fast noise PSD estimation based on blind channel identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20180131 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20191122 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: H04S 7/00 20060101AFI20210622BHEP Ipc: H04R 1/10 20060101ALI20210622BHEP Ipc: H04R 3/00 20060101ALI20210622BHEP Ipc: G10L 25/78 20130101ALI20210622BHEP Ipc: G10L 25/81 20130101ALI20210622BHEP Ipc: G10L 25/84 20130101ALI20210622BHEP |
|
INTG | Intention to grant announced |
Effective date: 20210707 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 1452926 Country of ref document: AT Kind code of ref document: T Effective date: 20211215 Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602016066829 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG9D |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20211201 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1452926 Country of ref document: AT Kind code of ref document: T Effective date: 20211201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220301 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220301 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220302 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220401 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602016066829 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220401 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
26N | No opposition filed |
Effective date: 20220902 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20220825 Year of fee payment: 7 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20220805 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20220831 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20220831 |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20220831 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230321 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20220805 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20220831 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20160805 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20230831 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211201 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20240828 Year of fee payment: 9 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20240827 Year of fee payment: 9 |