CN118865758A - Method, device, medium, program product and equipment for time alignment of cabin sound events - Google Patents
Method, device, medium, program product and equipment for time alignment of cabin sound events Download PDFInfo
- Publication number
- CN118865758A CN118865758A CN202411113493.XA CN202411113493A CN118865758A CN 118865758 A CN118865758 A CN 118865758A CN 202411113493 A CN202411113493 A CN 202411113493A CN 118865758 A CN118865758 A CN 118865758A
- Authority
- CN
- China
- Prior art keywords
- event
- time
- data
- events
- event time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/01—Correction of time axis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/04—Time compression or expansion
- G10L21/055—Time compression or expansion for synchronising with other signals, e.g. video signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/06—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being correlation coefficients
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T90/00—Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Navigation (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及航空技术领域,尤其涉及一种舱音事件时间对齐的方法、装置、介质、程序产品及设备。The present invention relates to the field of aviation technology, and in particular to a method, device, medium, program product and equipment for time alignment of cabin voice events.
背景技术Background Art
在现代航空工业中,飞行安全至关重要。为了确保飞行安全,飞机上配备了各种监控设备,包括驾驶舱语音记录器(Cockpit Voice Recorder,CVR)和飞行数据记录器(Flight Data Recorder,FDR)。这些设备记录了飞行过程中的关键信息,以便在发生事故时进行调查和分析。In the modern aviation industry, flight safety is of vital importance. To ensure flight safety, aircraft are equipped with various monitoring devices, including cockpit voice recorders (CVR) and flight data recorders (FDR). These devices record key information during the flight for investigation and analysis in the event of an accident.
CVR记录驾驶舱内的声音,包括飞行员的对话、警告音和其他声音,而FDR记录飞行过程中的各种参数,如速度、高度、航向等。CVR和FDR的数据是分开存储的,并且CVR的录音通常没有时间戳。这使得在事故调查中,将CVR的录音与FDR的数据对齐以确定录音发生的具体时间,成为了一项具有挑战性的任务。CVR records the sounds in the cockpit, including pilot conversations, warning sounds and other sounds, while FDR records various parameters during the flight, such as speed, altitude, heading, etc. The data of CVR and FDR are stored separately, and CVR recordings usually do not have time stamps. This makes it a challenging task to align CVR recordings with FDR data to determine the specific time when the recording occurred during an accident investigation.
目前,CVR和FDR数据对齐的方式主要依赖于人工分析。需要专业知识强且经验丰富的调查人员听取CVR的录音,识别出关键事件,然后根据FDR的数据推断这些事件发生的时间。这个过程既耗时又容易出现人工错误。Currently, the way to align CVR and FDR data mainly relies on manual analysis. It requires highly professional and experienced investigators to listen to the CVR recordings, identify key events, and then infer the time when these events occurred based on the FDR data. This process is time-consuming and prone to manual errors.
发明内容Summary of the invention
本发明实施例的目的在于,提供一种舱音事件时间对齐的方法、装置、介质、程序产品及设备,能够快速地将舱音数据和飞行数据进行对齐,并且准确地获取舱音中关键事件的发生时间。The purpose of the embodiments of the present invention is to provide a method, apparatus, medium, program product and equipment for time alignment of cockpit voice events, which can quickly align cockpit voice data and flight data and accurately obtain the occurrence time of key events in cockpit voice.
本发明第一方面实施例提供了一种舱音事件时间对齐方法,包括:The first aspect of the present invention provides a method for aligning cabin sound events in time, including:
获取待对齐的舱音数据和飞行数据;Acquire the cabin sound data and flight data to be aligned;
对所述飞行数据中的所有飞行事件进行标记,并生成第一事件时间序列;Marking all flight events in the flight data and generating a first event time series;
基于预设时间窗口,对所述舱音数据进行切片和特征提取,获得若干个音频特征序列;将每个所述音频特征序列分别输入舱音事件识别模型中,得到对应的识别结果,并将所有所述识别结果进行整合,得到第二事件时间序列;Based on a preset time window, the cabin sound data is sliced and features are extracted to obtain a plurality of audio feature sequences; each of the audio feature sequences is input into a cabin sound event recognition model to obtain a corresponding recognition result, and all the recognition results are integrated to obtain a second event time series;
基于所述第二事件时间序列的时间范围,对所述第一事件时间序列进行截取,获得第一事件时间子序列;Based on the time range of the second event time series, intercepting the first event time series to obtain a first event time subsequence;
对所述第一事件时间子序列和所述第二事件时间序列进行动态时间规整,并将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间。Dynamic time warping is performed on the first event time subsequence and the second event time sequence, and the time information of all the flight events in the first event time subsequence is mapped to the second event time sequence to obtain the time corresponding to all cabin sound events.
可选地,在所述获取待对齐的舱音数据和飞行数据之后,所述方法还包括:Optionally, after acquiring the cabin sound data and the flight data to be aligned, the method further includes:
对所述飞行数据进行第一预处理,得到预处理后的飞行数据;其中,所述第一预处理包括:数据清洗和数据转换;Performing a first preprocessing on the flight data to obtain preprocessed flight data; wherein the first preprocessing includes: data cleaning and data conversion;
对所述舱音数据进行第二预处理,得到预处理后的舱音数据;其中,所述第二预处理包括:降噪处理和信号增强处理。The cabin sound data is subjected to a second preprocessing to obtain preprocessed cabin sound data; wherein the second preprocessing includes: noise reduction processing and signal enhancement processing.
可选地,所述对所述飞行数据中的所有飞行事件进行标记,并生成第一事件时间序列,包括:Optionally, marking all flight events in the flight data and generating a first event time series includes:
基于快速访问记录仪的分析系统,对所述飞行数据进行解析,并在所述飞行数据中标记出所有所述飞行事件,得到对应的第一事件时间序列。The analysis system based on the rapid access recorder analyzes the flight data, and marks all the flight events in the flight data to obtain a corresponding first event time series.
可选地,所述基于预设时间窗口,对所述舱音数据进行切片和特征提取,获得若干个音频特征序列,包括:Optionally, the cabin sound data is sliced and features are extracted based on a preset time window to obtain a plurality of audio feature sequences, including:
对所述舱音数据按照所述预设时间窗口进行划分,得到若干个音频片段;Dividing the cabin sound data according to the preset time window to obtain a plurality of audio segments;
对每个所述音频片段进行特征提取,得到对应的音频特征序列;其中,每个所述音频特征序列包括:梅尔频率倒谱系数和梅尔频谱数据。Feature extraction is performed on each of the audio clips to obtain a corresponding audio feature sequence; wherein each of the audio feature sequences includes: Mel-frequency cepstral coefficients and Mel-frequency spectrum data.
可选地,所述舱音事件识别模型由以下步骤获取:Optionally, the cabin sound event recognition model is obtained by the following steps:
获得N个舱音样本数据;其中N≥1;Obtain N cabin sound sample data; where N≥1;
对所有所述舱音样本数据进行事件标注,并且根据所述预设时间窗口,对所有事件标注后的舱音样本数据进行划分,得到若干个音频样本片段以及对应的事件标签;Performing event labeling on all the cabin sound sample data, and dividing all the event-labeled cabin sound sample data according to the preset time window to obtain a plurality of audio sample segments and corresponding event labels;
对每个所述音频样本片段进行特征提取,得到对应的样本特征序列;Extracting features from each of the audio sample segments to obtain a corresponding sample feature sequence;
基于长短期记忆网络架构创建初始识别模型;将所有所述样本特征序列和对应的事件标签作为训练数据,并对所述初始识别模型进行训练,得到所述舱音事件识别模型。An initial recognition model is created based on a long short-term memory network architecture; all the sample feature sequences and corresponding event labels are used as training data, and the initial recognition model is trained to obtain the cabin sound event recognition model.
可选地,所述对所述第一事件时间子序列和所述第二事件时间序列进行动态时间规整,并将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间,包括:Optionally, the performing dynamic time warping on the first event time subsequence and the second event time sequence, and mapping the time information of all the flight events in the first event time subsequence to the second event time sequence to obtain the time corresponding to all cabin sound events, includes:
基于所述第一事件时间子序列和所述第二事件时间序列,构建累积距离矩阵;constructing a cumulative distance matrix based on the first event time subsequence and the second event time series;
在所述累积距离矩阵中进行路径回溯,获取最短规整路径;Perform path backtracking in the cumulative distance matrix to obtain the shortest regular path;
基于所述最短规整路径,将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间。Based on the shortest regular path, the time information of all the flight events in the first event time subsequence is mapped to the second event time sequence to obtain the time corresponding to all the cabin sound events.
本发明第二方面实施例提供了一种舱音事件时间对齐装置,用于实现上述第一方面任一项所述的舱音事件时间对齐方法,所述装置包括:A second aspect of the present invention provides a device for aligning cabin sound events in time, for implementing the method for aligning cabin sound events in time as described in any one of the first aspects, the device comprising:
数据获取模块,用于获取待对齐的舱音数据和飞行数据;A data acquisition module, used to acquire cabin sound data and flight data to be aligned;
飞行事件标记模块,用于对所述飞行数据中的所有飞行事件进行标记,并生成第一事件时间序列;A flight event marking module, used for marking all flight events in the flight data and generating a first event time series;
舱音事件识别模块,用于基于预设时间窗口,对所述舱音数据进行切片和特征提取,获得若干个音频特征序列;将每个所述音频特征序列分别输入舱音事件识别模型中,得到对应的识别结果,并将所有所述识别结果进行整合,得到第二事件时间序列;A cabin sound event recognition module is used to slice and extract features from the cabin sound data based on a preset time window to obtain a plurality of audio feature sequences; each of the audio feature sequences is input into a cabin sound event recognition model to obtain a corresponding recognition result, and all the recognition results are integrated to obtain a second event time series;
序列截取模块,用于基于所述第二事件时间序列的时间范围,对所述第一事件时间序列进行截取,获得第一事件时间子序列;A sequence interception module, configured to intercept the first event time series based on the time range of the second event time series to obtain a first event time subsequence;
序列对齐模块,用于对所述第一事件时间子序列和所述第二事件时间序列进行动态时间规整,并将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间。A sequence alignment module is used to dynamically time-align the first event time subsequence and the second event time sequence, and map the time information of all the flight events in the first event time subsequence to the second event time sequence to obtain the time corresponding to all cabin sound events.
本发明第三方面实施例提供了一种计算机可读存储介质,所述计算机可读存储介质包括存储的计算机程序;其中,所述计算机程序在运行时控制所述计算机可读存储介质所在的设备执行上述第一方面任一项所述的舱音事件时间对齐方法。An embodiment of the third aspect of the present invention provides a computer-readable storage medium, which includes a stored computer program; wherein, when the computer program is running, it controls the device where the computer-readable storage medium is located to execute the cabin sound event time alignment method described in any one of the first aspects above.
本发明第四方面实施例提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现上述第一方面任一项所述的舱音事件时间对齐方法。An embodiment of a fourth aspect of the present invention provides a computer program product, including a computer program, which, when executed by a processor, implements the cabin sound event time alignment method described in any one of the first aspects above.
本发明第五方面实施例一种电子设备,包括处理器、存储器以及存储在所述存储器中且被配置为由所述处理器执行的计算机程序,所述处理器在执行所述计算机程序时实现上述第一方面任一项所述的舱音事件时间对齐方法。An embodiment of the fifth aspect of the present invention is an electronic device, comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor implements the cabin sound event time alignment method described in any one of the first aspect above when executing the computer program.
与现有技术相比,本发明实施例提供一种舱音事件时间对齐的方法、装置、介质、程序产品及设备,所述方法包括:获取待对齐的舱音数据和飞行数据;对飞行数据中的所有飞行事件进行标记,并生成第一事件时间序列;基于预设时间窗口,对舱音数据进行切片和特征提取,获得若干个音频特征序列;将每个音频特征序列分别输入舱音事件识别模型中,得到对应的识别结果,并将所有识别结果进行整合,得到第二事件时间序列;基于第二事件时间序列的时间范围,对第一事件时间序列进行截取,获得第一事件时间子序列;对第一事件时间子序列和第二事件时间序列进行动态时间规整,并将第一事件时间子序列中所有飞行事件的时间信息映射到第二事件时间序列上,得到所有舱音事件对应的时间。本发明能够快速地将舱音数据和飞行数据进行对齐,并且准确地获取舱音中关键事件的发生时间。Compared with the prior art, the embodiments of the present invention provide a method, device, medium, program product and equipment for aligning the time of cabin sound events, the method comprising: obtaining cabin sound data and flight data to be aligned; marking all flight events in the flight data and generating a first event time series; slicing and feature extracting the cabin sound data based on a preset time window to obtain a number of audio feature sequences; inputting each audio feature sequence into a cabin sound event recognition model to obtain a corresponding recognition result, and integrating all recognition results to obtain a second event time series; based on the time range of the second event time series, intercepting the first event time series to obtain a first event time subsequence; dynamically time-warping the first event time subsequence and the second event time series, and mapping the time information of all flight events in the first event time subsequence to the second event time series to obtain the time corresponding to all cabin sound events. The present invention can quickly align the cabin sound data and the flight data, and accurately obtain the occurrence time of key events in the cabin sound.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本发明提供的舱音事件时间对齐方法的一个实施例的流程示意图。FIG1 is a flow chart of an embodiment of a method for time alignment of cabin sound events provided by the present invention.
图2是本发明提供的舱音事件时间对齐装置的一个实施例的结构示意图。FIG2 is a schematic structural diagram of an embodiment of a cabin sound event time alignment device provided by the present invention.
图3是本发明提供的一种电子设备的一个实施例的结构示意图。FIG. 3 is a schematic structural diagram of an embodiment of an electronic device provided by the present invention.
具体实施方式DETAILED DESCRIPTION
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本技术领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this technical field without creative work are within the scope of protection of the present invention.
参见图1,是本发明提供的舱音事件时间对齐方法的一个实施例的流程示意图。Refer to FIG1 , which is a flow chart of an embodiment of a method for time alignment of cabin sound events provided by the present invention.
本发明第一方面实施例提供了一种舱音事件时间对齐方法,包括步骤S1至步骤S6,具体如下:The first aspect of the present invention provides a method for aligning cabin sound events in time, including steps S1 to S6, which are specifically as follows:
步骤S1:获取待对齐的舱音数据和飞行数据;Step S1: Acquire cabin sound data and flight data to be aligned;
步骤S2:对所述飞行数据中的所有飞行事件进行标记,并生成第一事件时间序列;Step S2: marking all flight events in the flight data and generating a first event time series;
步骤S3:基于预设时间窗口,对所述舱音数据进行切片和特征提取,获得若干个音频特征序列;将每个所述音频特征序列分别输入舱音事件识别模型中,得到对应的识别结果,并将所有所述识别结果进行整合,得到第二事件时间序列;Step S3: based on a preset time window, the cabin sound data is sliced and features are extracted to obtain a plurality of audio feature sequences; each of the audio feature sequences is input into a cabin sound event recognition model to obtain a corresponding recognition result, and all the recognition results are integrated to obtain a second event time series;
步骤S4:基于所述第二事件时间序列的时间范围,对所述第一事件时间序列进行截取,获得第一事件时间子序列;Step S4: based on the time range of the second event time series, intercept the first event time series to obtain a first event time subsequence;
步骤S5:对所述第一事件时间子序列和所述第二事件时间序列进行动态时间规整,并将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间。Step S5: Dynamically time warp the first event time subsequence and the second event time sequence, and map the time information of all the flight events in the first event time subsequence to the second event time sequence to obtain the time corresponding to all cabin sound events.
需要说明的是,在上述步骤S4中,由于CVR的存储能力有限,它只能记录最近两小时的音频数据,如果飞行时间超过两小时,那么CVR中前期的录音数据将会被后续的录音数据覆盖。相比之下,FDR能够记录最近25小时的飞行数据。此外,航空公司和飞机制造商会确保CVR和FDR在电源和信号输入上具有一定的同步性,以便在事件调查中提供一致的数据记录,即CVR和FDR停止记录的时间基本一致,因为两者的电源通常连接到相同的飞机电源系统。基于此,本发明实施例根据第二事件时间序列的时间范围(CVR记录舱音数据的时间长度),从第一事件时间序列的末端截取相同时间长度的部分序列,作为第一事件时间子序列。其中,第一事件时间子序列和第二事件时间序列的时间粒度是一致的。It should be noted that in the above step S4, due to the limited storage capacity of the CVR, it can only record the audio data of the last two hours. If the flight time exceeds two hours, the previous recording data in the CVR will be overwritten by the subsequent recording data. In contrast, the FDR can record the flight data of the last 25 hours. In addition, airlines and aircraft manufacturers will ensure that the CVR and FDR have a certain degree of synchronization in power supply and signal input in order to provide consistent data records in event investigations, that is, the time when the CVR and FDR stop recording is basically the same, because the power supplies of both are usually connected to the same aircraft power system. Based on this, the embodiment of the present invention intercepts a partial sequence of the same time length from the end of the first event time sequence according to the time range of the second event time sequence (the time length of the CVR recording cabin sound data) as the first event time subsequence. Among them, the time granularity of the first event time subsequence and the second event time sequence is consistent.
具体实施时,首先,采集需要对齐的舱音数据(飞机驾驶舱内的音频数据)和飞行数据(飞机运行过程中的飞行参数)。然后,对飞行数据中的所有飞行事件进行标记,这些飞行事件包括但不限于起飞、着陆、飞机超速、发生颠簸、快速爬升/下降、飞机发动机出现故障等;标记完成后,生成一个包含所有飞行事件的时间序列,作为第一事件时间序列,该序列明确记录了飞行过程中发生的各个飞行事件及其具体时间点;其中,第一事件时间序列的长度与飞行数据记录的时间长度一致。同样地,需要确定舱音数据中发生的各个舱音事件,从而形成第二事件时间序列,这些舱音事件包括但不限于:驾驶舱警报音、从驾驶员对话中提取的关键事件,如超速、爬升过高、下降过低、发动机警告、碰撞物体等;其中,第二事件时间序列与舱音数据记录的时间长度一致,且舱音事件和飞行事件对应相同的事件键-键值空间。接着,从第一事件时间序列的末端截取与第二事件时间序列相同时长的第一事件时间子序列,用以确保对齐过程中只关注相关的时间段,提高了对齐的精度和效率。最后,基于动态时间规整(Dynamic Time Warping,DTW)算法,对第一事件时间子序列和第二事件时间序列进行最优对齐,并将第一事件时间子序列中的所有飞行事件时间信息映射到第二事件时间序列上,得到所有舱音事件对应的精确时间。In the specific implementation, first, the cockpit sound data (audio data in the cockpit of the aircraft) and flight data (flight parameters during the operation of the aircraft) that need to be aligned are collected. Then, all flight events in the flight data are marked, including but not limited to takeoff, landing, aircraft overspeed, turbulence, rapid climb/descent, aircraft engine failure, etc.; after the marking is completed, a time series containing all flight events is generated as the first event time series, which clearly records the various flight events and their specific time points that occurred during the flight; wherein, the length of the first event time series is consistent with the time length of the flight data record. Similarly, it is necessary to determine the various cockpit sound events that occur in the cockpit sound data to form a second event time series, which includes but is not limited to: cockpit alarm sounds, key events extracted from the driver's dialogue, such as overspeed, climbing too high, descending too low, engine warnings, collision objects, etc.; wherein, the second event time series is consistent with the time length of the cockpit sound data record, and the cockpit sound events and flight events correspond to the same event key-key value space. Next, the first event time subsequence with the same length as the second event time sequence is intercepted from the end of the first event time sequence to ensure that only the relevant time period is focused on during the alignment process, thereby improving the accuracy and efficiency of the alignment. Finally, based on the Dynamic Time Warping (DTW) algorithm, the first event time subsequence and the second event time sequence are optimally aligned, and all flight event time information in the first event time subsequence is mapped to the second event time sequence to obtain the precise time corresponding to all cabin sound events.
通过上述技术方案可知,本发明实施例提供的舱音事件时间对齐方法,能够快速地在CVR和FDR之间建立一个精确的时间关系,从而能够在同一时间线上观察飞行员的对话和飞机的飞行数据。这对于理解飞行员在关键事件发生时的决策过程和飞机的反应模式非常有帮助。此外,本发明实施例还能够分析CVR数据中是否存在缺失的关键事件,并预测这些缺失关键事件的发生时间及相关问题,从而大幅提高飞行事故调查的效率和准确性,为航空安全做出重要贡献。Through the above technical solutions, it can be known that the cockpit voice event time alignment method provided by the embodiment of the present invention can quickly establish an accurate time relationship between the CVR and FDR, so that the pilot's conversation and the aircraft's flight data can be observed on the same timeline. This is very helpful for understanding the pilot's decision-making process and the aircraft's response mode when critical events occur. In addition, the embodiment of the present invention can also analyze whether there are missing critical events in the CVR data, and predict the occurrence time of these missing critical events and related issues, thereby greatly improving the efficiency and accuracy of flight accident investigations and making important contributions to aviation safety.
在一个可选的实施例中,在所述获取待对齐的舱音数据和飞行数据之后,所述方法还包括:In an optional embodiment, after acquiring the cabin sound data and the flight data to be aligned, the method further includes:
对所述飞行数据进行第一预处理,得到预处理后的飞行数据;其中,所述第一预处理包括:数据清洗和数据转换;Performing a first preprocessing on the flight data to obtain preprocessed flight data; wherein the first preprocessing includes: data cleaning and data conversion;
对所述舱音数据进行第二预处理,得到预处理后的舱音数据;其中,所述第二预处理包括:降噪处理和信号增强处理。The cabin sound data is subjected to a second preprocessing to obtain preprocessed cabin sound data; wherein the second preprocessing includes: noise reduction processing and signal enhancement processing.
本发明实施例可以使用Python工具分别对飞行数据和舱音数据进行预处理。具体地,从FDR中获取待对齐的飞行数据后,会进一步对该飞行数据进行数据清洗(如去除无效或错误数据)和数据转换(如转为为分析所需的格式),以提高飞行数据的质量。从CVR中获取待对齐的舱音数据,该舱音数据中往往包含各种背景噪声,如气流声等,通过降噪处理和信号增强处理可以有效地减少背景噪声的成分,提高舱音数据中有用信号的清晰度和辨识度。The embodiment of the present invention can use Python tools to pre-process the flight data and the cabin sound data respectively. Specifically, after obtaining the flight data to be aligned from the FDR, the flight data will be further cleaned (such as removing invalid or erroneous data) and converted (such as converting to the format required for analysis) to improve the quality of the flight data. The cabin sound data to be aligned is obtained from the CVR. The cabin sound data often contains various background noises, such as airflow noise, etc. The background noise components can be effectively reduced through noise reduction processing and signal enhancement processing, thereby improving the clarity and recognition of useful signals in the cabin sound data.
在一个可选的实施例中,所述对所述飞行数据中的所有飞行事件进行标记,并生成第一事件时间序列,包括:In an optional embodiment, marking all flight events in the flight data and generating a first event time series includes:
基于快速访问记录仪的分析系统,对所述飞行数据进行解析,并在所述飞行数据中标记出所有所述飞行事件,得到对应的第一事件时间序列。The analysis system based on the rapid access recorder analyzes the flight data, and marks all the flight events in the flight data to obtain a corresponding first event time series.
本发明实施例通过快速访问记录仪(Quick Access Recorder,QAR)的分析系统,能够自动识别飞行数据中的所有飞行事件,如通过航空飞行分析系统环境(AirborneFlight Analysis System Environment,AirFASE)对FDR采集到的飞行数据进行解析,自动识别和分类出飞机在飞行过程中发生的关键事件(飞行事件)和事件发生的时间段,如起飞、着陆、飞机超速、发生颠簸、快速爬升/下降、飞机发动机出现故障等,并对这些飞行事件附上时间戳,形成第一事件时间序列;其中,第一事件时间序列的时间长度仍与飞行数据的时间长度一致。The embodiment of the present invention can automatically identify all flight events in the flight data through the analysis system of the Quick Access Recorder (QAR), such as parsing the flight data collected by the FDR through the Airborne Flight Analysis System Environment (AirFASE), automatically identifying and classifying the key events (flight events) and the time periods when the events occurred during the flight of the aircraft, such as take-off, landing, aircraft overspeed, turbulence, rapid climb/descent, aircraft engine failure, etc., and attaching timestamps to these flight events to form a first event time series; wherein the time length of the first event time series is still consistent with the time length of the flight data.
AirFASE的工作原理是基于内置的多种飞行数据分析算法和丰富的即用型功能,对飞行数据(如速度、高度、姿态、引擎状态等飞行参数)进行深度分析,自动识别出飞行过程中的关键事件(飞行事件),从而能够精确且高效地进行飞行监控和分析。AirFASE works based on built-in multiple flight data analysis algorithms and rich ready-to-use functions. It conducts in-depth analysis of flight data (such as speed, altitude, attitude, engine status and other flight parameters) and automatically identifies key events (flight events) during the flight process, thereby enabling accurate and efficient flight monitoring and analysis.
在一个可选的实施例中,所述基于预设时间窗口,对所述舱音数据进行切片和特征提取,获得若干个音频特征序列,包括:In an optional embodiment, the cabin sound data is sliced and feature extracted based on a preset time window to obtain a plurality of audio feature sequences, including:
对所述舱音数据按照所述预设时间窗口进行划分,得到若干个音频片段;Dividing the cabin sound data according to the preset time window to obtain a plurality of audio segments;
对每个所述音频片段进行特征提取,得到对应的音频特征序列;其中,每个所述音频特征序列包括:梅尔频率倒谱系数和梅尔频谱数据。Feature extraction is performed on each of the audio clips to obtain a corresponding audio feature sequence; wherein each of the audio feature sequences includes: Mel-frequency cepstral coefficients and Mel-frequency spectrum data.
具体实施时,首先根据分析需求确定预设时间窗口的大小,以确保预设时间窗口对应的音频片段能够捕捉到足够的音频信息(用于输入至舱音事件识别模型中进行预测),同时确保该音频片段最多覆盖一个舱音事件,以便于精准的舱音事件识别。然后,根据预设时间窗口,将长时间的舱音数据划分为若干个独立的音频片段,当然,也会对每个音频片段的起始和结束时间进行标记,用于后续对识别结果进行拼接时,确定识别出的舱音事件在第二事件时间序列中对应的具体时间段;其中,这些音频片段的分割可以是非重叠的,也可以是重叠的(如采用滑动窗口进行分割),具体方式在实际应用时根据需求进行选择。最后,对每个音频片段进行特征提取(声学特征提取),包括但不限于:梅尔频率倒谱系数(Mel-Frequency Cepstral Coefficients,MFCC)或梅尔频谱;将每个音频片段提取出的声学特征组成对应的音频特征序列。每个音频特征序列将会作为舱音事件识别模型的输入数据,通过模型识别出对应的舱音事件,这些舱音事件包括但不限于驾驶舱警报音、从驾驶员对话中提取的关键事件,如超速、爬升过高、下降过低、发动机警告、碰撞物体等。对于关键事件(舱音事件),舱音事件识别模型将输出具体的键值(标签编号),而对于非关键事件,模型将输出None。In the specific implementation, the size of the preset time window is first determined according to the analysis requirements to ensure that the audio clip corresponding to the preset time window can capture sufficient audio information (for input into the cabin sound event recognition model for prediction), and at the same time ensure that the audio clip covers at most one cabin sound event, so as to facilitate accurate cabin sound event recognition. Then, according to the preset time window, the long-term cabin sound data is divided into several independent audio clips. Of course, the start and end time of each audio clip will also be marked, which is used to determine the specific time period corresponding to the recognized cabin sound event in the second event time sequence when the recognition results are subsequently spliced; wherein, the segmentation of these audio clips can be non-overlapping or overlapping (such as segmentation using a sliding window), and the specific method is selected according to the needs in actual application. Finally, feature extraction (acoustic feature extraction) is performed on each audio clip, including but not limited to: Mel-Frequency Cepstral Coefficients (MFCC) or Mel spectrum; the acoustic features extracted from each audio clip are composed into a corresponding audio feature sequence. Each audio feature sequence will be used as input data for the cockpit sound event recognition model, and the corresponding cockpit sound events will be recognized by the model. These cockpit sound events include but are not limited to cockpit alarm sounds, key events extracted from driver conversations, such as speeding, climbing too high, descending too low, engine warnings, collisions with objects, etc. For key events (cabin sound events), the cockpit sound event recognition model will output specific key values (label numbers), and for non-key events, the model will output None.
在一个可选的实施例中,所述舱音事件识别模型由以下步骤获取:In an optional embodiment, the cabin sound event recognition model is obtained by the following steps:
获得N个舱音样本数据;其中N≥1;Obtain N cabin sound sample data; where N≥1;
对所有所述舱音样本数据进行事件标注,并且根据所述预设时间窗口,对所有事件标注后的舱音样本数据进行划分,得到若干个音频样本片段以及对应的事件标签;Performing event labeling on all the cabin sound sample data, and dividing all the event-labeled cabin sound sample data according to the preset time window to obtain a plurality of audio sample segments and corresponding event labels;
对每个所述音频样本片段进行特征提取,得到对应的样本特征序列;Extracting features from each of the audio sample segments to obtain a corresponding sample feature sequence;
基于长短期记忆网络架构创建初始识别模型;将所有所述样本特征序列和对应的事件标签作为训练数据,并对所述初始识别模型进行训练,得到所述舱音事件识别模型。An initial recognition model is created based on a long short-term memory network architecture; all the sample feature sequences and corresponding event labels are used as training data, and the initial recognition model is trained to obtain the cabin sound event recognition model.
具体实施时,训练数据的获取过程如下:首先,从CVR中收集大量的舱音样本数据,这些舱音样本数据通常涉及到驾驶舱内的对话(包括对话者身份、对话内容、语气和情绪等)以及驾驶舱警报音。随后,将采集到的舱音样本数据转录为对应的文本样本,转录过程通常由专业的语音转录员完成,或者通过语音识别算法实现自动转录。转录后的文本样本应详尽地包含所有的对话、警报音及其他相关声音的文字描述。然后,对每个文本样本进行详细的事件标注,即标注关键事件(舱音事件)及其具体的发生时间,同时也在对应舱音样本数据中标注出某段音频发生的舱音事件(附上事件标签),进而实现对所有舱音样本数据进行事件标注。显然,事件标注是一个专业、细致且耗时的过程,但它对于高效、准确的机器学习模型的构建是至关重要,即事件标注可以极大地提高模型的学习效率和预测准确性。需要说明的是,本发明实施例在将舱音样本数据转录为文本样本之前,还会对每个舱音样本数据进行预处理操作,如降噪处理和信号增强处理。在完成所有舱音样本的事件标注后,根据预设时间窗口,对所有事件标注后的舱音样本数据进行划分,得到若干个音频样本片段以及对应的事件标签;对每个所述音频样本片段进行特征提取,得到对应的样本特征序列。最后,将所有样本特征序列和对应的事件标签作为初始识别模型的训练数据。In specific implementation, the process of obtaining training data is as follows: First, a large amount of cockpit sound sample data is collected from the CVR. These cockpit sound sample data usually involve conversations in the cockpit (including the identity of the interlocutor, the content of the conversation, the tone and emotion, etc.) and the cockpit alarm sound. Subsequently, the collected cockpit sound sample data is transcribed into corresponding text samples. The transcription process is usually completed by a professional voice transcriptionist, or automatically transcribed through a speech recognition algorithm. The transcribed text sample should contain detailed text descriptions of all conversations, alarm sounds and other related sounds. Then, each text sample is annotated with detailed events, that is, key events (cabin sound events) and their specific occurrence time are annotated. At the same time, the cockpit sound events that occurred in a certain audio segment are also annotated in the corresponding cockpit sound sample data (with event labels attached), thereby realizing event annotation of all cockpit sound sample data. Obviously, event annotation is a professional, meticulous and time-consuming process, but it is crucial for the construction of efficient and accurate machine learning models, that is, event annotation can greatly improve the learning efficiency and prediction accuracy of the model. It should be noted that the embodiment of the present invention will perform preprocessing operations on each cabin sound sample data, such as noise reduction and signal enhancement, before transcribing the cabin sound sample data into a text sample. After completing the event labeling of all cabin sound samples, the cabin sound sample data after all event labeling is divided according to a preset time window to obtain a number of audio sample segments and corresponding event labels; feature extraction is performed on each of the audio sample segments to obtain a corresponding sample feature sequence. Finally, all sample feature sequences and corresponding event labels are used as training data for the initial recognition model.
本发明实施例采用“监督学习”进行模型训练。具体而言,样本特征序列作为模型的输入,对应的事件标签作为模型的输出。下面将从模型的架构至模型评估进行展开说明:The embodiment of the present invention adopts "supervised learning" to train the model. Specifically, the sample feature sequence is used as the input of the model, and the corresponding event label is used as the output of the model. The following will be explained from the model architecture to the model evaluation:
模型架构:初始识别模型以长短期记忆网络(Long Short-Term Memory,LSTM)为核心架构,该LSTM架构包括深度LSTM网络(即包含多个LSTM层),以有效地捕捉时间序列数据中的复杂依赖关系。为了防止过拟合,引入dropout层进行正则化,通过随机断开神经网络的连接,增强模型的泛化能力。当然,本发明实施例中的深度LSTM网络具体可以是双向LSTM,其能够同时处理过去和未来的信息,提升模型性能。Model architecture: The initial recognition model uses the Long Short-Term Memory (LSTM) network as the core architecture. The LSTM architecture includes a deep LSTM network (i.e., multiple LSTM layers) to effectively capture complex dependencies in time series data. In order to prevent overfitting, a dropout layer is introduced for regularization, and the generalization ability of the model is enhanced by randomly disconnecting the connection of the neural network. Of course, the deep LSTM network in the embodiment of the present invention can specifically be a bidirectional LSTM, which can process past and future information at the same time and improve model performance.
训练策略:使用Adam优化器,利用其自动调整学习率的特性,加速模型的收敛。采用交叉熵损失函数(适用于分类问题),能够识别并预测分类出CVR录音中的关键事件(舱音事件)。实施早停法策略,即模型在验证集上性能不再提升时停止训练,避免过拟合。此外,还通过数据增强(如调整音频速率、音量等),提高模型的鲁棒性和泛化能力。Training strategy: Use the Adam optimizer to accelerate the convergence of the model by taking advantage of its automatic learning rate adjustment feature. Use the cross entropy loss function (suitable for classification problems) to identify and predict key events (cabin sound events) in CVR recordings. Implement an early stopping strategy, that is, stop training when the model performance on the validation set no longer improves to avoid overfitting. In addition, data enhancement (such as adjusting the audio rate, volume, etc.) is used to improve the robustness and generalization ability of the model.
评估与优化:采用准确率、召回率和F1分数等多维度指标对模型性能进行综合评估。结合交叉验证方法确保评估结果的可靠性和模型的稳定性。根据评估反馈,不断调整LSTM网络结构、训练过程和正则化策略,以使模型达到最佳性能。Evaluation and optimization: Use multi-dimensional indicators such as accuracy, recall, and F1 score to comprehensively evaluate model performance. Combine cross-validation methods to ensure the reliability of evaluation results and the stability of the model. Based on evaluation feedback, continuously adjust the LSTM network structure, training process, and regularization strategy to achieve optimal model performance.
验证和测试:将数据集分为训练集、验证集和测试集;其中,训练集用于训练模型,验证集用于调整模型的超参数和防止过拟合,测试集用于评估模型的最终性能。Validation and testing: The dataset is divided into training set, validation set and test set; the training set is used to train the model, the validation set is used to adjust the model's hyperparameters and prevent overfitting, and the test set is used to evaluate the final performance of the model.
通过上述方式,可以得到一个既能准确预测舱音数据中关键事件(舱音事件),又具备强大的泛化能力的舱音事件识别模型。在得到训练好的舱音事件识别模型后,将其应用到新的舱音数据上,可自动识别出新的舱音数据中各种关键事件(舱音事件)。Through the above method, a cabin voice event recognition model can be obtained that can accurately predict key events (cabin voice events) in cabin voice data and has strong generalization ability. After obtaining the trained cabin voice event recognition model, it is applied to new cabin voice data to automatically identify various key events (cabin voice events) in the new cabin voice data.
此外,可将舱音事件识别模型部署在具体的测试场景中,以检验其整体执行效果,如将舱音事件识别模型上传至服务器中,并在该服务器上配置必要的运行环境。在服务器上加载模型后,首先验证模型是否能够正常加载。然后,进行初步的功能测试,包括数据的输入输出是否正常,模型是否能够正常运行,确定模型不会在运行中崩溃。同时,评估模型的运行速度和资源占用,以确定其是否满足实际部署的性能要求。显然,本发明中的舱音事件识别模型可以直接部署在服务器上,在每一次将要使用新的舱音数据覆盖掉旧的舱音数据之前,会将旧的舱音数据上传到服务器中的舱音事件识别模型中,自动识别出旧的舱音数据中发生的所有舱音事件,并在服务器中保存为对应的第二事件时间序列。这样在进行事故调查时,不再仅仅使用最后2小时的舱音数据,而是可以使用和FDR数据同等时长的舱音事件信息,执行25小时的舱音事件和飞行事件的时间对齐,从而为事故分析提供更全面的数据支持,进一步提高事故调查的准确性。In addition, the cockpit sound event recognition model can be deployed in a specific test scenario to verify its overall execution effect, such as uploading the cockpit sound event recognition model to the server and configuring the necessary operating environment on the server. After loading the model on the server, first verify whether the model can be loaded normally. Then, perform preliminary functional tests, including whether the input and output of the data are normal, whether the model can run normally, and determine whether the model will not crash during operation. At the same time, evaluate the running speed and resource occupancy of the model to determine whether it meets the performance requirements of actual deployment. Obviously, the cockpit sound event recognition model in the present invention can be directly deployed on the server. Before the new cockpit sound data is used to overwrite the old cockpit sound data each time, the old cockpit sound data will be uploaded to the cockpit sound event recognition model in the server, and all cockpit sound events occurring in the old cockpit sound data will be automatically identified and saved in the server as the corresponding second event time series. In this way, when conducting an accident investigation, the cockpit sound data of the last 2 hours will no longer be used alone, but the cockpit sound event information of the same length as the FDR data can be used to perform time alignment of the 25-hour cockpit sound event and the flight event, thereby providing more comprehensive data support for accident analysis and further improving the accuracy of accident investigation.
在一个可选的实施例中,所述对所述第一事件时间子序列和所述第二事件时间序列进行动态时间规整,并将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间,包括:In an optional embodiment, the performing dynamic time warping on the first event time subsequence and the second event time sequence, and mapping the time information of all the flight events in the first event time subsequence to the second event time sequence, to obtain the time corresponding to all cabin sound events, includes:
基于所述第一事件时间子序列和所述第二事件时间序列,构建累积距离矩阵;constructing a cumulative distance matrix based on the first event time subsequence and the second event time series;
在所述累积距离矩阵中进行路径回溯,获取最短规整路径;Perform path backtracking in the cumulative distance matrix to obtain the shortest regular path;
基于所述最短规整路径,将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间。Based on the shortest regular path, the time information of all the flight events in the first event time subsequence is mapped to the second event time sequence to obtain the time corresponding to all the cabin sound events.
具体地,在得到第一事件时间子序列和第二事件时间序列后,采用动态时间规整算法将CVR数据中的关键事件(第二事件时间序列)与FDR数据中的关键事件(第一事件时间子序列)进行自动对齐,从而精确地确定CVR中各个关键事件(舱音事件)的时间点;其中,第一事件时间子序列和第二事件时间序列中包含的数据元素为事件的键值(标签编号)。Specifically, after obtaining the first event time subsequence and the second event time series, the dynamic time warping algorithm is used to automatically align the key events in the CVR data (the second event time series) with the key events in the FDR data (the first event time subsequence), so as to accurately determine the time points of each key event (cabin sound event) in the CVR; wherein the data elements contained in the first event time subsequence and the second event time series are the key values (label numbers) of the events.
DTW算法是一种用于测量和对齐两个时间序列相似性的算法。该算法通过在时间轴上对序列进行非线性的拉伸或压缩,找到一种“扭曲”的方式,使得扭曲后的两个序列之间的相似度最大或差异最小。具体实施时,DTW算法首先计算两个事件时间序列之间的累积距离矩阵,然后在累积距离矩阵上找到一条最短规整路径,作为最优对齐路径。最后,根据最优对齐路径,对这两个事件时间序列进行扭曲,使得序列中的各元素在时间轴上对齐。即根据最短规整路径,实现第一事件时间子序列和第二事件时间序列在时间轴上的最佳对齐,从而将第一事件时间子序列中所有飞行事件的时间信息映射到第二事件时间序列上,得到所有舱音事件对应的精确时间。The DTW algorithm is an algorithm used to measure and align the similarity of two time series. The algorithm finds a "distortion" method by nonlinearly stretching or compressing the sequence on the time axis so that the similarity between the two distorted sequences is maximized or the difference is minimized. In specific implementation, the DTW algorithm first calculates the cumulative distance matrix between the two event time series, and then finds a shortest regularized path on the cumulative distance matrix as the optimal alignment path. Finally, according to the optimal alignment path, the two event time series are distorted so that the elements in the sequence are aligned on the time axis. That is, according to the shortest regularized path, the best alignment of the first event time subsequence and the second event time series on the time axis is achieved, so that the time information of all flight events in the first event time subsequence is mapped to the second event time series, and the precise time corresponding to all cabin sound events is obtained.
此外,在CVR数据中的关键事件(第二事件时间序列)与FDR数据中的关键事件(第一事件时间子序列)自动对齐后,可进一步判断CVR数据中是否存在关键事件的缺失,以及推断出缺失的关键事件发生的时间及出现的关键事件问题。In addition, after the key events in the CVR data (second event time series) are automatically aligned with the key events in the FDR data (first event time subsequence), it is possible to further determine whether there are missing key events in the CVR data, as well as to infer the time when the missing key events occurred and the key event problems that occurred.
在飞行过程中,某一关键事件可能会出现多次,而CVR记录对应事件的次数不完全一致的,这是因为CVR记录的是驾驶舱内的声音事件,其记录次数可能受到飞行员是否口头确认以及驾驶舱中是否发出相应警告的影响。例如,在飞行过程中超速事件出现两次,那么FDR数据中飞机超速事件就会有两个,而CVR中的超速事件只记录了一次,这可能是因为只有一次超速时伴随有明显的声音警告或飞行员的口头确认。在这种情况下,仅凭CVR数据无法准确判断出这次记录的超速事件(舱音事件)是对应于FDR数据中的哪一次超速事件。然而,通过CVR数据中的关键事件(第二事件时间序列)与FDR数据中的关键事件(第一事件时间子序列)的自动对齐,即使这些事件在时间上有所偏移,但在两个事件序列对齐后,仍能够精确地识别出CVR中记录的超速事件对应于FDR中的哪一次事件,进而判断CVR数据中是否存在关键事件的缺失,以及推断出缺失的关键事件发生的时间。During the flight, a critical event may occur multiple times, and the number of corresponding events recorded by the CVR is not completely consistent. This is because the CVR records the sound events in the cockpit, and the number of times it is recorded may be affected by whether the pilot verbally confirms and whether the corresponding warning is issued in the cockpit. For example, if the overspeed event occurs twice during the flight, there will be two aircraft overspeed events in the FDR data, while the overspeed event in the CVR is only recorded once. This may be because only one overspeed is accompanied by an obvious sound warning or verbal confirmation by the pilot. In this case, it is impossible to accurately determine which overspeed event (cabin sound event) recorded this time corresponds to the overspeed event in the FDR data based on the CVR data alone. However, by automatically aligning the key events in the CVR data (second event time series) with the key events in the FDR data (first event time subsequence), even if these events are offset in time, after the two event sequences are aligned, it is still possible to accurately identify which event in the FDR the overspeed event recorded in the CVR corresponds to, and then determine whether there is a missing key event in the CVR data, and infer the time when the missing key event occurred.
参见图2,是本发明提供的舱音事件时间对齐装置的一个实施例的结构示意图。Refer to FIG2 , which is a schematic diagram of the structure of an embodiment of a cabin sound event time alignment device provided by the present invention.
本发明第二方面实施例提供了一种舱音事件时间对齐装置,用于实现上述第一方面任一实施例所述的舱音事件时间对齐方法,所述装置包括:A second aspect of the present invention provides a device for aligning cabin sound events in time, for implementing the method for aligning cabin sound events in time as described in any one of the embodiments of the first aspect, the device comprising:
数据获取模块11,用于获取待对齐的舱音数据和飞行数据;A data acquisition module 11, used to acquire cabin sound data and flight data to be aligned;
飞行事件标记模块12,用于对所述飞行数据中的所有飞行事件进行标记,并生成第一事件时间序列;A flight event marking module 12, used for marking all flight events in the flight data and generating a first event time series;
舱音事件识别模块13,用于基于预设时间窗口,对所述舱音数据进行切片和特征提取,获得若干个音频特征序列;将每个所述音频特征序列分别输入舱音事件识别模型中,得到对应的识别结果,并将所有所述识别结果进行整合,得到第二事件时间序列;The cabin sound event recognition module 13 is used to slice and extract features of the cabin sound data based on a preset time window to obtain a plurality of audio feature sequences; each of the audio feature sequences is input into the cabin sound event recognition model to obtain a corresponding recognition result, and all the recognition results are integrated to obtain a second event time series;
序列截取模块14,用于基于所述第二事件时间序列的时间范围,对所述第一事件时间序列进行截取,获得第一事件时间子序列;A sequence interception module 14, configured to intercept the first event time series based on the time range of the second event time series to obtain a first event time subsequence;
序列对齐模块15,用于对所述第一事件时间子序列和所述第二事件时间序列进行动态时间规整,并将所述第一事件时间子序列中所有所述飞行事件的时间信息映射到所述第二事件时间序列上,得到所有舱音事件对应的时间。The sequence alignment module 15 is used to dynamically time-align the first event time subsequence and the second event time sequence, and map the time information of all the flight events in the first event time subsequence to the second event time sequence to obtain the time corresponding to all cabin sound events.
需要说明的是,本发明第二方面实施例所提供的舱音事件时间对齐装置,能够实现上述第一方面任一实施例所述的舱音事件时间对齐方法的所有流程,装置中的各个模块、单元的作用以及实现的技术效果分别与上述第一方面实施例所述的舱音事件时间对齐方法的作用以及实现的技术效果对应相同,这里不再赘述。It should be noted that the cabin sound event time alignment device provided in the embodiment of the second aspect of the present invention can implement all the processes of the cabin sound event time alignment method described in any embodiment of the first aspect above, and the functions of each module and unit in the device and the technical effects achieved are respectively the same as the functions and technical effects achieved by the cabin sound event time alignment method described in the embodiment of the first aspect above, and will not be repeated here.
本发明第三方面实施例提供了一种计算机可读存储介质,所述计算机可读存储介质包括存储的计算机程序;其中,所述计算机程序在运行时控制所述计算机可读存储介质所在的设备执行上述第一方面任一实施例所述的舱音事件时间对齐方法。An embodiment of the third aspect of the present invention provides a computer-readable storage medium, which includes a stored computer program; wherein, when the computer program is running, it controls the device where the computer-readable storage medium is located to execute the cabin sound event time alignment method described in any embodiment of the first aspect above.
本发明第四方面实施例提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现上述第一方面任一实施例所述的舱音事件时间对齐方法。An embodiment of a fourth aspect of the present invention provides a computer program product, including a computer program, which, when executed by a processor, implements the cabin sound event time alignment method described in any embodiment of the first aspect above.
参见图3,是本发明提供的一种电子设备的一个实施例的结构示意图。Referring to FIG. 3 , it is a schematic structural diagram of an embodiment of an electronic device provided by the present invention.
本发明第五方面实施例提供了一种电子设备,包括处理器21、存储器22以及存储在所述存储器22中且被配置为由所述处理器21执行的计算机程序,所述处理器21在执行所述计算机程序时实现上述第一方面任一实施例所述的舱音事件时间对齐方法。An embodiment of the fifth aspect of the present invention provides an electronic device, comprising a processor 21, a memory 22, and a computer program stored in the memory 22 and configured to be executed by the processor 21, wherein the processor 21 implements the cabin sound event time alignment method described in any embodiment of the first aspect when executing the computer program.
优选地,所述计算机程序可以被分割成一个或多个模块/单元(如计算机程序1、计算机程序2、……),所述一个或者多个模块/单元被存储在所述存储器22中,并由所述处理器21执行,以完成本发明。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序在所述电子设备中的执行过程。Preferably, the computer program can be divided into one or more modules/units (such as computer program 1, computer program 2, ...), and the one or more modules/units are stored in the memory 22 and executed by the processor 21 to complete the present invention. The one or more modules/units can be a series of computer program instruction segments that can complete specific functions, and the instruction segments are used to describe the execution process of the computer program in the electronic device.
所述处理器21可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,通用处理器可以是微处理器,或者所述处理器21也可以是任何常规的处理器,所述处理器21是所述电子设备的控制中心,利用各种接口和线路连接所述电子设备的各个部分。The processor 21 can be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field-programmable gate arrays (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor, or the processor 21 can also be any conventional processor. The processor 21 is the control center of the electronic device, and various parts of the electronic device are connected using various interfaces and lines.
所述存储器22主要包括程序存储区和数据存储区,其中,程序存储区可存储操作系统、至少一个功能所需的应用程序等,数据存储区可存储相关数据等。此外,所述存储器22可以是高速随机存取存储器,还可以是非易失性存储器,例如插接式硬盘,智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡和闪存卡(Flash Card)等,或所述存储器22也可以是其他易失性固态存储器件。The memory 22 mainly includes a program storage area and a data storage area, wherein the program storage area can store an operating system, an application required for at least one function, etc., and the data storage area can store related data, etc. In addition, the memory 22 can be a high-speed random access memory, or a non-volatile memory, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, and a flash card (Flash Card), etc., or the memory 22 can also be other volatile solid-state storage devices.
需要说明的是,上述电子设备可包括,但不仅限于,处理器、存储器,本领域技术人员可以理解,图3所示的结构示意图仅仅是上述电子设备的结构示例,并不构成对上述电子设备的结构限定,上述电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件。It should be noted that the above-mentioned electronic device may include, but is not limited to, a processor and a memory. Those skilled in the art may understand that the structural diagram shown in Figure 3 is only a structural example of the above-mentioned electronic device, and does not constitute a structural limitation of the above-mentioned electronic device. The above-mentioned electronic device may include more or less components than shown in the figure, or a combination of certain components, or different components.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above is only a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the technical principles of the present invention. These improvements and modifications should also be regarded as the scope of protection of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411113493.XA CN118865758B (en) | 2024-08-14 | 2024-08-14 | Method, apparatus, medium, program product and device for cabin audio event time alignment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411113493.XA CN118865758B (en) | 2024-08-14 | 2024-08-14 | Method, apparatus, medium, program product and device for cabin audio event time alignment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118865758A true CN118865758A (en) | 2024-10-29 |
CN118865758B CN118865758B (en) | 2025-04-08 |
Family
ID=93179131
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411113493.XA Active CN118865758B (en) | 2024-08-14 | 2024-08-14 | Method, apparatus, medium, program product and device for cabin audio event time alignment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118865758B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007035183A2 (en) * | 2005-04-13 | 2007-03-29 | Pixel Instruments, Corp. | Method, system, and program product for measuring audio video synchronization independent of speaker characteristics |
US20110230987A1 (en) * | 2010-03-11 | 2011-09-22 | Telefonica, S.A. | Real-Time Music to Music-Video Synchronization Method and System |
US20140195139A1 (en) * | 2013-01-08 | 2014-07-10 | The Mitre Corporation | Audio Monitor and Event-Conflict Signaling System |
US20180063106A1 (en) * | 2016-08-25 | 2018-03-01 | International Business Machines Corporation | User authentication using audiovisual synchrony detection |
US10854092B1 (en) * | 2019-09-20 | 2020-12-01 | Honeywell International Inc. | Method and system to improve the situational awareness of all aerodrome ground operations including all turnaround airport collaborative decision making (A-CDM) milestones in the cockpit |
US20210035453A1 (en) * | 2019-08-01 | 2021-02-04 | Honeywell International Inc. | Systems and methods to utilize flight monitoring data |
CN114282811A (en) * | 2021-12-24 | 2022-04-05 | 中国民航科学技术研究院 | Standardized business jet flight risk monitoring system and method based on cross-model SOPs |
US20240020525A1 (en) * | 2022-07-13 | 2024-01-18 | Robert Bosch Gmbh | Systems and methods for automatic alignment between audio recordings and labels extracted from a multitude of asynchronous sensors in urban settings |
-
2024
- 2024-08-14 CN CN202411113493.XA patent/CN118865758B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007035183A2 (en) * | 2005-04-13 | 2007-03-29 | Pixel Instruments, Corp. | Method, system, and program product for measuring audio video synchronization independent of speaker characteristics |
US20110230987A1 (en) * | 2010-03-11 | 2011-09-22 | Telefonica, S.A. | Real-Time Music to Music-Video Synchronization Method and System |
US20140195139A1 (en) * | 2013-01-08 | 2014-07-10 | The Mitre Corporation | Audio Monitor and Event-Conflict Signaling System |
US20180063106A1 (en) * | 2016-08-25 | 2018-03-01 | International Business Machines Corporation | User authentication using audiovisual synchrony detection |
US20210035453A1 (en) * | 2019-08-01 | 2021-02-04 | Honeywell International Inc. | Systems and methods to utilize flight monitoring data |
US10854092B1 (en) * | 2019-09-20 | 2020-12-01 | Honeywell International Inc. | Method and system to improve the situational awareness of all aerodrome ground operations including all turnaround airport collaborative decision making (A-CDM) milestones in the cockpit |
CN114282811A (en) * | 2021-12-24 | 2022-04-05 | 中国民航科学技术研究院 | Standardized business jet flight risk monitoring system and method based on cross-model SOPs |
US20240020525A1 (en) * | 2022-07-13 | 2024-01-18 | Robert Bosch Gmbh | Systems and methods for automatic alignment between audio recordings and labels extracted from a multitude of asynchronous sensors in urban settings |
Also Published As
Publication number | Publication date |
---|---|
CN118865758B (en) | 2025-04-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11776547B2 (en) | System and method of video capture and search optimization for creating an acoustic voiceprint | |
US20220122609A1 (en) | System and method of text zoning | |
JP6826205B2 (en) | Hybrid speech recognition combined performance automatic evaluation system | |
CN111613212B (en) | Speech recognition method, system, electronic device and storage medium | |
Ashar et al. | Speaker identification using a hybrid CNN-MFCC approach | |
US20210020168A1 (en) | Voice activity detection and dialogue recognition for air traffic control | |
CN109686383B (en) | Voice analysis method, device and storage medium | |
CN109964270B (en) | System and method for key phrase identification | |
US10535352B2 (en) | Automated cognitive recording and organization of speech as structured text | |
CN107305541A (en) | Speech recognition text segmentation method and device | |
US10650813B2 (en) | Analysis of content written on a board | |
CN110428854A (en) | Sound end detecting method, device and the computer equipment of vehicle-mounted end | |
CN111933187B (en) | Emotion recognition model training method and device, computer equipment and storage medium | |
CN113658599A (en) | Conference record generation method, device, equipment and medium based on voice recognition | |
CN118865758A (en) | Method, device, medium, program product and equipment for time alignment of cabin sound events | |
CN112466308B (en) | Auxiliary interview method and system based on voice recognition | |
US12087307B2 (en) | Method and apparatus for performing speaker diarization on mixed-bandwidth speech signals | |
Khan et al. | Speech recognition: increasing efficiency of support vector machines | |
CN114203202B (en) | Dialogue scene voice emotion recognition method and device and computing equipment | |
CN112466324A (en) | Emotion analysis method, system, equipment and readable storage medium | |
JP2021131524A (en) | On-line speaker sequential distinguishing method, on-line speaker sequential distinguishing device, and on-line sequential speaker distinguishing system | |
CN115223587B (en) | Abnormal sound detection method, device, storage medium and equipment | |
US20250014395A1 (en) | Onboard Voice-Activated Vehicle Diagnostic Systems and Methods | |
Meyer et al. | A new evaluation methodology for speech emotion recognition with confidence output | |
CN119252285A (en) | Execution evaluation method, device, equipment and storage medium of Shoulian standard speech |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |