CN104281682A - File Classification System and Method - Google Patents
File Classification System and Method Download PDFInfo
- Publication number
- CN104281682A CN104281682A CN201410524658.2A CN201410524658A CN104281682A CN 104281682 A CN104281682 A CN 104281682A CN 201410524658 A CN201410524658 A CN 201410524658A CN 104281682 A CN104281682 A CN 104281682A
- Authority
- CN
- China
- Prior art keywords
- audio
- audio frequency
- image file
- file
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000012545 processing Methods 0.000 abstract description 22
- 238000010586 diagram Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 239000013074 reference sample Substances 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明是有关于一种分类技术,且特别是有关于一种文件分类系统与文件分类方法。The present invention relates to a classification technology, and in particular to a file classification system and a file classification method.
背景技术Background technique
录音或录影技术发展已届一段时日,其中大多数乃针对声音或影像撷取技术方面进行研究改良,较少着墨于录制完成文件的分门别类存放方式。The development of audio or video technology has been going on for some time, and most of them are researched and improved for sound or image capture technology, and seldom focus on the classification and storage of recorded files.
一般而言,当录音或录影完成后产生的声音文件或影像文件常存放于相同位置,而且命名规则常使用相似的英文与/或数字组合依序递增作为文件名称;除非使用者自行重新命名,否则难以单就文件名称确定文件内容。经过长时间之后,当文件数目庞大,而且使用者未定期整理的情况下,欲从众多文件中找寻特定文件,实非容易之事。Generally speaking, the sound files or video files generated after recording or video recording are often stored in the same location, and the naming rules often use similar English and/or number combinations as file names; unless users rename themselves, Otherwise, it is difficult to determine the file content based on the file name alone. After a long period of time, when the number of files is huge and the user does not arrange them regularly, it is not easy to find a specific file from among the many files.
发明内容Contents of the invention
为解决录音或录影时文件未能适当分类的问题,本发明的一方面是提供一种文件分类系统,其包含储存装置、接收装置与处理器。储存装置储存至少一辨识音频,接收装置取得一声音文件或一影像文件。处理器将关连于该声音文件或该影像文件的一相关音频与该至少一辨识音频进行比对以产生一处理结果,进而根据该处理结果自动进行声音文件或影像文件的分类。In order to solve the problem that files cannot be properly classified during recording or video recording, an aspect of the present invention provides a file classification system, which includes a storage device, a receiving device and a processor. The storage device stores at least one identification audio, and the receiving device obtains an audio file or an image file. The processor compares a relevant audio associated with the sound file or the image file with the at least one identified audio to generate a processing result, and then automatically classifies the sound file or the image file according to the processing result.
本发明的另一方面是提供一种文件分类方法,其包含以下步骤:(A)储存至少一辨识音频;(B)取得一声音文件或一影像文件;(C)将关连于该声音文件或该影像文件的一相关音频与该至少一辨识音频进行比对以产生一处理结果,进而根据该处理结果自动进行该声音文件或该影像文件的分类。Another aspect of the present invention is to provide a file classification method, which includes the following steps: (A) store at least one identified audio; (B) obtain a sound file or an image file; (C) associate the sound file or A related audio of the image file is compared with the at least one identified audio to generate a processing result, and then automatically classify the audio file or the image file according to the processing result.
综上所述,本发明是以改善文件分类方式出发点,提供了语音归档辨识机制,让文件快速分类,免于繁琐的操作。To sum up, the present invention is based on improving the file classification method, and provides a voice file recognition mechanism to quickly classify files without cumbersome operations.
以下将以实施方式对上述的说明作详细的描述,并对本发明的技术方案提供更进一步的解释。The above-mentioned description will be described in detail in the following embodiments, and a further explanation will be provided for the technical solution of the present invention.
附图说明Description of drawings
为让本发明的上述和其他目的、特征、优点与实施例能更明显易懂,所附附图的说明如下:In order to make the above and other objects, features, advantages and embodiments of the present invention more comprehensible, the accompanying drawings are described as follows:
图1是依照本发明第一实施例的文件分类系统示意图;FIG. 1 is a schematic diagram of a document classification system according to a first embodiment of the present invention;
图2是依照本发明第二实施例的文件分类系统示意图;Fig. 2 is a schematic diagram of a file classification system according to a second embodiment of the present invention;
图3是依照本发明第三实施例的文件分类系统示意图;Fig. 3 is a schematic diagram of a file classification system according to a third embodiment of the present invention;
图4是依照本发明第四实施例的文件分类系统示意图;以及FIG. 4 is a schematic diagram of a file classification system according to a fourth embodiment of the present invention; and
图5是依照本发明第五实施例的文件分类方法流程图。FIG. 5 is a flowchart of a file classification method according to a fifth embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的叙述更加详尽与完备,可参照附图及以下所述的各种实施例。但所提供的实施例并非用以限制本发明所涵盖的范围;步骤的描述亦非用以限制其执行的顺序,任何由重新组合,所产生具有均等功效的装置,皆为本发明所涵盖的范围。In order to make the description of the present invention more detailed and complete, reference may be made to the accompanying drawings and various embodiments described below. However, the provided examples are not intended to limit the scope of the present invention; the description of the steps is not intended to limit the order in which they are executed, and any device that has the same effect produced by recombination is covered by the present invention scope.
请参考图1,其是绘示本发明的第一实施例的文件分类系统100。文件分类系统100包含储存装置130、接收装置140与处理器120。实作上,储存装置130可为硬盘、快闪记忆体或其他储存媒介,接收装置140可为至少一传输端口,其可依据实际需求为有线及/或无线的传输端口,且可依据实际需求为数字及/或模拟传输端口(如:HDMI、USB、3.5mm等等)内建或外接地连接至录音及/或录影装置,处理器120可为中央处理器、微控制器或其他电路。Please refer to FIG. 1 , which illustrates a file classification system 100 according to a first embodiment of the present invention. The document classification system 100 includes a storage device 130 , a receiving device 140 and a processor 120 . In practice, the storage device 130 can be a hard disk, flash memory or other storage media, and the receiving device 140 can be at least one transmission port, which can be a wired and/or wireless transmission port according to actual needs, and can be based on actual needs The digital and/or analog transmission ports (such as: HDMI, USB, 3.5mm, etc.) are built-in or externally connected to recording and/or video recording devices, and the processor 120 can be a central processing unit, a microcontroller or other circuits.
在录音或录影之前,使用者(如:演讲者)可先将一段预录的语音作为辨识音频132并存放到储存装置130,由储存装置130储存辨识音频132。于录音或录影完成后,接收装置140接收声音文件或影像文件110,文件分类系统100可提示要求或演讲者选择语音归类后,处理器120将关连于声音文件或该影像文件110的一相关音频与辨识音频132进行比对以产生处理结果122,其中该处理结果122带有建议的分类信息,接着,处理器120根据处理结果122自动进行声音文件或影像文件110的分类。举例来说,若该相关音频与辨识音频132相匹配,这代表声音文件或影像文件110是演讲者本人的录音或录影数据,因此,处理器120自动将声音文件或该影像文件110归类至储存装置130中该演讲者所自订的类别,借此,免于繁琐的手动操作。Before recording or video recording, the user (such as: a speaker) can store a pre-recorded voice as the recognition audio 132 in the storage device 130 , and the storage device 130 stores the recognition audio 132 . After the recording or video recording is completed, the receiving device 140 receives the sound file or image file 110, the file classification system 100 can prompt the request or after the speaker selects the voice classification, the processor 120 will be associated with a sound file or the image file 110 The audio is compared with the recognized audio 132 to generate a processing result 122 , wherein the processing result 122 has suggested classification information, and then, the processor 120 automatically classifies the audio file or the video file 110 according to the processing result 122 . For example, if the relevant audio matches the identification audio 132, it means that the audio file or the video file 110 is the audio or video data of the lecturer himself. Therefore, the processor 120 automatically classifies the audio file or the video file 110 into The category customized by the lecturer is stored in the storage device 130, thereby avoiding complicated manual operations.
为了对本发明的辨识音频132的收录方式作进一步的阐述,请参照图2,其是绘示本发明的第二实施例的文件分类系统200,图2的文件分类系统200除了增加音频录制装置250以外,其余硬件与图1的文件分类系统100实质上相同。实作上,音频录制装置250可为麦克风或其他收音装置,此外,音频录制装置250亦可依据实际需求与接收装置140所内建或外接的录音装置整合为同一装置。In order to further explain the recording method of the identification audio 132 of the present invention, please refer to FIG. 2 , which shows a file classification system 200 according to the second embodiment of the present invention. The file classification system 200 in FIG. 2 is in addition to adding an audio recording device 250 Other than that, the rest of the hardware is substantially the same as the document classification system 100 in FIG. 1 . In practice, the audio recording device 250 can be a microphone or other sound receiving devices. In addition, the audio recording device 250 can also be integrated with the built-in or external recording device of the receiving device 140 as the same device according to actual needs.
在声音文件或影像文件110被录制以前,使用者(如:演讲者)可透过文件分类系统200内建的音频录制装置250预录上述的辨识音频132以作为语音辨识的参考样本,简易方便。于录音或录影完成后,处理器120将关连于声音文件或该影像文件110的相关音频与辨识音频132进行比对以产生处理结果122,进而根据处理结果122自动进行声音文件或影像文件110的分类。Before the audio file or image file 110 is recorded, the user (such as a speaker) can pre-record the above-mentioned recognition audio 132 through the built-in audio recording device 250 of the file classification system 200 as a reference sample for speech recognition, which is simple and convenient . After the recording or video recording is completed, the processor 120 compares the relevant audio associated with the sound file or the image file 110 with the identified audio 132 to generate a processing result 122, and then automatically performs the processing of the sound file or image file 110 according to the processing result 122. Classification.
以下将搭配图3、图4来说明关连于声音文件或该影像文件110的相关音频的各种例子,请先参照图3,其是绘示本发明的第三实施例的文件分类系统300。图3的文件分类系统300除了增加音频撷取装置360以外,其余硬件与图2的文件分类系统200实质上相同。实作上,音频撷取装置360可为音效卡、音频处理芯片或其他类似元件。Various examples of related audio related to the sound file or the video file 110 will be described below with reference to FIG. 3 and FIG. The document classification system 300 in FIG. 3 is substantially the same as the document classification system 200 in FIG. 2 except that an audio capture device 360 is added. In practice, the audio capture device 360 can be a sound card, an audio processing chip or other similar components.
在声音文件或影像文件110被录制完成之后,音频撷取装置360从声音文件或影像文件110中撷取出的待决音频(pending audio signal)362作为关连于声音文件或该影像文件110的相关音频。接着,处理器120比对待决音频362与辨识音频132以产生处理结果122,进而根据处理结果122自动进行声音文件或影像文件110的分类。After the audio file or image file 110 is recorded, the audio capture device 360 retrieves a pending audio signal 362 extracted from the audio file or image file 110 as the relevant audio associated with the sound file or the image file 110 . Next, the processor 120 compares the pending audio 362 with the recognized audio 132 to generate a processing result 122 , and then automatically classifies the audio file or the video file 110 according to the processing result 122 .
关于具体的分类方式,于一实施例中,在录音或录影之前,演讲者可操作文件分类系统300或外部计算机装置以自订专属的类别334(如:Windows操作系统的一资料夹),使储存装置130记录类别334,由处理器120建立该类别334的路径与辨识音频132之间的关联,在声音文件或影像文件110被录制完成之后,处理器120分析及比对待决音频362的声学特征(acoustic feature)与辨识音频132的声学特征,当待决音频362的声学特征与辨识音频132的声学特征相匹配时,处理器120将声音文件或影像文件110归类至类别234。Regarding the specific classification method, in one embodiment, before audio recording or video recording, the lecturer can operate the file classification system 300 or an external computer device to customize the exclusive category 334 (such as: a file folder of the Windows operating system), so that The storage device 130 records the category 334, and the processor 120 establishes the association between the path of the category 334 and the identified audio 132. After the sound file or image file 110 is recorded, the processor 120 analyzes and compares the acoustics of the audio 362 to be processed. The acoustic feature is the acoustic feature of the identified audio 132 . When the acoustic feature of the pending audio 362 matches the acoustic feature of the identified audio 132 , the processor 120 classifies the audio file or image file 110 into the category 234 .
或者,于另一实施例中,在声音文件或影像文件110被录制完成之后,处理器120分析及比对待决音频362的语意特征(semantic feature)与辨识音频132的语意特征,当待决音频362的语意特征与辨识音频132的语意特征相匹配时,处理器120将声音文件或影像文件110归类至类别234。Or, in another embodiment, after the sound file or image file 110 is recorded, the processor 120 analyzes and compares the semantic feature (semantic feature) of the pending audio 362 with the semantic feature of the identification audio 132, when the pending audio When the semantic feature of 362 matches the semantic feature of the recognized audio 132 , the processor 120 classifies the audio file or video file 110 into the category 234 .
于文件分类系统300的语音辨识机制中,上述声学特征例如人声与背景音的组成比例,可用以协助判断进行场景辨识或者人声辨识;场景辨识可透过背景音的性质,推测周遭物体、发生的事件;人声辨识则可利用人声的音质特征,如声纹作为比对依据。语意特征例如可透过辨识音频当中的关键字词,或者透过常用词句、名字等。上述的声学特征与语意特征不限于列举范围,凡可作为场景辨识依据的声学特征或语意特征,均应包含在本发明的范围之内。In the speech recognition mechanism of the document classification system 300, the above-mentioned acoustic features such as the composition ratio of the human voice and the background sound can be used to assist in judging the scene recognition or human voice recognition; the scene recognition can use the nature of the background sound to infer the surrounding objects, events; human voice recognition can use the sound quality characteristics of the human voice, such as voiceprint, as a basis for comparison. Semantic features can be used, for example, to identify key words in the audio, or through commonly used words, names, and so on. The above-mentioned acoustic features and semantic features are not limited to the enumerated scope, and any acoustic feature or semantic feature that can be used as a basis for scene recognition shall be included within the scope of the present invention.
除了待决音频362可作为关连于声音文件或该影像文件110的相关音频之外,相关音频的另一例子,请参考图4,其是绘示本发明的第四实施例的文件分类系统400。图4的文件分类系统400的硬件架构实质上与图2的文件分类系统200的硬件架构相同。In addition to the pending audio 362 can be used as related audio related to the sound file or the image file 110, another example of related audio, please refer to FIG. 4, which is a file classification system 400 according to the fourth embodiment of the present invention . The hardware architecture of the document classification system 400 in FIG. 4 is substantially the same as the hardware architecture of the document classification system 200 in FIG. 2 .
在声音文件或影像文件110被录制完成之后,演讲者念出一段语句作为补充音频452,使音频录制装置250接收补充音频452作为上述关连于声音文件或该影像文件110的相关音频,接着,处理器120比对补充音频452与辨识音频132以产生处理结果122,进而根据处理结果122自动进行声音文件或影像文件110的分类。After the sound file or image file 110 is recorded, the speaker reads a sentence as supplementary audio 452, so that the audio recording device 250 receives the supplementary audio 452 as the above-mentioned relevant audio related to the sound file or the image file 110, and then processes The device 120 compares the supplementary audio 452 with the recognized audio 132 to generate a processing result 122 , and then automatically classifies the audio file or the video file 110 according to the processing result 122 .
关于具体的分类方式,于一实施例中,在录音或录影之前,演讲者可操作文件分类系统400或外部计算机装置以自订专属的类别434(如:Windows操作系统的一资料夹),使储存装置130记录类别434,由处理器120建立该类别434的路径与辨识音频132之间的关联,在声音文件或影像文件110被录制完成之后,处理器120分析及比对补充音频452的声学特征与辨识音频132的声学特征,当补充音频452的声学特征与辨识音频132的声学特征相匹配时,处理器120将声音文件或影像文件110归类至类别434。Regarding the specific classification method, in one embodiment, before audio recording or video recording, the lecturer can operate the file classification system 400 or an external computer device to customize the exclusive category 434 (such as: a file folder of the Windows operating system), so that The storage device 130 records the category 434, and the processor 120 establishes the association between the path of the category 434 and the identified audio 132. After the sound file or image file 110 is recorded, the processor 120 analyzes and compares the acoustics of the supplementary audio 452. The feature matches the acoustic feature of the identified audio 132 , and the processor 120 classifies the sound file or video file 110 into the category 434 when the acoustic feature of the supplemental audio 452 matches the acoustic feature of the identified audio 132 .
或者,于另一实施例中,在声音文件或影像文件110被录制完成之后,处理器120分析及比对补充音频452的语意特征与辨识音频132的语意特征,当补充音频452的语意特征与辨识音频132的语意特征相匹配时,处理器120将声音文件或影像文件110归类至类别434。Or, in another embodiment, after the audio file or image file 110 is recorded, the processor 120 analyzes and compares the semantic features of the supplementary audio 452 with the semantic features of the identification audio 132, when the semantic features of the supplementary audio 452 are in line with The processor 120 classifies the audio file or the video file 110 into a category 434 when the semantic features of the identified audio 132 match.
于文件分类系统400的语音辨识机制中,声学特征例如人声与背景音的组成比例,可用以协助判断进行场景辨识或者人声辨识;场景辨识可透过背景音的性质,推测周遭物体、发生的事件;人声辨识则可利用人声的音质特征,如声纹作为比对依据。语意特征例如可透过辨识音频当中的关键字词,或者透过常用词句、名字等。上述的声学特征与语意特征不限于列举范围,凡可作为场景辨识依据的声学特征或语意特征,均应包含在本发明的范围之内。In the voice recognition mechanism of the document classification system 400, acoustic features such as the composition ratio of the human voice and the background sound can be used to assist in the judgment of scene recognition or human voice recognition; the scene recognition can predict the surrounding objects and occurrences through the nature of the background sound. events; human voice recognition can use the sound quality characteristics of the human voice, such as voiceprint, as a basis for comparison. Semantic features can be used, for example, to identify key words in the audio, or through commonly used words, names, and so on. The above-mentioned acoustic features and semantic features are not limited to the enumerated scope, and any acoustic feature or semantic feature that can be used as a basis for scene recognition shall be included within the scope of the present invention.
图5是本发明的第五实施例的文件分类方法500的流程图。文件分类方法500可经由一计算机系统来实作,例如前述的文件分类系统100、200、300、400,亦可将部分功能实作为至少一计算机程序,并储存于一计算机可读取的记录媒体中,该至少一计算机程序具有多个指令,这些指令在一计算机上执行时使该计算机执行文件分类方法500。FIG. 5 is a flowchart of a file classification method 500 according to the fifth embodiment of the present invention. The document classification method 500 can be implemented through a computer system, such as the aforementioned document classification systems 100, 200, 300, 400, and part of the functions can also be implemented as at least one computer program and stored in a computer-readable recording medium Among them, the at least one computer program has a plurality of instructions, and these instructions cause the computer to execute the file classification method 500 when executed on a computer.
如图5所示,文件分类方法500包括多个步骤S502~S506。然熟悉本案的技艺者应了解到,在本实施例中所提及的步骤,除特别叙明其顺序者外,均可依实际需要调整其前后顺序,甚至可同时或部分同时执行。至于实施这些步骤的硬件装置,由于以上实施例已具体揭露,因此不再重复赘述的。As shown in FIG. 5 , the document classification method 500 includes a plurality of steps S502-S506. However, those skilled in the present case should understand that the steps mentioned in this embodiment, unless the order is specifically stated, can be adjusted according to actual needs, and can even be executed simultaneously or partly simultaneously. As for the hardware device for implementing these steps, since the above embodiments have been specifically disclosed, it is not repeated here.
首先,在录音或录影之前,于步骤S502,预录及储存辨识音频以作为语音辨识的参考样本,该辨识音频可为使用者(如:演讲者)的一段语音。接着,于录音或录影完成后,于步骤S504,接收声音文件或影像文件,系统可提示要求或演讲者选择语音归类,然后,于步骤S506,将关连于声音文件或该影像文件的一相关音频与辨识音频进行比对以产生处理结果,进而根据处理结果自动进行声音文件或影像文件的分类。如此,通过文件分类方法500的语音归档辨识机制,让文件快速分类,免于繁琐的手动操作。Firstly, before recording or video recording, in step S502, the recognition audio is pre-recorded and stored as a reference sample for speech recognition. The recognition audio can be a segment of speech of a user (eg, a speaker). Then, after the recording or video recording is completed, in step S504, the audio file or image file is received, the system can prompt the request or the lecturer selects the voice category, and then, in step S506, a related audio file or the image file will be linked The audio is compared with the recognized audio to generate a processing result, and then automatically classify the sound file or image file according to the processing result. In this way, through the voice file recognition mechanism of the file classification method 500, the files can be classified quickly without cumbersome manual operations.
具体而言,在录音或录影之前,演讲者可操作计算机系统以自订专属的类别(如:Windows操作系统的一资料夹),于步骤S502,建立该类别的路径与辨识音频之间的关联。于步骤S506中,若关连于声音文件或该影像文件的相关音频与辨识音频相匹配,这代表是演讲者本人的录音或录影,因此,处理器120自动将声音文件或该影像文件110归类至储存装置130中该演讲者所自订的类别。Specifically, before audio recording or video recording, the lecturer can operate the computer system to customize an exclusive category (such as: a file folder of the Windows operating system), and in step S502, establish the association between the path of the category and the identified audio . In step S506, if the relevant audio associated with the sound file or the image file matches the identified audio, it means that it is the audio or video recording of the speaker himself. Therefore, the processor 120 automatically classifies the sound file or the image file 110 to the category defined by the lecturer in the storage device 130 .
上述相关音频有至少两种实作方式,关于第一种方式,于一实施例中,步骤S506从声音文件或影像文件中撷取一待决音频以作为相关音频,分析及比对待决音频的声学特征与至少一辨识音频的声学特征,当待决音频的声学特征与至少一辨识音频的声学特征相匹配时,将声音文件或影像文件归类至该类别。There are at least two ways to implement the above-mentioned related audio. Regarding the first way, in one embodiment, step S506 extracts a pending audio from a sound file or an image file as a related audio, analyzes and compares the pending audio The acoustic feature matches at least one acoustic feature for identifying audio, and when the acoustic feature of the pending audio matches the at least one acoustic feature for identifying audio, the audio file or image file is classified into the category.
或者,于另一实施例中,步骤S506分析及比对待决音频的语意特征与至少一辨识音频的语意特征,当待决音频的语意特征与至少一辨识音频的语意特征相匹配时,将声音文件或影像文件归类至该类别。Or, in another embodiment, step S506 analyzes and compares the semantic features of the pending audio with the semantic features of at least one recognized audio, and when the semantic features of the pending audio match the semantic features of at least one recognized audio, the sound Documents or image files are classified into this category.
关于相关音频的第二种实作方式,于一实施例中,步骤S506接收一补充音频以作为相关音频,其中的补充音频可以是演讲者在录音或录影完成后念出的一段语句;接着,步骤S506分析及比对至少一辨识音频的声学特征与补充音频的声学特征,当至少一辨识音频的声学特征与补充音频的声学特征相匹配时,将声音文件或影像文件归类至该类别。Regarding the second implementation of the related audio, in one embodiment, step S506 receives a supplementary audio as the related audio, where the supplementary audio can be a sentence read by the speaker after the recording or video recording is completed; then, Step S506 analyzes and compares at least one acoustic feature of the identified audio and the acoustic feature of the supplementary audio, and when at least one acoustic feature of the identified audio matches the acoustic feature of the supplementary audio, the audio file or video file is classified into the category.
或者,于另一实施例中,步骤S506分析及比对至少一辨识音频的语意特征与补充音频的语意特征,当至少一辨识音频的语意特征与补充音频的语意特征相匹配时,将待决音频归类至该类别。Or, in another embodiment, step S506 analyzes and compares at least one semantic feature of the recognized audio with the semantic feature of the supplementary audio, and when at least one semantic feature of the recognized audio matches the semantic feature of the supplementary audio, the pending Audio falls into this category.
上述步骤S506分析及比对的声学特征时,亦可利用音频的频率、频谱(frequency spectrum)、振幅(amplitude)、相位(phase)、音长(duration)、声纹(voice print)或其任意组合,或是经过数学运算的结果、时域转换至频域(time domain to frequency domain transform)的结果,以作为分析及比对的依据,且均应包含在本发明专利范围之内。进行场景辨识时,可透过辨识周遭的物体,例如不同鞋种发出的脚步声、不同交通工具、不同动物的鸣叫声等;亦可辨识发生的事件,例如风声、雨声、开关门声、不同类型的音乐等;人声辨识可根据不同人说话的音高(pitch)、口音腔调(accent)、节奏(rhythm)、音量(volume)、音色(tone quality)等,以识别不同人物的独特性。或者,使用者可依需求而自行定义声学特征的匹配程度的强弱以及判断条件的顺序,并储存于该储存装置中。凡可作为场景辨识或人声辨识依据的声学特征,均应包含在本发明专利范围之内。When the above-mentioned step S506 analyzes and compares the acoustic features, the audio frequency, frequency spectrum, amplitude, phase, duration, voice print or any other audio frequency can also be used. Combination, or the result of mathematical operation, time domain to frequency domain transform (time domain to frequency domain transform) results, as the basis for analysis and comparison, and should be included in the patent scope of the present invention. When performing scene recognition, you can recognize surrounding objects, such as the footsteps of different types of shoes, different vehicles, and the sounds of different animals; you can also recognize events that occur, such as wind, rain, door opening and closing, Different types of music, etc.; human voice recognition can identify the uniqueness of different people according to their pitch, accent, rhythm, volume, tone quality, etc. sex. Alternatively, the user can define the strength of the matching degree of the acoustic features and the order of the judgment conditions according to the requirements, and store them in the storage device. All acoustic features that can be used as the basis for scene recognition or human voice recognition should be included in the patent scope of the present invention.
上述步骤S506于分析的语意特征时,方式包含辨识音频当中的关键字词,例如不同运动的术语、不同场合(演讲、结婚典礼、毕业典礼、演唱会等)的用语可用以进行场景辨识;若为人声辨识则可透过常用词句、姓名、关系用语等。上述匹配程度例示说明如下:上位语(hypernym)和下位语(hyponym)、同义词(synonym)、近似概念、不同语言的翻译词汇、不同语言的姓名或一部分姓名均可定义为匹配程度高,并可依使用者需求而自行定义匹配程度的强弱以及判断条件的顺序,并储存于储存装置中。In the above step S506, when analyzing the semantic features, the method includes identifying key words in the audio, such as the terms of different sports, and the terms of different occasions (speech, wedding ceremony, graduation ceremony, concert, etc.) can be used for scene recognition; if For human voice recognition, common words, names, and related terms can be used. The above matching degree is exemplified as follows: hypernym and hyponym, synonyms, similar concepts, translation vocabulary in different languages, names or part of names in different languages can all be defined as high matching degree, and can be The strength of the matching degree and the sequence of the judgment conditions are defined according to the needs of the user, and stored in the storage device.
上述的步骤S506可设定为自动依该处理结果进行分类,或者透过建议分类至一类别,让使用者得以确认该处理结果,可视使用需求弹性调整;例示其中一种分类形式为归档至如Windows操作系统的一资料夹内。The above-mentioned step S506 can be set to automatically classify according to the processing result, or to allow users to confirm the processing result by suggesting classification into a category, which can be flexibly adjusted according to the needs of use; an example of one of the classification forms is to file to For example, in a folder of the Windows operating system.
综上所述,本发明得以透过上述的实施例在每次录音或录影后,将文件即时分类至适当类别当中,以解决录音或录影文件在长时间未整理或是文件数目众多的情况下,难以迅速寻找到特定文件的问题,亦免除人工文件分类的麻烦。To sum up, the present invention can classify the files into appropriate categories in real time after each recording or video recording through the above-mentioned embodiment, so as to solve the problem that the recording or video files have not been sorted out for a long time or there are a large number of files , it is difficult to quickly find a specific file, and it also avoids the trouble of manual file classification.
虽然本发明已以实施方式揭露如上,然其并非用以限定本发明,任何熟悉此技艺者,在不脱离本发明的精神和范围内,当可作各种的更动与润饰,因此本发明的保护范围当视权利要求书所界定的范围为准。Although the present invention has been disclosed above in terms of implementation, it is not intended to limit the present invention. Any skilled person can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the present invention The scope of protection should depend on the scope defined by the claims.
Claims (16)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410524658.2A CN104281682A (en) | 2014-09-30 | 2014-09-30 | File Classification System and Method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410524658.2A CN104281682A (en) | 2014-09-30 | 2014-09-30 | File Classification System and Method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104281682A true CN104281682A (en) | 2015-01-14 |
Family
ID=52256555
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410524658.2A Pending CN104281682A (en) | 2014-09-30 | 2014-09-30 | File Classification System and Method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104281682A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731979A (en) * | 2015-04-16 | 2015-06-24 | 广东欧珀移动通信有限公司 | A method and device for saving all exclusive information resources of a specific person |
CN104731927A (en) * | 2015-03-27 | 2015-06-24 | 努比亚技术有限公司 | Sound recording file classifying method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1619640A (en) * | 2003-11-21 | 2005-05-25 | 先锋株式会社 | Automatic musical composition classification device and method |
CN101149950A (en) * | 2007-11-15 | 2008-03-26 | 北京中星微电子有限公司 | Media player for implementing classified playing and classified playing method |
US20110238422A1 (en) * | 2010-03-29 | 2011-09-29 | Schaertel David M | Method for sonic document classification |
-
2014
- 2014-09-30 CN CN201410524658.2A patent/CN104281682A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1619640A (en) * | 2003-11-21 | 2005-05-25 | 先锋株式会社 | Automatic musical composition classification device and method |
CN101149950A (en) * | 2007-11-15 | 2008-03-26 | 北京中星微电子有限公司 | Media player for implementing classified playing and classified playing method |
US20110238422A1 (en) * | 2010-03-29 | 2011-09-29 | Schaertel David M | Method for sonic document classification |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731927A (en) * | 2015-03-27 | 2015-06-24 | 努比亚技术有限公司 | Sound recording file classifying method and system |
CN104731979A (en) * | 2015-04-16 | 2015-06-24 | 广东欧珀移动通信有限公司 | A method and device for saving all exclusive information resources of a specific person |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110557589B (en) | System and method for integrating recorded content | |
CN108305615B (en) | Object identification method and device, storage medium and terminal thereof | |
CN103035247B (en) | Based on the method and device that voiceprint is operated to audio/video file | |
WO2019148586A1 (en) | Method and device for speaker recognition during multi-person speech | |
US11693988B2 (en) | Use of ASR confidence to improve reliability of automatic audio redaction | |
US8983836B2 (en) | Captioning using socially derived acoustic profiles | |
CN108242238B (en) | A method and device for generating audio files, and terminal equipment | |
US20160179831A1 (en) | Systems and methods for textual content creation from sources of audio that contain speech | |
US20160283185A1 (en) | Semi-supervised speaker diarization | |
CN106710593B (en) | A method, terminal and server for adding account | |
CN109361825A (en) | Meeting summary recording method, terminal and computer storage medium | |
CN103165131A (en) | Voice processing system and voice processing method | |
JP5779032B2 (en) | Speaker classification apparatus, speaker classification method, and speaker classification program | |
TWI536366B (en) | Spoken vocabulary generation method and system for speech recognition and computer readable medium thereof | |
Khan et al. | A novel audio forensic data-set for digital multimedia forensics | |
CN108831456B (en) | Method, device and system for marking video through voice recognition | |
TWI413106B (en) | Electronic recording apparatus and method thereof | |
US20160093299A1 (en) | File classifying system and method | |
CN116959498A (en) | Music adding method, device, computer equipment and computer readable storage medium | |
Pandey et al. | Cell-phone identification from audio recordings using PSD of speech-free regions | |
CN104281682A (en) | File Classification System and Method | |
JP6344849B2 (en) | Video classifier learning device and program | |
CN114121023A (en) | Speaker separation method, speaker separation device, electronic equipment and computer readable storage medium | |
CN114203180A (en) | Conference summary generation method and device, electronic equipment and storage medium | |
JP5997813B2 (en) | Speaker classification apparatus, speaker classification method, and speaker classification program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150114 |
|
WD01 | Invention patent application deemed withdrawn after publication |