CN213694055U - Voice acquisition equipment - Google Patents
Voice acquisition equipment Download PDFInfo
- Publication number
- CN213694055U CN213694055U CN202023183752.6U CN202023183752U CN213694055U CN 213694055 U CN213694055 U CN 213694055U CN 202023183752 U CN202023183752 U CN 202023183752U CN 213694055 U CN213694055 U CN 213694055U
- Authority
- CN
- China
- Prior art keywords
- voice
- audio
- display
- entries
- entry
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005070 sampling Methods 0.000 claims abstract description 14
- 238000001514 detection method Methods 0.000 claims description 9
- 238000000034 method Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 abstract description 2
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Landscapes
- User Interface Of Digital Computer (AREA)
Abstract
The utility model discloses a voice acquisition device, aiming at the problems of large workload and low efficiency caused by the existing need of manually acquiring and labeling audio, the personal information of a recorder is recorded in advance through a recorder; the display displays the entries to be recorded and the display modes of the entries; the audio collector collects the voice sent by a sound recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels; the audio processor identifies the voice collected by the audio collector and compares the recognized entries displayed by the voice displayer; and the memory automatically stores the audio files transmitted by the audio processor and names the audio files in a manner of entry-personal information. The efficiency of pronunciation collection and mark is promoted, reduces the manual work, save time and cost.
Description
Technical Field
The utility model belongs to the technical field of pronunciation collection, especially, relate to a pronunciation collection equipment.
Background
Sound is a wave generated by the vibration of an object, and when the object vibrates, surrounding narrow air is continuously compressed and relaxed and is diffused to the surroundings, which is a sound wave, and the frequency range of sound that a person can hear is 20Hz to 20 kHz. Three elements of sound that a person can hear are intensity, pitch, and timbre, where intensity is the intensity of the sound, depending on the amplitude between the sounds; the tone is related to the frequency of the sound, high frequency leads to the sound pitch, low frequency leads to the sound low; the timbre is determined by the overtones mixed with the fundamental tones. Each fundamental tone has a natural frequency and overtones of different pitch strength, so that each sound has a particular timbre effect.
Audio technologies include audio acquisition (analog to digital conversion to computer recognized digital signals), speech decoding/encoding, text-to-sound conversion, music synthesis, speech recognition and understanding, audio data transmission, audio video synchronization, audio effects and editing, and the like. There are two methods commonly used to implement computer speech output, recording/playback and text-to-speech conversion.
The common methods for collecting audio data include 3 methods: directly acquiring the existing audio, capturing and intercepting sound by using audio processing software, and recording the sound by using a microphone.
For recording sound by a microphone, a common mode at present is to manually pronounce a piece of paper, pronounce a vocabulary entry, store and name an audio, and the efficiency is extremely low. And the entries are completely pronounced, and then the audio is cut and labeled manually. The two modes need a large amount of labor and time, are low in working efficiency and cannot meet the requirements of people.
SUMMERY OF THE UTILITY MODEL
The utility model aims at providing a pronunciation collection equipment solves artifical collection and marks the problem that the audio frequency work load is big and inefficiency.
In order to solve the above problem, the technical scheme of the utility model is that:
a speech acquisition device comprising:
the recorder is used for recording personal information of a sound recorder in advance; the personal information comprises gender, age and region;
the display is used for displaying the entries to be recorded and the display modes of the entries;
the audio collector is used for collecting the voice sent by a sound recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels;
the audio processor is used for identifying the voice collected by the audio collector and comparing the identified voice with the entries displayed by the display;
and the memory is used for automatically storing the audio files transmitted by the audio processor and naming the audio files in a manner of entry-personal information.
According to an embodiment of the present invention, the input device is a touch screen with an input method or a personal information option.
According to an embodiment of the present invention, the input device is a keyboard.
According to the utility model discloses an embodiment, be equipped with data input interface on the display, data input interface is used for leading-in the vocabulary entry that needs record.
According to the utility model discloses an embodiment, be equipped with vocabulary entry list selection key and display mode selection key on the display.
According to an embodiment of the present invention, the audio processor includes a pause detection part and an entry comparison part;
the pause detection piece is used for detecting whether the voice collected by the audio collector pauses for a preset time length or not, and if so, stopping collecting the voice and performing voice recognition;
the vocabulary entry comparison part is used for comparing the voice recognized by the pause detection part with the vocabulary entries displayed by the display, judging whether the voice is consistent with the vocabulary entries, and if so, carrying out audio tagging and transmitting the voice to the memory; if not, the voice is discarded.
The utility model discloses owing to adopt above technical scheme, make it compare with prior art and have following advantage and positive effect:
1) the voice acquisition equipment in an embodiment of the utility model records the personal information of a recorder in advance through the recorder aiming at the problems of large workload and low efficiency caused by the existing need of manually acquiring and labeling audio; the display is used for displaying the entries to be recorded and the display modes of the entries; the audio collector collects the voice sent by the recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels; the audio processor is used for identifying the voice collected by the audio collector and comparing the recognized entries displayed by the voice displayer; and the memory automatically stores the audio files transmitted by the audio processor and names the audio files in a manner of entry-personal information. The efficiency of pronunciation collection and mark is promoted, reduces the manual work, save time and cost.
2) The utility model relates to an embodiment's pronunciation collection equipment, its memory can be according to the personal information automatic naming of recording person, does not need the manual work to name audio file one by one, has greatly reduced the cost of labor, also makes things convenient for follow-up screening to audio file.
Drawings
Fig. 1 is a schematic diagram of a voice collecting device in an embodiment of the present invention.
Description of reference numerals:
1: an input device; 2: a display; 3: an audio collector; 4: an audio processor; 5: a memory.
Detailed Description
The following describes a voice collecting device according to the present invention in further detail with reference to the accompanying drawings and specific embodiments. The advantages and features of the present invention will become more fully apparent from the following description and appended claims.
The present embodiment provides a voice collecting apparatus, please refer to fig. 1, the voice collecting apparatus includes:
the recorder 1 is used for recording personal information of a sound recorder in advance; the personal information includes name, gender, age, and region. The personal information is used for naming the audio files subsequently, and the subsequent screening or retrieval of the audio files is facilitated. In practical applications, the input device 1 may be a touch screen with an input method or a personal information option, or may be a keyboard.
And the display 2 is used for displaying the entries to be recorded and the display modes of the entries. The display 2 can remind the user of the entry required to be recorded in a prompting mode. The display 2 is provided with a data input interface for importing entries to be recorded. The display 2 is also provided with an entry list selection key and a display mode selection key. The entries can be imported through a data input interface, and a user can select the entries to be recorded through an entry list selection key and can select a key through a display mode to display in sequence or randomly display. In addition, according to actual needs, function keys for adding, deleting, searching or modifying the entries can be further arranged on the display 2.
And the audio collector 3 is used for collecting the voice sent by the sound recorder according to the entry in the display 2 according to the set sampling frequency, the set sampling digit and the set number of the sound channels. The sampling frequency of the audio collector 3 can be set to 16KHz, the number of sampling bits can be set to 16 bits (high fidelity tone quality), and the number of channels can be set to mono. Of course, the sampling frequency, the number of sampling bits, and the number of channels may be set to other values as necessary.
And the audio processor 4 is used for identifying the voice collected by the audio collector 3 and comparing the identified voice with the entries displayed by the display 2. The audio processor 4 comprises a pause detection part and an entry comparison part; the pause detection part is used for detecting whether the speech collected by the audio collector 3 pauses for a preset time, and if yes, stopping collecting the speech and performing speech recognition. The pause detector judges whether the speech is silent (pause) or not based on the energy of each frame of speech data, and judges that the silence (pause) occurs when the energy of each frame is relatively small. And if the mute time reaches the preset time (such as 2s), stopping voice acquisition and starting to recognize the acquired voice.
The vocabulary entry comparison part is used for comparing the voice recognized by the pause detection part with the vocabulary entries displayed by the display 2, judging whether the voice is consistent with the vocabulary entries, if so, carrying out audio annotation and transmitting the voice annotation to the memory 5, and the display 2 displays the next vocabulary entry in sequence; if not, the speech is discarded and the display 2 repeats the display of the entry.
And the memory 5 is used for automatically storing the audio files transmitted by the audio processor 4 and naming the audio files in a manner of entry-personal information. If the personal information of the recorder is sex male, age 18, and hang state in Zhejiang, the entries being recorded are turned on for illumination, then the audio file is named as: dakaizhaoming-Y18-zhe A-X (X for female, Y for male, number for age, zhe Hangzhou, Zhejiang, X for the number of recordings of the entry).
The operation of the speech acquisition device is briefly described as follows:
firstly, recording personal related information of a recorder such as gender, age, region and the like, reading out the entry by the recorder according to the entry displayed by the display, collecting and judging voice by the audio collector and the audio processor (mainly judging whether the recorder pauses or not and whether the spoken entry is the same as the entry prompted by the display or not), stopping recording and starting voice recognition on the recording when the audio processor judges that the recorder pauses, judging whether the recording content is consistent with the entry prompted by the display or not, storing the recording and naming and storing according to the recorded gender, age and region of the recorder according to naming rules, and displaying the next entry by the display after the storage is finished. If the recorded content is not consistent with the entry prompted by the display, the audio is discarded, and the display repeatedly displays the entry. The audio processor also needs to judge whether the entry displayed on the display is the last entry, and if so, the recording is finished.
On one hand, the voice acquisition equipment in the embodiment can realize high-efficiency voice acquisition, so that the acquisition efficiency is greatly improved; on the other hand, the collected voice files are automatically named according to the personal information of the sound recorder, and people do not need to name the voice files one by one, so that the labor cost is greatly reduced.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments. Even if various changes are made to the present invention, the changes are still within the scope of the present invention if they fall within the scope of the claims and their equivalents.
Claims (6)
1. A speech acquisition device, comprising:
the recorder is used for recording personal information of a sound recorder in advance; the personal information comprises gender, age and region;
the display is used for displaying the entries to be recorded and the display modes of the entries;
the audio collector is used for collecting the voice sent by a sound recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels;
the audio processor is used for identifying the voice collected by the audio collector and comparing the identified voice with the entries displayed by the display;
and the memory is used for automatically storing the audio files transmitted by the audio processor and naming the audio files in a manner of entry-personal information.
2. The speech acquisition device of claim 1 wherein the input device is a touch screen with input methods or personal information options.
3. The speech acquisition device of claim 1 wherein the input is a keyboard.
4. The speech acquisition device of claim 1 wherein the display is provided with a data input interface for importing entries to be recorded.
5. The speech sound pickup device according to claim 4, wherein an entry list selection key and a display mode selection key are provided on the display.
6. The speech acquisition device of claim 1 wherein the audio processor comprises a pause detection element and an entry comparison element;
the pause detection piece is used for detecting whether the voice collected by the audio collector pauses for a preset time length or not, and if so, stopping collecting the voice and performing voice recognition;
the vocabulary entry comparison part is used for comparing the voice recognized by the pause detection part with the vocabulary entries displayed by the display, judging whether the voice is consistent with the vocabulary entries, and if so, carrying out audio tagging and transmitting the voice to the memory; if not, the voice is discarded.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202023183752.6U CN213694055U (en) | 2020-12-25 | 2020-12-25 | Voice acquisition equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202023183752.6U CN213694055U (en) | 2020-12-25 | 2020-12-25 | Voice acquisition equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN213694055U true CN213694055U (en) | 2021-07-13 |
Family
ID=76740731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202023183752.6U Active CN213694055U (en) | 2020-12-25 | 2020-12-25 | Voice acquisition equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN213694055U (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115527521A (en) * | 2022-08-29 | 2022-12-27 | 北京探境科技有限公司 | Voice data acquisition and recognition method and device |
-
2020
- 2020-12-25 CN CN202023183752.6U patent/CN213694055U/en active Active
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115527521A (en) * | 2022-08-29 | 2022-12-27 | 北京探境科技有限公司 | Voice data acquisition and recognition method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10977299B2 (en) | Systems and methods for consolidating recorded content | |
CN108305632B (en) | Method and system for forming voice abstract of conference | |
CN1121108C (en) | Portable cellular phone | |
EP0887788B1 (en) | Voice recognition apparatus for converting voice data present on a recording medium into text data | |
CN103035247B (en) | Based on the method and device that voiceprint is operated to audio/video file | |
US7603273B2 (en) | Simultaneous multi-user real-time voice recognition system | |
CN102903375B (en) | Music player and player method | |
WO2020098115A1 (en) | Subtitle adding method, apparatus, electronic device, and computer readable storage medium | |
US20110208330A1 (en) | Sound recording device | |
CN108242238B (en) | A method and device for generating audio files, and terminal equipment | |
WO2016197708A1 (en) | Recording method and terminal | |
CN113271386B (en) | Howling detection method and device, storage medium and electronic equipment | |
CN110472097A (en) | Melody automatic classification method, device, computer equipment and storage medium | |
CN114373478A (en) | Song audio labeling and alignment model training method, equipment and storage medium | |
Stockdale | Tools for digital audio recording in qualitative research | |
CN213694055U (en) | Voice acquisition equipment | |
CN116994597B (en) | Audio processing system, method and storage medium | |
CN105487788B (en) | A kind of music information real time acquiring method and device | |
JP2011090483A (en) | Information processing apparatus and program | |
JPH08249343A (en) | Device and method for speech information acquisition | |
CN114387994A (en) | Audio data acquisition method and device | |
CN114333839A (en) | Model training material selection method, device, electronic device and storage medium | |
CN100458914C (en) | Speech recognition system and method | |
Tucker et al. | Novel techniques for time-compressing speech: an exploratory study | |
JPH10133678A (en) | Audio playback device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: Unit G4-202-059, Artificial Intelligence Industrial Park, No. 88 Jinjihu Avenue, Suzhou Industrial Park, Suzhou Area, China (Jiangsu) Pilot Free Trade Zone, Suzhou City, Jiangsu Province, 215124 Patentee after: Shencong Semiconductor (Jiangsu) Co.,Ltd. Address before: 200232 room 3712, 3 / F, 2879 Longteng Avenue, Xuhui District, Shanghai Patentee before: Shanghai shencong Semiconductor Co.,Ltd. |