[go: up one dir, main page]

CN213694055U - Voice acquisition equipment - Google Patents

Voice acquisition equipment Download PDF

Info

Publication number
CN213694055U
CN213694055U CN202023183752.6U CN202023183752U CN213694055U CN 213694055 U CN213694055 U CN 213694055U CN 202023183752 U CN202023183752 U CN 202023183752U CN 213694055 U CN213694055 U CN 213694055U
Authority
CN
China
Prior art keywords
voice
audio
display
entries
entry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202023183752.6U
Other languages
Chinese (zh)
Inventor
邹凯文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shencong Semiconductor Jiangsu Co ltd
Original Assignee
Shanghai Shencong Semiconductor Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shencong Semiconductor Co ltd filed Critical Shanghai Shencong Semiconductor Co ltd
Priority to CN202023183752.6U priority Critical patent/CN213694055U/en
Application granted granted Critical
Publication of CN213694055U publication Critical patent/CN213694055U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • User Interface Of Digital Computer (AREA)

Abstract

The utility model discloses a voice acquisition device, aiming at the problems of large workload and low efficiency caused by the existing need of manually acquiring and labeling audio, the personal information of a recorder is recorded in advance through a recorder; the display displays the entries to be recorded and the display modes of the entries; the audio collector collects the voice sent by a sound recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels; the audio processor identifies the voice collected by the audio collector and compares the recognized entries displayed by the voice displayer; and the memory automatically stores the audio files transmitted by the audio processor and names the audio files in a manner of entry-personal information. The efficiency of pronunciation collection and mark is promoted, reduces the manual work, save time and cost.

Description

Voice acquisition equipment
Technical Field
The utility model belongs to the technical field of pronunciation collection, especially, relate to a pronunciation collection equipment.
Background
Sound is a wave generated by the vibration of an object, and when the object vibrates, surrounding narrow air is continuously compressed and relaxed and is diffused to the surroundings, which is a sound wave, and the frequency range of sound that a person can hear is 20Hz to 20 kHz. Three elements of sound that a person can hear are intensity, pitch, and timbre, where intensity is the intensity of the sound, depending on the amplitude between the sounds; the tone is related to the frequency of the sound, high frequency leads to the sound pitch, low frequency leads to the sound low; the timbre is determined by the overtones mixed with the fundamental tones. Each fundamental tone has a natural frequency and overtones of different pitch strength, so that each sound has a particular timbre effect.
Audio technologies include audio acquisition (analog to digital conversion to computer recognized digital signals), speech decoding/encoding, text-to-sound conversion, music synthesis, speech recognition and understanding, audio data transmission, audio video synchronization, audio effects and editing, and the like. There are two methods commonly used to implement computer speech output, recording/playback and text-to-speech conversion.
The common methods for collecting audio data include 3 methods: directly acquiring the existing audio, capturing and intercepting sound by using audio processing software, and recording the sound by using a microphone.
For recording sound by a microphone, a common mode at present is to manually pronounce a piece of paper, pronounce a vocabulary entry, store and name an audio, and the efficiency is extremely low. And the entries are completely pronounced, and then the audio is cut and labeled manually. The two modes need a large amount of labor and time, are low in working efficiency and cannot meet the requirements of people.
SUMMERY OF THE UTILITY MODEL
The utility model aims at providing a pronunciation collection equipment solves artifical collection and marks the problem that the audio frequency work load is big and inefficiency.
In order to solve the above problem, the technical scheme of the utility model is that:
a speech acquisition device comprising:
the recorder is used for recording personal information of a sound recorder in advance; the personal information comprises gender, age and region;
the display is used for displaying the entries to be recorded and the display modes of the entries;
the audio collector is used for collecting the voice sent by a sound recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels;
the audio processor is used for identifying the voice collected by the audio collector and comparing the identified voice with the entries displayed by the display;
and the memory is used for automatically storing the audio files transmitted by the audio processor and naming the audio files in a manner of entry-personal information.
According to an embodiment of the present invention, the input device is a touch screen with an input method or a personal information option.
According to an embodiment of the present invention, the input device is a keyboard.
According to the utility model discloses an embodiment, be equipped with data input interface on the display, data input interface is used for leading-in the vocabulary entry that needs record.
According to the utility model discloses an embodiment, be equipped with vocabulary entry list selection key and display mode selection key on the display.
According to an embodiment of the present invention, the audio processor includes a pause detection part and an entry comparison part;
the pause detection piece is used for detecting whether the voice collected by the audio collector pauses for a preset time length or not, and if so, stopping collecting the voice and performing voice recognition;
the vocabulary entry comparison part is used for comparing the voice recognized by the pause detection part with the vocabulary entries displayed by the display, judging whether the voice is consistent with the vocabulary entries, and if so, carrying out audio tagging and transmitting the voice to the memory; if not, the voice is discarded.
The utility model discloses owing to adopt above technical scheme, make it compare with prior art and have following advantage and positive effect:
1) the voice acquisition equipment in an embodiment of the utility model records the personal information of a recorder in advance through the recorder aiming at the problems of large workload and low efficiency caused by the existing need of manually acquiring and labeling audio; the display is used for displaying the entries to be recorded and the display modes of the entries; the audio collector collects the voice sent by the recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels; the audio processor is used for identifying the voice collected by the audio collector and comparing the recognized entries displayed by the voice displayer; and the memory automatically stores the audio files transmitted by the audio processor and names the audio files in a manner of entry-personal information. The efficiency of pronunciation collection and mark is promoted, reduces the manual work, save time and cost.
2) The utility model relates to an embodiment's pronunciation collection equipment, its memory can be according to the personal information automatic naming of recording person, does not need the manual work to name audio file one by one, has greatly reduced the cost of labor, also makes things convenient for follow-up screening to audio file.
Drawings
Fig. 1 is a schematic diagram of a voice collecting device in an embodiment of the present invention.
Description of reference numerals:
1: an input device; 2: a display; 3: an audio collector; 4: an audio processor; 5: a memory.
Detailed Description
The following describes a voice collecting device according to the present invention in further detail with reference to the accompanying drawings and specific embodiments. The advantages and features of the present invention will become more fully apparent from the following description and appended claims.
The present embodiment provides a voice collecting apparatus, please refer to fig. 1, the voice collecting apparatus includes:
the recorder 1 is used for recording personal information of a sound recorder in advance; the personal information includes name, gender, age, and region. The personal information is used for naming the audio files subsequently, and the subsequent screening or retrieval of the audio files is facilitated. In practical applications, the input device 1 may be a touch screen with an input method or a personal information option, or may be a keyboard.
And the display 2 is used for displaying the entries to be recorded and the display modes of the entries. The display 2 can remind the user of the entry required to be recorded in a prompting mode. The display 2 is provided with a data input interface for importing entries to be recorded. The display 2 is also provided with an entry list selection key and a display mode selection key. The entries can be imported through a data input interface, and a user can select the entries to be recorded through an entry list selection key and can select a key through a display mode to display in sequence or randomly display. In addition, according to actual needs, function keys for adding, deleting, searching or modifying the entries can be further arranged on the display 2.
And the audio collector 3 is used for collecting the voice sent by the sound recorder according to the entry in the display 2 according to the set sampling frequency, the set sampling digit and the set number of the sound channels. The sampling frequency of the audio collector 3 can be set to 16KHz, the number of sampling bits can be set to 16 bits (high fidelity tone quality), and the number of channels can be set to mono. Of course, the sampling frequency, the number of sampling bits, and the number of channels may be set to other values as necessary.
And the audio processor 4 is used for identifying the voice collected by the audio collector 3 and comparing the identified voice with the entries displayed by the display 2. The audio processor 4 comprises a pause detection part and an entry comparison part; the pause detection part is used for detecting whether the speech collected by the audio collector 3 pauses for a preset time, and if yes, stopping collecting the speech and performing speech recognition. The pause detector judges whether the speech is silent (pause) or not based on the energy of each frame of speech data, and judges that the silence (pause) occurs when the energy of each frame is relatively small. And if the mute time reaches the preset time (such as 2s), stopping voice acquisition and starting to recognize the acquired voice.
The vocabulary entry comparison part is used for comparing the voice recognized by the pause detection part with the vocabulary entries displayed by the display 2, judging whether the voice is consistent with the vocabulary entries, if so, carrying out audio annotation and transmitting the voice annotation to the memory 5, and the display 2 displays the next vocabulary entry in sequence; if not, the speech is discarded and the display 2 repeats the display of the entry.
And the memory 5 is used for automatically storing the audio files transmitted by the audio processor 4 and naming the audio files in a manner of entry-personal information. If the personal information of the recorder is sex male, age 18, and hang state in Zhejiang, the entries being recorded are turned on for illumination, then the audio file is named as: dakaizhaoming-Y18-zhe A-X (X for female, Y for male, number for age, zhe Hangzhou, Zhejiang, X for the number of recordings of the entry).
The operation of the speech acquisition device is briefly described as follows:
firstly, recording personal related information of a recorder such as gender, age, region and the like, reading out the entry by the recorder according to the entry displayed by the display, collecting and judging voice by the audio collector and the audio processor (mainly judging whether the recorder pauses or not and whether the spoken entry is the same as the entry prompted by the display or not), stopping recording and starting voice recognition on the recording when the audio processor judges that the recorder pauses, judging whether the recording content is consistent with the entry prompted by the display or not, storing the recording and naming and storing according to the recorded gender, age and region of the recorder according to naming rules, and displaying the next entry by the display after the storage is finished. If the recorded content is not consistent with the entry prompted by the display, the audio is discarded, and the display repeatedly displays the entry. The audio processor also needs to judge whether the entry displayed on the display is the last entry, and if so, the recording is finished.
On one hand, the voice acquisition equipment in the embodiment can realize high-efficiency voice acquisition, so that the acquisition efficiency is greatly improved; on the other hand, the collected voice files are automatically named according to the personal information of the sound recorder, and people do not need to name the voice files one by one, so that the labor cost is greatly reduced.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments. Even if various changes are made to the present invention, the changes are still within the scope of the present invention if they fall within the scope of the claims and their equivalents.

Claims (6)

1. A speech acquisition device, comprising:
the recorder is used for recording personal information of a sound recorder in advance; the personal information comprises gender, age and region;
the display is used for displaying the entries to be recorded and the display modes of the entries;
the audio collector is used for collecting the voice sent by a sound recorder according to the entry in the display according to the set sampling frequency, the set sampling digit and the set number of the sound channels;
the audio processor is used for identifying the voice collected by the audio collector and comparing the identified voice with the entries displayed by the display;
and the memory is used for automatically storing the audio files transmitted by the audio processor and naming the audio files in a manner of entry-personal information.
2. The speech acquisition device of claim 1 wherein the input device is a touch screen with input methods or personal information options.
3. The speech acquisition device of claim 1 wherein the input is a keyboard.
4. The speech acquisition device of claim 1 wherein the display is provided with a data input interface for importing entries to be recorded.
5. The speech sound pickup device according to claim 4, wherein an entry list selection key and a display mode selection key are provided on the display.
6. The speech acquisition device of claim 1 wherein the audio processor comprises a pause detection element and an entry comparison element;
the pause detection piece is used for detecting whether the voice collected by the audio collector pauses for a preset time length or not, and if so, stopping collecting the voice and performing voice recognition;
the vocabulary entry comparison part is used for comparing the voice recognized by the pause detection part with the vocabulary entries displayed by the display, judging whether the voice is consistent with the vocabulary entries, and if so, carrying out audio tagging and transmitting the voice to the memory; if not, the voice is discarded.
CN202023183752.6U 2020-12-25 2020-12-25 Voice acquisition equipment Active CN213694055U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202023183752.6U CN213694055U (en) 2020-12-25 2020-12-25 Voice acquisition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202023183752.6U CN213694055U (en) 2020-12-25 2020-12-25 Voice acquisition equipment

Publications (1)

Publication Number Publication Date
CN213694055U true CN213694055U (en) 2021-07-13

Family

ID=76740731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202023183752.6U Active CN213694055U (en) 2020-12-25 2020-12-25 Voice acquisition equipment

Country Status (1)

Country Link
CN (1) CN213694055U (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115527521A (en) * 2022-08-29 2022-12-27 北京探境科技有限公司 Voice data acquisition and recognition method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115527521A (en) * 2022-08-29 2022-12-27 北京探境科技有限公司 Voice data acquisition and recognition method and device

Similar Documents

Publication Publication Date Title
US10977299B2 (en) Systems and methods for consolidating recorded content
CN108305632B (en) Method and system for forming voice abstract of conference
CN1121108C (en) Portable cellular phone
EP0887788B1 (en) Voice recognition apparatus for converting voice data present on a recording medium into text data
CN103035247B (en) Based on the method and device that voiceprint is operated to audio/video file
US7603273B2 (en) Simultaneous multi-user real-time voice recognition system
CN102903375B (en) Music player and player method
WO2020098115A1 (en) Subtitle adding method, apparatus, electronic device, and computer readable storage medium
US20110208330A1 (en) Sound recording device
CN108242238B (en) A method and device for generating audio files, and terminal equipment
WO2016197708A1 (en) Recording method and terminal
CN113271386B (en) Howling detection method and device, storage medium and electronic equipment
CN110472097A (en) Melody automatic classification method, device, computer equipment and storage medium
CN114373478A (en) Song audio labeling and alignment model training method, equipment and storage medium
Stockdale Tools for digital audio recording in qualitative research
CN213694055U (en) Voice acquisition equipment
CN116994597B (en) Audio processing system, method and storage medium
CN105487788B (en) A kind of music information real time acquiring method and device
JP2011090483A (en) Information processing apparatus and program
JPH08249343A (en) Device and method for speech information acquisition
CN114387994A (en) Audio data acquisition method and device
CN114333839A (en) Model training material selection method, device, electronic device and storage medium
CN100458914C (en) Speech recognition system and method
Tucker et al. Novel techniques for time-compressing speech: an exploratory study
JPH10133678A (en) Audio playback device

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Unit G4-202-059, Artificial Intelligence Industrial Park, No. 88 Jinjihu Avenue, Suzhou Industrial Park, Suzhou Area, China (Jiangsu) Pilot Free Trade Zone, Suzhou City, Jiangsu Province, 215124

Patentee after: Shencong Semiconductor (Jiangsu) Co.,Ltd.

Address before: 200232 room 3712, 3 / F, 2879 Longteng Avenue, Xuhui District, Shanghai

Patentee before: Shanghai shencong Semiconductor Co.,Ltd.