[go: up one dir, main page]

CN112599119A - Method for establishing and analyzing speech library of dysarthria of motility under big data background - Google Patents

Method for establishing and analyzing speech library of dysarthria of motility under big data background Download PDF

Info

Publication number
CN112599119A
CN112599119A CN202011546906.5A CN202011546906A CN112599119A CN 112599119 A CN112599119 A CN 112599119A CN 202011546906 A CN202011546906 A CN 202011546906A CN 112599119 A CN112599119 A CN 112599119A
Authority
CN
China
Prior art keywords
voice
speech
dysarthria
big data
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011546906.5A
Other languages
Chinese (zh)
Other versions
CN112599119B (en
Inventor
马春
杜炜
金力
阚峻岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University of Traditional Chinese Medicine AHUTCM
Original Assignee
Anhui University of Traditional Chinese Medicine AHUTCM
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University of Traditional Chinese Medicine AHUTCM filed Critical Anhui University of Traditional Chinese Medicine AHUTCM
Publication of CN112599119A publication Critical patent/CN112599119A/en
Application granted granted Critical
Publication of CN112599119B publication Critical patent/CN112599119B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/34Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Theoretical Computer Science (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Machine Translation (AREA)

Abstract

本发明涉及一种大数据背景下运动性构音障碍语音库的建立及分析方法,包括以下步骤:发音文本的设计;语音录制;对语音文件的参数分析;数据库管理系统的建立的建立;大数据技术的数据分析。本发明旨在研究神经系统疾病引起的运动性构音障碍的患者语音特性,依托于开放网络平台的优势,可以实现覆盖大规模群体的测量以及相关信息的收集,实现普通话、方言、健康人语音、患者语音等语音库的建立,并在此基础上,建立满足运动性构音障碍患者病情诊断的词库。

Figure 202011546906

The invention relates to a method for establishing and analyzing a speech database for motor dysarthria under the background of big data, comprising the following steps: design of pronunciation text; speech recording; parameter analysis of speech files; establishment of a database management system; Data analysis for data technology. The invention aims to study the speech characteristics of patients with motor dysarthria caused by nervous system diseases. Relying on the advantages of an open network platform, it can realize the measurement covering large-scale groups and the collection of relevant information, and realize the speech of Mandarin, dialect and healthy people. , patient voice and other voice libraries, and on this basis, establish a word library that meets the diagnosis of patients with motor dysarthria.

Figure 202011546906

Description

Method for establishing and analyzing speech library of dysarthria of motility under big data background
Technical Field
The invention relates to a method for establishing and analyzing a speech library of dysarthria of motility under a big data background.
Background
(1) Current study of motor dysarthria:
motor dysarthria (dysarthria) refers to a group of speech disorders resulting from disturbances in the control of muscles due to damage to the central or peripheral nervous system. Motor dysarthria is often manifested as slowed, weakened, inaccurate and uncoordinated movement of speech-related muscle tissues, and may also affect respiration, resonance, control of throat vocalization, dysarthria and rhythm, and is often referred to as dysarthria clinically. Common causes of motor dysarthria include brain trauma, cerebral palsy, amyotrophic lateral sclerosis, multiple sclerosis, stroke, Parkinson's disease, spinocerebellar ataxia, and the like. Dysarthria can be classified into flaccid, spastic, disorganized, hyperkinetic, and mixed types according to neuroanatomical and speech acoustics. Among the communication disorders associated with brain damage, dysarthria has an incidence rate of up to 54%. At present, the speech acoustics characteristics of dysarthria can be reflected from subjective and objective aspects through examination on aspects of voice, resonance, rhythm and the like in clinic, and the method is favorable for providing targeted treatment and comprehensively and scientifically clarifying the speech acoustics pathological mechanism of dysarthria.
The overall incidence of motor dysarthria has been reported in few domestic and foreign studies, and studies in 125 Parkinson's disease patients by Miller et al showed that 69.6% of patients had lower mean speech intelligibility than the normal control group, with 51.2% of patients having a standard deviation lower, indicating a higher incidence of dysarthria in Parkinson's patients. Bogousslavsky et al screened 1000 patients with primary stroke and found up to 46% of the patients with speech impairment, 12.4% of which were diagnosed with dysarthria. Hartelius et al also found 51% prevalence of dysarthria in patients with multiple sclerosis. This indicates that the incidence of dysarthria is high. At present, there is no unified assessment method for dysarthria at home, the dysarthria of motility has no special assessment standard, the dysarthria assessment method or improvement method and the dysarthria examination table of the Chinese rehabilitation research center are mostly adopted, and the degree and type of dysarthria are examined, scored, recorded and evaluated by clinicians or doctors in rehabilitation departments.
(2) The current research situation of the domestic voice library is as follows:
with the development of information technology and computer science, speech technology makes it possible to interact between machine behaviors and human natural language, and both speech synthesis, speech recognition and speech recognition research are necessarily dependent on the construction of a rear-end excellent speech corpus. At present, foreign speech libraries are developed more maturely, the research of Chinese speech libraries has been rapidly advanced in the last decade, and the research and establishment of speech libraries have fallen to the ground in different languages and cultural contexts. However, the construction of speech libraries for dysarthria is still under investigation.
The evaluation research of the sound-forming voice function in China mainly focuses on subjective evaluation, and only a few researchers distinguish the concept of sound formation and voice. Huang Zhaying et al proposed "Chinese word list for testing the ability to compose sound", the word list contains 50 words, and the speech rehabilitation teacher can comprehensively evaluate the ability to compose sound of 21 initial consonants and 4 tones by evaluating the pronunciation-forming voice of 50 tested words, and meanwhile, the ability to compare the sound position of tested words is evaluated by 18 sound position comparisons and 37 minimum voice pairs. Chen Sanding et al evaluated the initial consonant, vowel and tone of Mandarin Chinese to 50 deaf children, revealed the development law of deaf children's structure sound pronunciation of speaking Mandarin Chinese, still further proposed the pronunciation rehabilitation education principle of early, sequential, fault-tolerant and consolidation. Zhang Jing doctor of the university of east China studied the main wrong trend of hearing-impaired children in the consonant constitution, analyzed the cause, and correspondingly proposed the consonant phoneme treatment framework of hearing-impaired children.
(3) The current research situation of big data in the medical field is as follows:
currently, it is more popular to define big data: data that exceeds the capabilities of a typical database software tool to capture, store, process and analyze. Big data is different from traditional data concepts such as super-large-scale data and mass data, and has four basic characteristics: large amount, diversity, aging and value. Kayyali B et al studied the impact of big data on the U.S. medical industry, indicating that the value of big data will be more and more significant to the medical industry over time. At present, big data in the medical field mainly come from pharmaceutical enterprises, clinical diagnosis data, patient medical data, health management and social network data. For example, drug development is a relatively intensive process, even for small and medium-sized enterprises, data on drug development is above TB; the data of a hospital also increases very fast every day, 3000 images of a patient are imaged once in a dual-source CT examination, 1.5GB image data is generated approximately, a standard pathological examination image is about 5GB image, and the data of the patient such as medical treatment and electronic medical record are added, so that the data increase fast every day. Research methods based on massive big data analysis have led to thinking about scientific methodology. The research does not need to directly contact with a research object, and a new research discovery can be obtained by directly analyzing and mining mass data, so that a new scientific research mode is probably brought forward.
The establishment of the voice corpus is a complicated problem, and the problem that the later perfection of the voice corpus needs to be improved is solved, for example, the existing inter-word tone regulation rules are fully utilized, and the actual situations of tone variation and soft sound are reflected as much as possible. For the deficiency of the corpus, the utilization rate of the existing corpus can be improved in the preprocessing link. For the above reasons, the voice library should be an open database so that it can be added and modified at any time to complete the database. Because the speech conditions are different, the establishment of a specific speech corpus can also encounter various difficulties, and the problems discussed herein are only one kind of discussion for establishing a speech corpus, and hopefully, data support can be provided for speech research, and play an important role in better language development and improvement of the speech corpus.
In addition, the large data volume is undoubtedly a great advantage of the network big data analysis technology, but how to guarantee the quality of the mass data and how to implement the problems of cleaning, managing and analyzing the mass data also become a great technical difficulty of the research of the subject. The massive network big data has the characteristics of multi-source heterogeneity, interactivity, timeliness, burstiness, high noise and the like, so that the network big data has the characteristics of huge value, large noise and low value density. This poses a significant challenge to ensure data quality in network big data analytics research.
Disclosure of Invention
The invention designs a method for establishing and analyzing a speech library for dysarthria of motility under the background of big data, which solves the technical problems that the large data volume is undoubtedly a big advantage of a network big data analysis technology, but the problems of how to ensure the quality of mass data and how to realize the cleaning, management and analysis of the mass data and the like also become a big technical difficulty.
In order to solve the technical problems, the invention adopts the following scheme:
a method for establishing and analyzing a speech library of dysarthria with motility under a big data background comprises the following steps: step 1, designing a pronunciation text;
step 2, recording voice;
step 3, marking the voice file;
step 4, analyzing acoustic parameters of the voice file;
step 5, establishing a database management system;
and 6, analyzing data by a big data technology.
Preferably, the data analysis of the big data technology in step 6 is based on a speech classification mechanism of a Hadoop platform, and specifically includes the following sub-steps:
step 61, collecting a plurality of patient voice files, segmenting and labeling voice segments, constructing a voice database, analyzing the extracted acoustic parameters, and acquiring effective characteristics of voice classification;
step 62, based on a Hadoop platform, subdividing the big data voice classification problem by adopting a Map function, and performing voice classification solution on the subproblems in a multi-node parallel and distributed manner to obtain corresponding voice classification results;
and step 63, finally, combining the voice classification results of the sub-problems by using a Reduce function so as to adapt to the online requirement of the big data voice classification.
Preferably, the designing of the pronunciation text in step 1 includes selecting the pronunciation text, and the selection principle of the corpus of pronunciation texts includes one or more of the following:
a. the single characters in the corpus are required to contain all the phonological phenomena as much as possible, so that the phonetic system characteristics of the voices of different patients can be reflected better and more conveniently;
b. the vocabularies in the corpus are based on the common Chinese survey table, so that the vocabularies can be conveniently compared with the common Chinese speech;
c. sentences in the corpus are mainly obtained by carrying out dialogue with the patient according to a plurality of related topics, so that the method is more suitable for the real situation faced by speech recognition; "several related topics" include daily life topics or medical history topics, such as queries for time to first onset and medical history.
d. Sentences in the corpus are complete in content and semanteme, so that prosodic information of one sentence can be reflected as much as possible;
e. the three phones are not classified and selected, so that the problem of sparse training data can be effectively solved.
Preferably, the designing of the pronunciation text in step 1 further includes compiling the pronunciation text, and the compiling principle of the pronunciation text includes one or more of the following:
a. a single-word part: taking the initial consonants, the simple or compound vowels and some commonly used characters of the tone listed in the survey word list as the language materials used for the main recording of the voice library;
b. vocabulary part: based on a four thousand word list, but not limited to the four thousand word list, the related words are recorded according to the original conclusion about the related sound system, the voice characteristics including the characteristics of tone quality and super-sound quality can be comprehensively reflected, and example words can be added to reflect the characteristics of the voice phenomenon with great particularity; "recording related words for conclusion of related sound system" refers to a general vocabulary summarized according to the characteristics of sound, combination law, rhythm and intonation used in the same language.
The characteristic speech phenomenon refers to the situation that the dialect is easy to read wrongly, such as the situation that the flat tongue sound and the warped tongue sound are difficult to distinguish, and f and h are not divided.
c. Sentence material part: determining the number of the linguistic data according to the language mastering degree of different speakers, wherein the linguistic data is selected to have certain representativeness while the range of the linguistic data is ensured to be as wide as possible; "representative" as used herein refers to a general sentence that characterizes dysarthric speech.
d. And a natural conversation part: the method is characterized in that the method is used for recording 20-40 minutes of voice materials of a speaker in the forms of answering questions and freely talking, relates to words different from common Chinese in daily spoken language and requires the speaker to speak in a dialect.
Preferably, the voice recording of step 2 includes determination of speaker, and the selection principle of the speaker is to select a native speaker who has clear mouth and teeth, moderate speech rate ("moderate speech rate" means moderate speech rate, controlled at 150 words/minute) and proficient use of local language and is willing to actively cooperate with investigation, and to ensure that the language environment of the speaker is relatively stable and has cultural degree; or/and, the voice recording further comprises voice acquisition through a voice acquisition device, and the voice acquisition adopts two modes: one is the reading with prompt text, the prompt is the text material of Chinese, the speaker converts it into own native language and reads aloud; the other is natural voice, and the speaker tells the folk story, the folk life condition and humming of local folk songs by using prompts.
Preferably, the analyzing the acoustic parameters of the voice file in step 4 includes voice labeling of the voice library, where the basic voice labeling includes segmentation and alignment of initials and finals of each syllable, and labeling of initials and finals, and includes two parts: the first part is character marking, Chinese character + pinyin is character pronunciation transcription, and the voice information is recorded by Chinese characters so as to be provided for an identification system and also provide materials for the research of linguistics; the character label must mark basic character information and sublingual phenomenon, and the sublingual phenomenon in the basic label can be represented by a general sublingual symbol; the second part is syllable label, the standard mandarin syllable label is adopted in the mandarin syllable label, and the syllable label is tone label; in the tone notation, 0 indicates a light tone, 1 indicates a yin-ping, 2 indicates a yang-ping, 3 indicates a rising tone, and 4 indicates a falling tone.
Preferably, the analyzing of the acoustic parameters of the voice file in step 4 further includes extracting acoustic parameters; firstly, segmenting recorded voice and eliminating mute sections to ensure that analyzed objects are single words, phrases, sentences and conversations; then, judging the start and end sections of the voice signal in the voice waveform data, and labeling the voice; and finally, obtaining corresponding fundamental frequency and formant acoustic analysis parameter data according to an autocorrelation algorithm.
Preferably, the establishing of the database management system in step 5 includes selecting a database, and selecting an sql database management system which is easier to implement.
A big data voice classification flow method based on a Hadoop platform comprises the following steps: the establishing method is used for establishing a voice library, on the basis of the voice library, a Map function is adopted to subdivide a big data voice classification problem based on a Hadoop platform, and a multi-node parallel and distributed sub-problem is used for carrying out voice classification solution to obtain a corresponding voice classification result; and finally, combining the voice classification results of the sub-problems by using a Reduce function so as to adapt to the online requirement of the voice classification of the big data.
The method comprises the following specific steps:
(1) the Client submits a voice classification task to a Job Tracker of the Hadoop platform, and the Job Tracker copies voice characteristic data to a local distributed file processing system;
(2) initializing voice classified tasks, putting the tasks into a Task queue, and distributing the tasks to corresponding nodes, namely a Task Tracker, by a Job Tracker according to the processing capacity of different nodes;
(3) each Task Tracker adopts a support vector machine to fit the relation between the voice features to be classified and a voice feature library according to the distributed tasks to obtain the corresponding categories of the voice;
(4) taking the corresponding class of the voice as Key/Value, and storing the Key/Value into a local file disk;
(5) if the Key/Value of the voice classification intermediate result is the same, merging the intermediate result, delivering the merged result to Reduce for processing to obtain a voice classification result, and writing the result into a distributed file processing system;
(6) and the Job Tracker performs emptying processing on the task state, and the user obtains a voice classification result from the distributed file processing system.
The method for establishing and analyzing the speech library of dysarthria with motility under the big data background has the following beneficial effects:
(1) the invention aims to research the voice characteristics of patients with motor dysarthria caused by nervous system diseases, can realize measurement covering large-scale groups and collection of related information by relying on the advantages of an open network platform, realizes establishment of voice libraries such as mandarin, dialects, healthy human voices and patient voices, and establishes a word library meeting the condition diagnosis of the patients with motor dysarthria on the basis.
(2) Under the condition that the voice library is continuously expanded, a rich data resource center is finally established according to information such as Putonghua, dialect, different medical histories and different disease conditions, a network autonomous diagnosis way is provided for patients with nervous system diseases, doctors can be assisted in clinical diagnosis and treatment, and a rich and accurate data platform is provided for quantification of the disease conditions of the nervous system diseases.
(3) On the basis of a voice library, based on a Hadoop platform, a Map function is adopted to subdivide a big data voice classification problem, and a multi-node parallel and distributed sub-problem is used for carrying out voice classification solution to obtain a corresponding voice classification result; and finally, combining the voice classification results of the sub-problems by using a Reduce function so as to adapt to the online requirement of the voice classification of the big data.
Drawings
FIG. 1: the speech annotation of "bao" in the embodiment of the present invention is exemplified.
FIG. 2: in the embodiment of the invention, the resonance peak data of the 'bao' voice is obtained.
FIG. 3: the basic framework of the Hadoop platform in the embodiment of the invention.
FIG. 4 shows a big data voice classification process based on a Hadoop platform.
Detailed Description
The invention is further illustrated below with reference to fig. 1 to 4:
the voice library is composed of an unvoiced sound library, a voiced sound library, a tone library, a voice synthesis program and a Chinese-pinyin conversion program.
1. Establishing an unvoiced sound library:
according to the characteristics of unvoiced sound, the quality of synthesized speech is improved. The unvoiced sound library is established by adopting a direct sampling method. That is, the unvoiced parts in front of the voiced speech segments in various pinyin combinations are sampled to form an unvoiced speech library. Since the unvoiced sounds in 1 syllable actually occupy only a small part, the unvoiced sound library constituted by unvoiced sounds extracted from 400 unvoiced syllables actually occupies a small storage space.
2. Establishing a voiced sound library:
voiced sounds are synthesized by a voiced synthesis program calling VTFR synthesis for voiced sounds. The voiced sound library is actually composed of VTFRs of various voiced sounds, VTFRs of various voiced sounds are sequentially extracted by adopting a VTFR extracting program, and the VTFRs of various voiced sounds and a voiced sound synthesizing program are stored in 1 data packet, so that the voiced sound library is formed. The actually extracted VTFR is only 1 curve, and the space occupied by the voiced sound library formed by the curve is very small.
The establishment of the voice corpus mainly comprises the following four main processes: designing a pronunciation text; recording voice; analyzing parameters of the voice file; establishing a database management system; data analysis of big data technology.
1. Designing a pronunciation text;
1.1 selection of pronunciation text:
how to select corpora is the key of corpus database construction. In order to ensure the order and effectiveness of the database building work and the quality of the corpus, a selection principle of the corpus is firstly researched and formulated before the corpus is built. The selection principle of the speech corpus comprises the following steps: firstly, the single characters in the corpus are required to contain all the phonological phenomena as much as possible, so that the phonetic system characteristics of the dialect speech can be reflected better and more conveniently; secondly, the vocabularies in the corpus are based on the Chinese survey common table, so that the vocabularies can be conveniently compared with the Chinese mandarin; thirdly, sentences in the corpus are mainly selected from spoken language corpora | so that the method is more suitable for the real situation faced by speech recognition; the sentences in the corpus are complete in content and semanteme, so that the prosodic information of one sentence can be reflected as much as possible; and fifthly, selecting three phones without classification, so that the problem of sparse training data can be effectively solved.
1.2, preparation of pronunciation texts:
the formulation of pronunciation texts is one of the key links for establishing a voice database. When determining pronunciation materials, the selection principle of pronunciation texts comprises five parts: one is the single word portion. Taking the initial consonants, the simple or compound vowels and some commonly used characters of the tone listed in the survey word list as the language materials used for the main recording of the voice library; the second is the vocabulary part. Based on a four thousand word list, but not limited to the four thousand word list, the related words are recorded according to the original conclusion about the related sound system, the voice characteristics including the characteristics of tone quality and super-sound quality can be comprehensively reflected, and example words can be added to reflect the characteristics of the voice phenomenon with great particularity; thirdly, the statement material part determines the number of the linguistic data according to the language mastering degree of different speakers, and the linguistic data is selected to have certain representativeness while the range of the linguistic data is ensured to be as wide as possible; and fourthly, a natural conversation part, which is a subject of daily life, records voice materials of a speaker for about half an hour in a form of answering questions and freely talking, relates to words in daily spoken language which are different from the common Chinese language, and requires the speaker to speak in a dialect.
2. Recording voice;
2.1 determination of speaker:
the selection principle of the speaker is to select a native speaker who has clear mouth and teeth, moderate speech speed, proficient use of local language and willing to actively cooperate with investigation, and to ensure that the language environment of the speaker is stable and has a certain cultural degree.
2.2 voice collection:
the speaking mode in the recording process directly determines the purpose of the voice library. Because of the particularity of collecting the corpus, according to different research purposes, two modes are adopted: one is reading aloud with prompt text, the prompt is the literal material of Chinese | the speaker converts it into his own native language and reads aloud; the other is natural voice, and the speaker can tell the folk story, the national living condition, the humming of the local folk song and the like by using prompts.
3. Parameter analysis for the speech file:
after the pronunciation text is recorded, the voice data needs to be analyzed to obtain different features of the voice signal, which is a key for designing the voice corpus and a necessary basis for the post-stage voice processing. The invention focuses on researching voice information, so that the basic attribute of the voice signal waveform needs to be labeled, and meanwhile, the related acoustic parameters are extracted.
3.1 information annotation of the voice library:
the voice labeling uses Praat software and carries out hierarchical labeling by referring to a Chinese sound segment labeling system SAMPA-C. The labels of the voice library comprise a text label and a sound mixing label, wherein the voice "bao" is taken as an example, and is shown in fig. 1.
The first part is character marking, Chinese character + pinyin is character pronunciation transcription, and the phonetic information is recorded with Chinese character for identification system and linguistic research. The character label must mark basic character information and sublingual phenomena, and the sublingual phenomena in the basic label can be represented by a universal sublingual symbol.
The second part is a syllable label, the mandarin syllable label adopts a standard mandarin syllable label, and the syllable label is a tonal label. In the tone notation, 0 indicates a light tone, 1 indicates a yin-ping, 2 indicates a yang-ping, 3 indicates a rising tone, and 4 indicates a falling tone.
3.2 extraction of acoustic parameters:
for the recorded voice signals, the acoustic parameters of each speech segment need to be extracted, and in actual operation, the recorded voice is firstly segmented and the mute segments are eliminated so as to ensure that the analyzed objects are single words; then, judging the start and end sections of the voice signals in the voice waveform data, and marking the range of the vowels; finally, corresponding fundamental frequency and resonance peak data are obtained according to an autocorrelation algorithm, taking voice "bao" as an example, as shown in fig. 2.
4. Establishing a database management system:
4.1 database selection
For the selection of the database, because a large amount of voice waveform data needs to be stored in the voice database, the voice waveform data is characterized by large data volume, unfixed length, and lower requirements on aspects of transaction processing and recovery, safety, network support and the like. Therefore, we can choose a more easily implemented sql database management system.
4.2 creation of database management System
Establishing a database management system in a voice corpus needs to store four materials, namely, speaker attribute materials, such as age, gender, education condition, Chinese mastering condition, mother language use condition and the like of a speaker; secondly, a pronunciation text material is recorded and stored, and the pronunciation of the pronunciation person and text materials such as dialect pronunciation and mandarin international phonetic symbol corresponding to the pronunciation person are recorded and stored; thirdly, actual voice data material is mainly used for storing original parameters of the recorded voice waveform graph; and fourthly, storing acoustic analysis parameter data, namely the acoustic parameters extracted from the processed voice waveform.
5. Data analysis for big data technology
The big data is a data set with large scale which greatly exceeds the capability range of the traditional database software tools in the aspects of acquisition, storage, management and analysis, and has the four characteristics of large data scale, rapid data circulation, various data types and low value density. The strategic significance of big data technology is not to grasp huge data information, but to specialize the data containing significance. In other words, if big data is compared to an industry, the key to realizing profitability in the industry is to improve the "processing ability" of the data and realize the "value-added" of the data through the "processing". In the word bank construction, the important value of the big data technology is that the aim of evaluating the quality of the voice elements in the word bank is achieved through the targeted analysis and research on the data, so that the word bank construction is more complete.
The word stock is shared through the network platform, so that tests of different crowds are facilitated, more data samples are obtained, the voice library is enriched, in the future, the more targeted exercise dysarthria patient word stock can be established according to different regions and different dialects, and more abundant and reliable data samples are provided for subsequent automatic identification of disease classification and classification.
As shown in fig. 3, a speech classification mechanism based on a Hadoop platform is proposed, which includes collecting a large number of images, constructing an image database, and extracting effective features of image classification; then, based on a Hadoop platform, subdividing the big data voice classification problem by adopting a Map function, and performing voice classification solution on the subproblems in a multi-node parallel and distributed manner to obtain a corresponding voice classification result; and finally, combining the voice classification results of the sub-problems by using a Reduce function so as to adapt to the online requirement of the voice classification of the big data.
As shown in fig. 4, the big data speech classification process based on the Hadoop platform includes the following specific steps:
(1) the Client submits a voice classification task to a Job Tracker of the Hadoop platform, and the Job Tracker copies voice characteristic data to a local distributed file processing system;
(2) initializing voice classified tasks, putting the tasks into a Task queue, and distributing the tasks to corresponding nodes, namely a Task Tracker, by a Job Tracker according to the processing capacity of different nodes;
(3) each Task Tracker adopts a support vector machine to fit the relation between the voice features to be classified and a voice feature library according to the distributed tasks to obtain the corresponding categories of the voice;
(4) taking the corresponding class of the voice as Key/Value, and storing the Key/Value into a local file disk;
(5) if the Key/Value of the voice classification intermediate result is the same, merging the intermediate result, delivering the merged result to Reduce for processing to obtain a voice classification result, and writing the result into a distributed file processing system;
(6) and the Job Tracker performs emptying processing on the task state, and the user obtains a voice classification result from the distributed file processing system.
The invention is described above with reference to the accompanying drawings, it is obvious that the implementation of the invention is not limited in the above manner, and it is within the scope of the invention to adopt various modifications of the inventive method concept and solution, or to apply the inventive concept and solution directly to other applications without modification.

Claims (9)

1.一种大数据背景下运动性构音障碍语音库的建立及分析方法,包括以下步骤:1. A method for establishing and analyzing a speech database for motor dysarthria under the background of big data, comprising the following steps: 步骤1、发音文本的设计;Step 1. Design of pronunciation text; 步骤2、语音录制;Step 2, voice recording; 步骤3、语音文件的标注;Step 3, mark the voice file; 步骤4、对语音文件的声学参数分析;Step 4, analyze the acoustic parameters of the voice file; 步骤5、数据库管理系统的建立;Step 5, the establishment of the database management system; 步骤6、大数据技术的数据分析。Step 6: Data analysis of big data technology. 2.根据权利要求1所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:所述步骤6中大数据技术的数据分析基于Hadoop平台的语音分类机制,具体包括如下分步骤:2. the establishment and analysis method of motion dysarthria speech database under the big data background according to claim 1, it is characterized in that: the data analysis of big data technology in described step 6 is based on the speech classification mechanism of Hadoop platform, concrete It includes the following sub-steps: 步骤61、收集复数个患者语音文件,对语音进行音段切分和标注,构建语音数据库,对提取的声学参数进行分析,获取语音分类的有效特征;Step 61, collecting a plurality of patient voice files, segmenting and labeling the voice, constructing a voice database, analyzing the extracted acoustic parameters, and obtaining the effective features of voice classification; 步骤62、然后基于Hadoop平台,采用Map函数对大数据语音分类问题进行细分,用多节点并行、分布式地对子问题进行语音分类求解,得到相应的语音分类结果;Step 62, then based on the Hadoop platform, adopt the Map function to subdivide the big data voice classification problem, and use the multi-node parallel and distributed voice classification solution to the sub-problem to obtain the corresponding voice classification result; 步骤63、最后利用Reduce函数对子问题的语音分类结果进行组合,以适应大数据语音分类的在线要求。Step 63: Finally, use the Reduce function to combine the speech classification results of the sub-problems to meet the online requirements of big data speech classification. 3.根据权利要求1或2所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:3. under the big data background according to claim 1 and 2, the establishment and analysis method of the motor dysarthria speech library, is characterized in that: 所述步骤1中发音文本的设计包括发音文本的选择,所述发音文本的语料库的选择原则包括以下一种或多种:The design of the pronunciation text in the described step 1 includes the selection of the pronunciation text, and the selection principle of the corpus of the pronunciation text includes one or more of the following: a、语料库中的单字要求尽量包含所有的声韵现象,能够更好更方便的反映不同患者语音的音系特征;a. The words in the corpus are required to include all phonological phenomena as much as possible, which can better and more conveniently reflect the phonological characteristics of different patients' voices; b、语料库中的词汇依据汉语调查常用表为基础,所以能方便的与汉语普通话进行比较;b. The vocabulary in the corpus is based on the commonly used Chinese survey table, so it can be easily compared with Mandarin Chinese; c、语料库中的句子主要是根据几个相关主题,与患者进行对话所得,所以更符合语音识别面对的真实情形;c. The sentences in the corpus are mainly obtained from conversations with patients based on several related topics, so they are more in line with the real situation faced by speech recognition; d、语料库中的句子在内容和语义上都是完整的,所以能够尽可能的反映一个句子的韵律信息;d. The sentences in the corpus are complete in content and semantics, so they can reflect the prosodic information of a sentence as much as possible; e、对三音子不进行归类的挑选,这样能够有效的解决训练数据稀疏的问题。e. The selection of triphones without classification can effectively solve the problem of sparse training data. 4.根据权利要求3所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:4. under the big data background according to claim 3, the establishment and analysis method of the motor dysarthria speech library, is characterized in that: 所述步骤1中所述发音文本的设计还包括发音文本的编制,所述发音文本的编制原则包括以下一种或多种:The design of the pronunciation text described in the step 1 also includes the preparation of the pronunciation text, and the preparation principle of the pronunciation text includes one or more of the following: a、单字部分:将调查字表中列举的声母韵母以及声调的一些常用字作为本次语音库的主要录音所用语料;a. Single-character part: take the initials, finals, and tones listed in the survey word list as the corpus used for the main recording of this phonetic database; b、词汇部分:至少以一个四千词词表为基础,根据原来关于相关音系的结论记录相关词语,力求能够全面反映其语音特点,包括音质和超音质特点,针对一些很有特色的语音现象,可增加例词来反映其特征;b. Vocabulary part: Based on at least a 4,000-word vocabulary, record relevant words according to the original conclusions about the relevant phonology, and strive to fully reflect its phonetic characteristics, including the characteristics of sound quality and super sound quality, for some very distinctive voices Phenomenon, example words can be added to reflect its characteristics; c、语句材料部分:根据不同发音人的语言掌握程度决定语料数量,选取时既要保证语料的范围尽可能广,还需使其具有一定的代表性;c. Sentence material part: The number of corpus is determined according to the language mastery of different speakers. When selecting, it is necessary to ensure that the range of the corpus is as wide as possible, and it needs to be representative to a certain extent; d、自然对话部分:日常生活为题,采用回答问题和自由谈话的形式,录制发音人20-40分钟的语音材料,涉及日常口语中和普通话说法不同的词汇,要求发音人用方言说出来。d. Natural dialogue part: the topic of daily life, in the form of answering questions and free conversation, recording the speaker's voice material for 20-40 minutes, involving words that are different in spoken language and Mandarin, and requiring the speaker to speak it in dialect. 5.根据权利要求4所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:5. the establishment and analysis method of the motor dysarthria speech library under the big data background according to claim 4, is characterized in that: 所述步骤2的语音录制包括发音人的确定,所述发音人的选取原则是挑选口齿清晰、语速适中、熟练使用本地语且愿意主动配合调查的母语发音人,还要保证其所处的语言环境比较稳定,同时又要有文化程度;The voice recording of the step 2 includes the determination of the speaker, and the selection principle of the speaker is to select a native speaker who has clear articulation, moderate speaking speed, skilled use of native language and is willing to actively cooperate with the investigation, and also ensures that the The language environment is relatively stable, and at the same time, there must be an educational level; 或者/和,所述语音录制还包括通过语音采集器进行的语音采集,所述语音采集采用两种方式:一种是具有提示文本的朗读,提示是汉语的文字材料,发音人将其转换成自己的母语并朗读;另一种是自然语音,发音人利用提示讲述民间故事、民族生活状况以及当地民歌的哼唱。Or/and, the voice recording also includes voice acquisition by a voice collector, and the voice acquisition adopts two methods: one is reading aloud with prompt text, and the prompt is Chinese text material, and the speaker converts it into own native language and read aloud; the other is natural voice, the speaker uses prompts to tell folk stories, national living conditions and humming of local folk songs. 6.根据权利要求1-5中任何一项所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:6. according to the establishment and analysis method of motion dysarthria speech library under the big data background described in any one of claim 1-5, it is characterized in that: 步骤4中所述对语音文件的声学参数分析包括语音库的语音标注,基本的语音标注包括各个音节的声韵母切分和对齐,以及声韵调的标注,包括两个部分:The acoustic parameter analysis of the voice file described in step 4 includes the phonetic annotation of the phonetic library, and the basic phonetic annotation includes the segmentation and alignment of the vowels and finals of each syllable, and the annotation of the vowel, including two parts: 第一部分是文字标注,汉字+pinyin即字音转写,将语音信息用汉字记录下来,以便提供给识别系统使用,也能为语言学的研究提供素材;文字标注必须标明基本文字信息以及副语言学现象,基本标注中的副语言学现象可用通用副语言学符号表示;The first part is text annotation, Chinese characters + pinyin is the phonetic transcription, and the phonetic information is recorded in Chinese characters so that it can be used by the recognition system and can also provide materials for linguistic research; the text annotation must indicate basic text information and paralinguistics. Phenomenon, paralinguistic phenomena in basic annotations can be represented by general paralinguistic symbols; 第二部分是音节标注,普通话音节标注采用标准普通话音节标注,音节标注为有调标注;声调标注中0表示轻声,1表示阴平,2表示阳平,3表示上声,4表示去声。The second part is the syllable labeling. The standard Mandarin syllable labeling is used for the standard Mandarin syllable labeling, and the syllable labeling is a tone labeling; in the tone labeling, 0 means soft tone, 1 means Yinping, 2 means Yangping, 3 means rising tone, and 4 means eliminating tone. 7.根据权利要求6所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:步骤4中所述对语音文件的声学参数分析还包括声学参数的提取;7. the establishment and analysis method of motion dysarthria speech library under the big data background according to claim 6, is characterized in that: the acoustic parameter analysis to speech file described in step 4 also comprises the extraction of acoustic parameter; 首先对所录制的语音进行切分和消除静音段的处理,以保证分析的对象为单个字词、词组、语句、对话;然后在语音波形数据中对于语音信号的起止段做出判定,对语音进行标注;最后再根据自相关算法得到相应的基频和共振峰声学分析参数数据。First, the recorded speech is segmented and the mute segment is eliminated to ensure that the objects of analysis are single words, phrases, sentences, and dialogues; Finally, the corresponding fundamental frequency and formant acoustic analysis parameter data are obtained according to the autocorrelation algorithm. 8.根据权利要求1-7中任何一项所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:步骤5中所述数据库管理系统的建立包括数据库的选取,选用较易实现的sql数据库管理系统。8. according to the establishment and analysis method of motion dysarthria speech library under the big data background described in any one of claim 1-7, it is characterized in that: the establishment of described database management system in step 5 comprises the selection of database , choose the easier to implement sql database management system. 9.根据权利要求8所述的大数据背景下运动性构音障碍语音库的建立及分析方法,其特征在于:步骤5中所述数据库管理系统的建立中需存储四种素材:一是发音人属性素材;二是发音文本素材,录入和存储患者发音素材及其对应的发音和普通话国际音标等文本材料;三是实际语音数据材料,用于保存录制好的语音波形图形的原始参数;四是声学分析参数数据,即对处理后的语音波形提取的声学参数的保存。9. under the background of big data according to claim 8, the establishment and analysis method of the speech library of motor dysarthria, it is characterized in that: need to store four kinds of materials in the establishment of the database management system described in step 5: one is pronunciation Person attribute material; second, pronunciation text material, input and store patient pronunciation material and its corresponding pronunciation and text materials such as Mandarin Chinese International Phonetic Alphabet; third, actual speech data material, used to save the original parameters of the recorded voice waveform; fourth It is the acoustic analysis parameter data, that is, the preservation of the acoustic parameters extracted from the processed speech waveform.
CN202011546906.5A 2020-05-12 2020-12-24 Method for establishing and analyzing mobility dysarthria voice library in big data background Active CN112599119B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010395558X 2020-05-12
CN202010395558 2020-05-12

Publications (2)

Publication Number Publication Date
CN112599119A true CN112599119A (en) 2021-04-02
CN112599119B CN112599119B (en) 2023-12-15

Family

ID=75200795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011546906.5A Active CN112599119B (en) 2020-05-12 2020-12-24 Method for establishing and analyzing mobility dysarthria voice library in big data background

Country Status (1)

Country Link
CN (1) CN112599119B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450777A (en) * 2021-05-28 2021-09-28 华东师范大学 End-to-end sound barrier voice recognition method based on comparison learning
CN113889096A (en) * 2021-09-16 2022-01-04 北京捷通华声科技股份有限公司 Method and device for analyzing sound library training data
CN114566248A (en) * 2022-01-18 2022-05-31 华东师范大学 Intelligent pushing method for Chinese sound construction training scheme
CN114999468A (en) * 2022-05-20 2022-09-02 河北科技大学 Speech recognition algorithm and device for aphasia patients based on speech features

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6067520A (en) * 1995-12-29 2000-05-23 Lee And Li System and method of recognizing continuous mandarin speech utilizing chinese hidden markou models
CN102799684A (en) * 2012-07-27 2012-11-28 成都索贝数码科技股份有限公司 Video-audio file catalogue labeling, metadata storage indexing and searching method
CN103405217A (en) * 2013-07-08 2013-11-27 上海昭鸣投资管理有限责任公司 System and method for multi-dimensional measurement of dysarthria based on real-time articulation modeling technology
CN105740397A (en) * 2016-01-28 2016-07-06 广州市讯飞樽鸿信息技术有限公司 Big data parallel operation-based voice mail business data analysis method
CN106128450A (en) * 2016-08-31 2016-11-16 西北师范大学 The bilingual method across language voice conversion and system thereof hidden in a kind of Chinese
CN110111780A (en) * 2018-01-31 2019-08-09 阿里巴巴集团控股有限公司 Data processing method and server

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6067520A (en) * 1995-12-29 2000-05-23 Lee And Li System and method of recognizing continuous mandarin speech utilizing chinese hidden markou models
CN102799684A (en) * 2012-07-27 2012-11-28 成都索贝数码科技股份有限公司 Video-audio file catalogue labeling, metadata storage indexing and searching method
CN103405217A (en) * 2013-07-08 2013-11-27 上海昭鸣投资管理有限责任公司 System and method for multi-dimensional measurement of dysarthria based on real-time articulation modeling technology
CN105740397A (en) * 2016-01-28 2016-07-06 广州市讯飞樽鸿信息技术有限公司 Big data parallel operation-based voice mail business data analysis method
CN106128450A (en) * 2016-08-31 2016-11-16 西北师范大学 The bilingual method across language voice conversion and system thereof hidden in a kind of Chinese
CN110111780A (en) * 2018-01-31 2019-08-09 阿里巴巴集团控股有限公司 Data processing method and server

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450777A (en) * 2021-05-28 2021-09-28 华东师范大学 End-to-end sound barrier voice recognition method based on comparison learning
CN113889096A (en) * 2021-09-16 2022-01-04 北京捷通华声科技股份有限公司 Method and device for analyzing sound library training data
CN114566248A (en) * 2022-01-18 2022-05-31 华东师范大学 Intelligent pushing method for Chinese sound construction training scheme
CN114999468A (en) * 2022-05-20 2022-09-02 河北科技大学 Speech recognition algorithm and device for aphasia patients based on speech features

Also Published As

Publication number Publication date
CN112599119B (en) 2023-12-15

Similar Documents

Publication Publication Date Title
Duffy et al. Comprehension of synthetic speech produced by rule: A review and theoretical interpretation
Hua Phonological development in specific contexts: Studies of Chinese-speaking children
Lehiste An acoustic–phonetic study of internal open juncture
CN112599119B (en) Method for establishing and analyzing mobility dysarthria voice library in big data background
Myles et al. Using information technology to support empirical SLA research
Feraru et al. Cross-language acoustic emotion recognition: An overview and some tendencies
McCrocklin et al. Revisiting popular speech recognition software for ESL speech
CN109841231B (en) Early AD (AD) speech auxiliary screening system for Chinese mandarin
Chow et al. A musical approach to speech melody
Jessen Speaker classification in forensic phonetics and acoustics
Duchateau et al. Developing a reading tutor: Design and evaluation of dedicated speech recognition and synthesis modules
Ali et al. Development and analysis of speech emotion corpus using prosodic features for cross linguistics
CN114916921A (en) A rapid speech cognitive assessment method and device
Zealouk et al. Voice pathology assessment based on automatic speech recognition using Amazigh digits
CN111583914B (en) Big data voice classification method based on Hadoop platform
San Jose et al. Phonological idiosyncrasies of the Southern Sorsogon dialect in Bulan, Philippines.
Lai et al. Intonation and voice quality of Northern Appalachian English: A first look
Wayland et al. Lenition in L2 Spanish: The Impact of Study Abroad on Phonological Acquisition
Priyadharshini et al. Natural language processing (nlp) based phonetic insights for improving voice recognition and synthesis
Popun et al. Automatic Speech Recognition Techniques for Transcription of Thai Traditional Medicine Texts
Jiang How does modification affect the processing of formulaic language? Evidence from L1 and L2 speakers of Chinese
Huang et al. An Experimental Study on Declarative and Interrogative Sentences in Shanghai Chinese
Themistocleous Open Brain AI: automating language analysis
Qin On spoken english phoneme evaluation method based on sphinx-4 computer system
Chen et al. The development of a Singapore English call resource

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant