Summary of the invention
The present invention has overcome in existing voice difficulty measurement system that single dimension is measured and the less defect of characteristic parameter targetedly, has proposed a kind of voice disorder multi-dimensional measuring system and method thereof based on real-time voice Conceptual Modeling technology.
The present invention proposes a kind of voice disorder measuring system based on real-time voice Conceptual Modeling technology, comprising: harvester, it gathers phoneme data; Accountant, calculates the parameter value of described phoneme data; And judgment means, its standard figures scope by described parameter value and described phoneme data contrasts, if described parameter value, within the scope of described standard figures, represents not exist voice disorder; If described parameter value, below or above described standard figures scope, represents the voice disorder type that exists voice disorder judgement to exist.
In the voice disorder measuring system based on real-time voice Conceptual Modeling technology that the present invention proposes, described harvester comprises omnidirectional microphone harvester separated with mouth and nose.
In the voice disorder measuring system based on real-time voice Conceptual Modeling technology that the present invention proposes, described phoneme data comprise segmental phoneme data, supersegmental phoneme data, hybrid segmental phoneme data and separate type segmental phoneme data.
In the voice disorder measuring system based on real-time voice Conceptual Modeling technology that the present invention proposes, described segmental phoneme data comprise the related data with speech intelligibility.
In the voice disorder measuring system based on real-time voice Conceptual Modeling technology that the present invention proposes, described supersegmental phoneme data comprise the related data with tonal variations rate.
In the voice disorder measuring system based on real-time voice Conceptual Modeling technology that the present invention proposes, described hybrid segmental phoneme data comprise pure and impure signal to noise ratio rate, articulation type ratio, place of articulation ratio, cue and the time ratio of supplying gas.
In the voice disorder measuring system based on real-time voice Conceptual Modeling technology that the present invention proposes, described separate type segmental phoneme data comprise nose flow, nasal cavity linear prediction spectrum and nasal cavity power spectrum.
In the voice disorder measuring system based on real-time voice Conceptual Modeling technology that the present invention proposes, further comprise supervising device, the rehabilitation efficacy of the front and back difference monitoring voice disorder of the described phoneme data that it obtains according to multi collect.
The invention allows for a kind of voice disorder measuring method based on real-time voice Conceptual Modeling technology, comprise the following steps: step 1: gather phoneme data; Step 2: the parameter value that calculates described phoneme data; Step 3: contrast described parameter value according to described standard figures scope, if described parameter value, within the scope of described standard figures, represents not exist voice disorder; If described parameter value, below or above described standard figures scope, represents the voice disorder type that exists voice disorder judgement to exist.
In the voice disorder measuring method based on real-time voice Conceptual Modeling technology that the present invention proposes, further comprise step 4: according to the rehabilitation efficacy of the front and back difference monitoring voice disorder of the described phoneme data that repeatedly record.
The present invention is based on real-time voice Conceptual Modeling technology, broken through the restriction of carrying out voice disorder objective measurement from single dimension, added the detection analysis to separate type rhinophonia signal, from segmental phoneme, supersegmental phoneme, hybrid segmental phoneme and a plurality of dimensions of separate type segmental phoneme, analyze, extract the characteristic parameter of a plurality of dimensions for the objective measurement of voice disorder, make voice disorder user's assessment more effective with training.
Utilization of the present invention realizes speech rehabilitation effect monitoring based on real-time voice Conceptual Modeling technology, comprise the check analysis of measurement data and carry out Single subject experiment, thereby the periodic measurement result of utilizing multidimensional characteristic parameter in voice disorder during rehabilitation training is monitored the effect of its training, make rehabilitation training strategy can access feedback and adjustment timely.
The present invention has set up a comprehensive pointed voice disorder multidimensional measure method again comprehensively based on real-time voice Conceptual Modeling technology, scattered objective measurement Technology Integration is got up, and added the monitoring of rehabilitation efficacy, form a set of complete voice disorder composite measurement evaluation system, for voice disorder user's rehabilitation lays the foundation.
The specific embodiment
In conjunction with following specific embodiments and the drawings, the present invention is described in further detail.Implement process of the present invention, condition, experimental technique etc., except the content of mentioning specially below, be universal knowledege and the common practise of this area, the present invention is not particularly limited content.
What Figure 10 showed is voice disorder multi-dimensional measuring system of the present invention, comprising harvester 1, accountant 2 and judgment means 3.
Harvester 1 comprises omnidirectional microphone, the separated harvester of mouth and nose.Harvester 1 adopts voice Conceptual Modeling technology, by omnidirectional microphone, the separated harvester of mouth and nose, gathers phoneme data respectively from voice, two dimensions of rhinophonia.
Accountant 2 is connected with harvester 1, and it calculates the parameter value of phoneme data.
Judgment means 3 is connected with accountant 2, stores the standard figures scope that phoneme data are relevant in judgment means 3.The regime values of each phoneme parameter of each age level different crowd of standard figures Range Representation is interval, judgment means 3 judges that according to standard figures scope whether the parameter value of phoneme data is abnormal, if parameter value within the scope of standard figures, represents that these phoneme data are normal; If parameter value, higher or lower than standard figures scope, represents to exist voice disorder.
Further, voice disorder multi-dimensional measuring system of the present invention also comprises supervising device 4.Supervising device 4 is by the difference between a plurality of phoneme data that repeatedly measure, thus the rehabilitation efficacy of monitoring before and after voice rehabilitation training.
What Figure 11 showed is the flow process that obtains various phoneme data, and phoneme data comprise segmental phoneme data, supersegmental phoneme data, hybrid segmental phoneme data and separate type segmental phoneme data.
Segmental phoneme packet is containing the data relevant to speech intelligibility.The parameter value of segmental phoneme data comprises: word inteligibility, sentence definition, continuous speech definition etc., its integral body has reflected user's speech articulation.By the parameter value of a plurality of segmental phonemes is compared with corresponding critical field, can carry out many-side assessment to speech intelligibility.
Supersegmental phoneme packet is containing the related data with tonal variations rate.For supersegmental phoneme, tonal variations rate refers to the fundamental frequency standard deviation of voice signal.The measurement of tonal variations rate comprises rising tune, falling tone, rising-falling tone energy force measurement, adopt respectively with after rising tune, falling tone and the pronunciation of rising-falling tone form, the value of the fundamental frequency standard deviation of the voice signal obtaining, has reflected the intonation changing capability of continuous speech, thereby reflects the fluency of voice.
Hybrid segmental phoneme data comprise pure and impure signal to noise ratio rate, articulation type ratio, place of articulation ratio, cue and the time ratio etc. of supplying gas.Pure and impure signal to noise ratio rate refers to the ratio of the duration of sore throat relieving and the duration of voiced sound in input voice segments, has reflected the transfer capability of pure and impure sound in continuous speech.Articulation type ratio refers to the ratio of the duration of certain articulation type and the duration of whole voice segments in input voice segments, has reflected the order of accuarcy of articulation type in continuous speech.Place of articulation ratio refers to the ratio of the duration of certain place of articulation and the duration of whole voice segments in input voice segments, has reflected the order of accuarcy of place of articulation in continuous speech.Cue refers to that in input voice segments, sore throat relieving, to the formant trajectory of voiced sound transition, has reflected that the order of accuarcy that in continuous speech, sore throat relieving pronounces and sore throat relieving are to the order of accuarcy of voiced sound transition.The time ratio of supplying gas refers to the ratio of aspirated sound duration and the duration of whole voice segments, has reflected aspirated order of accuarcy in continuous speech.
Separate type segmental phoneme packet containing to nose flow, data that nasal cavity linear prediction spectrum is relevant with nasal cavity power spectrum.Nose flow refers to the ratio of nasal cavity sound pressure level and output sound pressure level (oral cavity sound pressure level and nasal cavity sound pressure level sum), has reflected the ability of rhinophonia pronunciation in continuous speech.Nasal cavity linear prediction spectrum is the spectrogram of formant frequency, bandwidth and the amplitude that can observe rhinophonia, the order of accuarcy of vocal tract shape while having reflected rhinophonia pronunciation in continuous speech.Nasal cavity power spectrum is the spectrogram that can observe rhinophonia harmonic structure and concentration of energy region, and while having reflected rhinophonia pronunciation in continuous speech, energy is with the characteristic of frequency change.
Above-mentioned segmental phoneme data, supersegmental phoneme data, hybrid segmental phoneme data and separate type segmental phoneme data, wherein each phoneme data is indicated respectively different voice disorder types.What Fig. 9 showed is the flow process that the present invention is based on the voice disorder multidimensional measure method of real-time voice Conceptual Modeling technology, the voice disorder multidimensional measure method that the present invention is based on real-time voice Conceptual Modeling is by the multidimensional measure rounded analysis to phoneme data, judges whether to exist the type of the voice disorder of voice disorder and existence.
Embodiment 1: speech intelligibility is measured
In the present embodiment, adopt mike to record by linearity input the voice that user reads aloud, to gather the segmental phoneme data of these voice.In test phonetic material, be provided with some appraising point, can appraise for speech intelligibility, as shown in Figure 1.After appraising in full, in accountant 2 these voice of calculating, correct number accounts for the percentage ratio of total appraising point number, and this percentage ratio is the parameter value of segmental phoneme data, comprising: the parameters such as word inteligibility, sentence definition, continuous speech definition, as shown in Figure 2.Judgment means 3, according to user's information, is compared user's parameters value with its age, the corresponding standard figures scope of sex, if parameter value within standard figures scope, illustrates user's speech intelligibility ability, be normal; If parameter value is lower than standard figures scope, there is voice disorder in explanation, and this obstacle is speech intelligibility obstacle.
Embodiment 2: voice fluency is measured
Test phonetic material is respectively and adopts with rising tune, falling tone and rising-falling tone form, the voice that harvester 1 collection user sends according to test phonetic material are to obtain supersegmental phoneme data, and accountant 2 calculating that these supersegmental phoneme data are carried out to fundamental frequency standard deviation obtain the parameter value of its tonal variations rate.
The present embodiment is measured as example with tone rising tune, test phonetic material is from minimum tone, to be raised to vowel/i--i/ that maximum tone is sent out as far as possible with rising tune form, user pronounces according to this test phonetic material, and harvester 1 obtains the phoneme data in this voice signal.Preferably, user's voice are carried out the pretreatment such as pre-filtering, amplification, cancellation of DC offset and normalization in the gatherer process of harvester 1, to improve the signal to noise ratio of voice signal, improve certainty of measurement.Then 2 pairs of phoneme data of accountant are carried out pure and impure sound differentiation, and calculate the fundamental frequency value of voiced sound part and the standard deviation of analyzing fundamental frequency value.Judgment means 3 is by user's fundamental frequency standard deviation and corresponding standard figures scope comparison, if its fundamental frequency standard deviation within standard figures scope, illustrates that user's tone rising tune ability is normal; Otherwise its fundamental frequency standard deviation is excessive or too smallly all illustrate that user's tone rising tune ability exists obstacle, has fluency obstacle thereby disclose.
For example, Fig. 3 has shown the tonal variations rate value of a tone rising tune ability patients with abnormal, and the standard deviation of its fundamental frequency value is 9.24Hz.Because normal person's standard figures scope is 30Hz~70Hz, and this patient compares with normal person, and its value is far smaller than term of reference, so judgement exists voice disorder, is specially and exists tonal variations rate abnormal.
Tone falling tone and rising-falling tone energy force measurement are the same, and three's result has reflected different tonal variations abilities, abnormal as long as one of them value exists, and all illustrate that user's tonal variations ability exists obstacle, and the fluency of voice exists abnormal.
With the following Examples 3 to embodiment 7, after hybrid segmental phoneme characteristic parameter is measured, user's the box-like segmental phoneme data analysis of different blended, differentiates the order of severity of the hybrid segmental phoneme obstacle of user.
Embodiment 3: pure and impure sound transfer capability is measured
Pure and impure signal to noise ratio rate mainly refers to the ratio of the duration of sore throat relieving and the duration of voiced sound in voice segments.Harvester 1 gathers hybrid segmental phoneme data from user's voice segments, 2 pairs of hybrid segmental phoneme data of accountant are carried out pure and impure sound differentiation to obtain the duration of its sore throat relieving part and voiced sound part, and by the former duration the duration divided by the latter, thereby obtain a kind of parameter value of hybrid segmental phoneme, i.e. pure and impure sound rate value.
The phonetic material " handcuff " that not only comprises sore throat relieving but also comprise voiced sound that this enforcement take that user sends is (kao4) example.By harvester 1, gather after its phoneme data, accountant 2 carries out pure and impure sound differentiation to it, detects also mark of its sore throat relieving part and voiced sound part, then calculates respectively total duration of its sore throat relieving part and voiced sound part and compares, final its pure and impure sound rate value that obtains, as shown in Figure 4.Judgment means 3 is compared user's pure and impure sound rate value with standard figures scope, if within its pure and impure sound rate value drops on standard figures scope, illustrate that user's pure and impure sound transfer capability is normal; Otherwise there is voice disorder in excessive or too small all explanation of its pure and impure sound rate value, is specially pure and impure sound transfer capability and has obstacle.
Embodiment 4: articulation type is measured
Articulation type ratio mainly refers to the ratio of the duration of certain articulation type and the duration of whole voice segments in input voice segments, is articulation type rate value.By voice being carried out to the analysis of sound-type and the calculating on duration and border obtains the duration of its specific articulation type and the duration of whole voice segments.Wherein the analysis of sound-type comprises the detection of the articulation types such as resonant detection, vowel detection, voiced consonant's detection, rhinophonia detection, turbid fricative detection, plosive and sore throat relieving detection.
The present embodiment with carry out plosive articulation type ratio be measured as example, user is sent to the phonetic material " bag (bao1)/throwing (pao1) " that comprises plosive, harvester 1 gather in user speech material hybrid segmental phoneme data, 2 pairs of hybrid segmental phoneme data of accountant are carried out pure and impure sound differentiation, comprise and carry out resonant detection, vowel detection and tenuis detect to obtain the analysis result of sound-type, analysis result is as Fig. 5, accountant 2 carries out sound-type mark again and border is determined, calculate respectively the duration of its tenuis part and total duration of whole phonetic material and compare, final its plosive rate value that obtains, it is a kind of parameter value of hybrid segmental phoneme data.
Judgment means 3 is compared user's plosive rate value with standard figures scope, if within its plosive rate value drops on standard figures scope, the articulation type that user is described is normal; Otherwise there is voice disorder in excessive or too small all explanation of its plosive rate value, is specially articulation type and has obstacle.
Embodiment 5: the mode of supplying gas ratio measure
The mode ratio of supplying gas mainly refers to the ratio of aspirated sound duration and the duration of whole voice segments.The hybrid segmental phoneme data that harvester 1 gathers in user voice signal, it is carried out to the analysis of sound-type with accountant 2 and the calculating on duration and border obtains the duration of the mode of supplying gas and the duration of whole voice segments, thereby obtain supplying gas mode ratio, be a kind of parameter value of hybrid segmental phoneme data.
For example, the phonetic material that comprises aspirated stop that user is sent out " throw/handcuff " gathers, and accountant 2 carries out pure and impure sound differentiation to it, then carry out that sore throat relieving detects and aspirated sound to detect to obtain analysis result and the border of aspirated stop definite; Then calculate respectively the duration of aspirated stop and total duration of whole word, the duration by the former duration divided by the latter, its result is the mode rate value of supplying gas.
Judgment means 3 is compared user's the mode of supplying gas rate value with standard figures scope, within mode rate value drops on standard figures scope if it is supplied gas, illustrate that user's the mode of supplying gas ratio is normal; Otherwise there is voice disorder in excessive or too small all explanation of its mode rate value of supplying gas, is specially the mode of supplying gas and has obstacle.
Embodiment 6: cue is measured
The phonetic material that comprises sore throat relieving that user is sent out of take in the present embodiment " is taken " as example, Fig. 6 shows is the oscillogram of the diaphone bit data that collects by harvester 1, the formant trace diagram that wherein the second width coordinate diagram is corresponding sonogram, first three formant that it comprises voice signal, is followed successively by first, second, and third formant from down to up.Transverse axis is the time, and the longitudinal axis is corresponding resonance peak.Because the second formant mainly reflects structure and the shape in oral cavity, so reflect place of articulation with the second formant.
In the present embodiment, hybrid segmental phoneme data in the phonetic material of harvester 1 collection sore throat relieving and first mixture of tones, because sore throat relieving does not have formant, so accountant 2 is chosen the threshold value of the second formant as cue value, be a kind of parameter value of hybrid segmental phoneme data.In Fig. 6, cue value is the starting point frequency values of second horizontal line: 801Hz, the intensity level that the 3rd width coordinate diagram is corresponding formant.Because the standard figures scope of the cue value of " taking " is 1200Hz~2400Hz, and this user compares with normal standard figures scope, and its value is far smaller than standard figures, so judgement exists voice disorder, is specially cue abnormal.
Embodiment 7: place of articulation ratio measure
Place of articulation ratio mainly refers to the ratio of the duration of certain place of articulation and the duration of whole voice segments in input voice segments.Hybrid segmental phoneme data in harvester 1 collection user's voice signal, accountant 2 carries out the analysis of formant and the definite duration of its specific place of articulation and duration of whole voice segments of obtaining on place of articulation and border thereof to it, duration by the former duration divided by the latter, its result is place of articulation rate value, is a kind of parameter value of hybrid segmental phoneme data.
For example, user sends and comprises the phonetic material that the tip of the tongue plosive " is thrown/cover ", the hybrid segmental phoneme data that harvester 1 gathers in phonetic material, 2 pairs of hybrid segmental phoneme data of accountant are carried out formant extraction and trace obtains the frequency values of each section of formant and the frequency values of starting point thereof, and determine position and the border thereof of the tip of the tongue plosive according to it; Then calculate respectively the duration of the tip of the tongue plosive and total duration of whole word and compare, finally obtaining the place of articulation rate value of the tip of the tongue plosive, being a kind of parameter value of hybrid segmental phoneme data.
Judgment means 3 is compared place of articulation rate value with standard figures scope, if within its place of articulation rate value drops on standard figures scope, the place of articulation ratio that user is described is normal; Otherwise there is voice disorder in excessive or too small all explanation of its place of articulation rate value, is specially place of articulation and has obstacle.
With the following Examples 8, after hybrid segmental phoneme characteristic parameter is measured, user's different separate type segmental phoneme data analysis, differentiate the order of severity of user's separate type segmental phoneme obstacle.
Embodiment 8: nose flow measurement
Figure 7 shows that the oscillogram of the detection nose flow of an embodiment, wherein comprise two windows of main window and secondary window.Wherein three of main window oscillograms are respectively from top to bottom: nose flow diagram, nose voice signal sound wave, oral area voice signal sound wave.Secondary window is respectively nose voice signal sound wave, oral area voice signal sound wave from top to bottom, and their transverse axis is the time.Nose flow diagram is corresponding to sonogram below, and the nose flow value of different time place sound wave is different.
Harvester 1 gathers the separate type segmental phoneme data of nose flow in user's generating process, the calculating that accountant 2 carries out nasal cavity sound pressure level and oral cavity sound pressure level to it obtains the measurement result of nose flow, secondly the double-channel pronunciation signal that the mouth and nose passage of accountant 2 typings separates carries out respectively pretreatment, then calculate respectively its nose signal energy value and oral area signal energy value, the total energy value of finally adding up divided by nose and oral area by nose signal energy value, obtain nose flow, be a kind of parameter value of separate type segmental phoneme data.In the present embodiment, test result gained nose flow meansigma methods is 49%, and because the standard figures scope of this nose flow meansigma methods is 18%~34%, its value is far longer than term of reference, thus judgment means 3 judgements this there is voice disorder, be specially nose Traffic Anomaly.
The phonetic material of above-mentioned parameter can be selected identical or different phonetic material as required.As user follows rhinophonia hypofunction, the voice content that selection comprises a large amount of rhinophonia is as the mouth and nose separate type acoustical signal of input, and analysis and measurement that it is carried out to above-mentioned each parameter, finally obtains the measurement result of its separate type segmental phoneme.
Supervising device 4 is mainly to utilize above-mentioned parameter in voice disorder during rehabilitation training to carry out periodic measurement with monitoring rehabilitation efficacy, monitors the effect of its training, to adjust in time training strategy; Mainly utilize check analysis and single tested technology to follow the trail of measurement.What Fig. 8 showed is the schematic diagram of the data result that in an embodiment, supervising device is measured, and wherein the check analysis of parameter periodic measurement result is comprised to autocorrelation test, significance test and regression analysis.By autocorrelation test, carry out the effectiveness of confirmatory measurement result, cross the diversity that significance test is verified intervention training fore-and-aft survey result, by regression analysis, carry out the variation tendency of confirmatory measurement result.
Protection content of the present invention is not limited to above embodiment.Do not deviating under the spirit and scope of inventive concept, variation and advantage that those skilled in the art can expect are all included in the present invention, and take appending claims as protection domain.