GB2282456A - Wave analysis - Google Patents
Wave analysis Download PDFInfo
- Publication number
- GB2282456A GB2282456A GB9418236A GB9418236A GB2282456A GB 2282456 A GB2282456 A GB 2282456A GB 9418236 A GB9418236 A GB 9418236A GB 9418236 A GB9418236 A GB 9418236A GB 2282456 A GB2282456 A GB 2282456A
- Authority
- GB
- United Kingdom
- Prior art keywords
- amplitude
- frequency
- time
- wave data
- event
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004458 analytical method Methods 0.000 title description 26
- 238000000034 method Methods 0.000 claims abstract description 115
- 230000036961 partial effect Effects 0.000 claims abstract description 107
- 238000009826 distribution Methods 0.000 claims abstract description 33
- 230000003595 spectral effect Effects 0.000 claims description 88
- 230000008569 process Effects 0.000 description 26
- 238000012360 testing method Methods 0.000 description 20
- 238000012545 processing Methods 0.000 description 18
- 238000006243 chemical reaction Methods 0.000 description 11
- 238000005070 sampling Methods 0.000 description 11
- 230000001186 cumulative effect Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 238000003909 pattern recognition Methods 0.000 description 7
- 238000001914 filtration Methods 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 6
- 238000012935 Averaging Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000001131 transforming effect Effects 0.000 description 4
- FBJUTZMAUXJMMH-UHFFFAOYSA-N azane;5-methyl-2-(4-methyl-5-oxo-4-propan-2-yl-1h-imidazol-2-yl)pyridine-3-carboxylic acid Chemical compound [NH4+].N1C(=O)C(C(C)C)(C)N=C1C1=NC=C(C)C=C1C([O-])=O FBJUTZMAUXJMMH-UHFFFAOYSA-N 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000005284 excitation Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000013707 sensory perception of sound Effects 0.000 description 3
- 230000000739 chaotic effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 238000010183 spectrum analysis Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 241000612182 Rexea solandri Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- JJWKPURADFRFRB-UHFFFAOYSA-N carbonyl sulfide Chemical compound O=C=S JJWKPURADFRFRB-UHFFFAOYSA-N 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
A method for determining a similarity between predetermined wave data and discrete portions of a continuous wave data is disclosed. The predetermined wave data is inputted and a first information relevant to each time strip of the wave data is generated and stored. Then the continuous stream of wave data is inputted and a second signal relevant to each time strip of the continuous wave data is generated. The two kinds of information are compared to obtain a similarity score for each strip. The information may relate to amplitude-independent, frequency-distribution widths, settle times or width forms or amplitude-independent slopes for the partials of each time strip. A plurality of similarity scores may be derived for event recognition.
Description
METHOD AND APPARATUS FOR WAVE ANALYSTS AND EVENT RECOGNITION
Computer program source code listings vith labeled titles and general functional descriptions comprising sequences of instructions. routines, and the like are provided in.the microfiche appendix, which is incorporated herein by reference. The program listings in the microfiche appendix are directed to unobvious software utilized in practicing the preferred embodiment of the invention This softvare is unique to the invention Other software used in practicing the invention is either commercially available or within the skill of practitioners in the programming arts. The microfiche appendix consists of 2 sheets of microfiche. each sheet having 70 frames, not all of which have been used.
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE TNVENTTON The invention relates to analysis of continuous wave data and. more particularly, to a method and apparatus for determining a similarity between discrete portions of a continuous stream of wave data and a set of data.
The need for recognizing and identifying individual events contained within a continuous stream of wave data arises in a broad range of technologies such as the conversion of music into an interactive database of musical events, the recognition of potentially dangerous patterns in a continuous stream of vave data produced bv medical monitoring devices, the recognition of specific objects in radar, sonar, laser, and other suave reflections by surveillance or navigational devices, and the like. Event recognition in this context means, in general, the separation and identification of specific vave patterns contained in complex continuous wave data.
Waves are a fundamental means of describing the movement of energy in the physical world. A wave consists of a travelling pattern of energies which fits within a certain distance, or wavelength. The pattern described within a wavelength is a cycle, and a stationary observer perceives the arrival of a repeating vave pattern as a certain number of cycles per second, or frequency. A waveform is defined as one cycle of a vave pattern. There are no theoretical limits to the frequency of a vave pattern, although the limits of current detection apparatus range from ultra lov frequency waves (0.001 Hz) to hard cosmic rays (one thousand billion billion Hz). Practitioners in the art vill recognize the invention can be applied to any wavelength capable of analysis using state of the art detection and processing equipment.
It is known that a complex waveform can be completely described by its component sinewaves. Complex waves from multiple wave sources can be combined to form even more complex waves containing multiple vave events. In DlgScal Filters and Signal Processing, Kluwer Academic
Publishers, Boston, 1986, Roland Jackson describes how waves of related frequencies combine. Component sinew vexs of complex waveforms add together, cancel each other, attract each other, and modify each other through interference. Yet despite this complexity of wave combinations, a vave event generated by a particular source usually maintains its own identity, even when combined vith many other simultaneous complex waves.
The human perception of sound is the best example of this phenomenon.
An instrument, such as a flute, is easily distinguished by an average listener from a second instrument, such as a timpani, even though both instruments are played simultaneously. Such human perception of sound waves suggests that separation and identification of waves of many frequencies should be possible.
Consequently, for many years researchers have studied the perception of sound as the quintessential guide toward understanding complex wave analysis and event recognition. Considerable technology exists in the field of acoustic wave analysis, and technology related to spectral analysis and pattern recognition exists in other fields. The use of acoustic wave analysis as illustrated by the preferred embodiment of the invention serves to damonstrate the invention is useful in many fields for spectral analysis and pattern recognition techniques. Since the conplexity of acoustic patterns requires attention to a level of detail exceeding the detail required in many other fields, the application of the preferred embodiment to acoustic patterns serves as a comprehensive example of the practice of the invention
In current practices of analysis of continuous audio data an instrument playing a note is considered a wave source. The notes played by the instrument are considered the individual events. As time progresses forward the notes played produce a continuous stream of complex wave data. The continuous wave data is digitally sampled at precise intervals and the resulting stream of digits is processed by a computer. Host analysis and event recognition techniques try to transform the continuous stream of sampled wave data into frequency. amplitude, and time spectra before attempting any form of pattern recognition. The goals of existing pattern recognition procedures are the identification of the pitch, amplitude, and timing of the note event.
James A. Moorer describes one type of complex continuous wave data analysis in his paper1 On the Segmentation and Analysis of Continuous Husical Sound by Digital Computer", CCRMA, Department of Music, Report
No. STAN-H-3. Moorer uses comb filters to reprocess the music wave. data and then refines the data with band-pass filters. Moorer relies on harmonic ratios for his preprocessing and pattern recognition procedures and constructs linked lists of all potential notes based on occurrences of events conforming to predetermined harmonic ratios. Moorer places extreme constraints on the structure and instrumentation of the music which can be analyzed since the optimum comb step depends on predictable and stable harmonics at integer harmonic ratios. However, real music rarely has predictable and stable harmonics and does not conform to the compositional constraints imposed by Hoorer.
The transformation of digitally sampled data into accurate frequency. amplitude, and time spectra is a difficult and computationally intensive process that has received considerable research. The nature of the difficulty lies in the fact that,for any single analysis step, the more accurate the frequency resolution becomes, the less accurate the time resolution becomes. The reverse condition is equally true. In The Bounded-Q Frequency Transform", CCRnA, Department of Music, Report No. STAN-M-28, Kyle L. Kashima, et al., present a method of frequency analysis called the Bounded-Q
Frequency Transform, which "lowpass-filters" and 1sub-samples1 input data and applies a Fast Fourier Transform [FFT] to the resulting data for each octave of output desired. Although the effective frequency resolution is good and the computational time is relatively low, the final results distort time to such a degree that the results are not sufficiently accurate, if the timing accuracy of events is important, which it is in music and most other fields.
Other attempts to transform digitally sampled data into frequency, amplitude, and time spectra are described by Lawrence R. Rabiner, et al., in Theory and Application of Digital Signal Processing, Prentice-Hall Inc., Englewood Cliffs, NJ., 1975. The use of the FFT vith a high degree of overlap is described as well as the principle of a bank of band-pass filters. Rabiner's vork does not recommend a bank of band-pass filters due to the tremendous computation time required. The inherent problem with Rabiner et al's recommended FFT technique is that the frequency spacing is not suited to most analysis needs since output from the PFT is linearly spaced and does not lend itself to more generalized needs.
John Chowning, et al., propose a study of continuous wave analysis and event recognition in their paper, "Intelligent Analysis of Composite
Acoustic Signals," CCRH, Department of Music, Report No. STAN-H-36.
They propose using "simulated real-time problem solving heuristics" to determine strategies for allocating resources and controlling feedback loops, and propose a "system learning coprocessor" for parameter adjustment and various forms of pattern recognition. The multi-rate signal processing they propose is a form of the Bounded-Q Frequency
Transform. Thus, the proposed system would suffer time distortion.
Although this paper was only a proposal for continued funding, it does draw attention to the need for a process that can learn and retain a knowledge base of the facts learned.
In Techniques for Note Identification in Polyphonic Music," CCRnA, Department of Music, Report No. STAN-M-29, Cris Chafe uses a moving-average technique to identify the beginnings of events prior to application of the Bounded-Q Frequency Transform. Although this technique can work well for some extremely simple forms of music, it does not perform well for dense or complex music. In addition to the weakness of the Bounded-Q Frequency transform, Chafe ignores the unstable portion of the note event, thereby failing to utilize a tremendous portion of the vital data.
Clearly, new and more comprehensive method and apparatus for complex continuous wave analysis and event recognition are needed to efficiently transform sampled wave data into frequency. amplitude, and time spectra vithout substantially distorting frequency or time. The method and apparatus should evaluate all aspects of spectral data vithout ignoring any portion of the data, despite any instability in the data, as well as readily adapt to a broad range of frequencies, timing. and analysis needs. The preferred embodiment of the invention addresses these needs and provides an efficient solution to these problems.
The major adantages of the preferred embodiment of the present invention are the speed, accuracy, flexibility, and consistency of the conversion of sampled wave data to frequency, amplitude, and time spectra and the extraordinary accuracy and flexibility of its event recogbition. In addition, the capacity of the preferred embodiment to be configured to meet virtually any frequency and time requirements, as well as virtually any definable recognized wave event, will enable researchers in a broad range of scientific investigation to utilize the invention effectively.
The preferred embodiment of th epresent invention accomplished wave analysis and event recognition with a greater degree of efficiency and accuracy than possible with any of the prior art. More specifically, by using the principles of the preferred embodiment the conversion of wave data to frequency, amplitude, and time spectra is many times faster and more accurate than using the best of the prior art. In addition, the preferred embodiment performs event recognition with a degree of accuracy and flexibility that effectively removes the restrictions imposed by the best attempts of the prior art. Furthermore, the preferred embodiment of the invention is "virtual" in the sense that its frequency scales, time scales, amplitude scales, conversion procedures, pattern recognition procedures, and file structures can be configured for a variety of applications.
According to the present invention there is provided a method for determining a similarity between predetermined wave data and discrete portions of a continuous stream of wave data, the method comprising the steps of (a) inputting as a first input predetermined spectral wave data, the predetermined spectral wave data including a plurality of time strips, with each time strip including at least a first partial and a second partial; (b) generating a first signal that includes first distortion information for each time strip of the predetermined spectral wave data; (c) storing the first distortion information for each time strip of the predetermined spectral wave data; (d) inputting as a second input a continuous stream of wave data; (e) generating from the second input spectral wave data that includes a plurality of time strips, with each time strip further including at least a third partial and a fourth partial; (f) generating a second signal that includes second distortion information for each time strip of the spectral wave data; and (g) comparing the second distortion information with the stored first distortion information to obtain a similarity score for each time strip.
In an embodiment of the invention, there is provided a method for computer automated detection and identification of one or more events from one or more continuous streams of wave data, each event having frequency, amplitude, and time components. The method comprises the steps of transforming a continuous stream of wave data into frequency, amplitude, and time spectral information to minimize computational time and to optimize frequency, amplitude, and time accuracy; extracting identifying characteristics from the spectral information obtained from the continuous stream of wave data; generating and retaining a
Fingerprint from the extracted identifying characteristics of a selected known event comprising a source, a time, and B fundamental frequency which are determinable by a user; comparing the identifying characteristics of the Fingerprint to corresponding identifying characteristics extracted from the spectral information obtained from one or more continuous streams of vave data and producing a database of sets of scores describing a level of correspondence between the identifying characteristics of the Fingerprint and corresponding identifying characteristics of a potential event; and selecting from the database sets of scores having a preselected level of correspondence to the identifying characteristics of the Fingerprint and interpreting spectral information associated with the selected sets of scores to determine the presence of a specific event, its start-time, its amplitude, and its duration.
The transforming step can comprise interpreting output of an FFT function to derive amplitudes of selected frequency components using polynomial interpolation; interpreting output of an FFT function to predict an amplitude of a selected frequency component by searching for an amplitude peak inside a calculated bin range; providing frequency. amplitude. and time accuracy by combining spectral information produced by a plurality of different length FFT's using frequency limiting; and providing frequency. amplitude. and time accuracy by combining spectral information produced by different length FFT's using amplitude averaging and amplitude clipping.
The extracting identifying characteristics step can comprise extracting amplitude-independent slopes of partials over time extracting amplitude- indepeent slopes existing between coterminous partials; extracting amplitude-independent, frequency. distribution widths for partials; extracting amplitude-independent, frequency-distribution width settle-times for partials over time; extracting amplitude-independent. frequency-distribution width forms for partials; determining a frequency position for a most prevalent amplitude peak of a partial and a relative drift of individual amplitude peaks from the most prevalent amplitude peak; and comparing an accumulated total amplitude for all of a plurality of partials for a predetermined number of time positions to an accumulated total amplitude derived from an amplitude reference Fingerprint having a known amplitude level and calculating a resulting relative amplitude level.
The Fingerprint generating and retaining step can comprise generating a Fingerprint from spectral information by extracting the identifying characteristics of the selected known event and retaining at least one of the identifying characteristics, and generating a
Fingerprint from more than one selected known event, each having the same fundamental frequency and source, and retaining a set of identifying characteristics which represent an average of at least one of the identifying characteristics.
The comparing step can comprise transposing selected frequency components of the identifying characteristics of the Fingerprint to corresponding frequency criteria of the spectral information; extracting at least one identifying characteristic from spectral information that corresponds to at least one of the identifying characteristics selected from the Fingerprint; comparing the extracted identifying characteristics of the spectral information to the corresponding identifying characteristics of the Fingerprint; and producing a set of scores which reflects the degree of amplitude-independent correspondence
The selecting and interpreting step comprise determining the presence of a specific event accumulating a series of scores for a fundamental frequency associated with the specific event, and analyzing a trend of the series of scores over time to determine precisely where in time the specific event begins; further analyzing the series of scores of the specific event to determine if a separate potential event, having partials related to the partials of the specific event, exists within the time covered by the series of scores of the specific event. thereby revealing the possibility of an incorrect interpretation of the series of scores; further accumulating an amplitude for each partial of the specific event for each time location forward in time from the start of the specific event. comparing an accumulated total of the amplitudes of the partials to an amplitude total for partials of the Fingerprint, and calculating a relative amplitude for the specific event; further determining the duration of the specific event by examining the amplitude level of the specific event's partials over time to determine if and when the amplitude level falls below a minimum selected threshold; further determining the duration of the specific event by comparing the amplitude level of each partial of the specific event over time against an average of the amplitudes of the specific event taken at precedent time locations to determine whether a new event has interrupted the duration of the specific event; and further determining the duration of the specific event by testing for a drift of the amplitude peak over time to identify larger drift variations that can follow a predetermined time period of smaller drift variations, thereby indicating that a new event has interrupted the partial of the specific event.
Another embodiment of the invention provides for synchronizing to an external time-base the start-times associated with a plurality of specific events, comprising the steps of generating and storing a table of elapsed sample count for each of a plurality of event timing requests; calculating an event timing request count for the start-time of the specific event and indexing the elapsed sample count table with the event timing request count to retrieve an elapsed sample count; calculating a time offset between the start-time of the specific event and an event timing request time calculated from the event timing request count; using the time offset, further calculating a relative time offset from the elapsed sample count; calculating a synchronized time by adding the elapsed sample count to the relative time offset; and assigning the synchronized time to the specific event.
Further embodiments of the invention also provide products produced by these processes, such as databases, representations, and derivatives thereof, audio and visual recordings and the like.
Another embodiment of the invention provides an apparatus for computer automated detection and identification of one or more events from one or more continuous streams of vave data, each event having frequency. amplitude, and time components. The apparatus comprises structure for transforming a continuous stream of wave data into frequency. amplitude, and time spectral information to minimize computational time and to optimize frequency. amplitude, and time accuracy; structure for extracting identifying characteristics from the spectral information obtained from the continuous stream of wave data; structure for generating and retaining a Fingerprint from the extracted identifying characteristics of a selected known event comprising a source, a time, and a fundamental frequency which are determinable by a user; structure for comparing the identifying characteristics of the Fingerprint to corresponding identifying characteristics extracted from the spectral information obtained from one or more continuous streams of wave data and for producing a database of sets of scores describing a level of correspondence between the identifying characteristics of the
Fingerprint and corresponding identifying characteristics of a potential event; and structure for selecting from the database sets of scores having a preselected level of correspondence to the identifying characteristics of the Fingerprint and for interpreting spectral information associated with the selected sets of scores to determine the presence of a specific event. its start-time, its amplitude, and its duration.
The wave data transforming structure can comprise means for interpreting the output of an FFT function to derive amplitudes of selected frequency components using polynomial interpolation; means for interpreting the output of an FFT function to predict an amplitude of a selected frequency component by searching for an amplitude peak inside a calculated bin range; structure for providing frequency. amplitude, and time accuracy by combining spectral information produced by a plurality of different length FFT's using frequency limiting; and structure for providing frequency. amplitude and time accuracy by combining spectral information produced by different length FFT's using amplitude averaging and amplitude clipping structure.
The identifying characteristics extracting structure can comprise structure for extracting amplitude-independent slopes of partials over time; structure for extracting amplitude-independent slopes existing between coterminous partials; structure for extracting amplitude-independent. frequency-distribution widths for partials; structure for extracting amplitude-independent, frequency-distribution width settle-times for partials over time; structure for extracting amplitude-independent. frequency-distribution width forms for partials; structure for determining a frequency position for a most prevalent amplitude peak of a partial and a relative drift of individual amplitude peaks from the most prevalent amplitude peak; and structure for comparing an accumulated total amplitude for all of a plurality of partials for a predetermined number of time positions to an accumulated total amplitude derived from an amplitude reference Fingerprint having a known amplitude level and structure for calculating a resulting relative amplitude level.
The Fingerprint generating and retaining structure can comprise structure for generating a Fingerprint from spectral information by extracting the identifying characteristics of the selected known event and retaining at least one of the identifying characteristics, and structure for generating a Fingerprint from more than one selected known event, each having the same fundamental frequency and source, and structure for retaining a set of identifying characteristics which represent an average of at least one of the identifying characteristics.
The comparing structure can comprise structure for transposing selected frequency components of the identifying characteristics of the
Fingerprint to corresponding frequency criteria of the spectral information; structure for extracting at least one identifying characteristic from the spectral information that corresponds to at least one of the identifying characteristics selected from the
Fingerprint; structure for comparing the extracted identifying characteristics of the spectral information to the corresponding identifying characteristics of the Fingerprint; and structure for producing a set of scores which reflects the degree of amplitude-independent correspondence.
The selecting and interpreting structure can comprise structure for determining the presence of a specific event, structure for accumulating a series of scores for a fundamental frequency associated with the specific event, and structure for analyzing the trend of the series of scores over time to determine precisely where in time the specific event begins. Too, structure for further analyzing the series of scores of the specific event to determine if a separate potential event, having partials related to the partials of the specific event, exists within the time covered by the series of scores of the specific event, thereby revealing the possibility of an incorrect interpretation of the series of scores; structure for accumulating an amplitude for each partial of the specific event for each time location forward in time from the start of the specific event; structure for comparing an accumulated total of the amplitudes of the partials to an amplitude total for partials of the Fingerprint, and structure for calculating a relative amplitude for the specific event; structure for determining the duration of the specific event by examining the amplitude level of the specific event's partials over time to determine if and when tha amplitude level falls below a minimum selected threshold; structure for determining the duration of the specific event by comparing the amplitude level of each partial of the specific event over time against an average of the amplitudes of the specific event taken at precedent time locations to determine whether a new event has interrupted the duration of the specific event; and structure for determining the duration of the specific event by testing for a drift of the amplitude peak over time to identify larger drift variations that can follow a predetermined time period of smaller drift variations, thereby indicating that a new event has interrupted the partial of the specific event can also be provided.
Another embodiment of the invention provides for synchronizing to an time-base. The start-times associated with a plurality of specific events.
The apparatus comprises structure for generating and storing a table of elapsed sample count for each of a plurality of event timing requests; structure for calculating an event timing request count for the start-time of the specific event and for indexing the elapsed sample count table with the event timing request count to retrieve an elapsed sample count; structure for calculating a time offset between the start-time of the specific event and an event timing request time calculated from the event timing request count; structure for using the time offset to further calculate a relative time offset from the elapsed sample count; structure for calculating a synchronized time by adding the elapsed sample count to the relative time offset; and structure for assigning the calculated synchronized time to the specific event.
The objects, advantages, novel features, and further scope of applicability of embodiments of the present invention are set forth in part in the detailed description to follow, taken in conjuction with the accompanying drawings, and in part will become apparent to those skilled in the art upon examination of the following, or may be learned by practice of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Examples of embodiments of the present invention w
Figs. 3 through 13c are block diagrams of program structure for the primary procedures. The core functions from each of the primary procedures are shown; in cases where core functions are substantially similar, one is illustrated as representative of the others; each block diagram is titled with the label of its parent procedure and includes a general functional description;
Fig. 3 CONVERSION (Generate one time strip of a Spectral File);
Fig. 4 COMPARISON (Initial database generation);
Fig. 5 COHPARISON (Time slope compare);
Fig. 6 COHPARISON (Frequency distribution width compare);
Fig. 7 COHPARISON (Find peak);
Fig. 8 COMPARISON (Find frequency distribution width);
Fig. 9 INTERPRETATION (Control the interpretation of scores);
Fig. 10 INTERPRETATION (Evaluate a series of scores);
Fig. 11 INTERPRETATION (Find the duration of an event);
Fig. 12 FINGERPRINT (Hake Fingerprint);
Fig. 13A SYNCHRONIZATION (Record file, generate sync table);
Fig. 13B SYNCHRONIZATION (Event time request processing);
Fig. 13C SYNCHRONIZATION (Calc sync time from sync table); and
Fig. 14 APPARATUS OVERVIEW is a block diagram showing the flow of signals and data among individual apparatus components, some of which are optional, in accordance with a particular application.
Relevant program source code with labeled titles and general functional descriptions directly corresponding to the titles and functional descriptions of the block diagrams is provided in the microfiche appendix. Steps represented in the block diagrams are labeled with a number corresponding to a Source Code Listings number.
The Source Code Listings are consistent vith the standard ANSI C language. The block diagrams provide overviews of process flow while the following discussion and Source Code Listings disclose in detail processes involved in the various embodiments.
DETATLED DESCRIPTION OF A PREFERRED EMBODIMENT
OF THE INVENTION (BEST MODE FOR CARRYING OUT THE INVENTION)
To understand the operation of an embodiment of the invention on complex waves, the nature of the vibrating source responsible for generating the wave, as well as the receiving and filtering processes, is examined.
As a preliminary step, the signal containing the wave data is digitally sampled at a rate allowing the highest desired frequency to be represented by at least two samples. The samples are processed in small groups of contiguous samples called time strips". Each time strip corresponds to a specific duration of time.
Before a vibrating source begins emitting waves it is at rest and must be excited by a force. A vibrating source responds to an exciting force vith a unique pattern of chaotic frequencies settling toward a resonant frequency. At the start of this excitation the vibrating source is unstable and generates many frequencies for a brief time. As the excitation continues, the chaotic frequencies begin to group and emphasize a "peak" frequency. At first the peaks tend to "drift" in frequency, but they eventually settle on a resonant frequency. The duration from excitation to settle point is called the "settle" time." Normally the settle time extends across several time strips. When a time strip contains a wave which is settling, many frequencies that are distributed between frequency limits occur (these limits being defined by a low and a high frequency point). The width of this frequency distribution is referred as the "frequency distribution width The shape of the frequency distribution width can be analyzed to produce a frequency "width form". A pattern emerges as the frequency distribution width, width form, and its associated peaks progress through multiple time strips toward a settle point.
Frequently, complex waves are composed of mathematically related sine waves, which are referred to as "'partials."' Partials of different frequencies share a common "fundamental1 frequency (the fundamental is also considered one of the partials). Adjacent partials sharing the same fundamental are referred to as "coterminous partials". Harmonics are one example of partials. Each partial has its own settle time, frequency distribution width, width form, and peak drift. The individual partial's amplitude curve is defined as the "time slope".
The amplitude relationship between coterminous partials is defined as the "'frequency slope Time slope and frequency slope provide identifying information when extracted and used in practicing the preferred embodiment of the invention.
In order to analyze the digitally sampled wave the user configures an embodiment of the invention with a lower and upper frequency limit suitable to the wave date, and then aselects appropriate filtering methods in order to divide the specified frequency range into discrete frequency steps.
Each filter is centered on a specific frequency. A filter's purpose is to provide the amplitude of the filter's center frequency for a specific time strip. One or more cycles of a wave must elapse before a filter can state the amplitude of its center frequency with any certainty.
From the point of time a wave enters r filter up to the point of time one cycle has elapsed, the filter output is, of course, in error and results in increased filter uncertainty width" and incorrect amplitude values.
Filters suffer from an additional phenomenon known as "'leakage." Since no filter is sufficiently narrow to separate out a single frequency from a complex wave, all filters allow some portion of the neighboring frequencies to leak through. This leakage can lead to false interpretations of a filter's output if the leakage is not calculated and the resulting error factor recognized.
In practicing the preferred embodiment of the invention, one takes into the nature of vibrating sources and their resulting waves as well as errors introduced by the receiving and filtering steps. The waves' partial ratios and amplitude slopes provide a tremendous amount of analysis data, but not enough detail to draw unequivocal conclusions about the characteristics of the vibrating source. The frequency distribution widths, width forms, peaks, settle times, and corresponding progressions across time provide the additional data necessary for this embodiment of the invention to achieve its extraordinary accuracy. Since these characteristics are applicable for any frequency, this embodiment can be practiced utilizing events comprising virtually any frequencies.
This embodiment of the inventon extracts the location in time that an event occurs and identifies the signal source, its fundamental frequency, amplitude and duration. In order to extract this information a wave whose attributes are known is first analyzed and stored along with its known attributes as a "'Fingerprint File."' the embodiment turn compares a
Fingerprint of a known fundamental frequency to a source wave by transposing the Fingerprint's frequency components to match the fundamental frequency of the source wave. Frequency ratios, amplitude ratios, frequency distribution widths, width forms, peaks, and settle times (i.e., Fingerprint attributes") are taken from the Fingerprint and compared to the same attributes of the source wave. Each comparison generates a wset of scores" which is stored in a "Score Filed for subsequent interpretation.
The Score File is processed through a statistical collection procedure which calculates the distribution of score density patterns.
From these score patterns a user selects a range of scores to be examined based on the density patterns observed.
The results of previous interpretations establish tractable rules whereby the intJdbEnt "learns" how to interpret better a specific progression and relationship of scores. If the embodiment's interpretation rules determine that an event is questionable, then it initiates a series of feedback steps so that the user may assist in the embodiment's learning. One an event is identified, its duration and evolution are tracked by the embodiment until the event is over.
At any point during score generation or score scanning a pre-defined message may be sent to the user. Messages are sent based on critical threshold conditions which are selected and established by a user during Setup. These messages serve as warning flags or indicate a need for user intervention.
Embodiments of the invention are also directed to products produced by the various processes. Such products include but are not limited to databases. representations, and derivatives thereof. audio recordings, video recordings, and the like.
Although the preferred embodiment is generally addressed to accustic wave processing by way of example, those skilled in the art will appreciate that other embodiments of the invention will be useful for processing waves of virtually any frequency contemplated for analysis. The particular designs and modifications for embodiments suitable for particular frequency ranges will be apparent to those skilled in the art from the embodiments herein described. Many of the various designs used by practitioners to analyze different frequency ranges will be a matter of choice.
Fig. 1 (PROCEDURE DATA FLOW) and Fig. 2 (PROCEDURE CONTROL FLOW) are block diagrams of the procedures. data files, and devices and of the subsequent flow of control information and pertinent data files. One first defines the input and output devices and configures the procedures and file structures to the appropriate values.
Fig. 1. and Fig. 2 INPUT DEVICES (10.0): "INPUT DEVICES" refers to (a) the devices to digitize the source wave and to time-correlate an external high resolution clock to the sampling process and (b) the device controls used by the to control the sampling process. In many applications an analog-to-digital converter discretely samples an incoming source wave at a specific rate. INPUT DEVICES are also used to time-correlate the high-resolution clock to the sampling process. Typically this device counts or reads elapsed time and generates an "event timing request" used by the SYNCHRONIZE (3.0) procedures. The INPUT DEVICES are controlled by the embodiments "start sampling", "stop sampling", and "inquire elapsed samples" commands.
Fig. 1. and Fig. 2 INPUT (1.0): "INPUT" refers to the procedures which control the recording of samples from the INPUT DEVICES. A Sample File is a continuous digital recording of the source wave. Recording of the source wave is not always necessary since an apparatus in accordance with embodiments of the invention can be configured to process previously recorded continuous wave source data.
Fig. 1. and Fig. 2 SETUP (2.0): The user selects the configuration parameters for SETUP in order to specify how the preferred embodiment displays, processes, compares, stores, verifies, and interprets the source wave and resulting files. The capabilities of the INPUT DEVICES determine the lowest and highest frequencies and amplitudes to be processed. These frequency and amplitude ranges are divided into smaller frequency and amplitude divisions by overlaying a user-defined frequency scale. The sample rate of the selected input device determines the smallest division of time that can be processed. The user can select for display or for processing any division of time as small as the duration of one sample.
of the embodiment's files can be visually displayed on the
DISPLAY (9.0) VDU terminal. The user controls which portions of the frequency range and time divisions are displayed. All step and control parameters for all procedures are defined in SETUP (2.0) and stored in the Setup File (2.1).
Each procedure receives input or control information from a wave source, a user, or a file produced by a previous procedure. Each procedure produces a file or controls the output of information from the apparatus.
Fig. 1 and Fig. 2 CONVERSION (4.0):
"CONVERSION" refers to the production of a Spectral File (4.1) from a Sample File (1.1) input. The apparatus can accept file input which is already in spectral form, or it can generate Spectral Files (4.1) through the use of external band-pass filters. or it can use a novel adaptation and interpretation of the Fast Fourier Transform (FFT), the "Interpolated FFT Filter Bank" The goal of filtering is to produce a Spectral File (4.1) with accurate frequency and time resolution for further processing. SETUP (2.0) defines the filtering process to be used. In most cases the user selects the apparatus' Interpolated FFT
Filter Bank since it is significantly faster and more accurate than other filtering.
This embodiment of the invention uses a relatively small number of sample points for the FFT if the frequency is high and a relatively large number of sample points for the FFT if the frequency is low. The results from these multiple FFTs are combined by a series of frequency-limitating and amplitude averaging and clipping procedures, called the "Hulti-Pass Procedure Frequency-limiting defines the optimal frequency range for the different length FFTs ("length" being a measure of the number of sample points). The amplitude averaging and clipping procedures process the shorter length FFTs by averaging individual amplitudes with corresponding amplitudes from the longer length FFTs and clipping the resulting averaged amplitude so it doesn't exceed the calculated amplitude levels taken from the shorter length firs.
An FFT produces a graph of amplitude values linearly spaced across a frequency range. Upon this linear scale the user can define and then overlay virtually any conceivable frequency scale. The individual frequencies contained in the user-defined frequency scale and the individual frequencies the FFT produces are referred to as "points".
The embodiment finds a series of FFT points which surround the desired frequency scale point and performs a curve-fit, using polynomial interpolation on these FFT points. The desired frequency point is then interpolated from the curve-fitted FFT points.
In the higher frequency ranges an FFT produces multiple points for each requested point in the user-defined frequency scale. The embodiment analyzes the number of multiple points per requested point, calculates the "bin range", a search range above and below the requested point, and scans this bin range to find a maximum amplitude peak, which is referred to as the "peak-in-bin".
The final Spectral File (4.1) is passed to both the COMPARISON (6.0) and FINGERPRINT (5.0) procedures for further processing (see below).
Fig. 1. and Fig. 2 SYNCHRONIZE (3.0): "SYNCHRONIZE" refers to the accurate time alignment of timing data when the embodiment collects and examines multiple sources of
the embodiment is citen required to synchronize to a common "external time-base" the wave data that are acquired at different times and perhaps from different locations. The external time-base is generated from a high resolution clock generatorireader capable of precise timing divisions. This clock generator or reader should be capable of generating an event timing request to which the SYNCHRONIZE (3.0) procedures can respond. The input sampling device should be capable of responding to a "start" command, a "stop" command, and an "elapsed sample inquiry" request.
The novel synchronization procedures begin simultaneously with the "start" command. Upon receiving an event timing request from the INPUT device the synchronization procedures obtain the elapsed number of samples. "elapsed sample count", from the INPUT device and accumulate a count of the event timing requests and associated elapsed samples in the
Time File (3.1). This file is passed to the INTERPRETATION (7.0) procedures which index, read, and interpret timing information from the
Time File in order to calculate the correct timing information for the
Destination File (7.2).
Fig. 1. and Fig. 2 FINGERPRINT (5.0): "FINGERPRINT" refers to the generation of a known set of facts about an event extracted from a Spectral File (4.1). The FINGERPRINT procedures take a Spectral File of a source wave from a known vibrational source and calculate, extract, and store (in the Fingerprint
File) the Spectral File's frequency and amplitude slope ratios, peak locations, settle times, frequency distribution widths, and width forms.
The FINGERPRINT procedures calculate a Fingerprint amplitude level by summing the individual associated partial amplitudes and then comparing this amplitude total to the amplitude total of a previously stored
Fingerprint, the "amplitude reference Fingerprint", which has been generated from a known event with a known amplitude. The Fingerprint amplitude level and the information identifying the vibration source and fundamental frequency are then stored in the Fingerprint File.
Once an initial Fingerprint File (5.1) is produced by the EINGERP2INT procedures the FEEDBACK (8.0) procedures automatically modify and evolve new Fingerprints based on the parameters of the initial Fingerprint File. The Fingerprint File is then passed to the COHPARISON (6.0) and INTERPRET (7.0) procedures for further processing (see below).
Fig. 1. and Fig. 2 COHPARISON (6,0): COMPARISON refers to the unique evaluation of differences and similarities between the attributes of the Fingerprint File (5 1) and the Spectral File (4.1). The COMPARISON procedures yield a unique score for each attribute of the Fingerprint File.
The Spectral File (4.1) is stored as a series of time strips with each strip containing the amplitudes of the defined frequency scale.
COMPARISON procedures step through each of the Spectral File's frequency points and compares the spectral information to the comparable attributes stored in the Fingerprint File, and then proceed to the next time strip, repeating the process until the entire file is processed.
The Fingerprint File is indexed by the specific frequency point (in the
Spectral File) that is under exasination, and a single Fingerprint is extracted and transposed to meet the requirements of the specific
Spectral File frequency point. The Spectral File is scanned starting from the point under examination, and the collected frequency and amplitude values are converted to a format matching the format of the
Fingerprint File. Then the converted Spectral File information is compared against the Fingerprint's corresponding attributes, and a set of scores is generated from the results and stored in the Score File.
Fig. 1. and Fig. 2 INTERPRETATION (7.0): "INTERPRETATION" refers to the novel process of evaluating the individual scores from the Score File (6.1) and reaching conclusions about the attributes and behavior of each event recognized in the source wave. INTERPRETATION procedures search for scores which indicate that a potential event is starting and then gather a "series of scores" related to the potential event's fundamental frequency from time strips both before and after the time strip associated with the potential event.
INTERPRETATION of the series of scores is controlled by tractable rules which determine the meaning assigned to specific sequences and relationships of scores. Once the INTERPRETATION procedures determine an event has started and when, i.e. , its 1start-time", they then calculate amplitude and track the event's duration.
The INTERPRETATION procedures contain utilities to scan the Score
File (6.1) and gather statistics about the distribution of scores. The user can display these distribution graphs and select a range of scores to examine. Through a set of interactive procedures the user can dynamically modify and add rules and conditions to the Rule File (7.1) pertinent to the data being analyzed.
In turn, INTERPRETATION's "feedback" procedures inform the user of questionable conditions arising from application of these rules to the data. Thus, the evolving "knowledge base" contained in the Rule File, in conjunction with the score ranges, comprises a set of rules that govern the INTERPRETATION, FINGERPRINT, and COMPARISON procedures.
These rules are stored and available to the user for future use.
INTERPRETATION's "duration tracking procedures lock onto the events partials and track the event until new events completely interrupt all partials or a minimum amplitude threshold is crossed.
INTERPRETATION's "amplitude extraction" procedures also lock onto the partials and compare the amplitude levels of the Spectral File's partials to the amplitude levels of the Fingerprint's partials. then the current event is finished, the collected amplitude data are compared to the Fingerprint's amplitude setting, and an event's "relative amplitude" is determined.
INTERPRETATION creates a Destination File (7.1) which is a database of conclusions made by its procedures. This file is then accessed by FEEDBACK (8.0) which sequences the results for the user to verify.
Furthermore, the flow of output from the INTERPRETATION to the
FEEDBACK (8.0) procedures is controlled by SETUP (2.0) or the user so that new Fingerprints can be generated if comparison averages fall below an established threshold.
Fig. 1. and Fig. 2 FEEDBACK (8.0): "FEEDBACK1 refers to the procedures which present results from previous procedures to the user for verification. In turn, the user specifies the conditions under which FEEDBACK is activated. FEEDBACK reads the appropriate portions of the Destination File (7.2) and converts them to the format appropriate for DISPLAY (9.0) and OUTPUT
DEVICES (10.1). FEEDBACK also controls the distribution of the Device
Files (8.1) and Message Files (8.2) to the appropriate procedures.
FEEDBACK procedures receive responses from the user and, in turn, control FINGERPRINT (5.0), CONVERSION (4.0), COMPARISON (6.0), and
INTERPRETATION (7.0) for the purpose of regenerating more accurate files. FEEDBACK also controls the generation and evolution of Rule
Files (7.1).
Fig. 1. and Fig. 2 OUTPUT DEVICES (10.1): "OUTPUT DEVICES ' refer to the selected devices used by the
FEEDBACK (8.0) procedures to verify results, Different applications use different devices since the frequency requirements of these applications vary and the means of presenting results necessarily differ. In principle the OUTPUT DEVICES should be capable of receiving a file to be played or displayed and should include the controls to position, start, and stop the output of the Device Files (8.1).
Figs. 3-13.
(PROCEDURES)
The DESCRIPTION OF PROCEDURES section (belov) further describes the essential processes involved in the key procedures: namely, CONVERSION 4.0, FINGERPRINT 5.0, COMPARISON 6.0, INTERPRETATION 7.0, and
SYNCHRONIZATION 3.0. Since all other procedures cover ancillary functions for controlling particular device configurations, data flow, and information display, these other procedures are not described in detail.
Fig. 14 APPARATUS OVERVIEW
The individual components of an exemplary apparatus in accordance with the invention generally comprise a device for converting, such as
A/D Converter 3, continuous wave data from a wave source 1 into discrete digital samples which are processed by a computer 6 and associated coprocessors 7, if any. The results are sent from the computer 6 to the
D/A Converter 3, a graphics card and monitor 9, and into storage 10.
Alternatively, a direct digital input 5, can enter wave data directly into the computer 6 without utilizing an A/D converter. A clock 2 can generate and read high speed timing information and signal the computer to respond to timing information by controlling a timing request generator 4, which in turn controls the computer.
For acoustical wave analysis, the wave source 1 may be, for example, a multi-track tape recorder capable of processing tape having several tracks of audio signal, as well as an SHPTE time track. The audio signal is fed from the tape recorder into the analog-to-digital converter 3, which may be, for example, an ARIES DSP-16, which generates digital samples to be read by the computer 6, which may be, for example an IBM or compatible PC 33MHz 80386 computer system, functioning under an operating system, such as HS- W Se 3.3. The computer system stores the digital samples on a storage device 10, which may be, for example, a hard disk drive. As stated previously, the A/D converter may be bypassed by using a direct digital input device 5, such as an AES EBU compatible input port. The processing power of the computer may be supplemented by coprocessors 7, such as an Eighteen
Eight Laboratories PL1250 flosting-point array coprocessor, and by a multi-tasking operating system, such as MICROSOFID Windows 3.0. The computer system generates signals to the graphics card and monitor 9, such as a VIDEO SEVEN 1024 X 768 high-resolution VGA video adapter and a NEC multisync high-resolution color monitor. The computer system sends digital samples to the D/A converter 3, such as the ARIEL' DSP-16. The high speed clock generator/reader 2, which may be for example a Roland
SBX-80 SMOTE synchronizer, reads the SMOTE time code from the tape recorder and controls the timing request generator 4, which may be for example a Roland nPU-401 HIDI interface, directly interfaced to the computer. The user controls the apparatus through commands entered through the computer keyboard 8.
DESCRIPTION OF PROCEDURES CONVERSION (Oener & e one time strip of r Spectral File)
NAME: Hulti PassSpectral() FIG. 3
This procedure novelly generates multiple Spectral time strips of varying time and frequency qualities and then uniquely combines them into a single spectral time strip. To describe the process the program source code illustrates spectral generation for a single time strip.
Typically, a Spectral File is composed of many time strips.
1. Reads the Setup File in order to extract the control
parameters, frequency scales, and file names required by the
Conversion procedures.
2. Sorts the file names in order of ascending FFT size to place
the smaller FFT-sized Spectral time strips at the beginning
of the subsequent spectral-combine procedures.
3. Begins the loop for controlling the generation of multiple
Spectral time strips of different FFT sizes.
4. Accesses the Setup File to retrieve variables specific to
the FFT size used in the generation of the current Spectral
time strip.
5. Overlays the frequency scale requested by the user upon the FF'T's frequency points and calculates floating point
locations to determine the positions in the FFT at which the
requested frequency scale points are to be extracted by
subsequent procedures. The results of an FFT may be viewed
as a series of amplitudes at specific frequency points.
However, the application of FFTs inherently produces too few
frequency points in the lower frequencies for adequate
interpretation and so many frequency points in the higher
frequenci
6. Loops through all frequencies in the requested frequency
scale.
7. Calculates and stores the bin-width and bin-width start
locations for each requested frequency scale point.
8-9. Tests whether the bin-vidth is greater than a minimum
bin-width requirement. and, if so, retains the frequency
location at which to start peak-in-bin searching.
10. Loads a group of contiguous samples of a specific length
derived from the Setup procedures into an array for FFT
processing.
11. Applies an apodizing function to the sample array to reduce
FFT analysis errors that occur at the edges of sample
windows (a Hanning window is shown for illustration).
12. Performs an FFT of the requested size on the sample array.
13. Calculates the complex magnitude for each pair of real and imaginary parts contained in the FIST.
14-18. Performs a polynomial interpolation on the current time
strip. Polynomial interpolation uses the InterpArray(] calculated in block 5 (above) to determine the exact
position in the m to derive the individual amplitudes. The
resulting amplitudes are placed in the TimeStrip array.
14. Loops through all frequency points in the requested
frequency scale.
15-16. Loops through the surrounding FFT points at a width equal to
the requirements of the curve-fitting procedure and collects
the amplitudes.
17. Performs a chi-squared orthogonal polynomial curve fit on
the collected amplitudes and extracts the amplitude at the
requested frequency offset position.
18. Saves the resulting smplitude value in the TimeStrip array.
19-22. Performs the peak-in-bin function on the current time strip.
19. Loops through the peak-in-bin frequency range one frequency
location at a time.
20. Indexes the peak-in-bin start location, sets the width of
the search, and loops through the search range.
21-22. Tests for the highest amplitude level in the search range
and retains it along with the frequency location.
23-37. Combines multiple Spectral File time strips in a successive
manner whereby each additional time strip further refines
the previous cosòinatlons. Spectral time strips generated
by a FFT of a shorter duration contribute more to the time
resolution of the final time strip, while time strips
generated by a m of a longer duration contribute more to
the frequency resolution of the final time strip.
23-24. Tests if the current time strip is the first time strip in
this combination, and, if so, copies the entire time strip
to the destination time strip.
25-26. Loads from the Setup File the variables which control the
subsequent combination and loads a corresponding time strip
from the previous combination.
27-28. Examines whether any portion of the current time strip
exists below the combine start location, and, if so, copies
this lower range to the destination time strip.
29-30. Examines whether any portion of the current time strip
exists above the combine end location, and, if so, copies
this upper range to the destination time strip.
31. Examines whether a combination of time strips is required,
and, if not, bypasses the combination process.
32. Loops through all of the frequencies in the overlap region.
33-34. Tests whether the previous amplitude or the current
amplitude is zero, and, if either is the case, sets both
amplitudes to zero.
35-36. Averages the two amplitudes together and clips the average
to the previous amplitude, thus protecting the timing
information from becoming distorted by the increasing time
inaccuracies of the larger-dimensioned FFT-generated
Spectral Files.
37. Saves the result in the current time strip.
COMPARISON (Initial database generation) NAHE: InitDatabaseGenerate() FIG. 4
COMPARISON derives a Score File from a Spectral File by comparing the Spectral information to stored Fingerprint information. This initial database generation step constructs the database which is subsequently processed by further COHPARISON procedures. Acting as a preprocessor. the Initial Database Generation step uniquely determines which areas of the Spectral File justify further comparison in order to reduce subsequent calculations.
1. Loops the variable "Fundamental" through a selected range of
fundamental frequencies.
2. Calculates the Fingerprint's "FingcrHeader" and "FingerData" addresses for Fundamental, then calculates the transposition
between Fundamental and FingerHeader's fundamental
frequency.
3. Loops the variable "StripNumber" through a selected range of
time strips.
4. Loops the variable "CompletedPartials" through the partials
listed in FingerHeader.
5. Calculates the Fingerprint's "FingerPartial" offset address
for CompletedPartials.
6. Loops the variable CompareStrip" through a selected number
of time strips offset from StripRumber.
7. Calculates the Spectral File's SpoctralCenter address for
the frequency locations associated with Comparestrip.
8. Searches the higher and lower frequencies surrounding SpectralCenter to find a peak amplitude and corresponding
frequency location.
9. Tests if Fundamental is locked onto a neighboring frequency
and branches accordingly.
10-11. If a valid peak is found, compares the Fingerprint drift to
the Spectral drift and calculates a drift variation score.
12. Compares the Fingerprint form to the Spectral form and
derives a form variation score.
13. Compares the Fingerprint amplitude to the Spectral amplitude
and derives an amplitude variation score.
14. Adjusts each partial's drift variation score based on
frequency and time contraints.
15. Calculates the final scores for amplitude, drift, and form
variation.
16-17. Tests if the variation scores are high enough to justify
acceptance of the final scores, and, if so, saves the event
into the Score database.
COMPARISON (Time slope compare)
NAME: CompareTimeSlope () FIG. 5
Time Slope refers to the slope between two time strips at given frequency positions. as calculated from the amplitudes at those time strip and frequency locations. The Time Slope Comparison step uniquely determines a score which reflects hov closely the time slopes for the
Fingerprint and Spectral File match.
1. Loops the variable "CompletedPartials" through a selected
range of frequency offsets.
2. Exits if the number of failed partials exceeds a certain
limit.
3. Calculates the variable "FundamentalOffset1 based on the
current value of CompletedPartials by using the
FundOffsetTable -
4. Derives the search-width-table address and Fingerprint data
addresses,
5. Calculates the Spectral File's "SpectralCenter" address for
the frequency locations associated with the current value of
the variable "TimeStrip" and Fund ment-lOffset 6. Searches higher and lower frequencies surrounding SpectralCenter to find a peak amplitude and corresponding
frequency location.
7. Loops the variable "CompareStrip" through a selected number
of time strips offset from TimeStrip.
8. Calculates the Fingerprint data offset address for the
current TimeStrip and CompletedPartials.
9. Finds the peak amplitude for the next time strip.
10-11. Calculates the time slope for two points in Spectral File
and adjusts its variation percentage based on the given
slope's frequency. time strip, and amplitude.
12-13. Calculates the time slope for two points in the Fingerprint
File and adjusts its variation percentage based on the given
slope's frequency. time strip, and amplitude.
14-15. Derives a variation score by comparing the Spectral File's
variation to the Fingerprint File's variation and compares
that score to a minimum acceptable score.
16. Adds the time strip variation score to the partial variation
score total.
17. swaps the amplitude values for the next time strip slope
calculation, thereby eliminating one amplitude extraction
step.
18. Adds the variation score for one partial to the total
variation score.
19. Increments the failed partial total if the number of failed
time strips exceeds an established limit.
20. Calculates and saves the final time slope variation for all
partials in the comparison.
COMPARISON (Frequency distribution vidth compare) NAHE: CompareWidth()
FIG. 6
Frequency Distribution Width refers to a found peak's full width above and below the peak's center. The Frequency Distribution Width
Comparison step novelly determines a score that reflects how closely the
Frequency Distribution Widths for the Fingerprint File and Spectral File match each other at given time and frequency locations.
1. Loops the variable wCompletedPartilsZ through the partials
listed in FingerHeader.
2. Loops the variable "CompareStrip1 through a selected range
of time strips offset from the variable TimeStrip".
3. Calculates the Fingerprint data address offsets.
4-5. Searches higher and lower frequencies surrounding
SpectralCenter to find a peak amplitude and corresponding
frequency location and sets form variables to NO FORT if no
peak amplitude is found.
6. Abandons the width search if previous steps found a
"perfect" width.
7. Determines an accurate width for the peak, a description of
the form of that peak, and a description of the form of the
above and below segments of that peak.
8. Compares the width form of the Spectral File to the width
form of the Fingerprint File and derives a width form
variation score.
9. Calculates the width variation score for the current time
strip.
10. Compares variation scores to minimum acceptance scores.
11. Adds the variation for one partial to the partial variation
total.
12. Increments the failed partial total if the number of failed
time strips exceeds an established limit.
13. Calculates and stores the width variation score for all
partials in the comparison.
COMPARISON (Find peak) NAKE: FindPeak()
FIG 7
FindPeak novelly locates the most clearly defined amplitude peak within a set frequency range around a center position. The search range is uniquely dynamically controlled based on the current time strip and frequency position.
1. Calculates the frequency search range for this peak search
and sets the variables wUpSe-rchRange and "Dn5earchRange to the calculated search range.
2. Calculates the variable ZTotalSearchRango for further
Comparison calculations.
3. Searches for a peak above and a peak below the center
frequency within the frequency search range.
4. Searches above the center position for the nearest peak
using slope direction flags and threshold flags.
5. Determines the amplitude of the current frequency position
in the Spectral File.
6. Calculates the direction of the amplitude slope for
neighboring frequency positions as being either UP, DOWN, or
PLATEAU. PLATEAU refers to areas of relatively level
amplitude and slowly changing slope.
7. Compares the Spectral File amplitude to a minimum threshold
for the current frequency position, and stops searching if
the amplitude level falls below the minimum threshold.
8. Searches below the center position for the nearest peak
using slope direction flags and threshold flags.
9. Disables the search if the number of consecutive down slopes
exceeds an established down slope tolerance.
10. Updates the peak amplitude and location if the most recently
found amplitude is larger than the stored peak amplitude.
11. Sets the variable 'PerfcctVidth" to TRUE if a perfect width
is found, thus implying that above and below the peak the
amplitudes slope down to a minimum threshold.
12. Derives the form of the found peak from the search data
under analysis.
13. Derives the form of the above and below slopes around the
peak position based on search variables and width data.
14. Saves all peak data.
COMPARISON (Find frequency distribution width)
NAME: FindFrequencyuidth() FIG 8
Find Frequency Distribution Width uniquely determines the extent and shape of the peak in frequency space once the center position has been determined. Information is produced concerning the above peak and below peak width and form. If a better peak is found within the new search range, then the peak position and amplitude are updated.
1 Calculates the extent of the search in both the above and
the below directions.
2. Loops the variable 'Both through the two possible search
directions, ABOVE and BELOW.
3. Stops the search if the original peak search found a clearly
defined width.
4. Loops the variable "Search" through the maximum search range
allowed for either direction.
5. Abandons the search if the current value of Search exceeds
the search limit for this direction.
6. Determines the amplitude at the current peak position.
7. Calculates the direction of the frequency slope and places
this value in the variable "DirectionCase".
* 8. Branches depending on the calculated slope direction found
in DirectionCase.
9. If the slope trend is downward, tests the peak amplitude to
determine if it falls below a dynamically established
threshold.
10. If the slope trend is upward, tests the slope rate of change
for an interrupt condition. An interrupt condition implies
that a slope which once had a downward trend suddenly
changed direction at a rate exceeding a dynamically
established limit.
11-12. Calculates the fadeout position on completion of the search
or if the search has found an interrupt.
13. Updates the peak amplitude value and position if the search
has not completed and there has not been an interrupt.
14. Updates the peak amplitude value and position if the slope
direction is a PLATEAU.
15. Tests if the width search is finished and branches
accordingly.
16. Sets the final width form to an "up only" condition if the
slope within the search range is only an up slope to the
edge of the search range.
17. Sets the final width form to a "down to edge" condition if
the slope within the search range is only a down slope to
the edge of the search range.
18. Sets the final width form to a plateau condition, if the
nature of the slope within the search range is neither "up to edge" nor "down to edge".
19. Saves the current peak form in the final peak form
variables.
20. Retains the current width form in the final width form
variables.
21. Retains the final peak amplitude and position.
INTERPRETATION (Control the interpretation of scores)
NAME: InterpretScores () FIG. 9
Interpretation of scores comprises two categories. One involves scanning the Score File generated by the Comparison process to determine local maximums where the Fingerprint and Spectral Files match best. The other involves novelly tracing the individual frequency components of the matched source forward in time from the lock position until they either fade out or are interrupted, thus allowing accurate deteraination of the duration of the tracked event. The start and end times, as well as performance parameters determined during the scan, are then stored in the final database.
1. Loads the current score from the Comparison File into the
CompareScore structure.
2. Loops the variable ScoreCount through all scores in the
Comparison File which fall between the established start and
end time strips.
3. Tests if the current score's fundamental frequency falls
between the established start and end frequencies and
branches accordingly.
4. Evaluates the score and sets a flag if the score justifies
further search.
5. Checks if the score falls within the frequency range of
another search in progress.
6. Tests if the preliminary evaluation flag is TRUE and if this
score is not in conflict with current searches in progress,
and branches accordingly.
7. Loads a series of scores from the Comparison File which fall
within a frequency and time strip range related to the
current score.
8-9. Analyzes the series of scores to locate the most likely
start of the event. If a clear event start is located, the wSeriesTestw flag is set to TRUE and tested.
10. Allocates resources in the duration database to track this
event.
11. Tracks the partials of the event until the event's duration
is determined.
12. Saves the found event if the event's duration exceeds an
established lower limit.
13. Analyzes a time uncertainty and updates the knowledge base
if the SeriesTest flag indicates a time uncertainty.
14. Analyzes a poor score and updates the knowledge base if the
SeriesTest flag indicates a poor score.
15. Increments the current score count and loads in the next
score from the Comparison File.
INTERPRETATION (Evaluate a series of scores)
NAME: EvaluateSeries()
FIG. 10
Evaluate a Series of Scores novelly gathers a series of scores of related frequencies over a limited range of time strips, then uniquely locates the most likely score maximum within that series, and novelly calculates the equivalent time strip based on the offset into the score series. It also novelly provides useful information concerning potential errors in the analysis.
1. Loops the variable "SeriesLoop" through each entry in the series 2 Records the series location of the current score if it is
the maximum score in the series search.
3. Continues searching for the maximum score if the current
score falls below the found maximum score.
4. Accepts the most recent maximum score if the search has
reached an established search range limit without
encountering a better maximum for this score attribute.
5. Continues this process for all other score attributes.
6. Calculates a lock position using a cumulative buffer. Adds
a score to the cumulative buffer at the series offset where
an attribute's score reaches a maximum. If several attributes lock on the same location the lock position will
show an excellent cumulative score.
7. Calculates the best cumulative locking position by scanning
the cumulative buffer.
8. Searches the cumulative buffer again and finds a secondary
locking position.
9. Calculates the time strip which corresponds to the best
locking position found in the cumulative buffer.
10. Records the best lock position.
11. Sets flags indicating that this region must be analyzed
futher if the cumulative buffer value at the lock position
falls within an established uncertainty limit.
12. Sets flags indicating that an event has not been found
within the time strip range of the Series array if the
cumulative buffer value at the lock position falls below an
established threshold 13. Sets flags indicating that the exact lock position cannot be
determined if a secondary lock position exists at an
established distance separate from the primary lock
position.
14. Sets flags indicating that the lock position is acceptable
if a clear lock has been attained.
INTEPPRETATI ON (Find the duration of an event)
NAME: FindDuration()
FIG. 11
Once an event has started, its duration is tracked by the
FindDuration routine which novelly monitors the Spectral File's partials at the frequency locations specified by the FingerPrint File and detects when the partial fades out or is interrupted by another event. As each partial is canceled by fade-out or by interruption, the total partial count is decremented. As active partials decrease, flags are set to indicate "fragile" duration tracking. Interpretation routines monitor the stability of the duration tracking for the current event. When active partials fall below a user-defined minimum, the event is flagged as complete and all duration tracking resources are freed.
1. Tests if this event is a continuation of a preceding
duration scan. If so, fade-out and interrupt entries in the
database are tested, and the FadeS core and InterruptScore
are appropriately updated with the correct number of
inactive partials.
2. Controls the loop through all of the partials.
3. Calculates the bit mask and database indexes needed by
subsequent Interpretation steps.
4-5. Tests if the current partial has already faded out or has
been interrupted, and, if so, increments a completed partial
count. Before an interrupt condition can occur a series of
down slopes must take place. By searching for an initial
down slope the duration tracking routines allow the initial
attack portion of an event to transpire successfully and to
start to decay in amplitude.
6. Examines the duration tracking database and determines if a
down slope occurred during a previous duration scan.
7. Calculates the memory address of the beginning of the
Fingerprint data.
8. Controls the loop through the time strips.
9. Accumulates a Fingerprint amplitude total used in
determining the current event's amplitude.
10. Retrieves the amplitude of the Spectral File for the current
position in frequency and time.
11. Accunulates the Spectral File amplitude total used to
calculate an event amplitude.
12-14. Compares the Spectral amplitude to the Fingerprint
amplitude, and, if the Spectral amplitude falls below a
user-defined percentage of the Fingerprint amplitude, a
bad-partial count is incremented. If the bad-partial count
exceeds a rejection threshold, the partial is marked as
completed and a FadeScore count is incremented. If enough
partials fall below the rejection threshold and if this
duration scan is the first for the current event, then
FindDuration marks this event as invalid.
15-18. Sets bits in the duration tracking database if the current
amplitude falls substantially below the calculated running
amplitude average and a down slope is found.
19-20. Tests if the current Spectral amplitude falls below the
fade-out amplitude, and. if so, sets fade-out bits in the
duration tracking database.
21-22. Tests if the general slope of the last few amplitude levels
is trending in an upward direction, If so, a more extensive
amplitude trend is calculated.
23-24. Sets bits in the duration tracking database and increments
the interrupt score if the current amplitude level exceeds
the trend level plus an offset amount.
25. Accumulates a wild frequency drift count and a bad-form
count. Since certain event combinations can destroy the
recognizable fade-out and interrupt characteristics of the
Spectral File, a third test can be made to test for wild
frequency drift and dramatic form variations since these
conditions indicate a new event obscuring the tracking
capabilities of FindDuration.
26-27. Tests if the wild-drift count or bad-form count exceeds an
established minimum, and, if so, sets bits in the duration
tracking database.
28. Totals the amplitudes for the current partial.
29. Calculates an event amplitude if this is the first duration
scan.
30. Assigns the results of the duration scan to the variables
returned from FindDuration.
FINGERPRINT (Make Fingerprint) NAHE: MakeFingerprint()
FIG. 12
HakeFingerprint uniquely extracts and saves the identifying characteristics of a Spectral File from a known source. The Spectral
File is scanned at each partial's frequency position for frequency peaks and general drift centers are determined. The Spectral File is analyzed a second time, and the peak amplitudes, drifts, widths, and details of peak structure are saved for each peak. The final result is a
Fingerprint File containing the essential information describing the known source.
1. Initializes the variables for the Fingerprint generation
process.
2. Loops the variable "FreqLoop" through all frequencies in the
Fingerprint generation frequency range and fills the PartialOnOff map.
3. Loops the variable "TimeLoop" through a set range of time
strips.
4. Finds the amplitude peak nearest in frequency to FreqLoop.
5. Sets the above and below form variables to the NO FORH state
if an accurate peak cannot be found, thus indicating that
the peak has no discernable form.
6. Finds the frequency width, form, and drift.
7. Disables the analysis of this partial if the drift is
extremely erratic over several time strips. Erratic drift
implies an uncertain frequency position for the partial.
8. Marks the current time strip as being the end of the
drift-center determination range unless the current
amplitude exceeds a dynamic threshold, in which case the
amplitude is included in the determination of the drift
center position.
9. Calculates the increment to the "PartialAmp" value based on
a comparison of the current partial to a known standard
amplitude.
10. Finds the common drift center for the partial under
analysis.
11. Logs the partial in the PartialOnOff map without drift if
the finished partial is the fundamental.
12. Logs the partial in the PartialOnOff map if the number of
strips used in the calculation of the common drift center
exceeds the minimum number of required time strips.
13. Calculates the next recommended frequency to analyze from a
map of partial ratios.
14. Saves the calculated PartialOnOff map and associated
variables in the Fingerprint File.
15. Loops the variable FreqLoop through all frequencies in the
Fingerprint generation range and fills the Fingerprint with
the correct amplitude levels.
16-17. Loops the variable TimeLoop through an established range of
time strips if the PartialOnOff map indicates that the
current frequency corresponds to a listed partial.
18. Finds the amplitude peak closest to the partial under
analysis.
19. Sets the Form variables to NO FORTH if an accurate peak
cannot be found, thus implying that the peak has no form.
20. Finds the frequency width, form, and drift.
21. Saves the completed Fingerprint partial data in the
Fingerprint File.
SYNCHRONIZATION (General discussion.)
To facilitate the ensuing discussion of the Synchronization procedures, devices and control procedures common to the music and recording industry are included for purposes of illustration and in no way imply any limit on the general applicability of the invention
Recording processes in the music industry commonly utilize a standard synchronizing timing protoc generator/reader as a high-resolution clocking device. Such devices are typically capable of generating HIDI real-time clock signals that can be passed to an apparatus in accordance with the invention via a MIDI serial interface that is responsible for generating the timing requests to which the invention responds.
For digital sampling purposes an analog-to-digital converter controlled by a dedicated signal processing device is suggested.
Whatever the field of application, the sampling device selected should be capable of sampling at a rate corresponding to the demands of the waveform being recorded and of responding to START RECORD, STOP RECORD, and ELAPSED SAMPLE requests from embodiments of the inusAisn.
SYNCHRONIZATION (Record a Sample File, and generate sync table) NAHE: RecordAndSync() FIG. 13A
1-3. Allocates memory for the synchronization table, initializes
the record variables and counters, enables the interrupts
(timing requests), sets the interrupt rate, and initializes
the interrupt processing routines.
4. Starts the recording process.
5-7. Sends an END RECORD command and disables the interrupt
processing routines if a stop-recording request is received
from the user.
8. Saves the synchronization table.
SYNCHRONIZATION (Event time request processing)
NAME: proc int FIG. 13B
These routines utilize assembly language in order to provide a quick, accurate, and precise response to interrupcs.
1-2. Retrieves the data placed in the data port by the interrupt
generating process and verifies the presence of a clock
identifier command.
3-4. Retrieves the elapsed sample count from the sampling device,
clears the sampling device's elapsed sample counter, and
stores the elapsed sample count in the synchronization
table.
SYNCHRONIZATION (Calculate synchronized time from sync table)
NAME: GetSyncTimeData()
FIG. 13C
1. Calculates the absolute time of the requested StartrimeStrip as well as the absolute interrupt count that vould be
associated with this absolute time if the measurement and
recording of the passage of time were constant. Also
calculates the difference between (a) the time associated
with the interrupt count which falls immediately before the
requested StartTimeStrip (Lowerlnterrupt) and (b) the
absolute time of the requested StartTimeStrip.
2. Searches the synchronization table to find the sample count associated vith the Lowerlnterrupt and the sample count
associated with the interrupt immediately following the
absolute interrupt - 3. Calculates the number of samples between the lower sample
count and the upper sample count and multiplies this
quantity by the time difference calculated in block 1
(above) thus resulting in the sample offset from the lower
interrupt sample count.
4. Adds the lower interrupt sample count to the sample offset
to calculate the final sample count.
5. Multiplies the final sample count by the time of one sample
in order to calculate the final synchronized time.
Timing accuracy of these processes can be increased by decreasing the time interval between interrupts.
Although the invention has been described with reference 8 preferred embodiment. other embodiments can achieve the same results.
Variations and modifications of the present invention will be obvious to those skilled in the art, and the appended claims are intended to cover all such modifications and equivalents.
The further features described herein are claimed in the applicant's co-pending application published under publication number GB-A-2262992.
Claims (8)
1. A method for determining a similarity between predetermined wave data and discrete portions of a continuous stream of wave data, the method comprising the steps of:
(a) inputting as a first input predetermined spectral wave data, the predetermined spectral wave data including a plurality of time strips, with each time strip including at least a first partial and a second partial;
(b) generating a first signal that includes first distortion information for each time strip of the predetermined spectral wave data;
(c) storing the first distortion information for each time strip of the predetermined spectral wave data;
(d) inputting as a second input a continuous stream of wave data;
(e) generating from the second input spectral wave data that includes a plurality of time strips, with each time strip further including at least a third partial and a fourth partial;
(f) generating a second signal that includes second distortion information for each time strip of the spectral wave data; and
(g) comparing the second distortion information with the stored first distortion information to obtain a similarity score for each time strip.
2. The method as recited in claim 1, wherein the first distortion information includes amplitude-independent, frequency-distribution widths for the first and second partials, and the second distortion information includes amplitude-independent, frequency-distribution widths for the third and fourth partials.
3. The method as recited in claim 1 or 2, wherein the first distortion information includes amplitude-independent, frequency-distribution widths settle-times for the first and second partials, and the second distortion information includes amplitude-independent, frequency-distribution widths settle-times for the third and fourth partials.
4. The method as recited in claim 1, 2 or 3, wherein the first distortion information includes amplitude-independent, frequency-distribution width forms for the first and second partials, and the second distortion information includes amplitude-independent, frequency-distribution width forms for the third and fourth partials.
5. The method as recited in any preceding claim, wherein step (c) further comprises the step of storing a first amplitude-independent slope for the first and second partials, step (f) further comprises the step of generating a second amplitude-independent slope for the third and fourth partials, and step (g) further comprises the step of comparing the stored first amplitude-independent slope with the second amplitude-independent slope.
6. The method as recited in any one of claims 1 to 4, wherein step (c) further comprises the step of storing a first amplitude-independent slope for the first and second partials if the first partial is coterminous with the second partial, step (f) further comprises the step of generating a second amplitude-independent slope for the third and fourth partials if the third partial is coterminous with the fourth partial, and step (g) further comprises the step of comparing the stored first amplitude-independent slope with the second amplitude-independent slope.
7. The method as recited in any preceding claim, wherein the continuous stream of wave data includes a continuous stream of complex wave data.
8. The method as recited in any preceding claim, wherein the continuous stream of wave data includes a continuous stream of digital wave data.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US54224190A | 1990-06-21 | 1990-06-21 | |
GB9226636A GB2262992B (en) | 1990-06-21 | 1991-06-18 | Method and apparatus for wave analysis and event recognition |
Publications (3)
Publication Number | Publication Date |
---|---|
GB9418236D0 GB9418236D0 (en) | 1994-10-26 |
GB2282456A true GB2282456A (en) | 1995-04-05 |
GB2282456B GB2282456B (en) | 1995-07-05 |
Family
ID=26302198
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB9418236A Expired - Fee Related GB2282456B (en) | 1990-06-21 | 1991-06-18 | Method and apparatus for wave analysis and event recognition |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2282456B (en) |
-
1991
- 1991-06-18 GB GB9418236A patent/GB2282456B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
GB9418236D0 (en) | 1994-10-26 |
GB2282456B (en) | 1995-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5276629A (en) | Method and apparatus for wave analysis and event recognition | |
US6930236B2 (en) | Apparatus for analyzing music using sounds of instruments | |
US6380474B2 (en) | Method and apparatus for detecting performance position of real-time performance data | |
Klapuri et al. | Automatic transcription of music | |
Eerola et al. | MIR In Matlab: The MIDI Toolbox. | |
Tanguiane | Artificial perception and music recognition | |
KR100776495B1 (en) | Search method in audio database | |
Dannenberg et al. | Music structure analysis from acoustic signals | |
EP1143409A1 (en) | Rhythm feature extractor | |
Scheirer | Extracting expressive performance information from recorded music | |
JP2002529772A (en) | Fundamental wave high-speed discovery method | |
Atlı et al. | Audio feature extraction for exploring Turkish makam music | |
Walters et al. | The intervalgram: an audio feature for large-scale melody recognition | |
Davies | Towards automatic rhythmic accompaniment | |
GB2282456A (en) | Wave analysis | |
EP2098968A1 (en) | Method and apparatus for identifying electronic files | |
US20030135377A1 (en) | Method for detecting frequency in an audio signal | |
Askenfelt | Automatic notation of played music (status report) | |
De Mulder et al. | Recent improvements of an auditory model based front-end for the transcription of vocal queries | |
Ślȩzak et al. | KDD-based approach to musical instrument sound recognition | |
Evans et al. | A computer-based technique for the quantitative analysis of animal sounds | |
Du et al. | Determination of the instants of glottal closure from speech wave using wavelet transform | |
Kashino et al. | Bayesian estimation of simultaneous musical notes based on frequency domain modelling | |
Kostek | Soft computing-based recognition of musical sounds | |
Schindler | Dynamic timbre control for real-time digital synthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
732E | Amendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977) | ||
PCNP | Patent ceased through non-payment of renewal fee |
Effective date: 20000618 |