US20070276668A1 - Method and apparatus for accessing an audio file from a collection of audio files using tonal matching - Google Patents
Method and apparatus for accessing an audio file from a collection of audio files using tonal matching Download PDFInfo
- Publication number
- US20070276668A1 US20070276668A1 US11/439,760 US43976006A US2007276668A1 US 20070276668 A1 US20070276668 A1 US 20070276668A1 US 43976006 A US43976006 A US 43976006A US 2007276668 A1 US2007276668 A1 US 2007276668A1
- Authority
- US
- United States
- Prior art keywords
- audio file
- vocal
- collection
- discrete portions
- index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000001755 vocal effect Effects 0.000 claims abstract description 97
- 238000010183 spectrum analysis Methods 0.000 claims abstract description 8
- 230000000881 depressing effect Effects 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000001771 impaired effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/632—Query formulation
- G06F16/634—Query by example, e.g. query by humming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Definitions
- This invention relates to a method and apparatus for accessing an audio file from a collection of audio files, and particularly relates to the accessing of files using tonal matching.
- While the audio files may be stored and categorisable according to their song titles, artistes, genre or the like, there may be instances where a user may forget the title or artiste of a song, rendering a search for the pertinent audio file akin to searching for a needle in a haystack. In many instances, the user may only be able to remember a portion of the song or its tune. At the present moment, this does not aid in the search for the pertinent audio file in any way. This is a problem when attempting to access audio files in a large collection of audio files where certain information like title or artiste of a song is unknown. This problem also arises when the visually impaired attempts to access audio files in a collection of audio files where they are unable to select the audio files through the use of sight.
- a method for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with an electronic device.
- the method includes generating one index comprising of information entries obtained from each of the more than one audio file in the collection, with each audio file in the collection information being linked to at least one information entry; receiving a vocal input during a voice reception mode; converting the vocal input into a digital signal using a digital-analog converter; analysing the digital signal using frequency spectrum analysis into discrete portions; and comparing the discrete portions with the entries in the index.
- the audio file is accessed when the discrete portions substantially coincide with at least one of the information entries in the index.
- the discrete portions are either musical notes or waveforms.
- the at least one information entry may also be musical notes or waveforms.
- the vocal input may preferably be speaker independent and may be in the form of singing, humming, or whistling.
- the form of vocal input may preferably be manually or automatically selectable.
- the audio file is accessible from the electronic device itself, a device functionally connected to the electronic device or a connected computer network.
- the information entry may also preferably be received from the audio file, a pre-recorded vocal entry linked to the audio file, or a connected computer network.
- the electronic device is selected from the group comprising: vehicle audio system, desktop computer, notebook computer, PDA, portable media player and mobile phone.
- the method further includes selecting a facility to access the audio files by depressing a pre-determined button at least once, and filtering the vocal input.
- an apparatus for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with the apparatus includes an indexer for generating an index comprising of information entries obtained from each of the more than one audio files in the collection, with each audio file in the collection information being linked to at least one information entry; a vocal reception means for receiving a vocal input during a vocal reception mode; converting the vocal input into a digital signal using a digital-analog converter; and a processor to analyse the digital signal using frequency spectrum analysis into discrete portions, the processor also being able to compare the discrete portions with the entries in the index.
- the audio file is accessed when the discrete portions substantially coincide with at least one of the information entries in the index.
- the apparatus may include a display and the vocal input may be filtered.
- the vocal reception mode may be activated by depressing at least one button at least once. It is preferable that the discrete portions are musical notes or waveforms.
- the apparatus is selected from the group comprising: vehicle audio system, desktop computer, notebook computer, PDA, portable media player and mobile phone.
- the vocal input is either manually or automatically selected from the group comprising: singing, humming, and whistling.
- the vocal input is speaker independent.
- the at least one information entry may be selected from either musical notes or waveforms.
- the at least one information entry is received from the audio file, a pre-recorded vocal entry linked to the audio file, or a connected computer network.
- the audio file may be accessible from the electronic device itself, any device functionally connected to the electronic device or a connected computer network.
- FIG. 1 shows a flow chart of a method of a preferred embodiment of the present invention.
- FIG. 2 shows a schematic diagram of an apparatus of a preferred embodiment of the present invention.
- program modules include routines, programs, characters, components, data structures, that perform particular tasks or implement particular abstract data types.
- program modules include routines, programs, characters, components, data structures, that perform particular tasks or implement particular abstract data types.
- program modules may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like.
- the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
- program modules may be located in both local and remote memory storage devices.
- the electronic device may be, for example, a vehicle audio system, a desktop computer, a notebook computer, a PDA, a portable media player or a mobile phone and the like.
- the method may include an enablement of a vocal reception mode ( 20 ) in the electronic device in a manner like, for example, depressing a pre-determined button on the electronic device at least once.
- the vocal reception mode may be enabled or disabled as it may prevent a power source in the electronic device from being continually drained by continual enablement of the vocal reception mode.
- the vocal reception mode may be for vocal input such as, for example, singing, humming, or whistling.
- the enablement of the vocal reception mode in the electronic device may initialise an indexing system ( 24 ). Once the indexing system is initiated, the system then determines whether the composition of audio files in the collection has changed ( 26 ).
- the composition of audio files may include the number of audio files and the audio filenames.
- the index may comprise information entries obtained from each of the more than one audio file in the collection of audio files stored in the electronic device, any device functionally connected to the electronic device or a connected computer network. Connection to the computer network may be via wired or wireless means.
- Each audio file in the collection may be linked to at least one information entry in the index.
- the at least one information entry may be musical notes or waveforms determined using semantic segmentation corresponding to a portion or the whole content stored in the audio files.
- the information entry may also be a MIDI component that is linked/attached to an audio file like file metadata.
- the information entry may also be obtainable from a pre-recorded vocal entry linked/attached to the audio file, or a connected computer network.
- There may be an online database on the connected computer network where information entries of musical notes or waveforms are downloadable for each audio file.
- a search is conducted on the collection of audio files stored in the electronic device, any device functionally connected to the electronic device or a connected computer network ( 28 ). This step is to determine whether audio files have been added to or removed from the collection. Subsequent to the search, information entries obtained from each audio file directly ( 25 ), information entries downloaded from the connected computer network for each audio file ( 29 ), or pre-recorded vocal entries linked to each audio file ( 23 ) may be combined into an index ( 30 ). The index is then loaded for use ( 32 ) in the electronic device.
- the last used index is then loaded for use ( 32 ) in the electronic device.
- the vocal input may be singing, humming, or whistling. In a particular instance, the vocal input need not be a song in its entirety. A portion of a song may be sufficient as a viable form of the vocal input.
- the vocal input may be filtered.
- a user may be able to manually select a specific vocal input ( 22 ) for the vocal reception mode.
- Vocal reception by the electronic device may be speaker independent.
- the vocal reception mode may have automatic volume correction for the vocal input if the vocal input is either too loud (such that distortion of input occurs) or too soft (such that input is inaudible).
- the electronic device may also be able to overcome the problem of an off tune vocal input during the vocal input mode by providing a selection of audio files that most closely approximates to the off tune vocal input based on the entries of the audio files in the index.
- the user may set the device to show the closest approximations up to a pre-determined number, such as, for example, the ten closest approximations.
- the vocal input in analog form is converted into digital signals by a digital-analog converter ( 36 ).
- the converter may be an analog-MIDI converter.
- a processor in the electronic device may analyse the digital signals into discrete portions, where the discrete portions may be either musical notes or waveforms. Processing of the digital signals may be done using frequency spectrum analysis.
- the processor may then compare the discrete portions with entries in the index ( 40 ). Exact or substantial similarity between the discrete portions and entries in the index enables the generation of a listing of audio files in order of extent of similarity ( 42 ).
- the listing may show a number of audio files, a number that may be pre-determined by the user and may be shown on a display on the electronic device.
- the extent of similarity may be based on relative closeness in terms of either musical notes or waveforms.
- an apparatus 50 for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with the apparatus 50 may be for example, a vehicle audio system, desktop computer, notebook computer, PDA, portable media player or mobile phone.
- the components described in the following sections may be incorporated in the aforementioned different forms of the apparatus 50 in addition to components used for their primary functionalities.
- the apparatus 50 may include a digital storage device 58 for the storage of the audio files that make up the collection of files.
- the digital storage device 58 may be non-volatile memory in the form of a hard disk drive or flash memory.
- the digital storage device 58 may have capacities of at least a few megabytes.
- the apparatus 50 may also include an indexer 56 for generating an index comprising of information entries obtained from each of the more than one audio files in the collection.
- the index may comprise information entries obtained from each of the more than one audio file in a collection of audio files stored in the digital storage device 58 of the apparatus 50 , any device functionally connected to the apparatus 50 or a connected computer network.
- Each audio file in the collection may be linked to at least one information entry in the index.
- the at least one information entry may be musical notes or waveforms determined using semantic segmentation corresponding to a portion or the whole content stored in the audio files.
- the information entry may also be a MIDI component that is linked/attached to an audio file like file metadata.
- the information entry may also be obtainable from a pre-recorded vocal entry linked/attached to the audio file, or a connected computer network.
- a vocal reception means 64 for receiving a vocal input during a vocal reception mode may also be included in the apparatus 50 .
- the vocal reception means 64 may be a microphone.
- the vocal input may be singing, humming, or whistling. In a particular instance, the vocal input need not be a song in its entirety. A portion of a song may be sufficient as a viable form of the vocal input.
- the vocal input may also be filtered. There may be a selector to choose the type of vocal input, or detection of vocal input may be automatic.
- the vocal reception mode may be activated by pressing an activating button 63 incorporated with the apparatus 50 at least once. Vocal input into the vocal reception means 64 may be speaker independent.
- the vocal reception mode may have automatic volume correction for the vocal input if the vocal input is either too loud (such that distortion of input occurs) or too soft (such that input is inaudible).
- the electronic device may also be able to overcome the problem of an off tune vocal input during the vocal input mode by providing a selection of audio files that most closely approximates to the off tune vocal input based on the entries of the audio files in the index.
- the user may set the device to show the closest approximations up to a pre-determined number, such as, for example, the ten closest approximations.
- the vocal reception means 64 may be coupled to a digital-analog converter 62 which converts the vocal input through the vocal reception means 64 into digital signals.
- the converter 62 may be an analog-MIDI converter.
- the converted digital signals are then passed into a processor 60 for analysis of the digital signals into discrete portions, where the discrete portions may be either musical notes or waveforms. Processing of the digital signals by the processor 60 may be done using frequency spectrum analysis.
- the processor 60 may then be able to compare the discrete portions of the signals with the entries in the index generated by the indexer 56 . Audio files may thereby be accessible when the discrete portions substantially coincides with at least one of the information entries in the index.
- Exact or substantial similarity between the discrete portions and entries in the index enable the generation of a listing of audio files in order of extent of similarity.
- the listing may show a number of audio files, a number that may be pre-determined by the user.
- a display 54 in the apparatus 50 allows for the listing of files to be shown clearly for selection by the user.
- the extent of similarity may be based on relative closeness in terms of, either musical notes or waveforms.
- the visually impaired may be able to use apparatus 50 to access files stored within or accessible with the apparatus 50 using tonal matching. While they are unable to select the files shown on the display 54 , they may access the audio file which has been extracted from the collection at their convenience just from using vocal input.
- An alternative application of the present invention makes use of the vocal reception mode of the electronic device to ascertain and improve vocal abilities of users. For example, if a user repeatedly fails to find a desired audio file through the use of vocal input into the electronic device, it is highly probable that the user's vocal input (prowess) is flawed. Thus the user is then inclined to continually practice vocal input into the electronic device until improvement is attained in terms of a higher incidence of finding a desired audio file. Thus, a device to conveniently ascertain a level of quality for vocal input is also disclosed.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Library & Information Science (AREA)
- Mathematical Physics (AREA)
- Auxiliary Devices For Music (AREA)
- Electrophonic Musical Instruments (AREA)
- Telephone Function (AREA)
Abstract
A method and apparatus for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with an electronic device. The method includes generating one index comprising information entries obtained from each of the more than one audio file in the collection, with each audio file in the collection information being linked to at least one information entry; receiving a vocal input during a voice reception mode; converting the vocal input into a digital signal using a digital-analog converter; analysing the digital signal using frequency spectrum analysis into discrete portions; and comparing the discrete portions with the entries in the index. It is advantageous that the audio file is accessed when the discrete portions substantially match at least one of the information entries in the index. It is preferable that the discrete portions are either musical notes or waveforms.
Description
- This invention relates to a method and apparatus for accessing an audio file from a collection of audio files, and particularly relates to the accessing of files using tonal matching.
- The advent of the age of affordable digital entertainment has given rise to a sharp increase in the adoption of personal digital entertainment devices by consumers. Such personal digital entertainment devices are usually equipped with storage capacities of a range of sizes. Given the falling prices of storage devices like hard drives and flash memory, an increasing number of personal digital entertainment devices come with storage capacities exceeding. 1 GB. Storage capacities of such sizes in personal digital entertainment devices used for audio files enable the storage of hundreds and even thousands of files.
- While the audio files may be stored and categorisable according to their song titles, artistes, genre or the like, there may be instances where a user may forget the title or artiste of a song, rendering a search for the pertinent audio file akin to searching for a needle in a haystack. In many instances, the user may only be able to remember a portion of the song or its tune. At the present moment, this does not aid in the search for the pertinent audio file in any way. This is a problem when attempting to access audio files in a large collection of audio files where certain information like title or artiste of a song is unknown. This problem also arises when the visually impaired attempts to access audio files in a collection of audio files where they are unable to select the audio files through the use of sight.
- It is also rather difficult to improve one's vocal prowess without engaging expensive vocal coaches. It is currently difficult to improve one's vocal prowess independently besides using karaoke machines with “scoring” functionalities incorporated in them. There are currently few devices available which are able to determine the quality of one's vocal prowess easily and conveniently.
- In a preferred aspect of the present invention, there is provided a method for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with an electronic device. The method includes generating one index comprising of information entries obtained from each of the more than one audio file in the collection, with each audio file in the collection information being linked to at least one information entry; receiving a vocal input during a voice reception mode; converting the vocal input into a digital signal using a digital-analog converter; analysing the digital signal using frequency spectrum analysis into discrete portions; and comparing the discrete portions with the entries in the index. It is advantageous that the audio file is accessed when the discrete portions substantially coincide with at least one of the information entries in the index. It is preferable that the discrete portions are either musical notes or waveforms. The at least one information entry may also be musical notes or waveforms.
- The vocal input may preferably be speaker independent and may be in the form of singing, humming, or whistling. The form of vocal input may preferably be manually or automatically selectable.
- It is preferable that the audio file is accessible from the electronic device itself, a device functionally connected to the electronic device or a connected computer network. The information entry may also preferably be received from the audio file, a pre-recorded vocal entry linked to the audio file, or a connected computer network. It is preferable that the electronic device is selected from the group comprising: vehicle audio system, desktop computer, notebook computer, PDA, portable media player and mobile phone.
- It is also preferable that the method further includes selecting a facility to access the audio files by depressing a pre-determined button at least once, and filtering the vocal input.
- There is also provided an apparatus for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with the apparatus. It is preferable that the apparatus includes an indexer for generating an index comprising of information entries obtained from each of the more than one audio files in the collection, with each audio file in the collection information being linked to at least one information entry; a vocal reception means for receiving a vocal input during a vocal reception mode; converting the vocal input into a digital signal using a digital-analog converter; and a processor to analyse the digital signal using frequency spectrum analysis into discrete portions, the processor also being able to compare the discrete portions with the entries in the index. Advantageously, the audio file is accessed when the discrete portions substantially coincide with at least one of the information entries in the index. The apparatus may include a display and the vocal input may be filtered. The vocal reception mode may be activated by depressing at least one button at least once. It is preferable that the discrete portions are musical notes or waveforms.
- It is preferable that the apparatus is selected from the group comprising: vehicle audio system, desktop computer, notebook computer, PDA, portable media player and mobile phone.
- It is preferable that the vocal input is either manually or automatically selected from the group comprising: singing, humming, and whistling. Advantageously, the vocal input is speaker independent. The at least one information entry may be selected from either musical notes or waveforms. Preferably, the at least one information entry is received from the audio file, a pre-recorded vocal entry linked to the audio file, or a connected computer network. The audio file may be accessible from the electronic device itself, any device functionally connected to the electronic device or a connected computer network.
- There is also provided a method of determining a level of quality for vocal input using the aforementioned apparatus.
- In order that the present invention may be fully understood and readily put into practical effect, there shall now be described by way of non-limitative example only preferred embodiments of the present invention, the description being with reference to the accompanying illustrative drawings.
-
FIG. 1 shows a flow chart of a method of a preferred embodiment of the present invention. -
FIG. 2 shows a schematic diagram of an apparatus of a preferred embodiment of the present invention. - The following discussion is intended to provide a brief, general description of a suitable computing environment in which the present invention may be implemented. Although not required, the invention will be described in the general context of computer-executable instructions, such as program modules, being executed by a personal computer. Generally, program modules include routines, programs, characters, components, data structures, that perform particular tasks or implement particular abstract data types. As those skilled in the art will appreciate, the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
- Referring to
FIG. 1 , there is provided flow chart of a method for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with an electronic device. The electronic device may be, for example, a vehicle audio system, a desktop computer, a notebook computer, a PDA, a portable media player or a mobile phone and the like. The method may include an enablement of a vocal reception mode (20) in the electronic device in a manner like, for example, depressing a pre-determined button on the electronic device at least once. The vocal reception mode may be enabled or disabled as it may prevent a power source in the electronic device from being continually drained by continual enablement of the vocal reception mode. The vocal reception mode may be for vocal input such as, for example, singing, humming, or whistling. - The enablement of the vocal reception mode in the electronic device may initialise an indexing system (24). Once the indexing system is initiated, the system then determines whether the composition of audio files in the collection has changed (26). The composition of audio files may include the number of audio files and the audio filenames. The index may comprise information entries obtained from each of the more than one audio file in the collection of audio files stored in the electronic device, any device functionally connected to the electronic device or a connected computer network. Connection to the computer network may be via wired or wireless means. Each audio file in the collection may be linked to at least one information entry in the index. The at least one information entry may be musical notes or waveforms determined using semantic segmentation corresponding to a portion or the whole content stored in the audio files. The information entry may also be a MIDI component that is linked/attached to an audio file like file metadata. The information entry may also be obtainable from a pre-recorded vocal entry linked/attached to the audio file, or a connected computer network. There may be an online database on the connected computer network where information entries of musical notes or waveforms are downloadable for each audio file.
- If the composition of audio files is found to be different, a search is conducted on the collection of audio files stored in the electronic device, any device functionally connected to the electronic device or a connected computer network (28). This step is to determine whether audio files have been added to or removed from the collection. Subsequent to the search, information entries obtained from each audio file directly (25), information entries downloaded from the connected computer network for each audio file (29), or pre-recorded vocal entries linked to each audio file (23) may be combined into an index (30). The index is then loaded for use (32) in the electronic device.
- If the composition of audio files is found to be unchanged, the last used index is then loaded for use (32) in the electronic device. With the enablement of the vocal reception mode, there may be vocal input into the device (34). The vocal input may be singing, humming, or whistling. In a particular instance, the vocal input need not be a song in its entirety. A portion of a song may be sufficient as a viable form of the vocal input. The vocal input may be filtered. A user may be able to manually select a specific vocal input (22) for the vocal reception mode. There may also be automatic detection of vocal input (22). Vocal reception by the electronic device may be speaker independent. The vocal reception mode may have automatic volume correction for the vocal input if the vocal input is either too loud (such that distortion of input occurs) or too soft (such that input is inaudible). The electronic device may also be able to overcome the problem of an off tune vocal input during the vocal input mode by providing a selection of audio files that most closely approximates to the off tune vocal input based on the entries of the audio files in the index. The user may set the device to show the closest approximations up to a pre-determined number, such as, for example, the ten closest approximations.
- Subsequently, the vocal input in analog form is converted into digital signals by a digital-analog converter (36). The converter may be an analog-MIDI converter. Thereafter, a processor in the electronic device may analyse the digital signals into discrete portions, where the discrete portions may be either musical notes or waveforms. Processing of the digital signals may be done using frequency spectrum analysis. The processor may then compare the discrete portions with entries in the index (40). Exact or substantial similarity between the discrete portions and entries in the index enables the generation of a listing of audio files in order of extent of similarity (42). The listing may show a number of audio files, a number that may be pre-determined by the user and may be shown on a display on the electronic device. The extent of similarity may be based on relative closeness in terms of either musical notes or waveforms.
- Referring to
FIG. 2 , there is provided anapparatus 50 for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with theapparatus 50. Theapparatus 50 may be for example, a vehicle audio system, desktop computer, notebook computer, PDA, portable media player or mobile phone. The components described in the following sections may be incorporated in the aforementioned different forms of theapparatus 50 in addition to components used for their primary functionalities. - The
apparatus 50 may include adigital storage device 58 for the storage of the audio files that make up the collection of files. Thedigital storage device 58 may be non-volatile memory in the form of a hard disk drive or flash memory. Thedigital storage device 58 may have capacities of at least a few megabytes. - In addition, the
apparatus 50 may also include anindexer 56 for generating an index comprising of information entries obtained from each of the more than one audio files in the collection. The index may comprise information entries obtained from each of the more than one audio file in a collection of audio files stored in thedigital storage device 58 of theapparatus 50, any device functionally connected to theapparatus 50 or a connected computer network. Each audio file in the collection may be linked to at least one information entry in the index. The at least one information entry may be musical notes or waveforms determined using semantic segmentation corresponding to a portion or the whole content stored in the audio files. The information entry may also be a MIDI component that is linked/attached to an audio file like file metadata. The information entry may also be obtainable from a pre-recorded vocal entry linked/attached to the audio file, or a connected computer network. There may be an online database on the connected computer network where information entries of musical notes or waveforms are downloadable for each audio file. - A vocal reception means 64 for receiving a vocal input during a vocal reception mode may also be included in the
apparatus 50. The vocal reception means 64 may be a microphone. The vocal input may be singing, humming, or whistling. In a particular instance, the vocal input need not be a song in its entirety. A portion of a song may be sufficient as a viable form of the vocal input. The vocal input may also be filtered. There may be a selector to choose the type of vocal input, or detection of vocal input may be automatic. The vocal reception mode may be activated by pressing an activatingbutton 63 incorporated with theapparatus 50 at least once. Vocal input into the vocal reception means 64 may be speaker independent. The vocal reception mode may have automatic volume correction for the vocal input if the vocal input is either too loud (such that distortion of input occurs) or too soft (such that input is inaudible). The electronic device may also be able to overcome the problem of an off tune vocal input during the vocal input mode by providing a selection of audio files that most closely approximates to the off tune vocal input based on the entries of the audio files in the index. The user may set the device to show the closest approximations up to a pre-determined number, such as, for example, the ten closest approximations. - The vocal reception means 64 may be coupled to a digital-
analog converter 62 which converts the vocal input through the vocal reception means 64 into digital signals. Theconverter 62 may be an analog-MIDI converter. The converted digital signals are then passed into aprocessor 60 for analysis of the digital signals into discrete portions, where the discrete portions may be either musical notes or waveforms. Processing of the digital signals by theprocessor 60 may be done using frequency spectrum analysis. Theprocessor 60 may then be able to compare the discrete portions of the signals with the entries in the index generated by theindexer 56. Audio files may thereby be accessible when the discrete portions substantially coincides with at least one of the information entries in the index. Exact or substantial similarity between the discrete portions and entries in the index enable the generation of a listing of audio files in order of extent of similarity. The listing may show a number of audio files, a number that may be pre-determined by the user. Adisplay 54 in theapparatus 50 allows for the listing of files to be shown clearly for selection by the user. The extent of similarity may be based on relative closeness in terms of, either musical notes or waveforms. - The visually impaired may be able to use
apparatus 50 to access files stored within or accessible with theapparatus 50 using tonal matching. While they are unable to select the files shown on thedisplay 54, they may access the audio file which has been extracted from the collection at their convenience just from using vocal input. - An alternative application of the present invention makes use of the vocal reception mode of the electronic device to ascertain and improve vocal abilities of users. For example, if a user repeatedly fails to find a desired audio file through the use of vocal input into the electronic device, it is highly probable that the user's vocal input (prowess) is flawed. Thus the user is then inclined to continually practice vocal input into the electronic device until improvement is attained in terms of a higher incidence of finding a desired audio file. Thus, a device to conveniently ascertain a level of quality for vocal input is also disclosed.
- Whilst there has been described in the foregoing description preferred embodiments of the present invention, it will be understood by those skilled in the technology concerned that many variations or modifications in details of design or construction may be made without departing from the present invention.
Claims (27)
1. A method for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with an electronic device, including:
generating one index comprising of information entries obtained from each of the more than one audio file in the collection, with each audio file in the collection being linked to at least one information entry;
receiving a vocal input during a voice reception mode;
converting the vocal input into a digital signal using a digital-analog converter;
analysing the digital signal using frequency spectrum analysis into discrete portions; and
comparing the discrete portions with the information entries in the index,
wherein the at least one audio file is accessed when the discrete portions substantially match at least one information entry in the index.
2. The method of claim 1 , wherein the discrete portions are selected from the group consisting of: musical notes and waveforms.
3. The method of claim 1 , wherein the vocal input is selected from the group consisting of: singing, humming, and whistling.
4. The method of claim 1 , wherein the at least one information entry is selected from the group consisting of: musical notes and waveforms.
5. The method of claim 1 , wherein the audio file accessible from a source selected from the group consisting of: the electronic device, any device functionally connected to the electronic device and a connected computer network.
6. The method of claim 3 , wherein the vocal input is set by means selected from the group consisting of: manual selection and automatic selection.
7. The method of claim 1 , wherein the vocal input is speaker independent.
8. The method of claim 1 , wherein the at least one information entry is received from a source selected from the group consisting of: the audio file, a pre-recorded vocal entry linked to the audio file, and a connected computer network.
9. The method of claim 1 , wherein the electronic device is selected from the group consisting of: vehicle audio system, desktop computer, notebook computer, PDA, portable media player and mobile phone.
10. The method of claim 1 , further including selecting a facility to access the audio files by depressing a pre-determined button at least once.
11. The method of claim 1 , further including filtering the vocal input.
12. An apparatus for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with the apparatus, including:
an indexer configured to generate an index comprising information entries obtained from each of the more than one audio files in the collection, with each audio file in the collection being linked to at least one information entry;
a vocal receiver configured to receive a vocal input during a vocal reception mode;
a digital signal using a digital-analog converter configured to convert the vocal input into a digital signal; and
a processor configured to analyse the digital signal using frequency spectrum analysis into discrete portions and to compare the discrete portions with the information entries in the index,
wherein the at least one audio file is accessed when the discrete portions substantially match at least one information entry in the index.
13. The apparatus of claim 12 , wherein the apparatus is selected from the group consisting of: vehicle audio system, desktop computer, notebook computer, PDA, portable media player and mobile phone.
14. The apparatus of claim 12 , wherein the vocal input is selected from the group consisting of: singing, humming, and whistling.
15. The apparatus of claim 14 , wherein the vocal input is set by means selected from the group consisting of: manual selection and automatic selection.
16. The apparatus of claim 12 , wherein the at least one information entry is selected from the group consisting of: musical notes and waveforms.
17. The apparatus of claim 12 , wherein the vocal input is speaker independent.
18. The apparatus of claim 12 , wherein the at least one information entry is received from a source selected from the group consisting of: the audio file, a pre-recorded vocal entry linked to the audio file, and a connected computer network.
19. The apparatus of claim 12 , wherein the vocal reception mode is activated by depressing at least one button at least once.
20. The apparatus of claim 12 , further including a display.
21. The apparatus of claim 12 , wherein the vocal input is filtered.
22. The apparatus of claim 12 , wherein the discrete portions are selected from the group consisting of: musical notes and waveforms.
23. The apparatus of claim 12 , wherein the audio file is accessible from a source selected from the group consisting of: the electronic device, any device functionally connected to the electronic device and a connected computer network.
24. A method of determining a level of quality for vocal input using the apparatus of claim 12 .
25. A method for accessing at least one audio file from a collection of audio files stored within or accessible with an electronic device, the method comprising:
generating an index comprising information entries obtained from audio files in the collection, each audio file in the collection having at least one corresponding information entry in the index;
analysing a digital signal into discrete portions, the digital signal being obtained from a converted vocal input received during a voice reception mode; and
comparing the discrete portions with the information entries in the index,
wherein the at least one audio file is accessed when the discrete portions substantially match at least one information entry in the index.
26. The method according to claim 25 , wherein the digital signal is analysed into discrete portions using frequency spectrum analysis.
27. The method according to claim 25 , wherein the vocal input is converted into the digital signal using a digital analog converter.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/439,760 US20070276668A1 (en) | 2006-05-23 | 2006-05-23 | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
PCT/SG2007/000140 WO2007136349A1 (en) | 2006-05-23 | 2007-05-22 | A method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
US12/301,878 US8892565B2 (en) | 2006-05-23 | 2007-05-22 | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
CN2007800190803A CN101454778B (en) | 2006-05-23 | 2007-05-22 | A method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
TW096118334A TWI454942B (en) | 2006-05-23 | 2007-05-23 | A method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/439,760 US20070276668A1 (en) | 2006-05-23 | 2006-05-23 | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/301,878 Continuation US8892565B2 (en) | 2006-05-23 | 2007-05-22 | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070276668A1 true US20070276668A1 (en) | 2007-11-29 |
Family
ID=38723575
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/439,760 Abandoned US20070276668A1 (en) | 2006-05-23 | 2006-05-23 | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
US12/301,878 Active 2030-07-13 US8892565B2 (en) | 2006-05-23 | 2007-05-22 | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/301,878 Active 2030-07-13 US8892565B2 (en) | 2006-05-23 | 2007-05-22 | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching |
Country Status (4)
Country | Link |
---|---|
US (2) | US20070276668A1 (en) |
CN (1) | CN101454778B (en) |
TW (1) | TWI454942B (en) |
WO (1) | WO2007136349A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8478719B2 (en) | 2011-03-17 | 2013-07-02 | Remote Media LLC | System and method for media file synchronization |
US8688631B2 (en) | 2011-03-17 | 2014-04-01 | Alexander Savenok | System and method for media file synchronization |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090024388A1 (en) * | 2007-06-11 | 2009-01-22 | Pandiscio Jill A | Method and apparatus for searching a music database |
TWI383693B (en) * | 2008-10-31 | 2013-01-21 | Hon Hai Prec Ind Co Ltd | Testing device capable of testing audio formats supported by an audio player device and method thereof |
US8584198B2 (en) * | 2010-11-12 | 2013-11-12 | Google Inc. | Syndication including melody recognition and opt out |
US9183849B2 (en) | 2012-12-21 | 2015-11-10 | The Nielsen Company (Us), Llc | Audio matching with semantic audio recognition and report generation |
US9195649B2 (en) | 2012-12-21 | 2015-11-24 | The Nielsen Company (Us), Llc | Audio processing techniques for semantic audio recognition and report generation |
US9158760B2 (en) | 2012-12-21 | 2015-10-13 | The Nielsen Company (Us), Llc | Audio decoding with supplemental semantic audio recognition and report generation |
KR102161237B1 (en) * | 2013-11-25 | 2020-09-29 | 삼성전자주식회사 | Method for outputting sound and apparatus for the same |
TWI579716B (en) * | 2015-12-01 | 2017-04-21 | Chunghwa Telecom Co Ltd | Two - level phrase search system and method |
CN106098058B (en) * | 2016-06-23 | 2018-09-07 | 腾讯科技(深圳)有限公司 | Tone line generation method and device |
US9922631B2 (en) * | 2016-06-24 | 2018-03-20 | Panasonic Automotive Systems Company of America, a division of Panasonic Corporation of North America | Car karaoke |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4915001A (en) * | 1988-08-01 | 1990-04-10 | Homer Dillard | Voice to music converter |
US6057502A (en) * | 1999-03-30 | 2000-05-02 | Yamaha Corporation | Apparatus and method for recognizing musical chords |
US6510410B1 (en) * | 2000-07-28 | 2003-01-21 | International Business Machines Corporation | Method and apparatus for recognizing tone languages using pitch information |
US20040060424A1 (en) * | 2001-04-10 | 2004-04-01 | Frank Klefenz | Method for converting a music signal into a note-based description and for referencing a music signal in a data bank |
US6938209B2 (en) * | 2001-01-23 | 2005-08-30 | Matsushita Electric Industrial Co., Ltd. | Audio information provision system |
US20080236364A1 (en) * | 2007-01-09 | 2008-10-02 | Yamaha Corporation | Tone processing apparatus and method |
US7488886B2 (en) * | 2005-11-09 | 2009-02-10 | Sony Deutschland Gmbh | Music information retrieval using a 3D search algorithm |
US20090064851A1 (en) * | 2007-09-07 | 2009-03-12 | Microsoft Corporation | Automatic Accompaniment for Vocal Melodies |
US7544881B2 (en) * | 2005-10-28 | 2009-06-09 | Victor Company Of Japan, Ltd. | Music-piece classifying apparatus and method, and related computer program |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001069575A1 (en) * | 2000-03-13 | 2001-09-20 | Perception Digital Technology (Bvi) Limited | Melody retrieval system |
US6735563B1 (en) * | 2000-07-13 | 2004-05-11 | Qualcomm, Inc. | Method and apparatus for constructing voice templates for a speaker-independent voice recognition system |
US7031980B2 (en) * | 2000-11-02 | 2006-04-18 | Hewlett-Packard Development Company, L.P. | Music similarity function based on signal analysis |
CA2563478A1 (en) * | 2004-04-16 | 2005-10-27 | James A. Aman | Automatic event videoing, tracking and content generation system |
US20070195963A1 (en) * | 2006-02-21 | 2007-08-23 | Nokia Corporation | Measuring ear biometrics for sound optimization |
US8750484B2 (en) * | 2007-03-19 | 2014-06-10 | Avaya Inc. | User-programmable call progress tone detection |
-
2006
- 2006-05-23 US US11/439,760 patent/US20070276668A1/en not_active Abandoned
-
2007
- 2007-05-22 WO PCT/SG2007/000140 patent/WO2007136349A1/en active Application Filing
- 2007-05-22 CN CN2007800190803A patent/CN101454778B/en active Active
- 2007-05-22 US US12/301,878 patent/US8892565B2/en active Active
- 2007-05-23 TW TW096118334A patent/TWI454942B/en not_active IP Right Cessation
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4915001A (en) * | 1988-08-01 | 1990-04-10 | Homer Dillard | Voice to music converter |
US6057502A (en) * | 1999-03-30 | 2000-05-02 | Yamaha Corporation | Apparatus and method for recognizing musical chords |
US6510410B1 (en) * | 2000-07-28 | 2003-01-21 | International Business Machines Corporation | Method and apparatus for recognizing tone languages using pitch information |
US6938209B2 (en) * | 2001-01-23 | 2005-08-30 | Matsushita Electric Industrial Co., Ltd. | Audio information provision system |
US20040060424A1 (en) * | 2001-04-10 | 2004-04-01 | Frank Klefenz | Method for converting a music signal into a note-based description and for referencing a music signal in a data bank |
US7544881B2 (en) * | 2005-10-28 | 2009-06-09 | Victor Company Of Japan, Ltd. | Music-piece classifying apparatus and method, and related computer program |
US7488886B2 (en) * | 2005-11-09 | 2009-02-10 | Sony Deutschland Gmbh | Music information retrieval using a 3D search algorithm |
US20080236364A1 (en) * | 2007-01-09 | 2008-10-02 | Yamaha Corporation | Tone processing apparatus and method |
US20090064851A1 (en) * | 2007-09-07 | 2009-03-12 | Microsoft Corporation | Automatic Accompaniment for Vocal Melodies |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8478719B2 (en) | 2011-03-17 | 2013-07-02 | Remote Media LLC | System and method for media file synchronization |
US8688631B2 (en) | 2011-03-17 | 2014-04-01 | Alexander Savenok | System and method for media file synchronization |
Also Published As
Publication number | Publication date |
---|---|
TWI454942B (en) | 2014-10-01 |
US8892565B2 (en) | 2014-11-18 |
CN101454778B (en) | 2011-12-07 |
US20110238666A1 (en) | 2011-09-29 |
TW200813759A (en) | 2008-03-16 |
WO2007136349A1 (en) | 2007-11-29 |
CN101454778A (en) | 2009-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20070276668A1 (en) | Method and apparatus for accessing an audio file from a collection of audio files using tonal matching | |
US6476306B2 (en) | Method and a system for recognizing a melody | |
US8352268B2 (en) | Systems and methods for selective rate of speech and speech preferences for text to speech synthesis | |
US8712776B2 (en) | Systems and methods for selective text to speech synthesis | |
US8396714B2 (en) | Systems and methods for concatenation of words in text to speech synthesis | |
US8352272B2 (en) | Systems and methods for text to speech synthesis | |
US7908338B2 (en) | Content retrieval method and apparatus, communication system and communication method | |
US8380507B2 (en) | Systems and methods for determining the language to use for speech generated by a text to speech engine | |
US20100082327A1 (en) | Systems and methods for mapping phonemes for text to speech synthesis | |
US20100082329A1 (en) | Systems and methods of detecting language and natural language strings for text to speech synthesis | |
US20100082328A1 (en) | Systems and methods for speech preprocessing in text to speech synthesis | |
EP1934828A2 (en) | Method and system to control operation of a playback device | |
US20070288517A1 (en) | Information processing system, terminal device, information processing method, and program | |
RU2381548C2 (en) | Method and system for providing music-related information by using audio dna | |
US20140129235A1 (en) | Audio tracker apparatus | |
WO2008089647A1 (en) | Music search method based on querying musical piece information | |
JP2012103832A (en) | Information processor, method, information processing system and program | |
US20090247096A1 (en) | Method And System For Integrated FM Recording | |
KR20080083290A (en) | Method and apparatus for accessing digital files in a collection of digital files | |
CN107679196A (en) | A kind of multimedia recognition methods, electronic equipment and storage medium | |
KR101576683B1 (en) | Method and apparatus for playing audio file comprising history storage | |
KR20080014188A (en) | Music file retrieval system and its method in mobile terminal | |
KR20090062548A (en) | Content retrieval method and mobile communication terminal using same | |
WO2006095847A1 (en) | Contents acquiring device, method used in such contents acquiring device, program used in such contents acquiring device, and recording medium with such program recorded therein | |
KR20140092028A (en) | Song recommendation system and terminal and song recommendation method using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: CREATIVE TECHNOLOGY LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:XU, JUN;ZHANG, HUAYUN;REEL/FRAME:018022/0931 Effective date: 20060718 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |