CN101271457A - A melody-based music retrieval method and device - Google Patents
A melody-based music retrieval method and device Download PDFInfo
- Publication number
- CN101271457A CN101271457A CNA2007100646076A CN200710064607A CN101271457A CN 101271457 A CN101271457 A CN 101271457A CN A2007100646076 A CNA2007100646076 A CN A2007100646076A CN 200710064607 A CN200710064607 A CN 200710064607A CN 101271457 A CN101271457 A CN 101271457A
- Authority
- CN
- China
- Prior art keywords
- music
- melody
- client
- search
- server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000005236 sound signal Effects 0.000 claims abstract description 21
- 239000000284 extract Substances 0.000 claims abstract description 6
- 238000012545 processing Methods 0.000 claims description 18
- 238000004458 analytical method Methods 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 7
- 230000033764 rhythmic process Effects 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000008030 elimination Effects 0.000 claims description 2
- 238000003379 elimination reaction Methods 0.000 claims description 2
- 239000012634 fragment Substances 0.000 claims description 2
- 230000003993 interaction Effects 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 230000008676 import Effects 0.000 claims 1
- 238000003672 processing method Methods 0.000 abstract 1
- 238000005070 sampling Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 230000011218 segmentation Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000009193 crawling Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000001208 nuclear magnetic resonance pulse sequence Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Landscapes
- Reverberation, Karaoke And Other Acoustics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Electrophonic Musical Instruments (AREA)
Abstract
本发明公开一种数字音乐检索方法及其装置,以音乐旋律为关键字,能够搜索包含指定旋律的音乐,本发明为用户提供两种输入旋律的方法:弹奏和哼唱。对于哼唱的输入方式,采用了一系列信号处理的方法对哼唱音频信号进行分析,从中提取出旋律信息。对于音乐库,采用了倒排算法编制索引,提高搜索的效率。本发明装置分为服务器端与客户端,服务器端的功能是维护音乐数据库及其索引,并响应客户端的查询请求;客户端的功能是采集用户的旋律输入,并且接收显示服务器的查询结果。本发明用音乐旋律搜索音乐弥补了传统基于文本搜索方式的不足,使用户在不知文本信息的情况下搜索想要的音乐;用户可以使用常见的设备如电脑,手机等,进行音乐搜索。
The invention discloses a digital music retrieval method and its device, which can search for music containing a specified melody by using the music melody as a keyword. The invention provides users with two methods of inputting the melody: playing and humming. For the humming input method, a series of signal processing methods are used to analyze the humming audio signal and extract the melody information from it. For the music library, the inverted algorithm is used to compile the index to improve the efficiency of the search. The device of the present invention is divided into a server end and a client end. The function of the server end is to maintain the music database and its index, and respond to the inquiry request of the client end; The invention uses music melody to search for music to make up for the shortcomings of the traditional text-based search method, enabling users to search for desired music without knowing the text information; users can use common equipment such as computers, mobile phones, etc. to search for music.
Description
技术领域 technical field
本发明属于计算机技术应用领域,具体的涉及对数字音乐用旋律作为关键字的检索方法,以及使该方法能够顺利运行的计算机硬件及通讯设备装置。The invention belongs to the application field of computer technology, and in particular relates to a retrieval method for digital music using melody as a keyword, as well as computer hardware and communication equipment that enable the method to run smoothly.
背景技术 Background technique
随着互联网信息量的几何级数的增长,怎样从海量的信息库中迅速而准确地找到我们需要的信息,成为人们使用互联网的一大瓶颈。基于内容的多媒体检索是一个新兴的研究领域,它给人们提供了全新的搜索方式:用多媒体本身来搜索多媒体信息。多媒体信息有音频、视频、图像、动画等多种形式,其中音频信息占有相当大的比例。而在音频当中,音乐又是最常见的形式。目前的音乐检索,主要根据文本关键字来搜索,例如音乐名,作者,演唱歌星,专辑,流派,歌词等。但是音乐本身与文本关键字有着本质的不同,用户使用关键字进行搜索,前提条件是用户必须对目标音乐有所了解,熟悉与之相关的文本信息。如果用户只是对音乐旋律本身感兴趣,而对歌名,歌词等文本信息一无所知,现有的音乐搜索方法就无能为力了。With the exponential growth of the amount of Internet information, how to quickly and accurately find the information we need from the massive information database has become a major bottleneck for people to use the Internet. Content-based multimedia retrieval is an emerging research field, which provides people with a new way of searching: using multimedia itself to search multimedia information. Multimedia information has various forms such as audio, video, image, and animation, among which audio information occupies a considerable proportion. Among audio, music is the most common form. The current music search is mainly based on text keywords, such as music name, author, singer, album, genre, lyrics, etc. However, music itself is fundamentally different from text keywords. Users use keywords to search, but the prerequisite is that users must understand the target music and be familiar with the relevant text information. If the user is only interested in the music melody itself, but knows nothing about text information such as song titles and lyrics, the existing music search methods will be useless.
发明内容 Contents of the invention
现有的音乐关键字检索技术,如果不知目标音乐的文本关键字,这种文本关键字搜索方法就无能为力了,为了解决现有技术的问题,本发明的目的是提供一种基于旋律的数字音乐检索方法及装置。Existing music keyword retrieval technology, if do not know the text keyword of target music, this text keyword search method just can't do anything, in order to solve the problem of prior art, the purpose of the present invention is to provide a kind of digital music based on melody Retrieval method and device.
为了实现所述的目的,本发明第一方面,提供基于旋律的音乐检索方法,步骤如下所述:In order to achieve the stated purpose, the first aspect of the present invention provides a melody-based music retrieval method, the steps are as follows:
步骤S1:指定待查音乐中的一段旋律作为搜索的旋律关键字;Step S1: Designate a melody in the music to be searched as the melody keyword for searching;
步骤S2:将所指定的旋律关键字输入查询客户端设备,经过处理得到数字化旋律信号;Step S2: Input the specified melody keyword into the query client device, and obtain a digitized melody signal after processing;
步骤S3:将音乐库中的音乐建立索引,该索引体现音乐的旋律特征,形成索引化的音乐数据库;Step S3: indexing the music in the music library, the index reflects the melody characteristics of the music, and forms an indexed music database;
步骤S4:由搜索引擎将数字化旋律信号与产生的音乐数据库中的旋律进行比较,从音乐数据库选出一组包含指定关键字音乐旋律的一组音乐;Step S4: The search engine compares the digitized melody signal with the melody in the generated music database, and selects a group of music containing the specified keyword music melody from the music database;
步骤S5:将选出的音乐按照与旋律关键字的相似程度递减排序。Step S5: sort the selected music in descending order according to the degree of similarity to the melody keyword.
所述音乐输入方式包括:弹奏输入和哼唱输入。The music input methods include: playing input and humming input.
所述索引,为针对旋律片段的旋律特征而的编制索引。The index is indexing for the melody features of the melody segment.
所述对于哼唱输入方式,采取如下步骤获得数字化的旋律信号:For the humming input method, the following steps are taken to obtain the digitized melody signal:
步骤S21:使用音频采集设备采集用户的哼唱输入;Step S21: using an audio collection device to collect the user's humming input;
步骤S22:对用户输入的音频信号进行预滤波处理,包括直流消除、增益标准化、低通滤波处理,得到音频帧序列信号;Step S22: Perform pre-filtering processing on the audio signal input by the user, including DC elimination, gain normalization, and low-pass filtering processing, to obtain audio frame sequence signals;
步骤S23:对音频帧序列信号进行时域或频域分析,提取基频序列;Step S23: Perform time domain or frequency domain analysis on the audio frame sequence signal to extract the fundamental frequency sequence;
步骤S24:对基频序列进行进一步处理,包括线性化、求差,得到数字化的旋律信号。Step S24: Perform further processing on the fundamental frequency sequence, including linearization and difference calculation, to obtain a digitized melody signal.
为了实现所述的目的,本发明第二方面,提供基于旋律的音乐检索装置,包括:In order to achieve the stated purpose, the second aspect of the present invention provides a melody-based music retrieval device, including:
至少一台服务器提供在线音乐旋律检索服务;At least one server provides online music melody retrieval service;
和至少一台客户端终端设备发出在线音乐旋律检索请求,并接收服务器查询音乐旋律的结果。Send an online music melody retrieval request with at least one client terminal device, and receive the result of querying the music melody from the server.
所述客户端,包括:The client includes:
输入模块,用于输入需要查找的音乐旋律信息,并将其发送至服务器端;搜索结果的显示模块,客户端通过网络或其他传输方式从服务器端获得搜索结果,并呈现给用户。The input module is used to input the music melody information to be searched and send it to the server; the display module of the search result, the client obtains the search result from the server through the network or other transmission methods, and presents it to the user.
所述输入模块,包括:The input module includes:
音频采集单元用于采集用户的哼唱音频信号;音符采集单元用于采集用户弹奏的音符旋律信号;音频信号处理单元,将音频采集单元采集的音频信号转化为音乐旋律信号。The audio collection unit is used to collect the humming audio signal of the user; the note collection unit is used to collect the note melody signal played by the user; the audio signal processing unit converts the audio signal collected by the audio collection unit into a music melody signal.
所述服务器,包括:The server includes:
音乐数据源接口单元,用于提供访问各种数据源获取原始音乐数据的接口;数据获取与分析单元,用于收集原始的音乐数据,并对音乐数据进行分析,从中提取出音乐旋律信息;索引编制单元,用于将数据获取与分析单元获取的原始音乐数据按照其旋律特征建立索引;搜索单元,用于接收客户端输入模块的查询请求,并在索引编制单元生成的索引中搜索包含与客户端输入模块提供的旋律关键字相同或相近旋律的音乐,将搜索结果列表按相似程度倒序排序,并反馈回客户端的搜索结果显示模块。The music data source interface unit is used to provide an interface for accessing various data sources to obtain original music data; the data acquisition and analysis unit is used to collect original music data and analyze the music data to extract music melody information; index The compilation unit is used to index the original music data obtained by the data acquisition and analysis unit according to its melody characteristics; the search unit is used to receive the query request from the input module of the client, and search the index generated by the index compilation unit for information related to the client. The terminal input module provides music with the same or similar melody as the melody keyword, sorts the search result list in reverse order of similarity, and feeds back to the search result display module of the client.
所述音乐数据源接口单元,提供以下的一种或几种数据获取方式的接口:The music data source interface unit provides the interface of one or more of the following data acquisition methods:
Web:采取Web网络抓取的方式,自动在互联网上漫游,抓取音乐文件和与该音乐文件相关的信息;文件:对本地或网络文件系统中存储的音乐文件进行抓取和分析;数据库:对数据库中记录的音乐文件进行提取和分析。Web: Take the method of web crawling, automatically roam on the Internet, grab music files and information related to the music files; file: grab and analyze music files stored in the local or network file system; database: Extract and analyze music files recorded in the database.
所述客户端为以下设备中的一种或几种:The client is one or more of the following devices:
个人电脑;智能移动设备包括:手机,个人数字助理,车载智能终端等;电话;具有媒体点播功能的音视频娱乐设备:包括卡拉OK点唱设备。Personal computers; smart mobile devices include: mobile phones, personal digital assistants, vehicle-mounted intelligent terminals, etc.; telephones; audio and video entertainment devices with media on-demand functions: including karaoke singing devices.
所述的客户端选择个人电脑设备时,个人电脑客户端从服务器下载安装特定的Web浏览器插件软件,用户访问服务器提供的音乐检索Web网站时,用于为用户提供音频采集输入和音符采集旋律的用户界面,并且采集用户的查询输入,通过互联网发送至服务器。When the client selects a personal computer device, the personal computer client downloads and installs specific Web browser plug-in software from the server, and when the user accesses the music retrieval Web site provided by the server, it is used to provide the user with audio collection input and note collection melody The user interface, and collect the user's query input, and send it to the server through the Internet.
所述的客户端选择智能移动设备时,客户端安装特定的软件,该软件为用户提供音频采集和音符采集的用户界面,并且采集用户的查询输入,通过无线网络发送至服务器。When the client selects a smart mobile device, the client installs specific software, which provides the user with a user interface for audio collection and note collection, and collects the user's query input and sends it to the server through the wireless network.
所述的客户端选择电话设备时,服务器提供特定的电话声讯台,客户端拨打该声讯台号码,利用电话数字键盘,或使用电话受话器分别作为音符采集和音频采集输入设备,服务器与客户端通过公共交换电话网络进行信息交互。When the client selects the telephone equipment, the server provides a specific telephone audio station, the client dials the audio station number, uses the telephone number keypad, or uses the telephone receiver as the note collection and audio collection input devices respectively, and the server and the client pass through the public exchange Telephone network for information exchange.
所述的客户端选择具有媒体点播功能的音视频娱乐设备时,客户端配备数字钢琴键盘设备,或安装虚拟钢琴键盘软件采集用户的钢琴键盘音符输入,利用卡拉OK麦克风采集用户的哼唱输入,服务器为专用本地服务器,搜索的范围为卡拉OK本地的音乐数据库。When the client selects an audio-video entertainment device with a media-on-demand function, the client is equipped with a digital piano keyboard device, or virtual piano keyboard software is installed to collect the user's piano keyboard note input, and the karaoke microphone is used to collect the user's humming input, The server is a dedicated local server, and the scope of searching is the local music database of karaoke.
所述服务器对于搜索结果选中的音乐列表,按照搜索结果与查询输入旋律的相似性递减排序,并发送回客户端进行显示。The server sorts the music list selected by the search result in descending order according to the similarity between the search result and the query input melody, and sends it back to the client for display.
本发明为用户提供了一种新的搜索方式,即:用音乐旋律搜索音乐。它弥补了传统基于文本搜索方式的不足,使用户在不知文本信息的情况下搜索想要的音乐;本发明还将此搜索方式实施于具体的硬件平台,使得用户可以使用常见的设备如电脑,手机等,进行音乐搜索。The present invention provides a new search mode for users, that is, to search music by music melody. It makes up for the shortcomings of the traditional text-based search method, enabling users to search for desired music without knowing the text information; the invention also implements this search method on a specific hardware platform, so that users can use common equipment such as computers, mobile phone, etc., for music search.
附图说明 Description of drawings
图1本发明结构示意图Fig. 1 structural representation of the present invention
具体实施方式 Detailed ways
下面将结合附图对本发明和优点加以详细说明,应指出的是,所描述的实施例仅旨在便于对本发明的理解,而对其不起任何限定作用。The present invention and its advantages will be described in detail below with reference to the accompanying drawings. It should be noted that the described embodiments are only intended to facilitate the understanding of the present invention, and have no limiting effect on it.
本发明主要研究基于内容的音乐检索(Content based MusicRetrieval),提供一种用音乐本身来搜索音乐的方式。具体来说,就是以一小段音乐旋律作为搜索的关键字,搜索引擎返回一组包含指定关键旋律的一组音乐。旋律作为关键字,它不同于文本关键字,用户无法直接从键盘输入,而需要提供一种特殊的输入旋律的方法。最符合人们习惯的方法就是哼唱输入,用户只要使用音频采集输入设备,如麦克风,哼唱一段需要查找的旋律。此外,用户还可以通过虚拟的钢琴键盘,进行弹奏输入。The present invention mainly researches the content based Music Retrieval (Content based Music Retrieval), provides a kind of mode that uses music itself to search for music. Specifically, a short piece of music melody is used as a search keyword, and the search engine returns a group of music that contains the specified key melody. The melody is used as a keyword, which is different from the text keyword. The user cannot input directly from the keyboard, but needs to provide a special method for inputting the melody. The method most in line with people's habits is humming input. The user only needs to use an audio acquisition input device, such as a microphone, to hum a melody that needs to be searched. In addition, users can also play input through the virtual piano keyboard.
本发明的实施例提供了一个完整的计算机技术应用系统平台,它的功能是提供基于旋律的音乐搜索服务,该平台同时实现了音乐原始数据获取,音乐原始数据分析,音乐数据库索引编制,在线查询,音频信号处理,信息反馈等功能。该系统平台具备了在普通个人电脑、智能移动设备、电话、卡拉OK点唱设备等终端设备上进行哼唱输入和钢琴键盘弹奏输入音乐旋律的条件,并且具备了在以上这些终端设备上向用户显示或再现搜索结果的条件。Embodiments of the present invention provide a complete computer technology application system platform, its function is to provide music search service based on melody, the platform simultaneously realizes music original data acquisition, music original data analysis, music database indexing, online query , audio signal processing, information feedback and other functions. The system platform has the conditions for humming input and piano keyboard playing input music melody on terminal equipment such as ordinary personal computers, smart mobile devices, telephones, and karaoke equipment, and has the ability to communicate with the above terminal equipment. The conditions under which the user displays or renders search results.
本发明由多个功能模块有机结合而成,每个功能模块完成特定的功能。系统完整的结构如图1所示。本发明基于旋律的音乐检索装置,包括至少一台计算机作为服务器2提供在线音乐检索服务,和至少一台客户端1终端设备发出在线音乐检索请求,并接收服务器2的查询结果,服务器2从多种数据源获取并存储了包含大量音乐旋律特征的音乐旋律数据库,并且对数据库建立索引。当收到客户端1的查询请求时,服务器2对用户输入的查询旋律片段与数据库中的旋律进行比较,并过滤掉与查询旋律片段不相关的音乐,将剩下的若干个候选音乐按照与查询旋律片段的相似程度排序,将排序后的音乐列表返回客户端。客户端1为用户提供两种输入界面,接收用户的旋律输入并将其转化为可用于查询的数字化旋律信号。The present invention is formed by the organic combination of a plurality of functional modules, and each functional module completes a specific function. The complete structure of the system is shown in Figure 1. The melody-based music retrieval device of the present invention includes at least one computer as server 2 to provide online music retrieval service, and at least one client terminal device 1 sends an online music retrieval request, and receives the query result of server 2, and server 2 from multiple A data source acquires and stores a music melody database containing a large number of music melody features, and indexes the database. When receiving the query request from client 1, server 2 compares the query melody segment input by the user with the melody in the database, and filters out the music that is not related to the query melody segment, and divides the remaining candidate music according to the Query the similarity ranking of melody fragments, and return the sorted music list to the client. Client 1 provides two input interfaces for the user, receives the user's melody input and converts it into a digitized melody signal that can be used for query.
图1所示的结构图中,左半部虚线框中的部件是在客户端1终端设备中的模块,包括:输入模块11采集用户输入并发送到服务器2,搜索结果显示模块12将服务器2返回的查询结果呈现给用户。In the structural diagram shown in Figure 1, the components in the dotted box on the left half are modules in the terminal device of the client 1, including: the
所述输入模块11,包括:音频采集单元111和音符采集单元113,分别用于采集用户的哼唱输入和弹奏输入;音频信号处理单元112,将音频采集单元111采集的音频信号转化为音乐旋律信息。The
音频采集单元111采集用户的哼唱输入。它由音频采集设备和一段录音程序软件组成。音频采集设备在个人电脑和卡拉OK点唱终端上通常是麦克风,在手机等通讯终端上通常为受话筒。它由录音软件驱动,将音频波形的模拟信号按录音软件指定的采样频率进行数字采集,将采集的数字脉冲序列存储在客户端1的存储器中。由于人声的基频(一次谐波)通常在2000Hz以内,根据Nyquist采样定理,为保证采集的数字信号不发生频率混叠,采样频率应该大于最高有效频率的2倍。由于本发明需要对人声的谐波进行分析,所以取采样频率为8000Hz或11025Hz。音频采集单元111每次采集的时间长度默认为10秒,可根据情况自行设定。The audio collection unit 111 collects the user's humming input. It consists of audio acquisition equipment and a piece of recording program software. The audio collection device is usually a microphone on a personal computer and a karaoke terminal, and is usually a receiving microphone on a communication terminal such as a mobile phone. It is driven by the recording software, digitally collects the analog signal of the audio waveform according to the sampling frequency specified by the recording software, and stores the collected digital pulse sequence in the memory of the client 1. Since the fundamental frequency (first harmonic) of the human voice is usually within 2000 Hz, according to the Nyquist sampling theorem, in order to ensure that the collected digital signal does not have frequency aliasing, the sampling frequency should be greater than twice the highest effective frequency. Since the present invention needs to analyze the harmonics of human voice, the sampling frequency is 8000Hz or 11025Hz. The audio collection unit 111 defaults to 10 seconds for each collection, which can be set according to the situation.
音频信号处理单元112,它将音频采集单元111采集的音频信号转化为音乐的旋律信息。音频信号处理单元112对音频信号进行以下处理:The audio
步骤1)、音频采集单元111收集的音频信号通常含有直流分量,直流分量造成信号平衡位置电位的偏移,给信号的低频频谱分析造成误差。因此有必要消除信号的直流分量。由于直流信号有时不变特性,令所有采样点电位值减去采样信号全局的平衡点电位值,即可消除直流分量。为消除信号强弱的差别带来的误差,音频信号处理单元112还对信号强度进行了标准化处理,方法是对于一次采样信号的能量最大值,将其设为1,其余所有点以该点为标准成比例地放大或缩小,保证任何一次采样的能量最大值都相等。此外,将采样信号通过低通滤波器处理,能抑制高频噪声,提高信噪比。Step 1), the audio signal collected by the audio collection unit 111 usually contains a DC component, and the DC component causes a shift in the potential of the equilibrium position of the signal, causing errors in the analysis of the low-frequency spectrum of the signal. Therefore it is necessary to eliminate the DC component of the signal. Due to the sometimes constant characteristic of the DC signal, the DC component can be eliminated by subtracting the potential value of the global equilibrium point of the sampling signal from the potential values of all sampling points. In order to eliminate the error caused by the difference in signal strength, the audio
步骤2)、对步骤1)处理的信号进行取帧,相邻帧之间有一定的重叠,在语音信号处理中,通常每帧信号长度在200毫秒以内,以使每一帧信号可近似看做平稳信号。对每帧数据进行加窗滤波处理。汉宁窗滤波公式如下:Step 2), the signal processed in step 1) is framed, and there is a certain overlap between adjacent frames. In speech signal processing, the length of each frame signal is usually within 200 milliseconds, so that each frame signal can be viewed approximately Make a smooth signal. Windowing and filtering are performed on each frame of data. The Hanning window filter formula is as follows:
步骤3)、傅利叶变换(Fourier Transform)是一种将时域信号变换为频域信号的方法,在频域中,信号在不同频率分量上的能量分布可以清晰直观地再现。本步骤中采用快速傅利叶变换(FFT)算法将步骤2)处理后的每帧信号变换到复频域,得到每个频率分量的复向量。每个复向量包括实轴和虚轴两个分量,取其平方之和,得到能量值,即表示了该帧信号在每个频率分量上的强弱。快速傅立叶变换要求输入的采样点数为2N,若步骤2)中每帧采样点数目不足2N,则将不足的点补0。Step 3), Fourier Transform (Fourier Transform) is a method of transforming a time-domain signal into a frequency-domain signal. In the frequency domain, the energy distribution of the signal on different frequency components can be clearly and intuitively reproduced. In this step, the fast Fourier transform (FFT) algorithm is used to transform each frame signal processed in step 2) into the complex frequency domain to obtain a complex vector of each frequency component. Each complex vector includes two components of the real axis and the imaginary axis, and the sum of their squares is taken to obtain the energy value, which represents the strength of the frame signal on each frequency component. The fast Fourier transform requires that the number of input sampling points be 2 N , if the number of sampling points in each frame in step 2) is less than 2 N , fill the insufficient points with 0.
步骤4)、在步骤3)处理后的每帧频域分布中,若能在人声频段找到能量的峰值,并且显著超过了背景噪声的能量,则满足条件的第一个峰值对应的频率为人声的基频值。将相邻帧的基频值进行比较,如果变化不大,则认为是同一音符,若变化较大,则认为是音符的转换。此外,静音帧也可以作为音符的分界。Step 4), in the frequency domain distribution of each frame processed in step 3), if the peak energy can be found in the human voice frequency band, and significantly exceeds the energy of the background noise, then the frequency corresponding to the first peak that satisfies the condition is human The fundamental frequency of the sound. Compare the fundamental frequency values of adjacent frames, if the change is not large, it is considered to be the same note, and if the change is large, it is considered to be the conversion of the note. Additionally, silence frames can also serve as note boundaries.
步骤5)、在相邻两个音符间,求其频率的对数差,得到旋律音符的差分特征序列。将频率值取对数,就是将随音阶指数增长的频率值线性化,使得音阶差与其频率的对数差成正比。以音符的对数频率差作为旋律的特征,可以消除不同用户哼唱时,不同的基调带来的差异。Step 5), between two adjacent notes, find the logarithmic difference of their frequencies, and obtain the difference feature sequence of the melody notes. Taking the logarithm of the frequency values linearizes the frequency values that grow exponentially with the scale so that the difference between the scales is proportional to the logarithmic difference of their frequencies. Using the logarithmic frequency difference of the note as the feature of the melody can eliminate the difference caused by different keynotes when different users hum.
经过以上5个步骤,人声哼唱的音频转化成了旋律特征信息,可以作为关键特征发送给服务器端进行搜索。在以上的基频提取步骤中,同样可以采用时域的方法,例如自相关法等。After the above five steps, the audio of human humming is converted into melody feature information, which can be sent to the server as a key feature for search. In the above fundamental frequency extraction step, a time-domain method, such as an autocorrelation method, can also be used.
音符采集单元113是采用钢琴键盘输入的方式提供弹奏输入旋律的界面。音符采集单元113在客户端1终端设备上显示钢琴键盘,用户可以用鼠标或其他触点设备如触摸屏,手写笔等点击相应的琴键输入旋律。音符采集单元113将钢琴的每个键按音高顺序编号,作为每个键的ID。用户所点击的相邻两键的ID之差即为与音频信号处理单元112的输出含义相同的音符差,作为旋律特征发送至服务器2端。钢琴键盘采集的音符信息无需进行信号处理的运算。因此,钢琴键盘输入的旋律具有无误差,速度快等优点。The
由于普通的电话设备不具有数据处理能力,因此在电话终端设备中,音频信号处理单元112运行于服务器2端,客户端1电话设备仅仅负责收集用户的输入。在哼唱输入方式中,用户使用电话受话筒作为音频采集单元111,音频信号以通过公共电话交换网络(PSTN)传送至服务器端;在钢琴键盘输入方式中,用户使用电话的数字拨号键盘,以音乐简谱的方式输入旋律,服务器2端收到电话按键信号后,服务器2与客户端1通过公共交换电话网络(PSTN)进行信息交互,将其转化为对应的音乐音符,反馈给用户以便用户修正。Since ordinary telephone equipment does not have data processing capabilities, in the telephone terminal equipment, the audio
搜索结果的显示模块12,客户端1通过网络或其他传输方式从服务器2端获得搜索所需音乐旋律信息结果。搜索结果以列表的形式呈现,列表中的每一项是一首音乐名(标题),以及作者,歌手等信息。列表中的音乐按相似程度递减排序。The
图1的结构图中,右边虚框中的是服务器2,包括:音乐数据源接口单元21、数据获取与分析单元22、索引编制单元23、搜索单元24,它们在后台完成收集数据、分析数据、编制索引,并且在线进行搜索运算。In the structural diagram of Fig. 1, what is in the virtual box on the right is the server 2, including: music data
数据获取与分析单元22,它负责收集原始的音乐数据文件,并对音乐数据文件进行分析,从中提取出音乐旋律信息;本发明直接支持的音乐文件格式是MIDI格式,因此,数据获取与分析单元22主要对MIDI音乐文件进行分析。MIDI文件格式是以数字指令的形式存储音乐的要素,如音高,时长,音色,节奏等。通过对MIDI文件中音乐数字指令序列的解析,可以很方便而且精确地提取出音乐的参数。MIDI音乐文件可以看作一个分层的结构。常见的MIDI文件有两种格式:单轨格式(Type 0),和多轨格式(Type 1)。在单轨格式中,每个文件包含一个音轨(track),每个音轨中有16个通道(channel),每个通道可以存放一种乐器。在播放时,16个通道同时播放。单轨格式最多有16种乐器同时播放,能满足一般数字音乐的需要。多轨格式音乐文件中,每个文件包含多个音轨(track),每个音轨也包含16个通道,但每个音轨只有一个通道是活动的,其他通道都为空。多个音轨也是同时播放。多轨格式可以同时播放多于16种乐器,因此一些表现力丰富的数字音乐常采用该格式。数据获取与分析单元22将两种文件格式统一,建立分层结构:MIDI-轨道-通道-音符四个层次,上层元素由下层元素的集合组成。每一个非空的通道都包含一段音符序列。数据获取与分析单元22将每一个音乐文件转化为一个具有分层结构的对象,并且还保存了该音乐的指纹信息,标题,作者等相关信息。Data acquisition and
任何一个搜索引擎,它的工作就是在一个可以接受的时间内返回一个和该用户查询匹配的信息列表。在这里,有三个概念需要注意:The job of any search engine is to return a list of information that matches the user's query within an acceptable amount of time. Here, there are three concepts to pay attention to:
1)可以接受的时间。这指的是响应时间。对于在Internet上向广大用户提供服务的软件来说,这个时间不能太长,通常也就是在“秒”这个量级。这是衡量搜索引擎可用性的一个基本指标,也是和传统信息检索系统的一个差别。更进一步的,这样的响应时间要求不仅要能满足单个用户的查询,而且要能在系统设计负载的情况下满足所有的用户。也就是说,系统应该在额定的吞吐率的情况下保证秒级响应时间。1) Acceptable time. This refers to response time. For software that provides services to a large number of users on the Internet, this time cannot be too long, usually on the order of "seconds". This is a basic index to measure the availability of search engines, and it is also a difference from traditional information retrieval systems. Furthermore, such a response time requirement must not only be able to satisfy a single user's query, but also satisfy all users under the system design load. In other words, the system should guarantee a second-level response time at the rated throughput rate.
2)匹配。以网页为例,指的是网页中以某种形式包含有用户输入的查询关键字的内容,或者出现与查询关键字非常相近的内容。在基于旋律的音乐搜索引擎系统中,匹配指的就是音乐的主旋律中包含用户输入的旋律关键字。用户旋律的输入与目标旋律有所偏差,因此,匹配不仅要能精确匹配,而且还需要有一定的容错能力。2) Match. Taking a webpage as an example, it means that the webpage contains the query keyword input by the user in some form, or the content very similar to the query keyword appears. In the melody-based music search engine system, matching means that the main melody of the music contains the melody keyword input by the user. The input of the user melody deviates from the target melody. Therefore, the matching must not only be able to match accurately, but also need to have a certain degree of fault tolerance.
3)列表。在搜索引擎返回给用户的搜索结果,通常是一个包含多项结果的列表,在这个列表中的每一个元素,与用户输入的关键字都有一定程度的相似或相关。然而绝大多数用户只关心排在结果列表中第一页的元素,因此,对搜索结果列表中元素的相似相关性排序是必需的。这种排序称为Rank。目前不同的搜索引擎采取了不尽相同的Ranking算法。如Google采用的是PageRank算法,它对结果中页面的重要性进行排序,而百度采用了竞价排名的方法等。3) List. The search result returned to the user by the search engine is usually a list containing multiple results, and each element in the list is similar or related to the keyword entered by the user to a certain extent. However, the vast majority of users only care about the elements ranked on the first page in the result list, so it is necessary to sort the elements in the search result list by similar relevance. This sorting is called Rank. At present, different search engines adopt different ranking algorithms. For example, Google uses the PageRank algorithm, which ranks the importance of the pages in the results, while Baidu uses the method of bidding ranking.
在搜索引擎系统中,索引算法的优劣,对以上三个性能指标有至关重要的影响。在目前的基于旋律的音乐搜索引擎中,多数采用的是线性匹配的算法。这种算法就是把用户的输入旋律和音乐文件中的旋律分别看作两个串,进行串的相似度对比。在基于内容的音乐搜索领域中,比较常用的有Suffix Tree,Suffix Array,Linear Alignment等方法。然而,线性搜索有一个共同的缺陷,在搜索过程中,需要对数据库中的每一个元素进行扫描,以确定是否匹配。这在原始数据库的数据量不大的时候是可以接受的,但是随着数据库的数据量的增大,在最理想的情况下,搜索的时间也会呈线性地增长,即搜索的时间复杂度至少为O(n),例如,在Suffix Array算法中,其时间复杂度为O(nlogn)。现在大型搜索引擎的数据量,通常在108至109数量级,如果对如此庞大的数据库进行线性扫描,运算时间是用户无法接受的。因此,大型的搜索引擎,一般都采用倒排索引的算法。In the search engine system, the quality of the indexing algorithm has a crucial impact on the above three performance indicators. In current melody-based music search engines, most of them adopt linear matching algorithms. This algorithm is to regard the user's input melody and the melody in the music file as two strings respectively, and compare the similarity of the strings. In the field of content-based music search, methods such as Suffix Tree, Suffix Array, and Linear Alignment are commonly used. However, linear search has a common defect. During the search process, each element in the database needs to be scanned to determine whether it matches. This is acceptable when the amount of data in the original database is not large, but as the amount of data in the database increases, in the most ideal case, the search time will also increase linearly, that is, the time complexity of the search At least O(n), for example, in the Suffix Array algorithm, its time complexity is O(nlogn). Now the data volume of large search engines is usually on the order of 10 8 to 10 9 , if such a huge database is linearly scanned, the calculation time is unacceptable to users. Therefore, large search engines generally use the inverted index algorithm.
在众多的搜索算法中,倒排索引(Inverted Index)以灵活,高效,具有通用性等特点,迅速获得广泛应用。它是一种基于单词的索引算法,能够根据用户输入的关键字,直接过滤掉数据库中不相关的内容,并且能对相关内容的相关性进行排序,并且有良好的容错性能,可以对近似的内容进行识别。Among the many search algorithms, Inverted Index has been widely used rapidly due to its flexibility, high efficiency, and versatility. It is a word-based indexing algorithm, which can directly filter out irrelevant content in the database according to the keywords entered by users, and can sort the relevance of related content, and has good fault tolerance performance, and can approximate Content is identified.
在多数语言的文本中,词与词之间都有天然的分隔符,如空格,标点符号等。在中文等没有天然分词的语言中,也有比较成熟的分词技术。倒排索引就是根据每个单词在文章中出现在频率不同,将不同文章中出现的同一个词归为一类,以单词作为索引的主键,含有该单词的文章作为元素列表。这样,当一个查询中出现了几个特定的单词,系统就会直接去查找这几个特定单词下的文章元素,而与查询无关的文章就会被自动过滤掉。这种自动过滤不需要占用CPU资源,因此效率非常高。这种高效自动过滤不相关信息的机制,就是倒排索引这种独特的数据结构的优势所在。In the text of most languages, there are natural separators between words, such as spaces, punctuation marks, etc. In languages that do not have natural word segmentation, such as Chinese, there are relatively mature word segmentation technologies. Inverted index is to classify the same word that appears in different articles into one category according to the frequency of each word in the article. The word is used as the primary key of the index, and the article containing the word is used as the element list. In this way, when several specific words appear in a query, the system will directly search for article elements under these specific words, and articles irrelevant to the query will be automatically filtered out. This automatic filtering does not require CPU resources, so it is very efficient. This efficient and automatic filtering mechanism of irrelevant information is the advantage of the unique data structure of the inverted index.
在音乐搜索引擎系统中,搜索的对象是音乐旋律,而不是文本。因此需要对基于文本的倒排索引模型做一些修改,使之适应音乐旋律的索引编制。In the music search engine system, the object of search is music melody, not text. Therefore, some modification of the text-based inverted index model is needed to adapt it to the indexing of music melodies.
音乐旋律是由连续的音符序列构成。在音乐中,虽然也有小节可以将乐曲分成小段,但是在MIDI音乐格式中,并没有明显的小节分隔的标志。此外,休止符与文本中的空格很相似,只是在不同风格的音乐中,休止符的出现很随机,没有一个具有明显特征的规律。因此,小节和休止符这类音乐本身天然的分隔符都不适合划分旋律。A musical melody is composed of a continuous sequence of notes. In music, although there are also bars that can divide a piece of music into small sections, in the MIDI music format, there is no obvious sign of bar separation. In addition, rests are very similar to spaces in text, but in different styles of music, the appearance of rests is very random, and there is no regularity with obvious characteristics. Therefore, the natural separators of music such as bars and rests are not suitable for dividing melody.
由于音乐旋律本身目前没有找到一种良好的分词机制,因此本发明采用旋律片段切分方法。将一段连续的旋律切分为小段,每小段包含3~4个音符,段与段之间有一定的重叠。本发明将旋律片段作为音乐旋律的分词,运用倒排算法进行索引编制。当有新音乐曲目需要加入索引时,只需要对该曲进行旋律片段的划分,并将该曲分别加入每个旋律片段的元素集合中。Since the music melody itself has not found a good word segmentation mechanism at present, the present invention adopts a melody segment segmentation method. Divide a continuous melody into small sections, each section contains 3 to 4 notes, and there is a certain overlap between sections. The invention uses the melody segment as the word segmentation of the music melody, and uses the inverted algorithm to compile the index. When there is a new music track that needs to be added to the index, it is only necessary to divide the song into melody segments, and add the song to the element set of each melody segment.
索引编制单元23,用于根据以上方法将音乐旋律信息片段作为音乐旋律的分词进行索引编制,对数据获取与分析单元22提供的音乐数据建立索引。The
搜索单元24,用于接收客户端输入模块11的查询请求,并在索引编制单元23生成的索引中搜索与客户端1中音频采集单元111或音符采集单元113查询的音乐旋律信息相同或相近旋律的音乐在线进行搜索运算,用于将搜索结果列表按相似程度倒序排序,并反馈回客户端的搜索结果显示模块12。The
上文提到,按相似程度对搜索结果进行排序,是搜索引擎一个重要的功能。搜索单元24根据客户端查询串和音乐库中旋律串中相同音符的个数来计算相似度,相同的音符越多,说明两者越相似。As mentioned above, sorting search results by similarity is an important function of search engines. The
搜索单元24根据不同的客户端1设备采用不同的交互方式。The
对于客户端1为个人电脑设备时,个人电脑客户端从服务器下载安装特定的Web浏览器插件软件,该插件软件集成了音频采集模块111中的录音程序和音符采集模块113的虚拟钢琴键盘程序。用户访问服务器提供的音乐检索Web网站时,用于为用户提供音频采集输入和音符采集旋律的用户界面,并且采集用户的查询输入,通过互联网发送至服务器。When client 1 is a personal computer device, the personal computer client downloads and installs specific Web browser plug-in software from server, and this plug-in software integrates the recording program in the audio collection module 111 and the virtual piano keyboard program of the musical
对于客户端1为智能移动设备时,客户端1安装特定的软件,该软件基于用户使用的移动设备操作系统平台开发(如Windows Mobile平台,Linux平台,Nokia S60平台,Java平台等),为用户提供音频采集输入和音符采集旋律的用户界面,并且采集用户的查询输入,通过无线网络发送至服务器。When client 1 is a smart mobile device, client 1 installs specific software, which is developed based on the mobile device operating system platform used by the user (such as Windows Mobile platform, Linux platform, Nokia S60 platform, Java platform, etc.), for the user Provide a user interface for audio collection input and note collection melody, and collect user query input and send it to the server through the wireless network.
对于客户端1选择电话设备时,服务器2提供特定的电话声讯台,客户端1拨打该声讯台号码,利用电话数字键盘,或使用电话受话器作为音频采集输入设备,服务器2与客户端1通过公共交换电话网络(PSTN)进行信息交互。When client 1 selects a telephone device, server 2 provides a specific telephone audio station, client 1 dials the audio station number, uses the telephone numeric keypad, or uses a telephone receiver as an audio collection input device, server 2 and client 1 through the public switching telephone Network (PSTN) for information exchange.
对于客户端1选择具有媒体点播功能的音视频娱乐设备,客户端1配备硬件数字钢琴键盘设备,或安装虚拟钢琴键盘软件采集用户的钢琴键盘音符输入,利用卡拉OK麦克风采集用户的哼唱输入,服务器2为专用本地服务器,搜索的范围为卡拉OK本地的音乐库。Select the audio-video entertainment device with media on demand function for client 1, client 1 is equipped with hardware digital piano keyboard equipment, or installs virtual piano keyboard software to collect user's piano keyboard note input, utilizes karaoke microphone to collect user's humming input, Server 2 is a dedicated local server, and the scope of searching is the local music library of karaoke.
对于电脑和移动智能设备,搜索结果以列表形式呈现给用户,用户在不侵犯音乐作品知识产权的情况下,可以进行下载,播放等操作。对于电话的客户端1,服务器2端将以语音提示的方式朗读搜索结果列表,用户可用电话按键选中。对于点唱设备客户端1,用户选中后,可以进行预约,点播等操作。For computers and mobile smart devices, the search results are presented to users in the form of a list, and users can download, play and other operations without infringing on the intellectual property rights of music works. For the client 1 of the phone, the server 2 will read the search result list in the form of voice prompts, and the user can select it with the phone buttons. For the jukebox device client 1, after the user selects it, he can perform operations such as reservation and on-demand.
音乐数据源接口单元21,用于提供多种不同的数据源访问接口,使服务器能够从不同的数据源获取原始音乐数据,并根据具体的用途和需求对音乐数据库进行扩充,例如:The music data
1.采取Web网络抓取的方式,自动在互联网上漫游,抓取音乐文件和与该音乐文件相关的信息;或1. Take the method of web crawling to automatically roam on the Internet to grab music files and information related to the music files; or
2.采取对本地或网络文件系统中存储的文件进行抓取和分析;或2. Take the crawling and analysis of files stored in the local or network file system; or
3.采取对数据库中的音乐记录进行提取和分析。3. Extract and analyze the music records in the database.
本发明不局限于以上三种数据源,而是提供了可二次开发的应用程序接口(API),可对数据源进行进一步的扩充。The present invention is not limited to the above three data sources, but provides an application program interface (API) capable of secondary development, which can further expand the data source.
上面描述是用于实现本发明的实施例,本领域的技术人员应该理解,在不脱离本发明的范围的任何修改或局部替换,均属于本发明权利要求来限定的范围。The above description is an embodiment for implementing the present invention, and those skilled in the art should understand that any modification or partial replacement that does not depart from the scope of the present invention belongs to the scope defined by the claims of the present invention.
Claims (15)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2007100646076A CN101271457B (en) | 2007-03-21 | 2007-03-21 | A melody-based music retrieval method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2007100646076A CN101271457B (en) | 2007-03-21 | 2007-03-21 | A melody-based music retrieval method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101271457A true CN101271457A (en) | 2008-09-24 |
CN101271457B CN101271457B (en) | 2010-09-29 |
Family
ID=40005434
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2007100646076A Expired - Fee Related CN101271457B (en) | 2007-03-21 | 2007-03-21 | A melody-based music retrieval method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101271457B (en) |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101916250A (en) * | 2010-04-12 | 2010-12-15 | 电子科技大学 | A Music Retrieval Method Based on Humming |
CN101980197A (en) * | 2010-10-29 | 2011-02-23 | 北京邮电大学 | A multi-layer filter audio retrieval method and device based on long-term structural voiceprint |
CN102332262A (en) * | 2011-09-23 | 2012-01-25 | 哈尔滨工业大学深圳研究生院 | Intelligent Song Recognition Method Based on Audio Features |
CN102375834A (en) * | 2010-08-17 | 2012-03-14 | 腾讯科技(深圳)有限公司 | Audio file retrieving method and system as well as audio file type identification method and system |
CN102411578A (en) * | 2010-09-25 | 2012-04-11 | 盛乐信息技术(上海)有限公司 | Multimedia playing system and method |
CN102420910A (en) * | 2011-12-16 | 2012-04-18 | 广东步步高电子工业有限公司 | Mobile handheld terminal for playing music and synchronously displaying music score and implementation method thereof |
CN102497400A (en) * | 2011-11-30 | 2012-06-13 | 上海博泰悦臻电子设备制造有限公司 | Music media information obtaining method of vehicle-mounted radio equipment and obtaining system thereof |
CN102522083A (en) * | 2011-11-29 | 2012-06-27 | 北京百纳威尔科技有限公司 | Method for searching hummed song by using mobile terminal and mobile terminal thereof |
CN101552001B (en) * | 2009-02-25 | 2012-07-04 | 北京派瑞根科技开发有限公司 | Network searching system and information searching method |
CN101552003B (en) * | 2009-02-25 | 2012-07-04 | 北京派瑞根科技开发有限公司 | Media information processing method |
CN102549575A (en) * | 2009-09-30 | 2012-07-04 | 索尼爱立信移动通讯有限公司 | Method for identifying and playing back an audio recording |
CN101552000B (en) * | 2009-02-25 | 2012-07-04 | 北京派瑞根科技开发有限公司 | Music similarity processing method |
CN103108229A (en) * | 2013-02-06 | 2013-05-15 | 上海云联广告有限公司 | Method for identifying video contents in cross-screen mode through audio frequency |
CN103218454A (en) * | 2013-05-06 | 2013-07-24 | 百度在线网络技术(北京)有限公司 | Voice-data-based file searching method, voice-data-based file device and voice-data-based file system |
CN103258033A (en) * | 2013-05-15 | 2013-08-21 | 江苏奇异点网络有限公司 | Song automatic searching system |
CN103559312A (en) * | 2013-11-19 | 2014-02-05 | 北京航空航天大学 | GPU (graphics processing unit) based melody matching parallelization method |
EP2638520A4 (en) * | 2010-11-12 | 2014-05-21 | Google Inc | Media rights management using melody identification |
CN103812917A (en) * | 2012-11-15 | 2014-05-21 | 佛山市顺德区顺达电脑厂有限公司 | Information collecting system and method thereof |
CN103970793A (en) * | 2013-02-04 | 2014-08-06 | 腾讯科技(深圳)有限公司 | Information inquiry method, client side and server |
CN104143339A (en) * | 2013-05-09 | 2014-11-12 | 索尼公司 | Music signal processing apparatus and method, and program |
CN104679778A (en) * | 2013-11-29 | 2015-06-03 | 腾讯科技(深圳)有限公司 | Search result generating method and device |
CN105069146A (en) * | 2015-08-20 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Sound searching method and device |
CN105244021A (en) * | 2015-11-04 | 2016-01-13 | 厦门大学 | Method for converting singing melody to MIDI (Musical Instrument Digital Interface) melody |
CN105895079A (en) * | 2015-12-14 | 2016-08-24 | 乐视网信息技术(北京)股份有限公司 | Voice data processing method and device |
WO2017028115A1 (en) * | 2015-08-16 | 2017-02-23 | 胡丹丽 | Intelligent desktop speaker and method for controlling intelligent desktop speaker |
CN106776977A (en) * | 2016-12-06 | 2017-05-31 | 深圳前海勇艺达机器人有限公司 | Search for the method and device of music |
CN107146631A (en) * | 2016-02-29 | 2017-09-08 | 北京搜狗科技发展有限公司 | Music recognition methods, note identification model method for building up, device and electronic equipment |
CN107205043A (en) * | 2017-07-03 | 2017-09-26 | 武汉理工大学 | A kind of violin class network virtual musical instrument |
CN107436953A (en) * | 2017-08-15 | 2017-12-05 | 中国联合网络通信集团有限公司 | A kind of method for searching music and system |
WO2018018283A1 (en) * | 2016-07-24 | 2018-02-01 | 张鹏华 | Counting method for usage condition of song information recognition technique and recognition system |
CN108268530A (en) * | 2016-12-30 | 2018-07-10 | 阿里巴巴集团控股有限公司 | Dub in background music generation method and the relevant apparatus of a kind of lyrics |
CN108574771A (en) * | 2017-03-10 | 2018-09-25 | 峰范(北京)科技有限公司 | Collecting and processing of information system and its voice playing device, processing method |
CN108665903A (en) * | 2018-05-11 | 2018-10-16 | 复旦大学 | An automatic detection method and system for audio signal similarity |
CN108806392A (en) * | 2018-07-03 | 2018-11-13 | 东北石油大学 | A kind of vocal music pronunciation training apparatus and system |
CN109346043A (en) * | 2018-10-26 | 2019-02-15 | 平安科技(深圳)有限公司 | A kind of music generating method and device based on generation confrontation network |
CN110472094A (en) * | 2019-08-06 | 2019-11-19 | 沈阳大学 | A kind of traditional music input method |
CN110853457A (en) * | 2019-10-31 | 2020-02-28 | 中国科学院自动化研究所南京人工智能芯片创新研究院 | Interactive music teaching guidance method |
CN111627410A (en) * | 2020-05-12 | 2020-09-04 | 浙江大学 | MIDI multi-track sequence representation method and application |
CN112015942A (en) * | 2020-08-28 | 2020-12-01 | 上海掌门科技有限公司 | Audio processing method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1703734A (en) * | 2002-10-11 | 2005-11-30 | 松下电器产业株式会社 | Method and apparatus for determining musical notes from sounds |
-
2007
- 2007-03-21 CN CN2007100646076A patent/CN101271457B/en not_active Expired - Fee Related
Cited By (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101552000B (en) * | 2009-02-25 | 2012-07-04 | 北京派瑞根科技开发有限公司 | Music similarity processing method |
CN101552001B (en) * | 2009-02-25 | 2012-07-04 | 北京派瑞根科技开发有限公司 | Network searching system and information searching method |
CN101552003B (en) * | 2009-02-25 | 2012-07-04 | 北京派瑞根科技开发有限公司 | Media information processing method |
CN102549575A (en) * | 2009-09-30 | 2012-07-04 | 索尼爱立信移动通讯有限公司 | Method for identifying and playing back an audio recording |
CN101916250B (en) * | 2010-04-12 | 2011-10-19 | 电子科技大学 | Humming-based music retrieving method |
CN101916250A (en) * | 2010-04-12 | 2010-12-15 | 电子科技大学 | A Music Retrieval Method Based on Humming |
CN102375834B (en) * | 2010-08-17 | 2016-01-20 | 腾讯科技(深圳)有限公司 | Audio file search method, system and audio file type recognition methods, system |
CN102375834A (en) * | 2010-08-17 | 2012-03-14 | 腾讯科技(深圳)有限公司 | Audio file retrieving method and system as well as audio file type identification method and system |
CN102411578A (en) * | 2010-09-25 | 2012-04-11 | 盛乐信息技术(上海)有限公司 | Multimedia playing system and method |
CN101980197A (en) * | 2010-10-29 | 2011-02-23 | 北京邮电大学 | A multi-layer filter audio retrieval method and device based on long-term structural voiceprint |
CN101980197B (en) * | 2010-10-29 | 2012-10-31 | 北京邮电大学 | A multi-layer filter audio retrieval method and device based on long-term structural voiceprint |
EP2638520A4 (en) * | 2010-11-12 | 2014-05-21 | Google Inc | Media rights management using melody identification |
CN102332262A (en) * | 2011-09-23 | 2012-01-25 | 哈尔滨工业大学深圳研究生院 | Intelligent Song Recognition Method Based on Audio Features |
CN102522083A (en) * | 2011-11-29 | 2012-06-27 | 北京百纳威尔科技有限公司 | Method for searching hummed song by using mobile terminal and mobile terminal thereof |
CN102497400A (en) * | 2011-11-30 | 2012-06-13 | 上海博泰悦臻电子设备制造有限公司 | Music media information obtaining method of vehicle-mounted radio equipment and obtaining system thereof |
CN102420910A (en) * | 2011-12-16 | 2012-04-18 | 广东步步高电子工业有限公司 | Mobile handheld terminal for playing music and synchronously displaying music score and implementation method thereof |
CN103812917A (en) * | 2012-11-15 | 2014-05-21 | 佛山市顺德区顺达电脑厂有限公司 | Information collecting system and method thereof |
CN103970793B (en) * | 2013-02-04 | 2020-03-03 | 腾讯科技(深圳)有限公司 | Information query method, client and server |
US9348906B2 (en) | 2013-02-04 | 2016-05-24 | Tencent Technology (Shenzhen) Company Limited | Method and system for performing an audio information collection and query |
CN103970793A (en) * | 2013-02-04 | 2014-08-06 | 腾讯科技(深圳)有限公司 | Information inquiry method, client side and server |
CN103108229A (en) * | 2013-02-06 | 2013-05-15 | 上海云联广告有限公司 | Method for identifying video contents in cross-screen mode through audio frequency |
CN103218454A (en) * | 2013-05-06 | 2013-07-24 | 百度在线网络技术(北京)有限公司 | Voice-data-based file searching method, voice-data-based file device and voice-data-based file system |
CN104143339A (en) * | 2013-05-09 | 2014-11-12 | 索尼公司 | Music signal processing apparatus and method, and program |
CN104143339B (en) * | 2013-05-09 | 2019-10-11 | 索尼公司 | Acoustic musical signals processing device and method |
CN103258033A (en) * | 2013-05-15 | 2013-08-21 | 江苏奇异点网络有限公司 | Song automatic searching system |
CN103559312B (en) * | 2013-11-19 | 2017-01-18 | 北京航空航天大学 | GPU (graphics processing unit) based melody matching parallelization method |
CN103559312A (en) * | 2013-11-19 | 2014-02-05 | 北京航空航天大学 | GPU (graphics processing unit) based melody matching parallelization method |
CN104679778B (en) * | 2013-11-29 | 2019-03-26 | 腾讯科技(深圳)有限公司 | A kind of generation method and device of search result |
US10452691B2 (en) | 2013-11-29 | 2019-10-22 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for generating search results using inverted index |
CN104679778A (en) * | 2013-11-29 | 2015-06-03 | 腾讯科技(深圳)有限公司 | Search result generating method and device |
WO2017028115A1 (en) * | 2015-08-16 | 2017-02-23 | 胡丹丽 | Intelligent desktop speaker and method for controlling intelligent desktop speaker |
CN105069146B (en) * | 2015-08-20 | 2019-04-02 | 百度在线网络技术(北京)有限公司 | Sound searching method and device |
CN105069146A (en) * | 2015-08-20 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Sound searching method and device |
CN105244021A (en) * | 2015-11-04 | 2016-01-13 | 厦门大学 | Method for converting singing melody to MIDI (Musical Instrument Digital Interface) melody |
CN105244021B (en) * | 2015-11-04 | 2019-02-12 | 厦门大学 | The conversion method of humming melody to MIDI melody |
CN105895079B (en) * | 2015-12-14 | 2022-07-29 | 天津智融创新科技发展有限公司 | Voice data processing method and device |
CN105895079A (en) * | 2015-12-14 | 2016-08-24 | 乐视网信息技术(北京)股份有限公司 | Voice data processing method and device |
CN107146631A (en) * | 2016-02-29 | 2017-09-08 | 北京搜狗科技发展有限公司 | Music recognition methods, note identification model method for building up, device and electronic equipment |
CN107146631B (en) * | 2016-02-29 | 2020-11-10 | 北京搜狗科技发展有限公司 | Music identification method, note identification model establishment method, device and electronic equipment |
WO2018018283A1 (en) * | 2016-07-24 | 2018-02-01 | 张鹏华 | Counting method for usage condition of song information recognition technique and recognition system |
CN106776977A (en) * | 2016-12-06 | 2017-05-31 | 深圳前海勇艺达机器人有限公司 | Search for the method and device of music |
CN108268530A (en) * | 2016-12-30 | 2018-07-10 | 阿里巴巴集团控股有限公司 | Dub in background music generation method and the relevant apparatus of a kind of lyrics |
CN108268530B (en) * | 2016-12-30 | 2022-04-29 | 阿里巴巴集团控股有限公司 | Lyric score generation method and related device |
CN108574771A (en) * | 2017-03-10 | 2018-09-25 | 峰范(北京)科技有限公司 | Collecting and processing of information system and its voice playing device, processing method |
CN107205043A (en) * | 2017-07-03 | 2017-09-26 | 武汉理工大学 | A kind of violin class network virtual musical instrument |
CN107436953B (en) * | 2017-08-15 | 2020-07-10 | 中国联合网络通信集团有限公司 | Music searching method and system |
CN107436953A (en) * | 2017-08-15 | 2017-12-05 | 中国联合网络通信集团有限公司 | A kind of method for searching music and system |
CN108665903B (en) * | 2018-05-11 | 2021-04-30 | 复旦大学 | Automatic detection method and system for audio signal similarity |
CN108665903A (en) * | 2018-05-11 | 2018-10-16 | 复旦大学 | An automatic detection method and system for audio signal similarity |
CN108806392A (en) * | 2018-07-03 | 2018-11-13 | 东北石油大学 | A kind of vocal music pronunciation training apparatus and system |
CN109346043A (en) * | 2018-10-26 | 2019-02-15 | 平安科技(深圳)有限公司 | A kind of music generating method and device based on generation confrontation network |
CN109346043B (en) * | 2018-10-26 | 2023-09-19 | 平安科技(深圳)有限公司 | Music generation method and device based on generation countermeasure network |
CN110472094A (en) * | 2019-08-06 | 2019-11-19 | 沈阳大学 | A kind of traditional music input method |
CN110853457A (en) * | 2019-10-31 | 2020-02-28 | 中国科学院自动化研究所南京人工智能芯片创新研究院 | Interactive music teaching guidance method |
CN111627410A (en) * | 2020-05-12 | 2020-09-04 | 浙江大学 | MIDI multi-track sequence representation method and application |
CN111627410B (en) * | 2020-05-12 | 2022-08-09 | 浙江大学 | MIDI multi-track sequence representation method and application |
CN112015942A (en) * | 2020-08-28 | 2020-12-01 | 上海掌门科技有限公司 | Audio processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN101271457B (en) | 2010-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101271457A (en) | A melody-based music retrieval method and device | |
Typke et al. | A survey of music information retrieval systems | |
US7949649B2 (en) | Automatically acquiring acoustic and cultural information about music | |
US9053183B2 (en) | System and method for storing and retrieving non-text-based information | |
Burred et al. | Hierarchical automatic audio signal classification | |
CN101916250B (en) | Humming-based music retrieving method | |
US20040093354A1 (en) | Method and system of representing musical information in a digital representation for use in content-based multimedia information retrieval | |
CN102053998A (en) | Method and system device for retrieving songs based on voice modes | |
JP5066963B2 (en) | Database construction device | |
KR20080054393A (en) | Music analysis | |
Cornelis et al. | Access to ethnic music: Advances and perspectives in content-based music information retrieval | |
US8751494B2 (en) | Constructing album data using discrete track data from multiple sources | |
Baggi et al. | Music navigation with symbols and layers: Toward content browsing with IEEE 1599 XML encoding | |
US20060253433A1 (en) | Method and apparatus for knowledge-based music searching and method and apparatus for managing music file | |
KR100916310B1 (en) | Cross recommendation system and method between music and video based on audio signal processing | |
Gurjar et al. | Comparative Analysis of Music Similarity Measures in Music Information Retrieval Systems. | |
Pachet et al. | The cuidado music browser: an end-to-end electronic music distribution system | |
Kurth et al. | Syncplayer-An Advanced System for Multimodal Music Access. | |
Moelants et al. | The problems and opportunities of content-based analysis and description of ethnic music | |
KR20020053979A (en) | Apparatus and method for contents-based musical data searching | |
Moelants et al. | Problems and opportunities of applying data-& audio-mining techniques to ethnic music | |
Lampropoulos et al. | Semantically meaningful music retrieval with content-based features and fuzzy clustering | |
KR20200106328A (en) | System and method for providing cbmr based music identifying serivce using note | |
Dovey | Overview of the OMRAS project: Online music retrieval and searching | |
CN117909491B (en) | A document metadata parsing method and system based on Bayesian network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20100929 Termination date: 20180321 |