[go: up one dir, main page]

CN103294776B - Smartphone address book fuzzy search method - Google Patents

Smartphone address book fuzzy search method Download PDF

Info

Publication number
CN103294776B
CN103294776B CN201310173227.1A CN201310173227A CN103294776B CN 103294776 B CN103294776 B CN 103294776B CN 201310173227 A CN201310173227 A CN 201310173227A CN 103294776 B CN103294776 B CN 103294776B
Authority
CN
China
Prior art keywords
pinyin
matching
match
keyword
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310173227.1A
Other languages
Chinese (zh)
Other versions
CN103294776A (en
Inventor
尹建伟
姚陶钧
李莹
邓水光
吴健
吴朝晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SUZHOU LONGTANG INFORMATION TECHNOLOGY Co Ltd
Zhejiang University ZJU
Original Assignee
SUZHOU LONGTANG INFORMATION TECHNOLOGY Co Ltd
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SUZHOU LONGTANG INFORMATION TECHNOLOGY Co Ltd, Zhejiang University ZJU filed Critical SUZHOU LONGTANG INFORMATION TECHNOLOGY Co Ltd
Priority to CN201310173227.1A priority Critical patent/CN103294776B/en
Publication of CN103294776A publication Critical patent/CN103294776A/en
Application granted granted Critical
Publication of CN103294776B publication Critical patent/CN103294776B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种智能手机通讯录模糊搜索的方法,首先对手机通讯录联系人数据进行预处理,通过查拼音编码表获取联系人姓名拼音和对应拇指键盘数字序列,然后将拼音、数字序列、手机号码等关键信息写入内存,同时备份到设计好的缓存器中,针对给定搜索关键字,根据是否包含汉字、字母、数字分三种类别匹配内存中联系人的各个字段,并采用一种改造过的字符串子序列匹配的算法,实现关键字与姓名拼音的模糊匹配。本发明可以对任意关键字全局搜索通讯录,支持拇指键盘数字搜索,可以模糊匹配联系人姓名、全拼、简拼、部分拼音、手机号码、邮箱等,同时对搜索结果根据具体匹配原因加权排序输出,给用户带来更便捷、高效的多元化操作方法。

The invention discloses a fuzzy search method for the address book of a smart phone. Firstly, the contact data of the mobile phone address book is preprocessed, and the pinyin of the contact name and the corresponding thumb keyboard number sequence are obtained by checking the pinyin coding table, and then the pinyin and the number sequence are obtained. , mobile phone number and other key information are written into the memory, and backed up to the designed buffer at the same time, for a given search keyword, according to whether it contains Chinese characters, letters, numbers, match each field of the contact in the memory in three categories, and use A modified string subsequence matching algorithm realizes fuzzy matching between keywords and pinyin of names. The present invention can globally search the address book for any keyword, supports digital search on the thumb keyboard, can fuzzily match the contact name, full spelling, simple spelling, partial pinyin, mobile phone number, mailbox, etc., and at the same time sorts the search results according to the specific matching reasons. output, bringing more convenient and efficient diversified operation methods to users.

Description

一种智能手机通讯录模糊搜索的方法A method for fuzzy search of smart phone address book

技术领域technical field

本发明涉及智能手机终端的信息处理技术领域,特别是通讯录联系人智能搜索。The invention relates to the technical field of information processing of a smart phone terminal, in particular to an intelligent search for contacts in an address book.

背景技术Background technique

随着智能移动设备系统的发展,人们对智能手机的需求也越来越高,包括更美化的界面,更简洁快捷的操作,更多功能的应用软件等。手机通讯录作为手机最基础的应用,在人们的日常生活中发挥巨大的作用,电话通讯、收发短信、发送邮件等过程都需要调用联系人电话簿,而目前手机操作系统自带的通讯录功能比较薄弱,尤其是在联系人搜索方面的用户体验严重滞后,造成用户发短信、打电话时定位联系人很多不便。With the development of smart mobile device systems, people's demand for smart phones is getting higher and higher, including more beautified interfaces, simpler and faster operations, and more functional application software. As the most basic application of mobile phones, the mobile phone address book plays a huge role in people's daily life. The process of telephone communication, sending and receiving text messages, and sending emails all need to call the contact phone book. It is relatively weak, especially the user experience in contact search is seriously lagging behind, which causes a lot of inconvenience for users to locate contacts when sending text messages and making calls.

数据库全文检索是对大数据文本进行索引,在建立的索引中对要查找的单词进行搜索,定位哪些文本数据包括要搜索的单词;全文检索的全部工作就是建立索引和在索引中搜索定位,所有的工作都是围绕这两个来进行的;分词的方法基本上是二元分词法、最大匹配法和统计方法,索引的数据结构基本上采用倒排索引的结构。全文检索方法主要应用于大型数据文本,它并不适用通讯录这种结构简单、数据量较少的场景。此外,它还需要较大的存储空间,随着通讯录的更新动态维护索引,总体代价太大,使用起来不够灵活。The full-text search of the database is to index the big data text, search for the word to be searched in the established index, and locate which text data contains the word to be searched; all the work of the full-text search is to build an index and search and locate in the index, all The work is carried out around these two; the method of word segmentation is basically binary word segmentation method, maximum matching method and statistical method, and the data structure of the index basically adopts the structure of the inverted index. The full-text retrieval method is mainly applied to large data texts, and it is not suitable for a scene with a simple structure and a small amount of data such as an address book. In addition, it also requires a large storage space, and the index is dynamically maintained as the address book is updated. The overall cost is too high and it is not flexible enough to use.

传统的通讯录搜索采用的是对输入关键词语和联系人姓名及号码进行简单字符串匹配,并没有考虑姓名的全拼、简拼及部分拼音等特殊搜索。此外,字符串的子序列是从最初序列通过去除某些元素但不破坏余下元素的相对位置而形成的新序列,它是描述两个字符串相似匹配的常用标准,以形式化的方式来说,给定一个序列X=<x1,x2,x3,…,xm>,另一个序列Z=<z1,z2,z3,…,zk>是X的子序列,如果存在X的一个严格递增下表序列<i1,i2,…,ik>,使得对所有的j=1,2,…,k。传统的子序列匹配算法没有考虑到拼音的声母和韵母,没有对拼音进行动态分词匹配,因此需要进行一定的改造。The traditional address book search uses a simple string matching of input keywords and contact names and numbers, and does not consider special searches such as full spelling, simplified pinyin, and partial pinyin of names. In addition, the subsequence of a string is a new sequence formed from the original sequence by removing some elements without destroying the relative positions of the remaining elements. It is a common standard for describing similar matching of two strings, in a formal way , Given a sequence X=<x1,x2,x3,…,xm>, another sequence Z=<z1,z2,z3,…,zk> is a subsequence of X, if there is a strictly increasing table of X Sequence <i1,i2,...,ik> such that for all j=1,2,...,k. The traditional subsequence matching algorithm does not take into account the initials and finals of Pinyin, and does not perform dynamic word segmentation matching on Pinyin, so certain modifications are required.

拇指键盘的简单易用性吸引着大量智能手机用户,而提供拇指键盘的智能查询是用户必不可少的需求,因此如何给人们提供更便捷、更高效的通讯录搜索操作方法和用户体验是亟待解决的问题。The ease of use of the thumb keyboard attracts a large number of smartphone users, and providing intelligent query of the thumb keyboard is an indispensable requirement for users. Therefore, how to provide people with a more convenient and efficient address book search operation method and user experience is an urgent need solved problem.

发明内容Contents of the invention

本发明针对上述问题,提供一种智能手机通讯录模糊搜索的方法,目的是便捷、高效的通讯录搜索操作方法,将支持对联系人姓名(全拼、简拼、部分拼音、拇指键盘数字序列)、手机号码、邮箱等关键属性的模糊匹配和定位,给用户提供更快、更好的用户体验。The present invention is aimed at above-mentioned problem, provides a kind of method of smart phone address book fuzzy search, purpose is convenient, efficient address book search operation method, will support contact name (full spelling, simple spelling, partial pinyin, thumb keyboard number sequence) ), mobile phone number, email and other key attributes to provide users with faster and better user experience.

为了实现上述目的,本发明采用以下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种智能手机通讯录模糊搜索的方法,包括以下步骤:A method for fuzzy search of smart phone address book, comprising the following steps:

1)通过开放接口读取本机通讯录联系人数据,剔除冗余数据,只留取关键信息,关键信息包括联系人姓名、手机号码和邮箱;1) Read the contact data of the local address book through the open interface, eliminate redundant data, and only keep key information, including contact name, mobile phone number and email address;

2)通过信息预处理器对所有联系人信息进行预处理,对于姓名中的中文,通过查汉字编码表提取拼音,多音字则对应多条记录,同时将拼音转码成拇指键盘对应的数字序列,将上述所有的记录写入内存和设计好的缓存器中;2) Preprocess all contact information through the information preprocessor. For the Chinese in the name, extract the pinyin by looking up the Chinese character encoding table, and the polyphonic characters correspond to multiple records, and at the same time transcode the pinyin into the number sequence corresponding to the thumb keyboard , write all the above records into the memory and the designed buffer;

3)对输入的搜索关键字key,首先查询缓存器是否已有对应记录,若命中则直接定位,否则全局搜索通讯录,将关键字key分为三种类型:a)含有中文字符;b)不含中文字符,含有英文字母和数字;c)只包含纯数字;3) For the input search keyword key, first check whether there is a corresponding record in the buffer, if it is hit, locate it directly, otherwise search the address book globally, and divide the keyword key into three types: a) contain Chinese characters; b) Does not contain Chinese characters, but contains English letters and numbers; c) only contains pure numbers;

4)对于给定的关键字key通过搜索匹配器进行匹配搜索,若关键字属于b类别,则匹配对象为联系人姓名拼音;若属于c类别,则匹配对象为拇指键盘数字序列;搜索匹配器的具体匹配规则:将姓名拼音转换为一个拼音数组,遍历关键字的每一个字母,将字母与拼音单词依次匹配,若当前字母匹配,则继续匹配key与当前单词下一个字母或下一个单词的第一个字母,若所有的关键字字母都有对应的匹配,则匹配成功,否则匹配失败;4) For a given keyword key, use the search matcher to perform a matching search. If the keyword belongs to the b category, the matching object is the pinyin of the contact name; if it belongs to the c category, the matching object is the thumb keyboard number sequence; the search matcher The specific matching rules: convert the pinyin of the name into a pinyin array, traverse each letter of the keyword, and match the letters with the pinyin word in turn. If the current letter matches, continue to match the key with the next letter of the current word or the next word The first letter, if all keyword letters have corresponding matches, the match is successful, otherwise the match fails;

5)通过步骤3)、4)得到所有与关键字key匹配的联系人,通过加权排序器按照匹配优先级进行加权排序并输出。5) Obtain all contacts matching the keyword key through steps 3) and 4), perform weighted sorting according to the matching priority through the weighted sorter and output.

进一步,所述步骤3)中,对于a匹配对象为联系人姓名;对于b匹配对象为联系人姓名拼音和邮箱;对于c匹配对象为联系人电话号码和姓名对应拇指键盘数字序列;对于b、c应采用步骤4)中的拼音模糊匹配算法对拼音编码和拇指键盘数字序列进行模糊匹配,搜索时,首先在缓存器中查询,若命中或存在当前关键字前缀,则在原结果中进行二次搜索。Further, in the step 3), for a matching object is the contact name; for b matching object is the contact name pinyin and mailbox; for c matching object is the corresponding thumb keyboard number sequence of the contact phone number and name; for b, C should adopt the phonetic fuzzy matching algorithm in step 4) to carry out fuzzy matching to the phonetic code and the thumb keyboard number sequence, when searching, at first inquire in the buffer, if hit or exist current keyword prefix, then carry out secondary in the original result search.

进一步,所述步骤4)中,采用了改造后的字符串子序列匹配算法,具体算法思想描述为:对给定的关键字key和姓名拼音pinyin,先将pinyin以空格符分隔成数组pinyin[N],定义s为开始匹配的拼音数组下标,k为当前需要匹配的key字母下标,i为当前匹配的拼音单词下标,j为当前单词的字母下标;Further, in the step 4), a modified character string subsequence matching algorithm is adopted, and the specific algorithm idea is described as: for a given keyword key and name pinyin pinyin, first separate pinyin into an array pinyin[ N], define s as the subscript of the pinyin array that starts to match, k is the subscript of the key letter that needs to be matched currently, i is the subscript of the currently matched pinyin word, and j is the subscript of the letter of the current word;

a.初始化s=0,k=0,i=s,j=0,即从关键字第一个字母与拼音第一个单词的第一个字母开始匹配;a. initialization s=0, k=0, i=s, j=0, promptly begin to match from the first letter of keyword first letter and the first letter of pinyin first word;

b.判断k==key.length,如果true,则匹配成功,算法终止;如果false,则转c;b. Judging k==key.length, if true, the matching is successful, and the algorithm is terminated; if false, then go to c;

c.判断key[k]==pinyin[i][j],如果true转d,如果false转f;c. Judging key[k]==pinyin[i][j], if true turn to d, if false turn to f;

d.k++,j++,即尝试匹配关键字下一个字母与当前拼音单词下一个字母;d.k++, j++, that is, try to match the next letter of the keyword with the next letter of the current pinyin word;

e.判断j<pinyin[i].length,即提前判断当前单词是否已经全部匹配,如果true转b继续匹配下一个,false转g;e. Judging j<pinyin[i].length, that is, judging in advance whether all the current words have been matched, if true, turn to b to continue to match the next one, and false to g;

f.判断第i个拼音是否有匹配,如果true转g,false则s++,k=0,i=s,j=0并转b,从第s个拼音单词开始重新匹配;f. Judging whether the i-th pinyin has a match, if true turns to g, false then s++, k=0, i=s, j=0 and turns to b, and starts to re-match from the s-th pinyin word;

g.i++,j=0,即尝试匹配关键字与下一个单词第一个字母;g.i++, j=0, that is, try to match the keyword and the first letter of the next word;

h.判断i<pinyin.length,如果true则转b,false则匹配失败,算法终止;h. Judging i<pinyin.length, if true, turn to b, if false, the matching fails, and the algorithm terminates;

根据以上算法描述,可以精确判断key和pinyin[N]是否匹配,在算法过程中,用一个match[N]数组记录关键字与每一个拼音的匹配详情,在算法结束之后,根据match数组准确描述算法匹配原因,对于拇指键盘数字序列,只是算法过程中所使用的临时数据,用户也应根据match数组反向查询到具体对pinyin的模糊匹配。According to the above algorithm description, it can be accurately judged whether the key and pinyin[N] match. During the algorithm process, a match[N] array is used to record the matching details between the keyword and each pinyin. After the algorithm is finished, it is accurately described according to the match array The reason for the algorithm matching is that the number sequence of the thumb keyboard is only temporary data used in the algorithm process, and the user should also reversely query the specific fuzzy match for pinyin according to the match array.

进一步,所述步骤5)中,所述匹配优先级可以自定义。Further, in the step 5), the matching priority can be customized.

进一步,所述步骤5)中,匹配优先级为:完全匹配>部分匹配;全拼>简拼>部分拼音;姓名匹配>号码匹配>邮箱匹配。Further, in the step 5), the matching priority is: complete match>partial match; full pinyin>simplified pinyin>partial pinyin; name match>number match>mailbox match.

进一步,对已搜索到的关键字key搜索结果记录,应存入缓存器中,下次搜索新的关键字时,首先应查询缓存器,若关键字命中或是存在关键字前缀时,直接在缓存的结果记录之上进行二次搜索。Further, the search result record of the keyword key that has been searched should be stored in the cache. When searching for a new keyword next time, the cache should be queried first. If the keyword hits or there is a keyword prefix, directly in the cache A secondary search is performed on top of the cached result records.

本发明与现有技术相比,具有如下有益效果:Compared with the prior art, the present invention has the following beneficial effects:

本发明所提供的方法实用简单、操作便捷、查找高效,它支持中文汉字、电话号码、邮箱来搜索联系人;支持按姓名首字母、全拼、部分拼音来模糊搜索联系人。The method provided by the invention is practical and simple, convenient to operate and efficient to search, and it supports searching contacts by Chinese characters, telephone numbers and mailboxes;

进一步的,支持拇指键盘输入数字直接反查拼音模糊搜索联系人。Furthermore, it supports the thumb keyboard to input numbers and directly reverse pinyin to search contacts fuzzily.

进一步的,通过改造后的字符串子序列匹配算法可以对搜索联系人结果给出具体匹配原因。Further, the modified character string subsequence matching algorithm can give specific matching reasons for the contact search results.

进一步的,支持拇指键盘输入数字直接反查拼音模糊搜索联系人。Furthermore, it supports the thumb keyboard to input numbers and directly reverse pinyin to search contacts fuzzily.

进一步的,对搜索结果进行记录缓存,二次搜索效率较高。Further, the search results are recorded and cached, so that the secondary search is more efficient.

附图说明Description of drawings

图1为通讯录模糊搜索的结构示意图。FIG. 1 is a structural schematic diagram of fuzzy search in an address book.

图2为通过关键字模糊搜索通讯录联系人的流程示意图。Fig. 2 is a flow diagram of fuzzy search for contacts in the address book by keywords.

图3为改造后的字符串子序列匹配算法流程示意图。Fig. 3 is a flow diagram of the modified string subsequence matching algorithm.

具体实施方式detailed description

下面结合实施例和附图来对发明进行详细描述。The invention will be described in detail below in conjunction with the embodiments and accompanying drawings.

本发明给出的是一种智能手机通讯录模糊搜索联系人的方法,设计了如图1所示的搜索结构示意图,具体包括:The present invention provides a method for fuzzily searching contacts in the address book of a smart phone, and designs a schematic diagram of a search structure as shown in Figure 1, specifically including:

信息预处理器:获取并对联系人信息预处理,过滤冗余数据,对姓名进行拼音编码等。Information preprocessor: obtain and preprocess contact information, filter redundant data, encode names in pinyin, etc.

缓存器:对预处理过的联系人信息进行缓存,同时对关键字-搜索结果进行缓存。Cache: cache the pre-processed contact information, and cache the keyword-search results at the same time.

搜索匹配器:对给定搜索关键字,根据其类别匹配联系人所有关键信息,它调用了改造过的子序列匹配算法。Search matcher: For a given search keyword, match all the key information of the contact according to its category, and it calls the modified subsequence matching algorithm.

加权排序器:对搜索结果根据自定义的优先级规则,进行全局加权排序输出。Weighted sorter: According to the self-defined priority rules, the search results are globally weighted and sorted.

如图2所示的关键字模糊搜索通讯录联系人的流程示意图可以看出本发明一个实施例中的搜索方法主要包括以下流程:As shown in Figure 2, the schematic flow diagram of keyword fuzzy search for contacts in the address book can be seen that the search method in one embodiment of the present invention mainly includes the following flow:

1)通过手机操作系统提供的接口,读取本机通讯录联系人数据,剔除冗余数据,只留取联系人姓名、手机号码、邮箱等关键信息。1) Through the interface provided by the mobile phone operating system, read the contact data of the phone's address book, eliminate redundant data, and only retain key information such as the contact name, mobile phone number, and email address.

2)对所有通讯录联系人的个人信息进行预处理,特别地,对联系人姓名通过查汉字拼音编码表提取拼音,每个汉字拼音之间用空格分隔,如果姓名中存在多音字,转码成多条记录,同时把拼音翻译成拇指键盘对应的数字序列,对上述处理过的联系人信息维护在内存中,同时备份到已设计好的缓存器中,下次搜索可直接查询缓存器中记录。2) Preprocess the personal information of all contacts in the address book. In particular, extract the pinyin of the contact name by checking the Chinese character pinyin code table, and separate each Chinese character pinyin with a space. If there are polyphonic characters in the name, transcode At the same time, translate the pinyin into the number sequence corresponding to the thumb keyboard, maintain the above-mentioned processed contact information in the memory, and back it up to the designed buffer at the same time, the next time you search, you can directly query the buffer Record.

表1为拼音字母对应的拇指键盘数字转码表。Table 1 is the thumb keyboard digital conversion table corresponding to the pinyin letters.

3)3)

拼音字母phonetic alphabet 拇指键盘数字thumb keyboard numbers a,b,ca,b,c 22 d,e,fd, e, f 33 g,h,ig, h, i 44 j,k,lj, k, l 55 m,n,om, n, o 66 p,q,r,sp, q, r, s 77 t,u,vt, u, v 88 x,y,zx,y,z 99

4)搜索通讯录联系人时,输入搜索关键字,将关键字分为三种类别考虑:a)含有中文字符;b)不含中文字符,包含英文字母;c)只包含纯数字。对于a),关键字包含中文字符,而联系人信息中只有未经处理过的原始姓名字段中可能含有中文,因此只需将关键字与联系人姓名字符串匹配即可;对于b),关键字包含英文字母,可能符合的字段应为联系人姓名的拼音或者邮箱,对于拼音情形,考虑为联系人姓名的全拼、简拼、部分拼音的一种,可采用步骤四中的关键字和姓名全拼的匹配算法处理,如果是邮箱,则只需简单的字符串匹配即可;对于c),关键字只包含纯数字,应考虑手机号码和拇指键盘数字序列两种情形,前者只需要对关键字与联系人号码进行子字符串匹配,后者则采用步骤四的关键字和拇指键盘数字序列的匹配算法。4) When searching contacts in the address book, enter search keywords and divide keywords into three categories for consideration: a) contain Chinese characters; b) contain English letters without Chinese characters; c) contain only pure numbers. For a), the keyword contains Chinese characters, and only the unprocessed raw name field in the contact information may contain Chinese, so it is only necessary to match the keyword with the contact name string; for b), the key If the word contains English letters, the possible matching fields should be the pinyin or email address of the contact name. For the pinyin situation, consider one of the full pinyin, simplified pinyin, and partial pinyin of the contact name, and you can use the keywords in step 4 and The matching algorithm of the full name spelling, if it is a mailbox, only simple string matching is enough; for c), the keyword only contains pure numbers, two cases of mobile phone number and thumb keyboard number sequence should be considered, the former only needs Perform substring matching on the keyword and the contact number, and the latter uses the matching algorithm of the keyword and the thumb keyboard number sequence in step 4.

5)对给定的关键字key和姓名拼音pinyin(纯数字和拇指键盘数字序列匹配情形类似),先将姓名拼音以空格符分隔成数组pinyin[N],匹配key和pinyin[N].具体改造过的拼音匹配算法,即改造后的字符串子序列匹配算法流程如图3所示,具体如下:定义s为开始匹配的拼音数组下标,k为当前需要匹配的key字母下标,i为当前匹配的拼音单词下标,j为当前单词的字母下标。5) For a given keyword key and name pinyin (the matching of pure numbers and thumb keyboard number sequences is similar), firstly separate the name pinyin into an array pinyin[N] with a space character, and match the key and pinyin[N]. Specifically The modified pinyin matching algorithm, that is, the process of the modified string subsequence matching algorithm is shown in Figure 3, and the details are as follows: define s as the subscript of the pinyin array to start matching, k is the subscript of the key letter that needs to be matched currently, and i It is the subscript of the currently matched pinyin word, and j is the subscript of the letter of the current word.

a.初始化s=0,k=0,i=s,j=0,即从第一个关键字字母与拼音第一个单词的第一个字母开始匹配。a. Initialize s=0, k=0, i=s, j=0, that is, start to match the first letter of the keyword with the first letter of the first word in Pinyin.

b.判断k==key.length,如果true,则匹配成功,算法终止;如果false,则转c。b. Determine k==key.length, if true, the matching is successful, and the algorithm terminates; if false, go to c.

c.判断key[k]==pinyin[i][j],如果true转d,如果false转f。c. Judging key[k]==pinyin[i][j], if true, turn to d, if false, turn to f.

d.k++,j++,即尝试匹配关键字下一个字母与当前拼音单词下一个字母。d.k++, j++, that is, try to match the next letter of the keyword with the next letter of the current pinyin word.

e.判断j<pinyin[i].length,即提前判断当前单词是否已经全部匹配,如果true转2继续匹配下一个,false转g。e. Judging j<pinyin[i].length, that is, judging in advance whether all the current words have been matched, if true, turn to 2 and continue to match the next one, and false to g.

f.判断第i个拼音是否有匹配,如果true转g,false则s++,k=0,i=s,j=0并转b,从第s个拼音单词开始重新匹配。f. Judge whether there is a match in the i-th pinyin, if true turn to g, false then s++, k=0, i=s, j=0 and turn to b, start to match again from the s-th pinyin word.

g.i++,j=0,即尝试匹配关键字与下一个单词第一个字母。g.i++, j=0, that is, try to match the keyword with the first letter of the next word.

h.判断i<pinyin.length,如果true则转b,false则匹配失败,算法终止。h. Determine if i<pinyin.length, if true, turn to b, if false, the matching fails, and the algorithm terminates.

根据以上算法描述,可以精确判断key和pinyin[N]是否匹配,在算法过程中,我们可以用一个match[N]数组记录关键字与每一个拼音的匹配详情,在算法结束之后,可以根据match准确描述算法匹配原因,此外对于拇指键盘数字序列,它只是算法过程中所使用的临时数据,对于用户也应根据match数组反向查询到具体对pinyin的模糊匹配。According to the above algorithm description, we can accurately judge whether the key and pinyin[N] match. During the algorithm process, we can use a match[N] array to record the matching details of the keyword and each pinyin. After the algorithm is over, we can use the match Accurately describe the reason for the algorithm matching. In addition, for the thumb keyboard number sequence, it is only temporary data used in the algorithm process. For the user, the reverse query should be based on the match array to find the specific fuzzy match for pinyin.

6)通过步骤3)、4)已经得出所有与关键字key匹配的联系人,按照匹配的原因(match)对结果进行排序,优先级可以根据需要自定义,本发明较好的实施例中,优先级定义如下:完全匹配>部分匹配;全拼>简拼>部分拼音;姓名匹配>号码匹配>邮箱。6) by steps 3), 4) have drawn all contacts matching with the keyword key, sort the results according to the reason (match) of the match, the priority can be customized according to needs, in a better embodiment of the present invention , the priority is defined as follows: exact match>partial match; full pinyin>simplified pinyin>partial pinyin; name match>number match>email.

为了提高搜索效率,我们将搜索关键字key搜索结果的记录存入缓存器中,这样下次搜索新的关键字时,可以先在缓存器中查询,如关键字命中或者存在前缀关键字时,可在缓存的结果记录之上进行二次搜索,这将大大提高搜索效率。In order to improve the search efficiency, we store the records of the search keyword key search results in the cache, so that the next time you search for a new keyword, you can first query it in the cache, such as when the keyword hits or there is a prefix keyword, Secondary searches can be performed on top of cached result records, which will greatly improve search efficiency.

本发明虽然已以较佳实施例公开如上,但其并不是用来限定本发明,任何本领域技术人员在不脱离本发明的精神和范围内,都可以利用上述揭示的方法和技术内容对本发明技术方案做出可能的变动和修改,因此,凡是未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所做的任何简单修改、等同变化及修饰,均属于本发明技术方案的保护范围。Although the present invention has been disclosed as above with preferred embodiments, it is not intended to limit the present invention, and any person skilled in the art can use the methods disclosed above and technical content to analyze the present invention without departing from the spirit and scope of the present invention. Possible changes and modifications are made in the technical solution. Therefore, any simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention, which do not depart from the content of the technical solution of the present invention, all belong to the technical solution of the present invention. scope of protection.

Claims (6)

1. a kind of method that intelligent mobile phone address list is searched for generally, it is characterised in that comprise the following steps:
1) native contact book contact data is read by open interface, rejects redundant data, only leave and take key message, key letter Breath includes name of contact person, phone number and mailbox;
2) All Contacts' information is pre-processed by information pre-processor, for the Chinese in name, by looking into Chinese character Coding schedule extracts phonetic, and polyphone then corresponds to a plurality of record, while phonetic is transcoded into into the corresponding Serial No. of thumb keyboard, will In above-mentioned all of record write internal memory and the buffer that designs;
3) to the search key key being input into, first whether query caching device has corresponding record, directly positions if hit, Otherwise global search address list, is three types by keyword key point:A) Chinese character is contained;B) Chinese character is not contained, is contained English alphabet and numeral;C) pure digi-tal is only included;
4) for given keyword key carries out matching search by searching for adaptation, if keyword belongs to b classifications, match Object is name of contact person phonetic;If belonging to c classifications, matching object is thumb keyboard Serial No.;The tool of search adaptation Body matched rule:Name phonetic is converted to into a phonetic array, each letter of keyword is traveled through, by letter and phonetic list Word is matched successively, if current letter matching, continues to match the first of key and the next letter of current word or next word Individual letter, if all of keyword letter has corresponding matching, the match is successful, and otherwise it fails to match;
5) by step 3), 4) obtain all contact persons matched with keyword key, it is preferential according to matching by weighting sorting unit Level is weighted and sorts and export.
2. the method that according to claim 1 address list is searched for generally, it is characterised in that the step 3) in, for a matchings Object is name of contact person;For b matching objects are name of contact person phonetic and mailbox;For c matching objects are contact person's electricity Words number thumb keyboard Serial No. corresponding with name;For b, c should adopt step 4) in fuzzy matching algorithm to phonetic compile Code and thumb keyboard Serial No. carry out fuzzy matching, during search, inquire about in both the buffers first, if hit or presence are currently closed Key word prefix, then carry out binary search in former result.
3. the method that according to claim 1 address list is searched for generally, it is characterised in that the step 4) in, employ and change Character string subsequence matching algorithm after making, specific algorithm thought is described as:To given keyword key and name phonetic Pinyin, is first separated into array pinyin [N] by pinyin with space character, and it is the phonetic array index for starting to match to define s, k For the key inferiors for being currently needed for matching, i is the pinyin word subscript of current matching, and j is the inferior of current word;
A. s=0, k=0, i=s, j=0 are initialized, i.e., from first of keyword first letter and phonetic first word Letter starts matching;
B. k==key.length is judged, if true, the match is successful, algorithm terminates;If false, turn c;
C. key [k]==pinyin [i] [j] is judged, if true turns d, if false turns f;
D.k++, j++, that is, attempt matching keyword next alphabetical next with current pinyin word alphabetical;
E. j is judged<Pinyin [i] .length, i.e., judge in advance current word whether all matchings, if true turn b after Continuous matching is next, and false turns g;
F. judge whether i-th phonetic has matching, if true turns g, false then s++, k=0, i=s, j=0 simultaneously turn b, from S pinyin word starts to match again;
G.i++, j=0, that is, attempt matching keyword with next word first letter;
H. i is judged<Pinyin.length, turns b if true, and then it fails to match for false, and algorithm terminates;
Described according to algorithm above, can accurately judge whether key and pinyin [N] match, in algorithmic procedure, with one Match [N] array recording keys match details with each phonetic, accurate according to match arrays after algorithm terminates Description algorithmic match reason, for thumb keyboard Serial No., the simply ephemeral data used in algorithmic procedure, user also should According to match arrays Query to specifically to the fuzzy matching of pinyin.
4. the method that according to claim 1 address list is searched for generally, it is characterised in that the step 5) in, the matching Priority can be with self-defined.
5. the method that address list according to claim 4 is searched for generally, it is characterised in that the step 5) in, match excellent First level is:Match completely>Part matches;Spelling>Simplicity>Part phonetic;Name matches>Numbers match>Mailbox is matched.
6. the method that the address list according to claim 1-5 any one is searched for generally, it is characterised in that to searching Keyword key Search Results record, should be stored in buffer, during the new keyword of search next time, query caching device is answered first, If keyword hits or there is key prefix, directly binary search is carried out on the result record of caching.
CN201310173227.1A 2013-05-13 2013-05-13 Smartphone address book fuzzy search method Active CN103294776B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310173227.1A CN103294776B (en) 2013-05-13 2013-05-13 Smartphone address book fuzzy search method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310173227.1A CN103294776B (en) 2013-05-13 2013-05-13 Smartphone address book fuzzy search method

Publications (2)

Publication Number Publication Date
CN103294776A CN103294776A (en) 2013-09-11
CN103294776B true CN103294776B (en) 2017-04-12

Family

ID=49095638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310173227.1A Active CN103294776B (en) 2013-05-13 2013-05-13 Smartphone address book fuzzy search method

Country Status (1)

Country Link
CN (1) CN103294776B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617204B (en) * 2013-11-15 2017-01-25 福建星网锐捷通讯股份有限公司 Contact fast searching method based on android system
CN104077418A (en) * 2014-07-18 2014-10-01 广州市久邦数码科技有限公司 Mobile terminal application program searching method and system
CN105718487A (en) * 2014-12-04 2016-06-29 东莞宇龙通信科技有限公司 Search method and search system for special contacts
CN104699775A (en) * 2015-03-10 2015-06-10 广州市久邦数码科技有限公司 Contact person dialing searching method and system thereof for mobile terminal
CN104994208B (en) * 2015-07-08 2018-04-06 苏州思必驰信息科技有限公司 Contact person information of mobile terminal extracting method and system
CN105915685B (en) * 2016-06-02 2019-01-18 重庆神指奇动网络有限公司 A kind of smart phone Dialing Method and its system
CN106446062A (en) * 2016-09-05 2017-02-22 惠州市德赛西威汽车电子股份有限公司 Retrieval system and method for continuous characters and fuzzy characters
CN106782517A (en) * 2016-12-15 2017-05-31 咪咕数字传媒有限公司 A kind of speech audio keyword filter method and device
CN108572998A (en) * 2017-03-14 2018-09-25 北京橙鑫数据科技有限公司 A kind of data search method and device for electronic card data
CN107341177B (en) * 2017-05-24 2019-12-06 福建网龙计算机网络信息技术有限公司 fuzzy search method and device for contact
CN109116997A (en) * 2017-06-23 2019-01-01 北京国双科技有限公司 A kind of searching method and device based on phonetic
CN107679122B (en) * 2017-09-20 2021-04-30 福建网龙计算机网络信息技术有限公司 Fuzzy search method and terminal
CN110019649A (en) * 2017-12-25 2019-07-16 北京新媒传信科技有限公司 A kind of method and device established, search for index tree
CN111314540B (en) * 2018-11-26 2021-07-27 卓望数码技术(深圳)有限公司 Address book searching method, device, equipment and readable storage medium
CN111382322B (en) * 2018-12-27 2023-06-13 北京猎户星空科技有限公司 Method and device for determining similarity of character strings
CN110475028A (en) * 2019-08-26 2019-11-19 广州讯鸿网络技术有限公司 A kind of T9 searching method, electronic equipment and the storage medium of millions contact person
CN110781209B (en) * 2019-09-29 2022-04-22 苏州浪潮智能科技有限公司 Method and device for quickly querying data
CN112153206B (en) * 2020-09-23 2022-08-09 阿波罗智联(北京)科技有限公司 Contact person matching method and device, electronic equipment and storage medium
CN112163007B (en) * 2020-09-28 2023-11-17 惠州市德赛西威智能交通技术研究院有限公司 Method and system for quickly matching and searching contacts
CN112527819B (en) * 2020-12-08 2024-06-04 北京百度网讯科技有限公司 Address book information retrieval method and device, electronic equipment and storage medium
CN112764851A (en) * 2021-01-14 2021-05-07 青岛海信传媒网络技术有限公司 Display method and display equipment for legal statement content
CN113094470B (en) * 2021-04-08 2022-05-24 蔡堃 Text searching method and system
CN113641731B (en) * 2021-08-17 2023-05-02 成都知道创宇信息技术有限公司 Fuzzy search optimization method, device, electronic equipment and readable storage medium
CN113703983A (en) * 2021-09-01 2021-11-26 杭州新渡桥科技有限公司 Configurable multi-element sorting method and system and electronic equipment
CN116010562B (en) * 2023-03-28 2023-07-07 之江实验室 A name matching method, device, equipment and medium based on multiple data sources

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101426053A (en) * 2008-10-24 2009-05-06 深圳市金立通信设备有限公司 System and method for fast searching phone book and call record when standby
US7865842B2 (en) * 2005-07-14 2011-01-04 International Business Machines Corporation Instant messaging real-time buddy list lookup
CN102156757A (en) * 2011-05-19 2011-08-17 重庆国虹科技发展有限公司 Android system-based method for intelligently retrieving mobile phone contact
CN102542000A (en) * 2011-12-07 2012-07-04 北京风灵创景科技有限公司 Method and equipment for retrieving contacts

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7865842B2 (en) * 2005-07-14 2011-01-04 International Business Machines Corporation Instant messaging real-time buddy list lookup
CN101426053A (en) * 2008-10-24 2009-05-06 深圳市金立通信设备有限公司 System and method for fast searching phone book and call record when standby
CN102156757A (en) * 2011-05-19 2011-08-17 重庆国虹科技发展有限公司 Android system-based method for intelligently retrieving mobile phone contact
CN102542000A (en) * 2011-12-07 2012-07-04 北京风灵创景科技有限公司 Method and equipment for retrieving contacts

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
几种模式匹配算法的效率分析;巫喜红;《大庆师范学院学报》;20070430;第27卷(第2期);全文 *

Also Published As

Publication number Publication date
CN103294776A (en) 2013-09-11

Similar Documents

Publication Publication Date Title
CN103294776B (en) Smartphone address book fuzzy search method
CN109284352B (en) Query method for evaluating indefinite-length words and sentences of class documents based on inverted index
CN100595760C (en) Method for gaining oral vocabulary entry, device and input method system thereof
WO2019153612A1 (en) Question and answer data processing method, electronic device and storage medium
CN100578539C (en) Automatic question-answering method and system
CN111178053B (en) Text generation method for generating abstract extraction by combining semantics and text structure
CN103425777B (en) A kind of based on the short message intelligent classification and the searching method that improve Bayes&#39;s classification
CN107967250B (en) Information processing method and device
CN110263325A (en) Chinese automatic word-cut
CN107947921A (en) Based on recurrent neural network and the password of probability context-free grammar generation system
CN112395421B (en) Course label generation method and device, computer equipment and medium
CN112395404B (en) Voice key information extraction method applied to power dispatching
CN101751430A (en) Electronic dictionary fuzzy searching method
CN109473103A (en) A kind of meeting summary generation method
CN103425668A (en) Information search method and electronic equipment
CN101287026A (en) System and method for executing quick dialing by hand-write recognition function
CN106649410B (en) Method and device for obtaining chat reply content
CN206639220U (en) A kind of portable simultaneous interpretation equipment
CN106294460A (en) A kind of Chinese speech keyword retrieval method based on word and word Hybrid language model
CN101794304B (en) Industry information service system and method
Tsai et al. Mencius: A Chinese named entity recognizer using the maximum entropy-based hybrid model
CN111444720A (en) Named entity recognition method for English text
CN101470701A (en) Text analyzer supporting semantic rule based on finite state machine and method thereof
CN113935308B (en) Method and system for automatic generation of text summaries in the field of earth sciences
CN101539433A (en) Searching method with first letter of pinyin and intonation in navigation system and device thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant