CN1252876A - 利用文本的语义表示进行信息检索 - Google Patents
利用文本的语义表示进行信息检索 Download PDFInfo
- Publication number
- CN1252876A CN1252876A CN98804175A CN98804175A CN1252876A CN 1252876 A CN1252876 A CN 1252876A CN 98804175 A CN98804175 A CN 98804175A CN 98804175 A CN98804175 A CN 98804175A CN 1252876 A CN1252876 A CN 1252876A
- Authority
- CN
- China
- Prior art keywords
- speech
- logical form
- document
- mark
- super
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 claims description 31
- 230000008878 coupling Effects 0.000 claims description 16
- 238000010168 coupling process Methods 0.000 claims description 16
- 238000005859 coupling reaction Methods 0.000 claims description 16
- 238000004458 analytical method Methods 0.000 claims description 8
- 239000012634 fragment Substances 0.000 claims 1
- 230000002194 synthesizing effect Effects 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 6
- 241000282898 Sus scrofa Species 0.000 description 24
- 241001465754 Metazoa Species 0.000 description 17
- 238000010606 normalization Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000009795 derivation Methods 0.000 description 5
- 238000013507 mapping Methods 0.000 description 5
- 238000006073 displacement reaction Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 239000003607 modifier Substances 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 241000282887 Suidae Species 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000010025 steaming Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99932—Access augmentation or optimizing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99935—Query augmenting and refining, e.g. inexact access
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
本发明涉及利用文本的语义表达进行信息检索。在一种优选实施例中,记号化器从输入字符串生成表征该输入字符串中所表达的语义关系的信息检索记号。记号化器首先从输入字符串建立表征输入字符串中的选定词之间的语义关系的原逻辑形式。接着记号化器确定和输入字符串中选定词之一具有“isa(是)”关系的超属词。然后记号化器从原逻辑形式构造一个或多个替代逻辑形式。通过为输入字符串中的一个或多个选定词中的每个选定词用为该选定词确定的某超属词代替原逻辑形式中的该选词,记号化器构造各替代逻辑形式。最后,记号化器生成代表原逻辑形式和替代逻辑形式的记号。最好使用记号化器生成记号,以便既用于构造代表目标文档的索引又用于对照索引处理查询。
Description
本发明涉及信息检索领域,并且更具体地涉及信息检索记号化领域。
信息检索指的是确定目标文档中出现查询或查询文档中的词的过程。信息检索可以被有益地应用于几种情况中,包括:处理用户的明确搜索查询,确定和某特定文档相关的文档,判断两份文档的类似性,提取某文档的特征以及概述某文档。
信息检索典型地包括两阶段过程:(1)在编索引阶段,最初通过(a)把文档中的每个词转化成信息检索引擎可理解、可区分的一串字符,称之为“记号”(即 文档的记号化)以及(b)建立各记号到该记号在该文档中出现位置的索引,对文档编索引。(2)在查询阶段中,相似地对查询(或查询文档)进行记号化,并和索引进行比较以确定文档中出现记号化后的查询中的记号的位置。
图1是描述信息检索过程的概述数据流图。在编索引阶段,把目标文档111提供给记号化器112。目标文档是由一些字符串,例如一些句子,组成的,每个字符串出现在目标文档的某特定位置上。将目标文档中的各字符串以及词的位置传送到记号化器120,记号化器120把各字符串中的词转换成一系列可由信息检索引擎130理解及区分的记号。信息检索引擎130的索引建立部分131把这些记号以及它们的位置添加到索引140中。该索引把每个唯一的记号映射到该目标文档中出现该记号的位置。若需要,可以重复该过程,以便把一些不同的目标文档添加到该索引中。若索引140表示一些目标文档中的文本,则位置信息最好包含各位置对应的文档的标记。
在查询阶段,把文本查询112提供给记号化器120。查询可能是单个字符串或一个句子,或者可能是由一些字符串组成的完整文档。记号化器120按它把目标文档中的词转换成记号的相同方式把查询112的文本中的词转换成记号。记号化器120把这些记号传送到信息检索引擎130的索引检索部分132。信息检索引擎的索引检索部分在索引140中搜索这些记号在目标文档中的出现。对于每个记号,信息检索引擎的索引检索部分确定目标文档中出现该记号的各个位置。作为查询结果113返回位置表。
常规记号化器典型地包括输入文本的外表变换,例如把每个大写字符变成小写、确定输入文本中的每个词并且去掉词的后缀。例如,常规记号化器可能把输入的文本字符串
The father is holding the baby。
(该 父亲 正抱着 该 婴儿。)转换成下述记号:
the (该)
father (父亲)
is (是)
hold (抱)
the (该)
baby (婴儿)这种记号化方法趋向于使依据它的搜索过分地包含出现这样的词,即其含意是和查询文本中的预定含意不同的。例如,该示例输入文本字符串使用“to support or grasp(支持或抓住)”含意下的动词“hold”。但是,记号“hold”可能会和其含意是“the cargo area of a ship(船的装货区”)的词“hold”匹配。这种记号化方法还趋向于过分包含这样的情况,即其中词之间的关系和查询文本中各词之间的关系不同。例如,在上述示例输入文本字符串中,“father”是词“hold”的主语而“baby”是宾语,该示例的字符串可能和句子“The father and the baby held the toy”匹配,在该句中,“baby”是主语而不是宾语。该方法还会过少地包括出现这样的情况,即采用不同的但在语义上相关的词来代替查询文本中的某个词。例如,上述的输入文本字符串可能不和文本字符串“The parent isholding the baby”匹配。出于常规记号化方法的这些缺点,一种编有记号化文本中隐含的语义关系的记号化器应该是非常实用的。
本发明目的是利用一种改进的记号化器进行信息检索,该改进的记号化器分析输入文本以确定逻辑形式,接着利用超属词扩展逻辑形式。当和常规信息检索索引结构以及查询一起使用时,本发明减少标识出现不同的含意以及标识出现词之间带有不同的关系的次数,并且增加标识出现使用不同的但在语义上相关的用语的次数。
通过对已编索引的文本和查询文本进行语法分析以对该输入文本进行词法、语法和语义分析,本发明克服了和常规记号化过程相关的问题。该分析过程产生一个或多个逻辑形式,它们标识查询文本中起主要作用的词以及它们预定的含意,并且还进而确定这些词之间的关系。该语法分析程序最好产生和输入文本的深主语、动词和深宾语相关的逻辑形式。例如,对于输入文本“The father is holding the baby”,语法分析程序可能生成下述逻辑形式:
深主语 动词 深宾语
father hold baby语法分析程序还将该输入文本中采用的特定含意归入这些词。
利用数字词典或辞典(也称为语言知识库)为某词的某特定含义确定和该词的该含义为通用术语的其它词的含义(“超属词”),本发明把语法分析程序生成的逻辑形式中的词改变成它们的超属词以创造附加的逻辑形式,这些附加的逻辑形式所具有的总含义和原始逻辑形式的含义相接近。例如,根据词库中的指示,“parent”的一种含意是“father”的所属含意的超属词,“touch”的一种含意是“hold”的所属含意的超属词,“child”的一种含意以及“person”的一种含意是“baby”的所属含意的超属词,本发明可建立如下的附加逻辑形式:深主语 动词 深宾语parent hold babyfather touch babyparent touch babyfather hold childparent hold childfather touch childparent touch childfather hold personparent hold personfather touch personparent touch person
然后,本发明把所有生成的逻辑形式变换成可由信息检索系统理解的记号,该系统把记号化后的查询和索引进行比较,并且提供给该信息检索系统。
图1是信息检索过程的概述数据流图。
图2是最好在其上运行本工具的通用计算机系统的高级框图。
图3是一个概述流程图,表示最好由本工具执行的各步骤以便构造和访问语义上代表目标文档的索引。
图4是一个流程图,表示由本工具使用的用以生成输入句子的各记号的记号化例程。
图5是一个逻辑形式图,表示示例的逻辑形式。
图6是一个输入文本图,表示输入文本片断,本工具为这些片断构造图5中示出的逻辑形式。
图7A是一个语言知识库图,表示由语言知识库确定的示例性超属词关系。
图7B是一个语言知识库图,表示为原逻辑形式的深主语man(含意2)选择超属词。
图8是一个语言知识库图,表示为原逻辑形式的动词kiss(含意1)选择超属词。
图9和10是语言知识库图,表示为原逻辑形式的深宾语pig(含意2)选择超属词。
图11表示扩展逻辑形式的逻辑形式。
图12表示通过置换扩展的原逻辑形式建立派生的逻辑形式。
图13是一个索引图,表示索引内容的例子。
图14是一个逻辑形式图,表示本工具为查询“man kissing horse”优选构造的逻辑形式。
图15表示利用超属词扩充原逻辑形式。
图16是一个语言知识库图,表示选择查询逻辑形式的深宾词horse(含意1)的超属词。
图17是部分逻辑形式图,表示和一个只包含深主语和动词的部分查询对应的部分逻辑形式。
图18是部分逻辑形式图,表示和一个只包含动词和深宾语的部分查询对应的部分逻辑形式。
本发明的目的是利用文本的语义表达进行信息检索。当和常规信息检索索引结构以及查询一起使用时,本发明减少标识出现不同的含意以及标识出现词之间存在不同的关系的次数,并且增加标识出现使用不同的但在语义上相关的用语的次数。
在一种优选实施例中,用一种改进的信息检索记号化工具(以下称“本工具”)代替图1所示的常规记号化器,该工具分析输入文本以确定逻辑形式,接着利用超属词扩展逻辑形式。通过对已编索引的文本和查询文本进行语法分析以对该输入文本进行词法、语法和语义分析,本发明克服了和常规记号化过程相关的问题。该分析过程产生一个或多个逻辑形式,它们标识查询文本中起主要作用的词以及它们的预定含意,并且还进而确定这些词之间的关系。该语法分析程序最好产生和输入文本的深主语、动词和深宾语相关的逻辑形式。例如,对于输入文本“The fatheris holding the baby”,该语法分析程序可产生表示深主语是“father”、动词是“hold”及深宾语是“baby”的逻辑形式。由于把输入文本转换成逻辑形式通过去掉修饰语并忽略时态和语态的差导将输入文本“蒸馏”成基本含义,把输入文本片断转换成逻辑形式趋于统一自然语言中表达相同思想可能采用的许多不同方式。该语法分析程序还确定这些词在该输入文本中所使用的特定含义。
利用数字词典或辞典(也称为语言知识库)为某词的某特定含义确定和该词的该含义为通用术语的其它词的含义(“超属词”),本发明把语法分析程序生成的逻辑形式中的词改变成它们的超属词以创造附加的逻辑形式,这些附加的逻辑形式所具有的总含义和原始逻辑形式的含义相接近。然后,本发明把所有生成的逻辑形式变换成可由信息检索系统理解的记号,该系统把记号化后的查询和索引进行比较,并且提供给该信息检索系统。
图2是最好在其上运行本工具的通用计算机系统的高级框图。计算机系统200包括中央处理器(CPU)210、输入/输出部件220及计算机存储器(存储器)230。输入/输出部件中有存储部件221,例如硬盘机。输入/输出部件还包括计算机可读的介质驱动器222,它可用于安装软件产品,其中包括计算机可读介质如CD-ROM上提供的本工具。输入/输出部件还包括因特网连接223,其使计算机系统200通过因特网和其它计算机系统通信。最好包括本工具240的计算机程序驻留在存储器230中并在CPU 210上执行。本工具240包括一个基于规则的语法分析程序,用于分析要记号化的输入文本片断以生成逻辑形式。本工具240还包括一个由该语法分析程序使用的语言知识库242,以把含义号赋予逻辑形式中的词。本工具还利用语言知识库确定所生成的逻辑形式中的各词的超属词。存储器230最好还包括索引250,其用于将根据目标文档生成的记号映射到目标文档中的位置。存储器230还包括一个信息检索引擎(“IR引擎”)260,用于把从目标文档生成的记号存储到索引250中,并且用于确定索引中和从查询生成的记号相匹配的记号。尽管本工具最好在按上述配置的计算机系统中实现,熟练技术人员可意识到它可实现在具有不同配置的计算机系统上。
图3是一个概述流程图,表示为了构造和访问语义上代表目标文档的索引最好由本工具执行的步骤。简言之,本工具首先通过把目标文档的每个句子或句子片断变换成一些记号在语义上对目标文档编索引,这些记号表示描述句子中重要的词之间的关系的扩展逻辑形式,并包括着具有类似含义的超属词。本工具把这些“语义记号”以及目标文档中出现该句子的位置存储到索引中。当对所有目标文档编排索引后,本工具能对照该索引处理信息检索查询。对于接收到的每条这种查询,本工具以对来自目标文档的句子进行记号化的相同方式对查询文本记号化-即通过把句子变换成共同表示查询文本之扩展逻辑形式的各语义记号。然后,本工具把这些语义记号和索引中存储的语义记号进行比较,以确定目标文档中存储的这些语义记号的位置,并且按照与该查询的关联顺序对包含这些语义记号的目标文档分类。本工具最好可更新索引,以便随时包含新目标文档的语义记号。
参照图3,在步骤301-304,本工具循环处理目标文档中各个句子。在步骤302,本工具调用例程以记号化图4所示的句子。
图4是一个流程图,表示本工具使用的生成输入句子或其它输入文本片断的记号的记号化例程。在步骤401,本工具从输入文本片断构造原逻辑形式。如上面所讨论。逻辑形式表示句子或句子片断的基本含义。通过应用语法分析程序241(图2)使输入文本片断得到语法及语义分析处理产生逻辑形式。对于构造表示输入文本字符串的逻辑形式的详细讨论,请参见美国专利申请08/674,610号,这里引用作为参考。
本工具使用的逻辑形式最好析出句子的主要动词、该动词的实际主语的名词(“深主语”)以及该动词的实际宾语的名词(“深宾语”)。图5是一个逻辑形式图,表示示例的原逻辑形式。该逻辑形式具有三个元素“深主语元素510、动词元素520以及深宾语元素530。可以看出,该逻辑形式的深主语是词“man”的含义2。含义号为具有多于一个含义的词指示语法分析程序赋予词的特定含义,该含义是由语法分析程序所使用的语言知识库定义的。例如,词“man”可具有意思为人的第一含义和具有成年男性的第二含义。逻辑形式的动词是词“kiss”的第一含义。最后,深宾语是词“pig”的第二含义。该逻辑形式的简化版本是一个有序三元组550,其第一元素是深主语,第二元素是动词,其第三元素是深宾语:
(man,kiss,pig)
图5中所示的逻辑形式表征一些不同的句子和句子片断。例如,图6是一个表示输入文本片断的输入文本图,本工具会为其构造图5中所示的逻辑形式。图6表示输入文本句子片断“man kissing a pig”。可以看出该短语出现在文档5的词号150处,占据着词位置150、151、152和153。当本工具对该输入文本蒸片断进行记号化时,它生成图5中示出的逻辑形式。本工具也会为下述输入文本片断生成图5中所示的逻辑形式:
The pig was kissed by an unusual man.
The man will kiss the largest pig。
Many pigs have been kissed by that man。如前面所讨论,由于把输入文本转换成逻辑形式通过去掉修饰语并忽略时态和语态的差异将输入文本蒸馏成基本含义,把输入文本片断转换成逻辑形式趋于统一自然语言中表达相同思想可能采用的许多不同方式。
回到图4,在本工具从输入文本构造出原逻辑形式后,例如图5中所示的逻辑形式后,本工具进入步骤420以利用超属词扩展该原逻辑形式。在步骤402后,记号化例程返回。
如上面所述,超属词是一个属术语,它和某特定的词具有“is a”(是)的关系。例如,词“vehicle”是词“automobile”的超属词。本工具最好利用一个语言知识库确定原逻辑形式下的词的超属词。这种语言知识库典型地包含规定某词的超属词的语义链接。
图7A是一个语言知识库图,表示由语言知识库确定的示例超属词关系。请注意,类似于后面的语言知识库,图7A已被简化以便利本说明,并且略掉通常可在语言知识库中发现的不和本说明直接相关的信息。图7A中的每个向上的箭头把某个词和它的超属词连接起来。例如,有一个箭头把词man(含义2)711连接到词person(含义1)714,表示person(含义1)是man(含义2)的超属词。相反,man(含义2)被说成是person(含义1)的“亚属词”。
在为了扩展原逻辑形式而确定超属词时,本工具根据超属词的亚属词的相关为原逻辑形式的每个词选择一个或多个超属词。通过以这种方式选择超属词,本工具在超出输入文本片断含义的范围外(但在控制量内)使逻辑形式的含义广义化。对于某原逻辑形式中的某特定词,本工具首先选择该原逻辑形式的该词的直接超属词。例如,参照图7A,从原逻辑形式中的man(含义2)711开始,本工具选择它的超属词person(含义1)714。下一步,本工具根据person(含义1)714是否具有相对于起始词man(含义2)711的相关亚属词集,判定是否还要选择person(含义1)714的超属词animal(含义3)715。若与起始词man(含义2)711不同的词person的所有含义的大量亚属词至少具有对起始词man(含义2)711的相似性的临阈级,则person(含义1)714具有相对于man(含义2)711的相干亚属词集。
为了确定超属词的不同含义的亚属词之间的相似度,本工具最好咨询语言知识库以得到表示词的这些词句之间的相似程度的相似性权重。图7B是一个语言知识库图,表示man(含义2)和person(含义1)的及person(含义5)的其它亚属词之间的相似性权重。该图表示:man(含义2)和woman(含义1)之间的相似性加权是“.0075”;在man(含义2)和child(含义1)之间的相似性权重是“.0029”;在man(含义2)和villain(含义1)之间的相似性权重是“.0003”;以及在man(含义2)和lead(含义7)之间的相似性权重是“.0002”。这些相似性加权最好是由语言知识库根据该语言知识库保持的词意对之间的语义关系网络计算的。关于利用语言知识库计算词义对之间的相似性加权的详细讨论,请参见标题为“确定词之间的相似性”的美国专利申请号(专利律师卷号661005.524),这里引用作为参考。
为了根据这些相似性加权判定亚属词集是否相干,本工具确定相似性加权的阈值量是否超过相似性加权阈。虽然优选阈百分比是90%,最好为了优化本工具的性能调整阈百分比。还可把相似性加权阈值配置成优化本工具的性能。相似性加权阈值最好和语言知识库提供的相似性加权的总分布相配合。这里,示出采用“.0015”的阈值。从而本工具判定起始词的和超属词的所有含义的其它亚属词之间的至少90%的相似性加权是否等于或高于“.0015”的相似性加权阈。可以从图7B看出,相对于man(含义1)的person的亚属词不满足该条件:尽管man(含义1)和women(含义1)之间以及man(含义1)和child(含义1)之间的相似性加权大于“.0015”,man(含义1)和villain(含义1)之间以及man(含义1)和lead(含义7)之间的相似性加权小于“.0015”。从而本工具不再选择超属词animal(含义3)715,也不选择animal(含义3)的任何超属词。因此,只选择超属词person(含义1)714用于扩展原逻辑形式。
为了扩展原逻辑形式,本工具还选择原逻辑形式的动词和深宾语的超属词。图8是一个语言知识库图,表示选择原逻辑形式的动词kiss(含义1)的超属词。从图中可看出touch(含义2)是kiss(含义1)的超属词。该图还示出kiss(含义1)和touch的所有含义的其它亚属词之间的相似性加权。本工具首先选择原逻辑形式的动词kiss(含义1)的直接超属词touch(含义2)。为了判定是否选择touch(含义2)的超属词interact(含义9),本工具判定kiss(含义1)和touch的所有含义的其它亚属词之间的相似性加权中有多少至少和相似性加权阈值一样大。由于这四个相似性加权中只有两个至少和“.0015”的相似性加权阈值一样大,所以本工具不选择touch(含义2)的超属词interat(含义9)。
图9和图10是语言知识库图,表示选择原逻辑形式的深宾语的超属词和pig(含义2)。从图9中可以看出本工具选择pig(含义2)的超属词swine(含义1)和选择swine(含义1)的超属词animal(含义3)来扩展原逻辑形式,因为swine的唯一含义的90%以上(事实上,100%)的超属词具有等于或高于“.0015”的相似性加权阈值。从图10中可以看出,本工具不继续选择animal(含义3)的超属词organism(含义1),因为animal的含义的超属词中具有等于或高于“.0015”相似性加权阈值的超属词少于90%(实际上25%)。
图11是一个逻辑形式图,表示扩展逻辑形式。从图11中可以看出,扩展逻辑形式的深主语元素1110包括除词man(含义2)1111之外的超属词person(含义1)。可看出动词元素1120包括超属词touch(含义2)1112和词kiss(含义1)1121。还可以看出,扩展逻辑形式的深宾语包括除词pig(含义2)1131之外的超属词swine(含义1)和animal(含义3)1132。
通过在扩展逻辑形式的各个元素中用超属词置换原始词,本工具可创造一个数量比较大的派生逻辑形式,这些逻辑形式在意义上和原逻辑形式比较接近。图12表示通过置换扩展的原逻辑形式建立的派生逻辑形式。从图12中可看出,此置换创造十一个派生逻辑形式,每个逻辑形式在比较准确的方式下表征输入文本的含义。例如,图12示出的派生逻辑形式。
(person,touch,pig)在含义上非常接近句子片断
man kissing a pig图11中所示的扩展逻辑形式表示原逻辑形式加这十一个派生逻辑形式,它们被更紧凑地表示成扩展逻辑形式1200:
((man OR person),(kiss OR touch),(pig OR swine OR animal))
本工具以允许记号可由常规信息检索引擎处理的方式,从该扩展逻辑形式生成逻辑记号。首先,本工具把某保留字符附加到扩展逻辑形式中的各个词上,以确定输入文本片断中出现的词是否是深主语、动词或深宾语。这可确保,当词“man”作为深主语出现在查询输入文本的扩展逻辑形式中时,它不会和存储在索引中的作为动词出现在某扩展逻辑形式的一部分的词“man”匹配。一将保留字符映射为逻辑格式元素的示例 如下:
逻辑形式元素标识字符
深主语 -
动词 ∧
深宾语 #利用保留字符的这种示例映射,为逻辑形式“(man,kiss,pig)”生成的记号应包括“man_”,“kiss^”以及“pig#”。
常规信息检索引擎生成的索引通常把每个记号映射到目标文档中出现该记号的各特定位置。常规信息检索引擎可能利用文档号和词号表示这种目标文档位置,文档号标识包含着该记号的目标文档,词号标识该目标文档中出现该记号的位置。这种目标文档位置允许常规信息检索引擎确定在目标文档中一起出现的多个词,以响应利用“PHRASE(短语)”运算符的查询,该运算符要求其联接的词在目标文档中是相邻的。例如,查询“red PHRASE bicycle”将匹配出现在文档5词611处的“red”以及在文档5词612处的“bicycle”,但不会匹配出现在文档7词762处的“red”以及在文档7词202处的“bicycle”。把目标文档位置存储在索引中还允许常规信息检索引擎响应查询确定目标文档中出现被查询记号的各个点。
对于来自目标文档输入文本片断的扩展逻辑形式,本工具最好类似地向每个记号分配人工目标文档位置,即使扩展逻辑形式的这些记号实际上并不在目标文档中的这些位置上出现。分配这些目标文档位置既(A)允许常规搜索引擎利用PHRASE运算符确定和单个原逻辑形式或派生逻辑形式对应的语义记号的组合,又(B)允许本工具把分配的位置和目标文档中的输入文本片断的实际位置关联起来。从而本工具按如下向语义记号分配位置。逻辑形式元素 位置深主语 (输入文本片断中第1个词的位置)动词 (输入文本片断中第1个词的位置)+1深宾语 (输入文本片断中第1个词的位置)+2从而本工具按如下对从文档5、字150处开始的句子得到的“(man,kiss,pig)”的扩展逻辑形式的记号分配目标文档位置:“man_”和“person”——文档5,词150;“kiss^”和“touch^”——文档5,词151;以及“pig#”、“swine#”和“animal#”——文档5,词152。
回到图3,在步骤303,本工具把记号化例程建立的记号以及它们的出现位置存储到索引中。图13表示索引的示例内容。索引将每个记号映射到文档的标识上以及该记号在该文档中的出现位置。请注意,尽管索引是作为表示出的,以便更清楚地表示索引中的映射,实际上最好把索引存储到一些其它的更有效支持索引中的记号的位置的格式中的一种格式中,例如树状格式。另外,最好利用诸如前缀压缩技术压缩索引中的内容,以将索引的长度降到最低限度。
可以看出,根据步骤303,本工具为扩展逻辑形式下的各个词的索引1300中存储了映射。在索引中存储了从深主语词“man”和“person”到文档号5、词号150处的目标文档位置的映射。词号150是在该处开始图6中所示的输入文本片断的词位置。可以看出,本已把保留字符“”附加在和深主语词对应的记号上。通过附加该保留字符,当以后搜索该索引时,本工具能检索这些词作为逻辑形式的深主语出现的情况,而不检索这些词作为逻辑形式的动词或深宾语的出现。类似地,该索引包括动词“kiss”和“touch”的记号。这些动词词的条目把它们映射到文档5、词号151的目标文档位置上,即深主语词的目标文档位置的后一个词。还可以看出,已为这些动词词的记号附加了保留字符“^”,从而这些词的出现以后不会作为深主语或深宾语元素出现。类似地,该索引包含深宾语词“animal”、“pig”和“swine”的记号,把它们映射到文档号5、词号152的目标文档位置上,即该短语开始的目标文档位置的两个词后。对深宾语词的记号附加保留字符“#”以把它们标识为索引中的深宾语。利用以这种状态示出的索引,通过搜索图12示出的任一派生原逻辑形式的索引,可以找到图6中所示的输入文本片断。
在一种优选实施例中,本工具在同一索引中存储目标文档中字面上出现的词到其目标文档中的实际位置的映射以及该目标文档的语义表达,最好用一个常数递增语义表达的各个语义记号的词号值,其中该常数大于任一文档中的词的数量,以便在访问该索引时把语义表达的语义记号和文字记号区分开来。为了简化图13,未示出添加该常数。
在该例子,本工具将扩展逻辑形式中的每个词的记号添加到索引中,以形成目标文档的语义表达。然而,在一种优选实施例中,本工具对那些可能在区分各目标文档中的文档是有效的逻辑形式记号,限制添加到索引中的扩展逻辑形式记号集。为了如此限制添加剂索引的扩展逻辑形式记号集,本工具最好确定各记号文档频率倒数,其公式由后面的式(1)表示。在该实施例,本工具只把其文档频率倒数超过最小阈值的记号添加到索引中。
回到图3,在目标文档的当前句子之前把记号存储到索引中后,在步骤304,本工具循环回到步骤301以处理目标文档中的下个句子。当处理完目标文档中的所有句子时,本工具进入步骤305。在步骤305,本工具接收查询文本。在步骤306-308,本工具处理接收到的查询。在步骤306,本工具调用记号化例程以对查询文本记号化。图14是一个逻辑形式图,表示根据步骤401(图4)最好由本工具为查询“man kissing horse”构造的逻辑形式。可以该逻辑形式图中看出,深主语是man(含义2),动词是kiss(含义1),深宾语是horse(含义1)。该原逻辑形式更简明地表达成原逻辑形式1450。
(man,kiss,horse)
图15表示根据步骤402(图4)利用超属词扩展原逻辑形式,从图15可看出,类似于取自目标文档的示例输入文本,用超属词person(含义1)扩展深主语man(含义2),用超属词touch(含义2)扩展动词kiss(含义1),还可以看出,用超属词animal(含义3)扩展深宾语horse(含义1)。
图16是一个语言知识库图,表示选择查询逻辑形式的深宾语horse(含义1)的超属词。从图16中可以看出,由于animal(含义3)的亚属词中少于90%的亚属词具有的相似性加权等于或高于“.0015”的相似性加权阈值,所以本工具不选择animal(含义3)的超属词organism(含义1)。从而,本工具只利用超属词animal(含义3)扩展逻辑形式。
回到图3,在步骤307,本工具使用扩展逻辑形式1550(图15)检索目标文档中出现匹配记号的索引位置,该扩展逻辑形式1550是利用原逻辑形式的词含义的超属词构造的。本工具最好通过发出下述与索引对比的查询:
(man_OR person_)PHRASE(kiss ∧OR touch∧)PHRASE(horse#OR animal#)进行检索。PHRASE运算符匹配出现这样的情况,即,该运算符后的操作数的词位置1比其前面的操作数的词位置大。从而,该查询匹配在动词kiss^或touch^之前的深主语man_或person,其中动词kiss^或touch^在深宾语horse#或animal#之前。从图13的索引可看出,在文档号5、词号150处满足该查询。
若该查询不满足该索引,则本工具将继续提出两个不同部分查询下的查询。第一个部分形式只包括深主语和动词,不包括宾语:
(man_OR person_) PHRASE(kiss∧OR touch∧)图17是一个部分逻辑形式图,表示和该第一查询对应的部分逻辑形式。查询的第二部分形式包括动词和深宾语,但不包括深主语:
(kiss∧OR touch∧)PHRASE(horse#OR animal#)图18是一个部分逻辑形式图,表示和该第二部分查询对应的部分逻辑形式。这些部分查询会和索引中具有不同深主语或深宾语的逻辑形式匹配,并且会和不具有深主语或深宾语的部分逻辑形式匹配。这些部分查询考虑查询输入文本片断和目标文档输入文本片断之间的差异,其中包括代词的使用以及暗含的深主语以及深宾语。
回到图3,在确定索引中记号的匹配后,本工具进入步骤308以对目标文档分类,其中按它们与查询的关联性的顺序出现和原逻辑形式或派生逻辑形式对应的各匹配记号的特定组合的匹配。在本发明的不同实施例中,本工具采用一些周知方法中的一种或几种通过关联性对各文档分类,这些方法包括Jaccard加权和二进制项独立加权。本工具最好采用文档频率倒数和项频率等待的组合对匹配的目标文档分类。
在对目标文档中出现较少的记号组合给予较大的加权下,文档频率倒数加权表征记号组合区分文档的能力。例如,对于一组主题是photography(摄影术)的一组目标文档,逻辑形式
(photographer,frame,subject)会出现在该组文档中的每份文档中,从而对于区分各文档它不是一种很好的基准。由于上述逻辑形式在每份目标文档中出现,所以它具有较小的文档频率倒数。记号组合的文档频率倒数的公式如下:
文档中记号组合的项频率加权量测该文档专用于该记号组合的程度,并假定其中多次出现某特定查询记号的文档要比在其中不太出现该查询记号的文档关联更大。文档中某记号组合的项频率加权公式如下:
项频率(记号组合,文档)=该文档中出现该记号组合的次数(2)
本工具利用各匹配文档的记分对文档分类。本工具首先利用下述公式对每份文档中的各匹配记号组合计算计分:
记分(记号组合,文档)
=文档频率倒数(记号组合)×项频率(记号组合,文档) (3)接着本工具根据下式通过选择各匹配文档中任一匹配记号组合的最高记分,计算各匹配文档的记分:一旦本工具计算出每份文档的记分,本工具可扩大这些记分以反映和那些指向语义匹配的项不同的查询项。在扩大每份文档的记分后,若需要,本工具通过按下式考虑文档的篇幅计算每份文档的归一化记分:篇幅(文档)项可以是某文档的篇幅的任何合理量测,例如该文档中的字符、词、句子或句子片断的数量。可以替代地用一些其它归一化技术归一化文档记分,包括余弦测量归一化、项加权和归一化以及最大项加权归一化。
在计算出每份匹配文档的归一化记分后,本工具按文档的归一化记分的顺序对匹配文档分类。用户最好从分类表中选择一份匹配文档,以得到该文档中匹配记号组的位置,或者显示该文档的匹配部分。
回到图3,在步骤308中对匹配的目标文档分类后,本工具最好进入步骤305以接收下个查询的文本以和索引对比。
上面讨论了通过关联性对包含匹配记号组的文档进行分类。本发明的其它优选实施例类似地通过关联性分别对包含匹配的文档集和文档段落分类。对于被组织成各包含一份或几份文档的文档集的目标文档,本工具最好通过关联性对出现匹配的文档集分类,以确定最相关的文档集供进一步查询。另外,本工具最好可配置成能把每份目标文档划成段落并且对其中出现匹配的文档段落的关联性分类。通过选择一数量的字节、词或句子或者使用目标文档中出现的结构、格式或语言线索,在目标文档中相邻标识这些文档段落。本工具最好还确定论及特定论题的不相邻的文档段落。
虽然参照各优选实施例显示并说明了本发明,熟练技术人员理解,在不背离本发明的范围下在形式和细节上可作出各种更改或修改。例如,记号化程序可以直接采纳或生成对应于一个完整的逻辑形式结构的记号以替代对应于某逻辑形式结构中的一个词的记号,并且把这样的记号存储到索引中。而且,可以应用各种周知技术以在具有语义匹配成分的查询中包括其它类型的搜索。并且,查询可包括若干语义匹配成分。此外,可利用标识词之间的语义关系代替超属词来扩展原逻辑形式。本工具还可以利用原逻辑形式的每个词的预先编译的替代词表扩展原逻辑形式,而不是如前面所说明的那样在运行时根据语言知识库生成超属性表。此外,为了提高匹配精度,记号化程序可以在词的记号中编码标识该词的含义号。在这种情况下,对超属词集的相干性的检查减少成不必为选定超属词的所有含义检查相似性。在本例中,只有词person的含义1的超属词需要带有对于词man(含义2)的起始含义的相似性阈值。由于索引表中的可能匹配项岐义较少,我们可以限制可能产生的错误命中的项集。由于这个原因,只需要检查和逻辑形式中的词具有超属词关系的那些含义。
Claims (17)
1.计算机系统中一种用于从输入字符串生成信息检索记号的方法,该方法包括步骤:
从输入字符串建立表征该输入字符串中选定的词之间的语义关系的原逻辑形式;
确定该输入字符串中各选定词的超属词;
从该原逻辑形式构造一个或多个替代的逻辑形式,通过为该输入字符串中的一个或多个选定词中的每个词用对该选定词确定的超属词代替原逻辑形式中的该选定词,构造每个替代的逻辑形式;以及
生成代表原逻辑形式以及替代逻辑形式的记号,所生成的记号可由信息检索引擎区分。
2.权利要求1的方法,其中构造步骤包括对输入字符串进行语法分析以判明其语法及语义结构的步骤。
3.权利要求1的方法,其中确定步骤包括步骤:
对输入字符串中的每个选定词:
从语言知识库中检索该选定词的一个或多个超属词,每个超属词具有一个表征该超属词对该选定词在含义上的相似性的相似性值;以及
确定其相似性值超过某预先建立的阈值的所有超属词。
4.权利要求1的方法,还包括步骤:
在构造步骤之前,从某搜索查询选择输入字符串;以及
把生成的记号提交给查询引擎以和一份或多份目标文档的表达进行比较。
5.权利要求1的方法,还包括步骤:
在构造步骤之前,从要编排索引的文本体中选择输入字符串;以及
把生成的记号提交给索引子系统以存储在代表该文本体的索引中。
6.权利要求5的方法,还包括确定替代逻辑形式中出现的每个词的文档频率倒数的步骤,并且其中提交步骤不向索引子系统提交这样的表示替代逻辑形式的记号,即这些逻辑形式所包含的词的文档频率倒数小于预先确定的最小文档频率倒数。
7.权利要求5的方法,还包括步骤:
在提交步骤之后,确定替代逻辑形式中出现的每个词的文档频率倒数;以及
从索引中去掉这样的表示替代逻辑形式的记号,即这些逻辑形式所包含的词的文档频率倒数小于预先确定的最小文档倒数。
8.权利要求1的方法,其中确定步骤确定相对于选定词具有相干亚属词集的选定词的超属词。
9.一种计算机可读介质,其内容使计算机系统通过执行下述步骤从输入字符串中生成信息检索记号:
从输入字符串建立表征该输入字符串中选定的词之间的语义关系的原逻辑形式,
确定该输入字符串中各选定词的超属词;
从该原逻辑形式构造一个或多个替代的逻辑形式,通过为该输入字符串中的一个或多个选定词中的每个词用对该选定词确定的超属词代替原逻辑形式中的该选定词,构造每个替代的逻辑形式;
生成代表原逻辑形式以及替代逻辑形式的记号,所生成的记号可由信息检索引擎区分。
10.权利要求9的计算机可读介质,其中构造步骤包括对输入字符串进行语法分析以判明其语法及语义结构的步骤。
11.权利要求9的计算机可读介质,其中确定步骤包括步骤:
对输入字符串中的每个选定词:
从语言知识库中检索该选定词的一个或多个超属词,每个超属词具有一个表征该超属词对该选定词在含义上的相似性的相似性值;以及
确定其相似性值超过某预先建立的阈值的所有超属词。
12.权利要求9的计算机可读介质,其中该计算机可读介质的内容还使计算机系统执行步骤:
在构造步骤之前,从某搜索查询选择输入字符串;以及
把生成的记号提交给查询引擎以和一份或多份目标文档的表达进行比较。
13.权利要求9的计算机可读介质,其中该计算机可读介质的内容还使计算机系统执行步骤:
在构造步骤之前,从要编排索引的文本体中选择输入字符串;以及
把生成的记号提交给索引子系统以存储在代表该文本体的索引中。
14.一种计算机存储器,含有表征一份或几份目标文档的内容的文档索引数据结构,该文档索引数据结构把词映射到目标文档中的位置,该文档索引数据结构为各目标文档中出现的多个词段中的每个词段,把从该词段生成的逻辑形式中所包含的各个词映射到与该词段相对应的位置上,并且把从该词段生成的逻辑形式中所包含的各词的超属词映射到与该词段相对应的位置上,从而可把该文档索引数据结构用于响应接收到查询确定出目标文档中语义上类似于查询段的词段位置。
15.权利要求14的计算机存储器,其中文档索引数据结构把至少一个未在任一目标文档中出现的词映射到目标文档的某位置上。
16.一种用于响应查询的计算机系统,查询包含着与一份或多份目标文档对照的词段,每份目标文档包含一个或多个词段,每个目标文档段具有目标文档中的一个位置,该计算机系统包括:
目标文档接收器,用于接收目标文档;
查询接收器,用于接收对各目标文档的查询;
记号化器,用于从目标文档接收器接收到的目标文档的词段以及从查询接收器接收的查询生成记号,该记号化器包括用于从每个词段合成出一个表征该词段的语义结构的逻辑形式的逻辑形式合成器,该记号化器生成代表从词段中合成出的逻辑形式的记号;
索引存储器,用于存储把每个从某目标文档段生成的记号映射到生成该记号的目标文档段在目标文档中的位置上的关系;以及
查询处理子系统,用于为每次查询在索引存储器中确定和从该查询生成的记号匹配的某记号,并用于返回从该确定的记号映射到的位置的指示。
17.权利要求16的计算机系统,其中逻辑形式合成器合成的逻辑形式包含若干词,并且记号化器还包括:
超属词扩展子系统,用于从逻辑形式合成器生成的逻辑形式创造一个或多个用超属词替代该逻辑形式中的一个或多个词的辅助的逻辑形式,记号化器还生成代表由超属词扩展子系统创造的辅助逻辑形式的记号。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/886,814 | 1997-03-07 | ||
US08/886,814 US6076051A (en) | 1997-03-07 | 1997-03-07 | Information retrieval utilizing semantic representation of text |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1252876A true CN1252876A (zh) | 2000-05-10 |
Family
ID=25389830
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN98804175A Pending CN1252876A (zh) | 1997-03-07 | 1998-02-11 | 利用文本的语义表示进行信息检索 |
Country Status (5)
Country | Link |
---|---|
US (5) | US6076051A (zh) |
EP (1) | EP0965089B1 (zh) |
JP (1) | JP4282769B2 (zh) |
CN (1) | CN1252876A (zh) |
WO (1) | WO1998039714A1 (zh) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1333362C (zh) * | 2001-03-26 | 2007-08-22 | 美国网上搜索公司 | 用于智能数据同化的方法和装置 |
US7630879B2 (en) | 2002-09-13 | 2009-12-08 | Fuji Xerox Co., Ltd. | Text sentence comparing apparatus |
US8065307B2 (en) | 2006-12-20 | 2011-11-22 | Microsoft Corporation | Parsing, analysis and scoring of document content |
CN101508188B (zh) * | 2009-03-24 | 2012-09-26 | 北京市城南橡塑技术研究所 | 抗冲击复合衬板 |
CN105512291A (zh) * | 2006-02-28 | 2016-04-20 | 贝宝公司 | 用于扩展数据库搜索查询的方法和系统 |
CN106598722A (zh) * | 2015-10-19 | 2017-04-26 | 上海引跑信息科技有限公司 | 一种在文本信息检索服务中支持分布式事务管理的方法 |
CN110088754A (zh) * | 2016-10-26 | 2019-08-02 | 联邦科学和工业研究组织 | 立法到逻辑的自动编码器 |
CN114969262A (zh) * | 2022-05-31 | 2022-08-30 | 云知声智能科技股份有限公司 | 文本处理方法、装置、存储介质及电子装置 |
Families Citing this family (588)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7089218B1 (en) | 2004-01-06 | 2006-08-08 | Neuric Technologies, Llc | Method for inclusion of psychological temperament in an electronic emulation of the human brain |
US8725493B2 (en) * | 2004-01-06 | 2014-05-13 | Neuric Llc | Natural language parsing method to provide conceptual flow |
US6076051A (en) * | 1997-03-07 | 2000-06-13 | Microsoft Corporation | Information retrieval utilizing semantic representation of text |
US5933822A (en) * | 1997-07-22 | 1999-08-03 | Microsoft Corporation | Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision |
US6243670B1 (en) * | 1998-09-02 | 2001-06-05 | Nippon Telegraph And Telephone Corporation | Method, apparatus, and computer readable medium for performing semantic analysis and generating a semantic structure having linked frames |
US6167370A (en) * | 1998-09-09 | 2000-12-26 | Invention Machine Corporation | Document semantic analysis/selection with knowledge creativity capability utilizing subject-action-object (SAO) structures |
GB9821969D0 (en) * | 1998-10-08 | 1998-12-02 | Canon Kk | Apparatus and method for processing natural language |
US6964011B1 (en) * | 1998-11-26 | 2005-11-08 | Canon Kabushiki Kaisha | Document type definition generating method and apparatus, and storage medium for storing program |
US6233547B1 (en) * | 1998-12-08 | 2001-05-15 | Eastman Kodak Company | Computer program product for retrieving multi-media objects using a natural language having a pronoun |
WO2000034845A2 (en) * | 1998-12-08 | 2000-06-15 | Mediadna, Inc. | A system and method of obfuscating data |
US6993580B2 (en) * | 1999-01-25 | 2006-01-31 | Airclic Inc. | Method and system for sharing end user information on network |
GB9904662D0 (en) * | 1999-03-01 | 1999-04-21 | Canon Kk | Natural language search method and apparatus |
CA2272739C (en) * | 1999-05-25 | 2003-10-07 | Suhayya Abu-Hakima | Apparatus and method for interpreting and intelligently managing electronic messages |
US6901402B1 (en) * | 1999-06-18 | 2005-05-31 | Microsoft Corporation | System for improving the performance of information retrieval-type tasks by identifying the relations of constituents |
US20060116865A1 (en) | 1999-09-17 | 2006-06-01 | Www.Uniscape.Com | E-services translation utilizing machine translation and translation memory |
US6816857B1 (en) | 1999-11-01 | 2004-11-09 | Applied Semantics, Inc. | Meaning-based advertising and document relevance determination |
US9076448B2 (en) | 1999-11-12 | 2015-07-07 | Nuance Communications, Inc. | Distributed real time speech recognition system |
US7725307B2 (en) | 1999-11-12 | 2010-05-25 | Phoenix Solutions, Inc. | Query engine for processing voice based queries including semantic decoding |
US7392185B2 (en) | 1999-11-12 | 2008-06-24 | Phoenix Solutions, Inc. | Speech based learning/training system using semantic decoding |
US7050977B1 (en) | 1999-11-12 | 2006-05-23 | Phoenix Solutions, Inc. | Speech-enabled server for internet website and method |
US8793160B2 (en) | 1999-12-07 | 2014-07-29 | Steve Sorem | System and method for processing transactions |
US6823492B1 (en) * | 2000-01-06 | 2004-11-23 | Sun Microsystems, Inc. | Method and apparatus for creating an index for a structured document based on a stylesheet |
US6751621B1 (en) | 2000-01-27 | 2004-06-15 | Manning & Napier Information Services, Llc. | Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors |
GB0006159D0 (en) * | 2000-03-14 | 2000-05-03 | Ncr Int Inc | Predicting future behaviour of an individual |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
AU4869601A (en) * | 2000-03-20 | 2001-10-03 | Robert J. Freeman | Natural-language processing system using a large corpus |
US7428500B1 (en) | 2000-03-30 | 2008-09-23 | Amazon. Com, Inc. | Automatically identifying similar purchasing opportunities |
US7120574B2 (en) * | 2000-04-03 | 2006-10-10 | Invention Machine Corporation | Synonym extension of search queries with validation |
US20010039490A1 (en) * | 2000-04-03 | 2001-11-08 | Mikhail Verbitsky | System and method of analyzing and comparing entity documents |
US20020010574A1 (en) * | 2000-04-20 | 2002-01-24 | Valery Tsourikov | Natural language processing and query driven information retrieval |
US7962326B2 (en) * | 2000-04-20 | 2011-06-14 | Invention Machine Corporation | Semantic answering system and method |
US7912868B2 (en) * | 2000-05-02 | 2011-03-22 | Textwise Llc | Advertisement placement method and system using semantic analysis |
AU2001271397A1 (en) * | 2000-06-23 | 2002-01-08 | Decis E-Direct, Inc. | Component models |
US6675159B1 (en) * | 2000-07-27 | 2004-01-06 | Science Applic Int Corp | Concept-based search and retrieval system |
US8200485B1 (en) * | 2000-08-29 | 2012-06-12 | A9.Com, Inc. | Voice interface and methods for improving recognition accuracy of voice search queries |
US7328211B2 (en) * | 2000-09-21 | 2008-02-05 | Jpmorgan Chase Bank, N.A. | System and methods for improved linguistic pattern matching |
US7085708B2 (en) | 2000-09-23 | 2006-08-01 | Ravenflow, Inc. | Computer system with natural language to machine language translator |
US20020143524A1 (en) * | 2000-09-29 | 2002-10-03 | Lingomotors, Inc. | Method and resulting system for integrating a query reformation module onto an information retrieval system |
AU2000276396A1 (en) * | 2000-09-30 | 2002-04-15 | Intel Corporation (A Corporation Of Delaware) | Method and system for building a domain specific statistical language model fromrule-based grammar specifications |
US7027974B1 (en) | 2000-10-27 | 2006-04-11 | Science Applications International Corporation | Ontology-based parser for natural language processing |
US7146349B2 (en) * | 2000-11-06 | 2006-12-05 | International Business Machines Corporation | Network for describing multimedia information |
US6978419B1 (en) * | 2000-11-15 | 2005-12-20 | Justsystem Corporation | Method and apparatus for efficient identification of duplicate and near-duplicate documents and text spans using high-discriminability text fragments |
US20020091671A1 (en) * | 2000-11-23 | 2002-07-11 | Andreas Prokoph | Method and system for data retrieval in large collections of data |
US7013308B1 (en) | 2000-11-28 | 2006-03-14 | Semscript Ltd. | Knowledge storage and retrieval system and method |
US20030028564A1 (en) * | 2000-12-19 | 2003-02-06 | Lingomotors, Inc. | Natural language method and system for matching and ranking documents in terms of semantic relatedness |
WO2002054279A1 (en) * | 2001-01-04 | 2002-07-11 | Agency For Science, Technology And Research | Improved method of text similarity measurement |
US7904595B2 (en) | 2001-01-18 | 2011-03-08 | Sdl International America Incorporated | Globalization management system and method therefor |
US6766316B2 (en) | 2001-01-18 | 2004-07-20 | Science Applications International Corporation | Method and system of ranking and clustering for document indexing and retrieval |
US20020133392A1 (en) * | 2001-02-22 | 2002-09-19 | Angel Mark A. | Distributed customer relationship management systems and methods |
US6697793B2 (en) | 2001-03-02 | 2004-02-24 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | System, method and apparatus for generating phrases from a database |
US6741981B2 (en) | 2001-03-02 | 2004-05-25 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration (Nasa) | System, method and apparatus for conducting a phrase search |
US6721728B2 (en) | 2001-03-02 | 2004-04-13 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | System, method and apparatus for discovering phrases in a database |
US6823333B2 (en) | 2001-03-02 | 2004-11-23 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | System, method and apparatus for conducting a keyterm search |
US6813616B2 (en) * | 2001-03-07 | 2004-11-02 | International Business Machines Corporation | System and method for building a semantic network capable of identifying word patterns in text |
US7426505B2 (en) * | 2001-03-07 | 2008-09-16 | International Business Machines Corporation | Method for identifying word patterns in text |
US7194454B2 (en) * | 2001-03-12 | 2007-03-20 | Lucent Technologies | Method for organizing records of database search activity by topical relevance |
US7860706B2 (en) | 2001-03-16 | 2010-12-28 | Eli Abir | Knowledge system method and appparatus |
US8874431B2 (en) * | 2001-03-16 | 2014-10-28 | Meaningful Machines Llc | Knowledge system method and apparatus |
US8744835B2 (en) * | 2001-03-16 | 2014-06-03 | Meaningful Machines Llc | Content conversion method and apparatus |
US7146308B2 (en) * | 2001-04-05 | 2006-12-05 | Dekang Lin | Discovery of inference rules from text |
US6904428B2 (en) | 2001-04-18 | 2005-06-07 | Illinois Institute Of Technology | Intranet mediator |
GB2375859B (en) * | 2001-04-27 | 2003-04-16 | Premier Systems Technology Ltd | Search Engine Systems |
US6829605B2 (en) * | 2001-05-24 | 2004-12-07 | Microsoft Corporation | Method and apparatus for deriving logical relations from linguistic relations with multiple relevance ranking strategies for information retrieval |
SG103289A1 (en) * | 2001-05-25 | 2004-04-29 | Meng Soon Cheo | System for indexing textual and non-textual files |
US7050964B2 (en) * | 2001-06-01 | 2006-05-23 | Microsoft Corporation | Scaleable machine translation system |
US7734459B2 (en) | 2001-06-01 | 2010-06-08 | Microsoft Corporation | Automatic extraction of transfer mappings from bilingual corpora |
US7003444B2 (en) * | 2001-07-12 | 2006-02-21 | Microsoft Corporation | Method and apparatus for improved grammar checking using a stochastic parser |
US9009590B2 (en) * | 2001-07-31 | 2015-04-14 | Invention Machines Corporation | Semantic processor for recognition of cause-effect relations in natural language documents |
US7251781B2 (en) * | 2001-07-31 | 2007-07-31 | Invention Machine Corporation | Computer based summarization of natural language documents |
US8799776B2 (en) * | 2001-07-31 | 2014-08-05 | Invention Machine Corporation | Semantic processor for recognition of whole-part relations in natural language documents |
US7284191B2 (en) * | 2001-08-13 | 2007-10-16 | Xerox Corporation | Meta-document management system with document identifiers |
US8020754B2 (en) | 2001-08-13 | 2011-09-20 | Jpmorgan Chase Bank, N.A. | System and method for funding a collective account by use of an electronic tag |
US7133862B2 (en) | 2001-08-13 | 2006-11-07 | Xerox Corporation | System with user directed enrichment and import/export control |
US6609124B2 (en) | 2001-08-13 | 2003-08-19 | International Business Machines Corporation | Hub for strategic intelligence |
US7526425B2 (en) | 2001-08-14 | 2009-04-28 | Evri Inc. | Method and system for extending keyword searching to syntactically and semantically annotated data |
US7024351B2 (en) * | 2001-08-21 | 2006-04-04 | Microsoft Corporation | Method and apparatus for robust efficient parsing |
US7047183B2 (en) * | 2001-08-21 | 2006-05-16 | Microsoft Corporation | Method and apparatus for using wildcards in semantic parsing |
US7403938B2 (en) * | 2001-09-24 | 2008-07-22 | Iac Search & Media, Inc. | Natural language query processing |
JP4065936B2 (ja) * | 2001-10-09 | 2008-03-26 | 独立行政法人情報通信研究機構 | 機械学習法を用いた言語解析処理システムおよび機械学習法を用いた言語省略解析処理システム |
ITFI20010199A1 (it) | 2001-10-22 | 2003-04-22 | Riccardo Vieri | Sistema e metodo per trasformare in voce comunicazioni testuali ed inviarle con una connessione internet a qualsiasi apparato telefonico |
US7194464B2 (en) | 2001-12-07 | 2007-03-20 | Websense, Inc. | System and method for adapting an internet filter |
US7231343B1 (en) * | 2001-12-20 | 2007-06-12 | Ianywhere Solutions, Inc. | Synonyms mechanism for natural language systems |
US20030172368A1 (en) * | 2001-12-26 | 2003-09-11 | Elizabeth Alumbaugh | System and method for autonomously generating heterogeneous data source interoperability bridges based on semantic modeling derived from self adapting ontology |
US7137062B2 (en) * | 2001-12-28 | 2006-11-14 | International Business Machines Corporation | System and method for hierarchical segmentation with latent semantic indexing in scale space |
US7177799B2 (en) * | 2002-01-14 | 2007-02-13 | Microsoft Corporation | Semantic analysis system for interpreting linguistic structures output by a natural language linguistic analysis system |
US7295966B2 (en) * | 2002-01-14 | 2007-11-13 | Microsoft Corporation | System for normalizing a discourse representation structure and normalized data structure |
US7225183B2 (en) * | 2002-01-28 | 2007-05-29 | Ipxl, Inc. | Ontology-based information management system and method |
FR2835334A1 (fr) * | 2002-01-31 | 2003-08-01 | France Telecom | Systeme et procedes d'indexation et de recherche a extension de requetes, moteurs d'indexation et de recherche |
US7031969B2 (en) * | 2002-02-20 | 2006-04-18 | Lawrence Technologies, Llc | System and method for identifying relationships between database records |
US8380491B2 (en) * | 2002-04-19 | 2013-02-19 | Educational Testing Service | System for rating constructed responses based on concepts and a model answer |
US20040039562A1 (en) * | 2002-06-17 | 2004-02-26 | Kenneth Haase | Para-linguistic expansion |
WO2003107223A1 (en) * | 2002-06-17 | 2003-12-24 | Beingmeta, Inc. | Systems and methods for processing queries |
US7493253B1 (en) | 2002-07-12 | 2009-02-17 | Language And Computing, Inc. | Conceptual world representation natural language understanding system and method |
US20040034541A1 (en) * | 2002-08-16 | 2004-02-19 | Alipio Caban | Client devices, processor-usable media, data signals embodied in a transmission medium and processor implemented methods |
JP2004139553A (ja) * | 2002-08-19 | 2004-05-13 | Matsushita Electric Ind Co Ltd | 文書検索システムおよび質問応答システム |
US7136807B2 (en) * | 2002-08-26 | 2006-11-14 | International Business Machines Corporation | Inferencing using disambiguated natural language rules |
JP4038717B2 (ja) * | 2002-09-13 | 2008-01-30 | 富士ゼロックス株式会社 | テキスト文比較装置 |
US7567902B2 (en) * | 2002-09-18 | 2009-07-28 | Nuance Communications, Inc. | Generating speech recognition grammars from a large corpus of data |
US7194455B2 (en) * | 2002-09-19 | 2007-03-20 | Microsoft Corporation | Method and system for retrieving confirming sentences |
US7171351B2 (en) * | 2002-09-19 | 2007-01-30 | Microsoft Corporation | Method and system for retrieving hint sentences using expanded queries |
US7293015B2 (en) * | 2002-09-19 | 2007-11-06 | Microsoft Corporation | Method and system for detecting user intentions in retrieval of hint sentences |
US20040122736A1 (en) | 2002-10-11 | 2004-06-24 | Bank One, Delaware, N.A. | System and method for granting promotional rewards to credit account holders |
WO2004044888A1 (de) * | 2002-11-13 | 2004-05-27 | Schoenebeck Bernd | Sprachverarbeitendes system, verfahren zur zuordnung von akustischen und/oder schriftlichen zeichenketten zu wörtern bzw. lexikalischen einträgen |
US20040098250A1 (en) * | 2002-11-19 | 2004-05-20 | Gur Kimchi | Semantic search system and method |
JP2006508448A (ja) * | 2002-11-28 | 2006-03-09 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | ワードクラス情報を割り当てる方法 |
US8155946B2 (en) * | 2002-12-23 | 2012-04-10 | Definiens Ag | Computerized method and system for searching for text passages in text documents |
WO2004077217A2 (en) * | 2003-01-30 | 2004-09-10 | Vaman Technologies (R & D) Limited | System and method of object query analysis, optimization and execution irrespective of server functionality |
US7343280B2 (en) * | 2003-07-01 | 2008-03-11 | Microsoft Corporation | Processing noisy data and determining word similarity |
US20050060140A1 (en) * | 2003-09-15 | 2005-03-17 | Maddox Paul Christopher | Using semantic feature structures for document comparisons |
US7593845B2 (en) * | 2003-10-06 | 2009-09-22 | Microsoflt Corporation | Method and apparatus for identifying semantic structures from text |
CA2542438A1 (en) * | 2003-10-21 | 2005-04-28 | Intellectual Property Bank Corp. | Document characteristic analysis device for document to be surveyed |
US7584092B2 (en) * | 2004-11-15 | 2009-09-01 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7412385B2 (en) * | 2003-11-12 | 2008-08-12 | Microsoft Corporation | System for identifying paraphrases using machine translation |
CN1629833A (zh) * | 2003-12-17 | 2005-06-22 | 国际商业机器公司 | 实现问与答功能和计算机辅助写作的方法及装置 |
US7359851B2 (en) * | 2004-01-14 | 2008-04-15 | Clairvoyance Corporation | Method of identifying the language of a textual passage using short word and/or n-gram comparisons |
JP2005267607A (ja) * | 2004-02-20 | 2005-09-29 | Fuji Photo Film Co Ltd | デジタル図鑑システム、図鑑検索方法、及び図鑑検索プログラム |
US7983896B2 (en) | 2004-03-05 | 2011-07-19 | SDL Language Technology | In-context exact (ICE) matching |
GB0407389D0 (en) * | 2004-03-31 | 2004-05-05 | British Telecomm | Information retrieval |
US20050256700A1 (en) * | 2004-05-11 | 2005-11-17 | Moldovan Dan I | Natural language question answering system and method utilizing a logic prover |
US7424485B2 (en) * | 2004-06-03 | 2008-09-09 | Microsoft Corporation | Method and apparatus for generating user interfaces based upon automation with full flexibility |
US7363578B2 (en) * | 2004-06-03 | 2008-04-22 | Microsoft Corporation | Method and apparatus for mapping a data model to a user interface model |
US7665014B2 (en) * | 2004-06-03 | 2010-02-16 | Microsoft Corporation | Method and apparatus for generating forms using form types |
US20060009966A1 (en) * | 2004-07-12 | 2006-01-12 | International Business Machines Corporation | Method and system for extracting information from unstructured text using symbolic machine learning |
US20060026522A1 (en) * | 2004-07-27 | 2006-02-02 | Microsoft Corporation | Method and apparatus for revising data models and maps by example |
US7685118B2 (en) * | 2004-08-12 | 2010-03-23 | Iwint International Holdings Inc. | Method using ontology and user query processing to solve inventor problems and user problems |
US8407239B2 (en) | 2004-08-13 | 2013-03-26 | Google Inc. | Multi-stage query processing system and method for use with tokenspace repository |
US7917480B2 (en) | 2004-08-13 | 2011-03-29 | Google Inc. | Document compression system and method for use with tokenspace repository |
US20060047691A1 (en) * | 2004-08-31 | 2006-03-02 | Microsoft Corporation | Creating a document index from a flex- and Yacc-generated named entity recognizer |
US20060047500A1 (en) * | 2004-08-31 | 2006-03-02 | Microsoft Corporation | Named entity recognition using compiler methods |
US20060047690A1 (en) * | 2004-08-31 | 2006-03-02 | Microsoft Corporation | Integration of Flex and Yacc into a linguistic services platform for named entity recognition |
CN100361126C (zh) * | 2004-09-24 | 2008-01-09 | 北京亿维讯科技有限公司 | 使用本体论和用户查询处理技术解决问题的方法 |
US7657519B2 (en) * | 2004-09-30 | 2010-02-02 | Microsoft Corporation | Forming intent-based clusters and employing same by search |
US7996208B2 (en) | 2004-09-30 | 2011-08-09 | Google Inc. | Methods and systems for selecting a language for text segmentation |
US7680648B2 (en) * | 2004-09-30 | 2010-03-16 | Google Inc. | Methods and systems for improving text segmentation |
US8051096B1 (en) | 2004-09-30 | 2011-11-01 | Google Inc. | Methods and systems for augmenting a token lexicon |
US20060074632A1 (en) * | 2004-09-30 | 2006-04-06 | Nanavati Amit A | Ontology-based term disambiguation |
US7546235B2 (en) * | 2004-11-15 | 2009-06-09 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7552046B2 (en) * | 2004-11-15 | 2009-06-23 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US20060122834A1 (en) * | 2004-12-03 | 2006-06-08 | Bennett Ian M | Emotion detection device & method for use in distributed systems |
US8843536B1 (en) | 2004-12-31 | 2014-09-23 | Google Inc. | Methods and systems for providing relevant advertisements or other content for inactive uniform resource locators using search queries |
US8473449B2 (en) * | 2005-01-06 | 2013-06-25 | Neuric Technologies, Llc | Process of dialogue and discussion |
US7869989B1 (en) * | 2005-01-28 | 2011-01-11 | Artificial Cognition Inc. | Methods and apparatus for understanding machine vocabulary |
EP1851616A2 (en) * | 2005-01-31 | 2007-11-07 | Musgrove Technology Enterprises, LLC | System and method for generating an interlinked taxonomy structure |
EP1846815A2 (en) * | 2005-01-31 | 2007-10-24 | Textdigger, Inc. | Method and system for semantic search and retrieval of electronic documents |
US20060200464A1 (en) * | 2005-03-03 | 2006-09-07 | Microsoft Corporation | Method and system for generating a document summary |
US20060200337A1 (en) * | 2005-03-04 | 2006-09-07 | Microsoft Corporation | System and method for template authoring and a template data structure |
US20060200338A1 (en) * | 2005-03-04 | 2006-09-07 | Microsoft Corporation | Method and system for creating a lexicon |
US20060200336A1 (en) * | 2005-03-04 | 2006-09-07 | Microsoft Corporation | Creating a lexicon using automatic template matching |
US7937396B1 (en) | 2005-03-23 | 2011-05-03 | Google Inc. | Methods and systems for identifying paraphrases from an index of information items and associated sentence fragments |
US9400838B2 (en) * | 2005-04-11 | 2016-07-26 | Textdigger, Inc. | System and method for searching for a query |
US8032823B2 (en) * | 2005-04-15 | 2011-10-04 | Carnegie Mellon University | Intent-based information processing and updates |
US7672908B2 (en) * | 2005-04-15 | 2010-03-02 | Carnegie Mellon University | Intent-based information processing and updates in association with a service agent |
FR2885712B1 (fr) * | 2005-05-12 | 2007-07-13 | Kabire Fidaali | Dispositif et procede d'analyse semantique de documents par constitution d'arbres n-aire et semantique |
CN101366024B (zh) | 2005-05-16 | 2014-07-30 | 电子湾有限公司 | 用于处理数据搜索请求的方法和系统 |
US7401731B1 (en) | 2005-05-27 | 2008-07-22 | Jpmorgan Chase Bank, Na | Method and system for implementing a card product with multiple customized relationships |
GB0512744D0 (en) * | 2005-06-22 | 2005-07-27 | Blackspider Technologies | Method and system for filtering electronic messages |
US7689411B2 (en) | 2005-07-01 | 2010-03-30 | Xerox Corporation | Concept matching |
US7809551B2 (en) * | 2005-07-01 | 2010-10-05 | Xerox Corporation | Concept matching system |
CA2545237A1 (en) * | 2005-07-29 | 2007-01-29 | Cognos Incorporated | Method and system for managing exemplar terms database for business-oriented metadata content |
CA2545232A1 (en) * | 2005-07-29 | 2007-01-29 | Cognos Incorporated | Method and system for creating a taxonomy from business-oriented metadata content |
US8666928B2 (en) | 2005-08-01 | 2014-03-04 | Evi Technologies Limited | Knowledge repository |
JP4639124B2 (ja) * | 2005-08-23 | 2011-02-23 | キヤノン株式会社 | 文字入力補助方法及び情報処理装置 |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US20070073533A1 (en) * | 2005-09-23 | 2007-03-29 | Fuji Xerox Co., Ltd. | Systems and methods for structural indexing of natural language text |
US7475072B1 (en) | 2005-09-26 | 2009-01-06 | Quintura, Inc. | Context-based search visualization and context management using neural networks |
US7937265B1 (en) * | 2005-09-27 | 2011-05-03 | Google Inc. | Paraphrase acquisition |
WO2007038713A2 (en) * | 2005-09-28 | 2007-04-05 | Epacris Inc. | Search engine determining results based on probabilistic scoring of relevance |
US7908132B2 (en) * | 2005-09-29 | 2011-03-15 | Microsoft Corporation | Writing assistance using machine translation techniques |
US7949444B2 (en) * | 2005-10-07 | 2011-05-24 | Honeywell International Inc. | Aviation field service report natural language processing |
US9886478B2 (en) | 2005-10-07 | 2018-02-06 | Honeywell International Inc. | Aviation field service report natural language processing |
US8036876B2 (en) * | 2005-11-04 | 2011-10-11 | Battelle Memorial Institute | Methods of defining ontologies, word disambiguation methods, computer systems, and articles of manufacture |
US10319252B2 (en) | 2005-11-09 | 2019-06-11 | Sdl Inc. | Language capability assessment and training apparatus and techniques |
EP1949273A1 (en) * | 2005-11-16 | 2008-07-30 | Evri Inc. | Extending keyword searching to syntactically and semantically annotated data |
US7765212B2 (en) * | 2005-12-29 | 2010-07-27 | Microsoft Corporation | Automatic organization of documents through email clustering |
US8694530B2 (en) | 2006-01-03 | 2014-04-08 | Textdigger, Inc. | Search system with query refinement and search method |
US20070162481A1 (en) * | 2006-01-10 | 2007-07-12 | Millett Ronald P | Pattern index |
FR2896603B1 (fr) * | 2006-01-20 | 2008-05-02 | Thales Sa | Procede et dispositif pour extraire des informations et les transformer en donnees qualitatives d'un document textuel |
US7599861B2 (en) | 2006-03-02 | 2009-10-06 | Convergys Customer Management Group, Inc. | System and method for closed loop decisionmaking in an automated care system |
US8266152B2 (en) * | 2006-03-03 | 2012-09-11 | Perfect Search Corporation | Hashed indexing |
EP1999565A4 (en) * | 2006-03-03 | 2012-01-11 | Perfect Search Corp | HYPER SPACE INDEX |
US8862573B2 (en) | 2006-04-04 | 2014-10-14 | Textdigger, Inc. | Search system and method with text function tagging |
US7991608B2 (en) * | 2006-04-19 | 2011-08-02 | Raytheon Company | Multilingual data querying |
AU2007248585A1 (en) * | 2006-05-04 | 2007-11-15 | Jpmorgan Chase Bank, N.A. | System and method for restricted party screening and resolution services |
US7809663B1 (en) | 2006-05-22 | 2010-10-05 | Convergys Cmg Utah, Inc. | System and method for supporting the utilization of machine language |
US8379830B1 (en) | 2006-05-22 | 2013-02-19 | Convergys Customer Management Delaware Llc | System and method for automated customer service with contingent live interaction |
US7493293B2 (en) * | 2006-05-31 | 2009-02-17 | International Business Machines Corporation | System and method for extracting entities of interest from text using n-gram models |
US20070288248A1 (en) * | 2006-06-12 | 2007-12-13 | Rami Rauch | System and method for online service of web wide datasets forming, joining and mining |
US8140267B2 (en) * | 2006-06-30 | 2012-03-20 | International Business Machines Corporation | System and method for identifying similar molecules |
US8615800B2 (en) | 2006-07-10 | 2013-12-24 | Websense, Inc. | System and method for analyzing web content |
US8020206B2 (en) | 2006-07-10 | 2011-09-13 | Websense, Inc. | System and method of analyzing web content |
US20080027971A1 (en) * | 2006-07-28 | 2008-01-31 | Craig Statchuk | Method and system for populating an index corpus to a search engine |
US8589869B2 (en) | 2006-09-07 | 2013-11-19 | Wolfram Alpha Llc | Methods and systems for determining a formula |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
JP5076417B2 (ja) * | 2006-09-15 | 2012-11-21 | 富士ゼロックス株式会社 | 概念ネットワーク生成システム、概念ネットワーク生成方法及び概念ネットワーク生成プログラム |
US7557167B2 (en) * | 2006-09-28 | 2009-07-07 | Gore Enterprise Holdings, Inc. | Polyester compositions, methods of manufacturing said compositions, and articles made therefrom |
US8146051B2 (en) * | 2006-10-02 | 2012-03-27 | International Business Machines Corporation | Method and computer program product for providing a representation of software modeled by a model |
US9098489B2 (en) * | 2006-10-10 | 2015-08-04 | Abbyy Infopoisk Llc | Method and system for semantic searching |
US9069750B2 (en) * | 2006-10-10 | 2015-06-30 | Abbyy Infopoisk Llc | Method and system for semantic searching of natural language texts |
US8892423B1 (en) | 2006-10-10 | 2014-11-18 | Abbyy Infopoisk Llc | Method and system to automatically create content for dictionaries |
US8145473B2 (en) | 2006-10-10 | 2012-03-27 | Abbyy Software Ltd. | Deep model statistics method for machine translation |
US9053090B2 (en) | 2006-10-10 | 2015-06-09 | Abbyy Infopoisk Llc | Translating texts between languages |
US9235573B2 (en) | 2006-10-10 | 2016-01-12 | Abbyy Infopoisk Llc | Universal difference measure |
US9892111B2 (en) | 2006-10-10 | 2018-02-13 | Abbyy Production Llc | Method and device to estimate similarity between documents having multiple segments |
US9633005B2 (en) | 2006-10-10 | 2017-04-25 | Abbyy Infopoisk Llc | Exhaustive automatic processing of textual information |
US9984071B2 (en) | 2006-10-10 | 2018-05-29 | Abbyy Production Llc | Language ambiguity detection of text |
US8195447B2 (en) | 2006-10-10 | 2012-06-05 | Abbyy Software Ltd. | Translating sentences between languages using language-independent semantic structures and ratings of syntactic constructions |
US9495358B2 (en) | 2006-10-10 | 2016-11-15 | Abbyy Infopoisk Llc | Cross-language text clustering |
US8214199B2 (en) * | 2006-10-10 | 2012-07-03 | Abbyy Software, Ltd. | Systems for translating sentences between languages using language-independent semantic structures and ratings of syntactic constructions |
US9075864B2 (en) * | 2006-10-10 | 2015-07-07 | Abbyy Infopoisk Llc | Method and system for semantic searching using syntactic and semantic analysis |
US8548795B2 (en) * | 2006-10-10 | 2013-10-01 | Abbyy Software Ltd. | Method for translating documents from one language into another using a database of translations, a terminology dictionary, a translation dictionary, and a machine translation system |
US9047275B2 (en) | 2006-10-10 | 2015-06-02 | Abbyy Infopoisk Llc | Methods and systems for alignment of parallel text corpora |
US9588958B2 (en) | 2006-10-10 | 2017-03-07 | Abbyy Infopoisk Llc | Cross-language text classification |
US20080086298A1 (en) * | 2006-10-10 | 2008-04-10 | Anisimovich Konstantin | Method and system for translating sentences between langauges |
US9471562B2 (en) | 2006-10-10 | 2016-10-18 | Abbyy Infopoisk Llc | Method and system for analyzing and translating various languages with use of semantic hierarchy |
US9645993B2 (en) | 2006-10-10 | 2017-05-09 | Abbyy Infopoisk Llc | Method and system for semantic searching |
US9110975B1 (en) * | 2006-11-02 | 2015-08-18 | Google Inc. | Search result inputs using variant generalized queries |
US8661029B1 (en) | 2006-11-02 | 2014-02-25 | Google Inc. | Modifying search result ranking based on implicit user feedback |
US9208174B1 (en) * | 2006-11-20 | 2015-12-08 | Disney Enterprises, Inc. | Non-language-based object search |
US9654495B2 (en) * | 2006-12-01 | 2017-05-16 | Websense, Llc | System and method of analyzing web addresses |
GB2458094A (en) | 2007-01-09 | 2009-09-09 | Surfcontrol On Demand Ltd | URL interception and categorization in firewalls |
US7437370B1 (en) * | 2007-02-19 | 2008-10-14 | Quintura, Inc. | Search engine graphical interface using maps and images |
EP2135231A4 (en) * | 2007-03-01 | 2014-10-15 | Adapx Inc | SYSTEM AND METHOD FOR DYNAMIC LEARNING |
US8180633B2 (en) * | 2007-03-08 | 2012-05-15 | Nec Laboratories America, Inc. | Fast semantic extraction using a neural network architecture |
WO2008113045A1 (en) | 2007-03-14 | 2008-09-18 | Evri Inc. | Query templates and labeled search tip system, methods, and techniques |
US8959011B2 (en) | 2007-03-22 | 2015-02-17 | Abbyy Infopoisk Llc | Indicating and correcting errors in machine translation systems |
US9031947B2 (en) * | 2007-03-27 | 2015-05-12 | Invention Machine Corporation | System and method for model element identification |
US7873640B2 (en) * | 2007-03-27 | 2011-01-18 | Adobe Systems Incorporated | Semantic analysis documents to rank terms |
US7720783B2 (en) * | 2007-03-28 | 2010-05-18 | Palo Alto Research Center Incorporated | Method and system for detecting undesired inferences from documents |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US9092510B1 (en) | 2007-04-30 | 2015-07-28 | Google Inc. | Modifying search result ranking based on a temporal element of user feedback |
GB0709527D0 (en) | 2007-05-18 | 2007-06-27 | Surfcontrol Plc | Electronic messaging system, message processing apparatus and message processing method |
US7792826B2 (en) * | 2007-05-29 | 2010-09-07 | International Business Machines Corporation | Method and system for providing ranked search results |
US8812296B2 (en) | 2007-06-27 | 2014-08-19 | Abbyy Infopoisk Llc | Method and system for natural language dictionary generation |
US8037086B1 (en) * | 2007-07-10 | 2011-10-11 | Google Inc. | Identifying common co-occurring elements in lists |
US8260619B1 (en) | 2008-08-22 | 2012-09-04 | Convergys Cmg Utah, Inc. | Method and system for creating natural language understanding grammars |
US7912840B2 (en) * | 2007-08-30 | 2011-03-22 | Perfect Search Corporation | Indexing and filtering using composite data stores |
US8280721B2 (en) | 2007-08-31 | 2012-10-02 | Microsoft Corporation | Efficiently representing word sense probabilities |
US8868562B2 (en) | 2007-08-31 | 2014-10-21 | Microsoft Corporation | Identification of semantic relationships within reported speech |
EP2183686A4 (en) * | 2007-08-31 | 2018-03-28 | Zhigu Holdings Limited | Identification of semantic relationships within reported speech |
CN101796510A (zh) * | 2007-08-31 | 2010-08-04 | 微软公司 | 搜索索引中单词的索引角色分层结构 |
US8316036B2 (en) * | 2007-08-31 | 2012-11-20 | Microsoft Corporation | Checkpointing iterators during search |
US8712758B2 (en) | 2007-08-31 | 2014-04-29 | Microsoft Corporation | Coreference resolution in an ambiguity-sensitive natural language processing system |
US8209321B2 (en) * | 2007-08-31 | 2012-06-26 | Microsoft Corporation | Emphasizing search results according to conceptual meaning |
US20090070322A1 (en) * | 2007-08-31 | 2009-03-12 | Powerset, Inc. | Browsing knowledge on the basis of semantic relations |
US8229970B2 (en) * | 2007-08-31 | 2012-07-24 | Microsoft Corporation | Efficient storage and retrieval of posting lists |
US8463593B2 (en) * | 2007-08-31 | 2013-06-11 | Microsoft Corporation | Natural language hypernym weighting for word sense disambiguation |
US8229730B2 (en) * | 2007-08-31 | 2012-07-24 | Microsoft Corporation | Indexing role hierarchies for words in a search index |
US8346756B2 (en) * | 2007-08-31 | 2013-01-01 | Microsoft Corporation | Calculating valence of expressions within documents for searching a document index |
US9053089B2 (en) | 2007-10-02 | 2015-06-09 | Apple Inc. | Part-of-speech tagging using latent analogy |
US8165886B1 (en) | 2007-10-04 | 2012-04-24 | Great Northern Research LLC | Speech interface system and method for control and interaction with applications on a computing system |
US8838659B2 (en) * | 2007-10-04 | 2014-09-16 | Amazon Technologies, Inc. | Enhanced knowledge repository |
US8595642B1 (en) | 2007-10-04 | 2013-11-26 | Great Northern Research, LLC | Multiple shell multi faceted graphical user interface |
US8909655B1 (en) | 2007-10-11 | 2014-12-09 | Google Inc. | Time based ranking |
US8594996B2 (en) | 2007-10-17 | 2013-11-26 | Evri Inc. | NLP-based entity recognition and disambiguation |
EP2212772A4 (en) * | 2007-10-17 | 2017-04-05 | VCVC lll LLC | Nlp-based content recommender |
WO2009059297A1 (en) * | 2007-11-01 | 2009-05-07 | Textdigger, Inc. | Method and apparatus for automated tag generation for digital content |
US20090119090A1 (en) * | 2007-11-01 | 2009-05-07 | Microsoft Corporation | Principled Approach to Paraphrasing |
US8725756B1 (en) | 2007-11-12 | 2014-05-13 | Google Inc. | Session-based query suggestions |
US7860885B2 (en) * | 2007-12-05 | 2010-12-28 | Palo Alto Research Center Incorporated | Inbound content filtering via automated inference detection |
US10002189B2 (en) * | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8504361B2 (en) * | 2008-02-07 | 2013-08-06 | Nec Laboratories America, Inc. | Deep neural networks and methods for using same |
US8392436B2 (en) * | 2008-02-07 | 2013-03-05 | Nec Laboratories America, Inc. | Semantic search via role labeling |
US10269024B2 (en) * | 2008-02-08 | 2019-04-23 | Outbrain Inc. | Systems and methods for identifying and measuring trends in consumer content demand within vertically associated websites and related content |
US8065143B2 (en) | 2008-02-22 | 2011-11-22 | Apple Inc. | Providing text input using speech data and non-speech data |
US8180754B1 (en) * | 2008-04-01 | 2012-05-15 | Dranias Development Llc | Semantic neural network for aggregating query searches |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US8061142B2 (en) * | 2008-04-11 | 2011-11-22 | General Electric Company | Mixer for a combustor |
US8706477B1 (en) | 2008-04-25 | 2014-04-22 | Softwin Srl Romania | Systems and methods for lexical correspondence linguistic knowledge base creation comprising dependency trees with procedural nodes denoting execute code |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US8682660B1 (en) * | 2008-05-21 | 2014-03-25 | Resolvity, Inc. | Method and system for post-processing speech recognition results |
US8464150B2 (en) | 2008-06-07 | 2013-06-11 | Apple Inc. | Automatic language identification for dynamic text processing |
US8219397B2 (en) * | 2008-06-10 | 2012-07-10 | Nuance Communications, Inc. | Data processing system for autonomously building speech identification and tagging data |
US8032495B2 (en) * | 2008-06-20 | 2011-10-04 | Perfect Search Corporation | Index compression |
AU2009267107A1 (en) | 2008-06-30 | 2010-01-07 | Websense, Inc. | System and method for dynamic and real-time categorization of webpages |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US9262409B2 (en) | 2008-08-06 | 2016-02-16 | Abbyy Infopoisk Llc | Translation of a selected text fragment of a screen |
US9317589B2 (en) * | 2008-08-07 | 2016-04-19 | International Business Machines Corporation | Semantic search by means of word sense disambiguation using a lexicon |
US8112269B2 (en) * | 2008-08-25 | 2012-02-07 | Microsoft Corporation | Determining utility of a question |
US8364663B2 (en) * | 2008-09-05 | 2013-01-29 | Microsoft Corporation | Tokenized javascript indexing system |
US8768702B2 (en) | 2008-09-05 | 2014-07-01 | Apple Inc. | Multi-tiered voice feedback in an electronic device |
JP2010066365A (ja) * | 2008-09-09 | 2010-03-25 | Toshiba Corp | 音声認識装置、方法、及びプログラム |
US8898568B2 (en) | 2008-09-09 | 2014-11-25 | Apple Inc. | Audio user interface |
US8712776B2 (en) | 2008-09-29 | 2014-04-29 | Apple Inc. | Systems and methods for selective text to speech synthesis |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US8185509B2 (en) * | 2008-10-15 | 2012-05-22 | Sap France | Association of semantic objects with linguistic entity categories |
EP2361465B1 (en) * | 2008-10-15 | 2012-08-29 | Hewlett-Packard Development Company, L.P. | Retrieving configuration records from a configuration management database |
WO2010077714A2 (en) * | 2008-12-09 | 2010-07-08 | University Of Houston System | Word sense disambiguation |
WO2010067118A1 (en) | 2008-12-11 | 2010-06-17 | Novauris Technologies Limited | Speech recognition involving a mobile device |
US8862252B2 (en) | 2009-01-30 | 2014-10-14 | Apple Inc. | Audio user interface for displayless electronic device |
US9805089B2 (en) * | 2009-02-10 | 2017-10-31 | Amazon Technologies, Inc. | Local business and product search system and method |
US8380507B2 (en) | 2009-03-09 | 2013-02-19 | Apple Inc. | Systems and methods for determining the language to use for speech generated by a text to speech engine |
WO2010104970A1 (en) * | 2009-03-10 | 2010-09-16 | Ebrary, Inc. | Method and apparatus for real time text analysis and text navigation |
CN102439590A (zh) * | 2009-03-13 | 2012-05-02 | 发明机器公司 | 用于自然语言文本的自动语义标注的系统和方法 |
KR20110136843A (ko) * | 2009-03-13 | 2011-12-21 | 인벤션 머신 코포레이션 | 지식 검색을 위한 시스템 및 방법 |
US20110301941A1 (en) * | 2009-03-20 | 2011-12-08 | Syl Research Limited | Natural language processing method and system |
US20100250522A1 (en) * | 2009-03-30 | 2010-09-30 | Gm Global Technology Operations, Inc. | Using ontology to order records by relevance |
US20100268600A1 (en) * | 2009-04-16 | 2010-10-21 | Evri Inc. | Enhanced advertisement targeting |
US8601015B1 (en) | 2009-05-15 | 2013-12-03 | Wolfram Alpha Llc | Dynamic example generation for queries |
US8788524B1 (en) * | 2009-05-15 | 2014-07-22 | Wolfram Alpha Llc | Method and system for responding to queries in an imprecise syntax |
US20100299132A1 (en) * | 2009-05-22 | 2010-11-25 | Microsoft Corporation | Mining phrase pairs from an unstructured resource |
WO2010138466A1 (en) | 2009-05-26 | 2010-12-02 | Wabsense, Inc. | Systems and methods for efficeint detection of fingerprinted data and information |
US20100306214A1 (en) * | 2009-05-28 | 2010-12-02 | Microsoft Corporation | Identifying modifiers in web queries over structured data |
US10540976B2 (en) | 2009-06-05 | 2020-01-21 | Apple Inc. | Contextual voice commands |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10255566B2 (en) | 2011-06-03 | 2019-04-09 | Apple Inc. | Generating and processing task items that represent tasks to perform |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US20130219333A1 (en) * | 2009-06-12 | 2013-08-22 | Adobe Systems Incorporated | Extensible Framework for Facilitating Interaction with Devices |
US8762131B1 (en) | 2009-06-17 | 2014-06-24 | Softwin Srl Romania | Systems and methods for managing a complex lexicon comprising multiword expressions and multiword inflection templates |
US8762130B1 (en) | 2009-06-17 | 2014-06-24 | Softwin Srl Romania | Systems and methods for natural language processing including morphological analysis, lemmatizing, spell checking and grammar checking |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US20110015921A1 (en) * | 2009-07-17 | 2011-01-20 | Minerva Advisory Services, Llc | System and method for using lingual hierarchy, connotation and weight of authority |
US20110040604A1 (en) * | 2009-08-13 | 2011-02-17 | Vertical Acuity, Inc. | Systems and Methods for Providing Targeted Content |
US9396485B2 (en) * | 2009-12-24 | 2016-07-19 | Outbrain Inc. | Systems and methods for presenting content |
US20110044447A1 (en) * | 2009-08-21 | 2011-02-24 | Nexidia Inc. | Trend discovery in audio signals |
US10169599B2 (en) * | 2009-08-26 | 2019-01-01 | International Business Machines Corporation | Data access control with flexible data disclosure |
US8498974B1 (en) | 2009-08-31 | 2013-07-30 | Google Inc. | Refining search results |
US8560300B2 (en) * | 2009-09-09 | 2013-10-15 | International Business Machines Corporation | Error correction using fact repositories |
GB2487023A (en) * | 2009-09-14 | 2012-07-04 | Arun Jain | Zolog intelligent human language interface for business software applications |
US9224007B2 (en) * | 2009-09-15 | 2015-12-29 | International Business Machines Corporation | Search engine with privacy protection |
US8972391B1 (en) | 2009-10-02 | 2015-03-03 | Google Inc. | Recent interest based relevance scoring |
US8645372B2 (en) * | 2009-10-30 | 2014-02-04 | Evri, Inc. | Keyword-based search engine results using enhanced query strategies |
US8682649B2 (en) | 2009-11-12 | 2014-03-25 | Apple Inc. | Sentiment prediction from textual data |
US20110131033A1 (en) * | 2009-12-02 | 2011-06-02 | Tatu Ylonen Oy Ltd | Weight-Ordered Enumeration of Referents and Cutting Off Lengthy Enumerations |
US10713666B2 (en) | 2009-12-24 | 2020-07-14 | Outbrain Inc. | Systems and methods for curating content |
US10607235B2 (en) * | 2009-12-24 | 2020-03-31 | Outbrain Inc. | Systems and methods for curating content |
US20110161091A1 (en) * | 2009-12-24 | 2011-06-30 | Vertical Acuity, Inc. | Systems and Methods for Connecting Entities Through Content |
US20110197137A1 (en) * | 2009-12-24 | 2011-08-11 | Vertical Acuity, Inc. | Systems and Methods for Rating Content |
US9600134B2 (en) | 2009-12-29 | 2017-03-21 | International Business Machines Corporation | Selecting portions of computer-accessible documents for post-selection processing |
US8311838B2 (en) | 2010-01-13 | 2012-11-13 | Apple Inc. | Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts |
US8381107B2 (en) | 2010-01-13 | 2013-02-19 | Apple Inc. | Adaptive audio feedback system and method |
US9201905B1 (en) * | 2010-01-14 | 2015-12-01 | The Boeing Company | Semantically mediated access to knowledge |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US8977584B2 (en) | 2010-01-25 | 2015-03-10 | Newvaluexchange Global Ai Llp | Apparatuses, methods and systems for a digital conversation management platform |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
JP5398007B2 (ja) * | 2010-02-26 | 2014-01-29 | 独立行政法人情報通信研究機構 | 関係情報拡張装置、関係情報拡張方法、及びプログラム |
US9710556B2 (en) | 2010-03-01 | 2017-07-18 | Vcvc Iii Llc | Content recommendation based on collections of entities |
US10417646B2 (en) | 2010-03-09 | 2019-09-17 | Sdl Inc. | Predicting the cost associated with translating textual content |
US8676565B2 (en) | 2010-03-26 | 2014-03-18 | Virtuoz Sa | Semantic clustering and conversational agents |
US9378202B2 (en) * | 2010-03-26 | 2016-06-28 | Virtuoz Sa | Semantic clustering |
US8694304B2 (en) | 2010-03-26 | 2014-04-08 | Virtuoz Sa | Semantic clustering and user interfaces |
US8645125B2 (en) | 2010-03-30 | 2014-02-04 | Evri, Inc. | NLP-based systems and methods for providing quotations |
US9110882B2 (en) | 2010-05-14 | 2015-08-18 | Amazon Technologies, Inc. | Extracting structured knowledge from unstructured text |
US8484015B1 (en) | 2010-05-14 | 2013-07-09 | Wolfram Alpha Llc | Entity pages |
US9672204B2 (en) * | 2010-05-28 | 2017-06-06 | Palo Alto Research Center Incorporated | System and method to acquire paraphrases |
US9836460B2 (en) * | 2010-06-11 | 2017-12-05 | Lexisnexis, A Division Of Reed Elsevier Inc. | Systems and methods for analyzing patent-related documents |
WO2011160140A1 (en) | 2010-06-18 | 2011-12-22 | Susan Bennett | System and method of semantic based searching |
US9623119B1 (en) | 2010-06-29 | 2017-04-18 | Google Inc. | Accentuating search results |
US8713021B2 (en) | 2010-07-07 | 2014-04-29 | Apple Inc. | Unsupervised document clustering using latent semantic density analysis |
US8812298B1 (en) | 2010-07-28 | 2014-08-19 | Wolfram Alpha Llc | Macro replacement of natural language input |
US8838633B2 (en) | 2010-08-11 | 2014-09-16 | Vcvc Iii Llc | NLP-based sentiment analysis |
US8719006B2 (en) | 2010-08-27 | 2014-05-06 | Apple Inc. | Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis |
JP5012981B2 (ja) * | 2010-09-09 | 2012-08-29 | カシオ計算機株式会社 | 電子辞書装置およびプログラム |
US9405848B2 (en) | 2010-09-15 | 2016-08-02 | Vcvc Iii Llc | Recommending mobile device activities |
US8719014B2 (en) | 2010-09-27 | 2014-05-06 | Apple Inc. | Electronic device with text error correction based on voice recognition data |
US9524291B2 (en) | 2010-10-06 | 2016-12-20 | Virtuoz Sa | Visual display of semantic information |
US8725739B2 (en) | 2010-11-01 | 2014-05-13 | Evri, Inc. | Category-based content recommendation |
US9424351B2 (en) * | 2010-11-22 | 2016-08-23 | Microsoft Technology Licensing, Llc | Hybrid-distribution model for search engine indexes |
US9824091B2 (en) | 2010-12-03 | 2017-11-21 | Microsoft Technology Licensing, Llc | File system backup using change journal |
US8620894B2 (en) * | 2010-12-21 | 2013-12-31 | Microsoft Corporation | Searching files |
US10515147B2 (en) | 2010-12-22 | 2019-12-24 | Apple Inc. | Using statistical language models for contextual lookup |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
US10235360B2 (en) | 2010-12-23 | 2019-03-19 | Koninklijke Philips N.V. | Generation of pictorial reporting diagrams of lesions in anatomical structures |
JP5237400B2 (ja) * | 2011-01-21 | 2013-07-17 | 株式会社三菱東京Ufj銀行 | 検索装置 |
US10657540B2 (en) | 2011-01-29 | 2020-05-19 | Sdl Netherlands B.V. | Systems, methods, and media for web content management |
US9547626B2 (en) | 2011-01-29 | 2017-01-17 | Sdl Plc | Systems, methods, and media for managing ambient adaptability of web applications and web services |
US8781836B2 (en) | 2011-02-22 | 2014-07-15 | Apple Inc. | Hearing assistance system for providing consistent human speech |
US10580015B2 (en) | 2011-02-25 | 2020-03-03 | Sdl Netherlands B.V. | Systems, methods, and media for executing and optimizing online marketing initiatives |
US10140320B2 (en) | 2011-02-28 | 2018-11-27 | Sdl Inc. | Systems, methods, and media for generating analytical data |
US8543577B1 (en) | 2011-03-02 | 2013-09-24 | Google Inc. | Cross-channel clusters of information |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
JP5696555B2 (ja) * | 2011-03-28 | 2015-04-08 | 富士ゼロックス株式会社 | プログラム及び情報処理装置 |
US9116995B2 (en) | 2011-03-30 | 2015-08-25 | Vcvc Iii Llc | Cluster-based identification of news stories |
US20120265784A1 (en) * | 2011-04-15 | 2012-10-18 | Microsoft Corporation | Ordering semantic query formulation suggestions |
US20120310642A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Automatically creating a mapping between text data and audio data |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8812294B2 (en) | 2011-06-21 | 2014-08-19 | Apple Inc. | Translating phrases from one language into another using an order-based set of declarative rules |
US10198506B2 (en) * | 2011-07-11 | 2019-02-05 | Lexxe Pty Ltd. | System and method of sentiment data generation |
US9069814B2 (en) | 2011-07-27 | 2015-06-30 | Wolfram Alpha Llc | Method and system for using natural language to generate widgets |
US8706472B2 (en) | 2011-08-11 | 2014-04-22 | Apple Inc. | Method for disambiguating multiple readings in language conversion |
US9984054B2 (en) | 2011-08-24 | 2018-05-29 | Sdl Inc. | Web interface including the review and manipulation of a web document and utilizing permission based control |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US9734252B2 (en) | 2011-09-08 | 2017-08-15 | Wolfram Alpha Llc | Method and system for analyzing data using a query answering system |
US8914277B1 (en) * | 2011-09-20 | 2014-12-16 | Nuance Communications, Inc. | Speech and language translation of an utterance |
US8762156B2 (en) | 2011-09-28 | 2014-06-24 | Apple Inc. | Speech recognition repair using contextual information |
US10169339B2 (en) | 2011-10-31 | 2019-01-01 | Elwha Llc | Context-sensitive query enrichment |
US20130124194A1 (en) * | 2011-11-10 | 2013-05-16 | Inventive, Inc. | Systems and methods for manipulating data using natural language commands |
US9851950B2 (en) | 2011-11-15 | 2017-12-26 | Wolfram Alpha Llc | Programming in a precise syntax using natural language |
US8965750B2 (en) | 2011-11-17 | 2015-02-24 | Abbyy Infopoisk Llc | Acquiring accurate machine translation |
US9195853B2 (en) | 2012-01-15 | 2015-11-24 | International Business Machines Corporation | Automated document redaction |
JP5567749B2 (ja) * | 2012-02-15 | 2014-08-06 | 楽天株式会社 | 辞書生成装置、辞書生成方法、辞書生成プログラム、及びそのプログラムを記憶するコンピュータ読取可能な記録媒体 |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9064009B2 (en) * | 2012-03-28 | 2015-06-23 | Hewlett-Packard Development Company, L.P. | Attribute cloud |
US8989485B2 (en) | 2012-04-27 | 2015-03-24 | Abbyy Development Llc | Detecting a junction in a text line of CJK characters |
US8971630B2 (en) | 2012-04-27 | 2015-03-03 | Abbyy Development Llc | Fast CJK character recognition |
US9773270B2 (en) | 2012-05-11 | 2017-09-26 | Fredhopper B.V. | Method and system for recommending products based on a ranking cocktail |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US9460082B2 (en) * | 2012-05-14 | 2016-10-04 | International Business Machines Corporation | Management of language usage to facilitate effective communication |
US8775442B2 (en) | 2012-05-15 | 2014-07-08 | Apple Inc. | Semantic search using a single-source semantic model |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US10261994B2 (en) | 2012-05-25 | 2019-04-16 | Sdl Inc. | Method and system for automatic management of reputation of translators |
WO2013185109A2 (en) | 2012-06-08 | 2013-12-12 | Apple Inc. | Systems and methods for recognizing textual identifiers within a plurality of words |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9195647B1 (en) * | 2012-08-11 | 2015-11-24 | Guangsheng Zhang | System, methods, and data structure for machine-learning of contextualized symbolic associations |
US9405424B2 (en) | 2012-08-29 | 2016-08-02 | Wolfram Alpha, Llc | Method and system for distributing and displaying graphical items |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US11386186B2 (en) | 2012-09-14 | 2022-07-12 | Sdl Netherlands B.V. | External content library connector systems and methods |
US10452740B2 (en) | 2012-09-14 | 2019-10-22 | Sdl Netherlands B.V. | External content libraries |
US11308528B2 (en) | 2012-09-14 | 2022-04-19 | Sdl Netherlands B.V. | Blueprinting of multimedia assets |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
US8935167B2 (en) | 2012-09-25 | 2015-01-13 | Apple Inc. | Exemplar-based latent perceptual modeling for automatic speech recognition |
US9916306B2 (en) | 2012-10-19 | 2018-03-13 | Sdl Inc. | Statistical linguistic analysis of source content |
US9892278B2 (en) | 2012-11-14 | 2018-02-13 | International Business Machines Corporation | Focused personal identifying information redaction |
US10095692B2 (en) * | 2012-11-29 | 2018-10-09 | Thornson Reuters Global Resources Unlimited Company | Template bootstrapping for domain-adaptable natural language generation |
US20150317386A1 (en) * | 2012-12-27 | 2015-11-05 | Abbyy Development Llc | Finding an appropriate meaning of an entry in a text |
KR102516577B1 (ko) | 2013-02-07 | 2023-04-03 | 애플 인크. | 디지털 어시스턴트를 위한 음성 트리거 |
US9135240B2 (en) * | 2013-02-12 | 2015-09-15 | International Business Machines Corporation | Latent semantic analysis for application in a question answer system |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
US9977779B2 (en) | 2013-03-14 | 2018-05-22 | Apple Inc. | Automatic supplementation of word correction dictionaries |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US10642574B2 (en) | 2013-03-14 | 2020-05-05 | Apple Inc. | Device, method, and graphical user interface for outputting captions |
US9311297B2 (en) * | 2013-03-14 | 2016-04-12 | Prateek Bhatnagar | Method and system for outputting information |
US9733821B2 (en) | 2013-03-14 | 2017-08-15 | Apple Inc. | Voice control to diagnose inadvertent activation of accessibility features |
US10572476B2 (en) | 2013-03-14 | 2020-02-25 | Apple Inc. | Refining a search based on schedule items |
WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
US11151899B2 (en) | 2013-03-15 | 2021-10-19 | Apple Inc. | User training by intelligent digital assistant |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
AU2014233517B2 (en) | 2013-03-15 | 2017-05-25 | Apple Inc. | Training an at least partial voice command system |
US10078487B2 (en) | 2013-03-15 | 2018-09-18 | Apple Inc. | Context-sensitive handling of interruptions |
JP6152711B2 (ja) * | 2013-06-04 | 2017-06-28 | 富士通株式会社 | 情報検索装置および情報検索方法 |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
KR101959188B1 (ko) | 2013-06-09 | 2019-07-02 | 애플 인크. | 디지털 어시스턴트의 둘 이상의 인스턴스들에 걸친 대화 지속성을 가능하게 하기 위한 디바이스, 방법 및 그래픽 사용자 인터페이스 |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
KR101809808B1 (ko) | 2013-06-13 | 2017-12-15 | 애플 인크. | 음성 명령에 의해 개시되는 긴급 전화를 걸기 위한 시스템 및 방법 |
CN105453026A (zh) | 2013-08-06 | 2016-03-30 | 苹果公司 | 基于来自远程设备的活动自动激活智能响应 |
US9311300B2 (en) * | 2013-09-13 | 2016-04-12 | International Business Machines Corporation | Using natural language processing (NLP) to create subject matter synonyms from definitions |
US20160224637A1 (en) * | 2013-11-25 | 2016-08-04 | Ut Battelle, Llc | Processing associations in knowledge graphs |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
RU2592395C2 (ru) | 2013-12-19 | 2016-07-20 | Общество с ограниченной ответственностью "Аби ИнфоПоиск" | Разрешение семантической неоднозначности при помощи статистического анализа |
US20150178390A1 (en) * | 2013-12-20 | 2015-06-25 | Jordi Torras | Natural language search engine using lexical functions and meaning-text criteria |
RU2613847C2 (ru) | 2013-12-20 | 2017-03-21 | ООО "Аби Девелопмент" | Выявление китайской, японской и корейской письменности |
RU2586577C2 (ru) | 2014-01-15 | 2016-06-10 | Общество с ограниченной ответственностью "Аби ИнфоПоиск" | Фильтрация дуг в синтаксическом графе |
RU2665239C2 (ru) | 2014-01-15 | 2018-08-28 | Общество с ограниченной ответственностью "Аби Продакшн" | Автоматическое извлечение именованных сущностей из текста |
JP6260294B2 (ja) * | 2014-01-21 | 2018-01-17 | 富士通株式会社 | 情報検索装置、情報検索方法および情報検索プログラム |
RU2640322C2 (ru) | 2014-01-30 | 2017-12-27 | Общество с ограниченной ответственностью "Аби Девелопмент" | Способы и системы эффективного автоматического распознавания символов |
RU2648638C2 (ru) | 2014-01-30 | 2018-03-26 | Общество с ограниченной ответственностью "Аби Девелопмент" | Способы и системы эффективного автоматического распознавания символов, использующие множество кластеров эталонов символов |
RU2556425C1 (ru) * | 2014-02-14 | 2015-07-10 | Закрытое акционерное общество "Эвентос" (ЗАО "Эвентос") | Способ автоматической итеративной кластеризации электронных документов по семантической близости, способ поиска в совокупности кластеризованных по семантической близости документов и машиночитаемые носители |
US10839110B2 (en) * | 2014-05-09 | 2020-11-17 | Autodesk, Inc. | Techniques for using controlled natural language to capture design intent for computer-aided design |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
TWI566107B (zh) | 2014-05-30 | 2017-01-11 | 蘋果公司 | 用於處理多部分語音命令之方法、非暫時性電腦可讀儲存媒體及電子裝置 |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
KR101661198B1 (ko) * | 2014-07-10 | 2016-10-04 | 네이버 주식회사 | 단문/복문 구조의 자연어 질의에 대한 검색 및 정보 제공 방법 및 시스템 |
CN104199803B (zh) * | 2014-07-21 | 2017-10-13 | 安徽华贞信息科技有限公司 | 一种基于组合理论的文本信息处理系统及方法 |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
RU2596600C2 (ru) | 2014-09-02 | 2016-09-10 | Общество с ограниченной ответственностью "Аби Девелопмент" | Способы и системы обработки изображений математических выражений |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US9588961B2 (en) | 2014-10-06 | 2017-03-07 | International Business Machines Corporation | Natural language processing utilizing propagation of knowledge through logical parse tree structures |
US9715488B2 (en) * | 2014-10-06 | 2017-07-25 | International Business Machines Corporation | Natural language processing utilizing transaction based knowledge representation |
US9665564B2 (en) | 2014-10-06 | 2017-05-30 | International Business Machines Corporation | Natural language processing utilizing logical tree structures |
US9710547B2 (en) | 2014-11-21 | 2017-07-18 | Inbenta | Natural language semantic search system and method using weighted global semantic representations |
US9626358B2 (en) | 2014-11-26 | 2017-04-18 | Abbyy Infopoisk Llc | Creating ontologies by analyzing natural language texts |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9589185B2 (en) | 2014-12-10 | 2017-03-07 | Abbyy Development Llc | Symbol recognition using decision forests |
JP6447161B2 (ja) * | 2015-01-20 | 2019-01-09 | 富士通株式会社 | 意味構造検索プログラム、意味構造検索装置、及び意味構造検索方法 |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9632999B2 (en) * | 2015-04-03 | 2017-04-25 | Klangoo, Sal. | Techniques for understanding the aboutness of text based on semantic analysis |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US9778929B2 (en) | 2015-05-29 | 2017-10-03 | Microsoft Technology Licensing, Llc | Automated efficient translation context delivery |
US10762521B2 (en) | 2015-06-01 | 2020-09-01 | Jpmorgan Chase Bank, N.A. | System and method for loyalty integration for merchant specific digital wallets |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10628413B2 (en) * | 2015-08-03 | 2020-04-21 | International Business Machines Corporation | Mapping questions to complex database lookups using synthetic events |
US10628521B2 (en) * | 2015-08-03 | 2020-04-21 | International Business Machines Corporation | Scoring automatically generated language patterns for questions using synthetic events |
US10134389B2 (en) * | 2015-09-04 | 2018-11-20 | Microsoft Technology Licensing, Llc | Clustering user utterance intents with semantic parsing |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
EP3163467A1 (en) * | 2015-10-30 | 2017-05-03 | BIGFLO s.r.l. | Method and tool for the automatic reformulation of search keyword strings in document search systems |
US10614167B2 (en) | 2015-10-30 | 2020-04-07 | Sdl Plc | Translation review workflow systems and methods |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10592603B2 (en) * | 2016-02-03 | 2020-03-17 | International Business Machines Corporation | Identifying logic problems in text using a statistical approach and natural language processing |
US11042702B2 (en) | 2016-02-04 | 2021-06-22 | International Business Machines Corporation | Solving textual logic problems using a statistical approach and natural language processing |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
EP3394798A1 (en) * | 2016-03-18 | 2018-10-31 | Google LLC | Generating dependency parses of text segments using neural networks |
US11200217B2 (en) | 2016-05-26 | 2021-12-14 | Perfect Search Corporation | Structured document indexing and searching |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US10289680B2 (en) * | 2016-05-31 | 2019-05-14 | Oath Inc. | Real time parsing and suggestions from pre-generated corpus with hypernyms |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179309B1 (en) | 2016-06-09 | 2018-04-23 | Apple Inc | Intelligent automated assistant in a home environment |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
US11049190B2 (en) | 2016-07-15 | 2021-06-29 | Intuit Inc. | System and method for automatically generating calculations for fields in compliance forms |
US10579721B2 (en) | 2016-07-15 | 2020-03-03 | Intuit Inc. | Lean parsing: a natural language processing system and method for parsing domain-specific languages |
US11222266B2 (en) | 2016-07-15 | 2022-01-11 | Intuit Inc. | System and method for automatic learning of functions |
US10120861B2 (en) | 2016-08-17 | 2018-11-06 | Oath Inc. | Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time |
US9984063B2 (en) | 2016-09-15 | 2018-05-29 | International Business Machines Corporation | System and method for automatic, unsupervised paraphrase generation using a novel framework that learns syntactic construct while retaining semantic meaning |
US9953027B2 (en) * | 2016-09-15 | 2018-04-24 | International Business Machines Corporation | System and method for automatic, unsupervised paraphrase generation using a novel framework that learns syntactic construct while retaining semantic meaning |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10437833B1 (en) * | 2016-10-05 | 2019-10-08 | Ontocord, LLC | Scalable natural language processing for large and dynamic text environments |
KR102589638B1 (ko) * | 2016-10-31 | 2023-10-16 | 삼성전자주식회사 | 문장 생성 장치 및 방법 |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
JP6805927B2 (ja) * | 2017-03-28 | 2020-12-23 | 富士通株式会社 | インデックス生成プログラム、データ検索プログラム、インデックス生成装置、データ検索装置、インデックス生成方法、及びデータ検索方法 |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
US10275452B2 (en) | 2017-05-12 | 2019-04-30 | International Business Machines Corporation | Automatic, unsupervised paraphrase detection |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
US10152571B1 (en) * | 2017-05-25 | 2018-12-11 | Enlitic, Inc. | Chest x-ray differential diagnosis system |
CA3076418C (en) | 2017-09-22 | 2023-02-21 | Intuit Inc. | Lean parsing: a natural language processing system and method for parsing domain-specific languages |
US10635863B2 (en) | 2017-10-30 | 2020-04-28 | Sdl Inc. | Fragment recall and adaptive automated translation |
US11087097B2 (en) * | 2017-11-27 | 2021-08-10 | Act, Inc. | Automatic item generation for passage-based assessment |
US11410130B2 (en) * | 2017-12-27 | 2022-08-09 | International Business Machines Corporation | Creating and using triplet representations to assess similarity between job description documents |
US10817676B2 (en) | 2017-12-27 | 2020-10-27 | Sdl Inc. | Intelligent routing services and systems |
MY201295A (en) | 2017-12-28 | 2024-02-15 | Mimos Berhad | A computer-implemented method for self-learning text relevance and determining text relevancy |
US11573990B2 (en) * | 2017-12-29 | 2023-02-07 | Entefy Inc. | Search-based natural language intent determination |
IL258689A (en) * | 2018-04-12 | 2018-05-31 | Browarnik Abel | A system and method for computerized semantic indexing and searching |
JP7135399B2 (ja) * | 2018-04-12 | 2022-09-13 | 富士通株式会社 | 特定プログラム、特定方法および情報処理装置 |
US11016985B2 (en) * | 2018-05-22 | 2021-05-25 | International Business Machines Corporation | Providing relevant evidence or mentions for a query |
US11042712B2 (en) * | 2018-06-05 | 2021-06-22 | Koninklijke Philips N.V. | Simplifying and/or paraphrasing complex textual content by jointly learning semantic alignment and simplicity |
US11256867B2 (en) | 2018-10-09 | 2022-02-22 | Sdl Inc. | Systems and methods of machine learning for digital assets and message creation |
US11163956B1 (en) | 2019-05-23 | 2021-11-02 | Intuit Inc. | System and method for recognizing domain specific named entities using domain specific word embeddings |
US11477140B2 (en) | 2019-05-30 | 2022-10-18 | Microsoft Technology Licensing, Llc | Contextual feedback to a natural understanding system in a chat bot |
US10868778B1 (en) | 2019-05-30 | 2020-12-15 | Microsoft Technology Licensing, Llc | Contextual feedback, with expiration indicator, to a natural understanding system in a chat bot |
JP2022547750A (ja) | 2019-09-16 | 2022-11-15 | ドキュガミ インコーポレイテッド | クロスドキュメントインテリジェントオーサリングおよび処理アシスタント |
US11068665B2 (en) | 2019-09-18 | 2021-07-20 | International Business Machines Corporation | Hypernym detection using strict partial order networks |
CN111090668B (zh) * | 2019-12-09 | 2023-09-26 | 京东科技信息技术有限公司 | 数据检索方法及装置、电子设备和计算机可读存储介质 |
US11783128B2 (en) | 2020-02-19 | 2023-10-10 | Intuit Inc. | Financial document text conversion to computer readable operations |
US11651156B2 (en) * | 2020-05-07 | 2023-05-16 | Optum Technology, Inc. | Contextual document summarization with semantic intelligence |
US11954448B2 (en) * | 2020-07-21 | 2024-04-09 | Microsoft Technology Licensing, Llc | Determining position values for transformer models |
US20230343333A1 (en) * | 2020-08-24 | 2023-10-26 | Unlikely Artificial Intelligence Limited | A computer implemented method for the aut0omated analysis or use of data |
US12210824B1 (en) * | 2021-04-30 | 2025-01-28 | Now Insurance Services, Inc. | Automated information extraction from electronic documents using machine learning |
US11966699B2 (en) * | 2021-06-17 | 2024-04-23 | International Business Machines Corporation | Intent classification using non-correlated features |
US11989507B2 (en) | 2021-08-24 | 2024-05-21 | Unlikely Artificial Intelligence Limited | Computer implemented methods for the automated analysis or use of data, including use of a large language model |
US12136484B2 (en) | 2021-11-05 | 2024-11-05 | Altis Labs, Inc. | Method and apparatus utilizing image-based modeling in healthcare |
Family Cites Families (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4823306A (en) * | 1987-08-14 | 1989-04-18 | International Business Machines Corporation | Text search system |
US4839853A (en) * | 1988-09-15 | 1989-06-13 | Bell Communications Research, Inc. | Computer information retrieval using latent semantic structure |
SE466029B (sv) * | 1989-03-06 | 1991-12-02 | Ibm Svenska Ab | Anordning och foerfarande foer analys av naturligt spraak i ett datorbaserat informationsbehandlingssystem |
NL8900587A (nl) * | 1989-03-10 | 1990-10-01 | Bso Buro Voor Systeemontwikkel | Werkwijze voor het bepalen van de semantische verwantheid van lexicale componenten in een tekst. |
US5146406A (en) * | 1989-08-16 | 1992-09-08 | International Business Machines Corporation | Computer method for identifying predicate-argument structures in natural language text |
JP3266246B2 (ja) | 1990-06-15 | 2002-03-18 | インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン | 自然言語解析装置及び方法並びに自然言語解析用知識ベース構築方法 |
US5617578A (en) * | 1990-06-26 | 1997-04-01 | Spss Corp. | Computer-based workstation for generation of logic diagrams from natural language text structured by the insertion of script symbols |
US5325298A (en) * | 1990-11-07 | 1994-06-28 | Hnc, Inc. | Methods for generating or revising context vectors for a plurality of word stems |
US5278980A (en) * | 1991-08-16 | 1994-01-11 | Xerox Corporation | Iterative technique for phrase query formation and an information retrieval system employing same |
US5488719A (en) * | 1991-12-30 | 1996-01-30 | Xerox Corporation | System for categorizing character strings using acceptability and category information contained in ending substrings |
US5591661A (en) | 1992-04-07 | 1997-01-07 | Shiota; Philip | Method for fabricating devices for electrostatic discharge protection and voltage references, and the resulting structures |
US5377103A (en) | 1992-05-15 | 1994-12-27 | International Business Machines Corporation | Constrained natural language interface for a computer that employs a browse function |
US5592661A (en) * | 1992-07-16 | 1997-01-07 | International Business Machines Corporation | Detection of independent changes via change identifiers in a versioned database management system |
US5630121A (en) * | 1993-02-02 | 1997-05-13 | International Business Machines Corporation | Archiving and retrieving multimedia objects using structured indexes |
US5454106A (en) * | 1993-05-17 | 1995-09-26 | International Business Machines Corporation | Database retrieval system using natural language for presenting understood components of an ambiguous query on a user interface |
US5619709A (en) * | 1993-09-20 | 1997-04-08 | Hnc, Inc. | System and method of context vector generation and retrieval |
GB9320404D0 (en) * | 1993-10-04 | 1993-11-24 | Dixon Robert | Method & apparatus for data storage & retrieval |
US5873056A (en) * | 1993-10-12 | 1999-02-16 | The Syracuse University | Natural language processing system for semantic vector representation which accounts for lexical ambiguity |
US5724594A (en) | 1994-02-10 | 1998-03-03 | Microsoft Corporation | Method and system for automatically identifying morphological information from a machine-readable dictionary |
US5675819A (en) * | 1994-06-16 | 1997-10-07 | Xerox Corporation | Document information retrieval using global word co-occurrence patterns |
US5794050A (en) * | 1995-01-04 | 1998-08-11 | Intelligent Text Processing, Inc. | Natural language understanding system |
JP2923552B2 (ja) * | 1995-02-13 | 1999-07-26 | 富士通株式会社 | 組織活動データベースの構築方法,それに使用する分析シートの入力方法及び組織活動管理システム |
US5963940A (en) * | 1995-08-16 | 1999-10-05 | Syracuse University | Natural language information retrieval system and method |
US6006221A (en) | 1995-08-16 | 1999-12-21 | Syracuse University | Multilingual document retrieval system and method using semantic vector matching |
JP3083742B2 (ja) * | 1995-10-03 | 2000-09-04 | インターナショナル・ビジネス・マシーンズ・コーポレ−ション | 表計算方法 |
US5995922A (en) | 1996-05-02 | 1999-11-30 | Microsoft Corporation | Identifying information related to an input word in an electronic dictionary |
US5966686A (en) * | 1996-06-28 | 1999-10-12 | Microsoft Corporation | Method and system for computing semantic logical forms from syntax trees |
US5893104A (en) * | 1996-07-09 | 1999-04-06 | Oracle Corporation | Method and system for processing queries in a database system using index structures that are not native to the database system |
US6038561A (en) * | 1996-10-15 | 2000-03-14 | Manning & Napier Information Services | Management and analysis of document information text |
US5970490A (en) * | 1996-11-05 | 1999-10-19 | Xerox Corporation | Integration platform for heterogeneous databases |
US6076051A (en) * | 1997-03-07 | 2000-06-13 | Microsoft Corporation | Information retrieval utilizing semantic representation of text |
US5895464A (en) * | 1997-04-30 | 1999-04-20 | Eastman Kodak Company | Computer program product and a method for using natural language for the description, search and retrieval of multi-media objects |
US5933822A (en) * | 1997-07-22 | 1999-08-03 | Microsoft Corporation | Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision |
US6070134A (en) * | 1997-07-31 | 2000-05-30 | Microsoft Corporation | Identifying salient semantic relation paths between two words |
US5991713A (en) * | 1997-11-26 | 1999-11-23 | International Business Machines Corp. | Efficient method for compressing, storing, searching and transmitting natural language text |
US6675159B1 (en) * | 2000-07-27 | 2004-01-06 | Science Applic Int Corp | Concept-based search and retrieval system |
US6664964B1 (en) * | 2000-11-10 | 2003-12-16 | Emc Corporation | Correlation criteria for logical volumes |
US7050964B2 (en) | 2001-06-01 | 2006-05-23 | Microsoft Corporation | Scaleable machine translation system |
US7734459B2 (en) | 2001-06-01 | 2010-06-08 | Microsoft Corporation | Automatic extraction of transfer mappings from bilingual corpora |
-
1997
- 1997-03-07 US US08/886,814 patent/US6076051A/en not_active Expired - Lifetime
-
1998
- 1998-02-11 CN CN98804175A patent/CN1252876A/zh active Pending
- 1998-02-11 JP JP53853998A patent/JP4282769B2/ja not_active Expired - Lifetime
- 1998-02-11 EP EP98906476.1A patent/EP0965089B1/en not_active Expired - Lifetime
- 1998-02-11 WO PCT/US1998/003005 patent/WO1998039714A1/en active Application Filing
-
1999
- 1999-08-03 US US09/366,499 patent/US6161084A/en not_active Expired - Lifetime
- 1999-08-03 US US09/368,071 patent/US6246977B1/en not_active Expired - Lifetime
-
2000
- 2000-05-17 US US09/572,765 patent/US6871174B1/en not_active Expired - Lifetime
-
2004
- 2004-10-29 US US10/977,910 patent/US7013264B2/en not_active Expired - Fee Related
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1333362C (zh) * | 2001-03-26 | 2007-08-22 | 美国网上搜索公司 | 用于智能数据同化的方法和装置 |
US7630879B2 (en) | 2002-09-13 | 2009-12-08 | Fuji Xerox Co., Ltd. | Text sentence comparing apparatus |
CN105512291A (zh) * | 2006-02-28 | 2016-04-20 | 贝宝公司 | 用于扩展数据库搜索查询的方法和系统 |
CN105512291B (zh) * | 2006-02-28 | 2020-05-15 | 贝宝公司 | 用于扩展数据库搜索查询的方法和系统 |
US8065307B2 (en) | 2006-12-20 | 2011-11-22 | Microsoft Corporation | Parsing, analysis and scoring of document content |
CN101508188B (zh) * | 2009-03-24 | 2012-09-26 | 北京市城南橡塑技术研究所 | 抗冲击复合衬板 |
CN106598722A (zh) * | 2015-10-19 | 2017-04-26 | 上海引跑信息科技有限公司 | 一种在文本信息检索服务中支持分布式事务管理的方法 |
CN110088754A (zh) * | 2016-10-26 | 2019-08-02 | 联邦科学和工业研究组织 | 立法到逻辑的自动编码器 |
CN110088754B (zh) * | 2016-10-26 | 2023-04-28 | 联邦科学和工业研究组织 | 立法到逻辑的自动编码器 |
CN114969262A (zh) * | 2022-05-31 | 2022-08-30 | 云知声智能科技股份有限公司 | 文本处理方法、装置、存储介质及电子装置 |
Also Published As
Publication number | Publication date |
---|---|
EP0965089B1 (en) | 2015-03-25 |
WO1998039714A1 (en) | 1998-09-11 |
US6161084A (en) | 2000-12-12 |
JP2001513243A (ja) | 2001-08-28 |
US6076051A (en) | 2000-06-13 |
EP0965089A1 (en) | 1999-12-22 |
US20050065777A1 (en) | 2005-03-24 |
US6246977B1 (en) | 2001-06-12 |
JP4282769B2 (ja) | 2009-06-24 |
US6871174B1 (en) | 2005-03-22 |
US7013264B2 (en) | 2006-03-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1252876A (zh) | 利用文本的语义表示进行信息检索 | |
CN107993724B (zh) | 一种医学智能问答数据处理的方法及装置 | |
KR101157693B1 (ko) | 토큰스페이스 저장소와 함께 사용하기 위한 멀티-스테이지질의 처리 시스템 및 방법 | |
US20220261427A1 (en) | Methods and system for semantic search in large databases | |
KR101661198B1 (ko) | 단문/복문 구조의 자연어 질의에 대한 검색 및 정보 제공 방법 및 시스템 | |
US6131082A (en) | Machine assisted translation tools utilizing an inverted index and list of letter n-grams | |
CN105045875B (zh) | 个性化信息检索方法及装置 | |
US20030078915A1 (en) | Generalized keyword matching for keyword based searching over relational databases | |
CN101051311A (zh) | 从应用于中心词提取系统的词条中提取中心词的方法 | |
JPH1145241A (ja) | かな漢字変換システムおよびそのシステムの各手段としてコンピュータを機能させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体 | |
CN108763348B (zh) | 一种扩展短文本词特征向量的分类改进方法 | |
JP2002520712A (ja) | データ検索システムと方法およびサーチ・エンジンにおけるその使用 | |
CN101042692A (zh) | 基于语义预测的译文获取方法和设备 | |
WO2015062340A1 (zh) | 一种兼容关键词搜索的自然语言搜索方法及系统 | |
CN105335487A (zh) | 基于农业技术信息本体库的农业专家信息检索系统及方法 | |
CN102662936A (zh) | 融合Web挖掘、多特征与有监督学习的汉英未登录词翻译方法 | |
JP2011118689A (ja) | 検索方法及びシステム | |
CN106649605A (zh) | 一种推广关键词的触发方法及装置 | |
CN102314464B (zh) | 歌词搜索方法及搜索引擎 | |
JP2007334388A (ja) | クラスタリング方法及び装置及びプログラム及びコンピュータ読み取り可能な記録媒体 | |
Liu | The application of RAG technology in traditional chinese medicine | |
JP5298834B2 (ja) | 例文マッチング翻訳装置、およびプログラム、並びに翻訳装置を含んで構成された句翻訳装置 | |
JP2003108595A (ja) | 情報検索装置、情報検索方法及び情報検索プログラム | |
CN118820407B (zh) | 基于大语言模型的生命周期流数据混合检索方法及装置 | |
CN112163065A (zh) | 信息检索方法、系统及介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |