[go: up one dir, main page]

CN1252876A - 利用文本的语义表示进行信息检索 - Google Patents

利用文本的语义表示进行信息检索 Download PDF

Info

Publication number
CN1252876A
CN1252876A CN98804175A CN98804175A CN1252876A CN 1252876 A CN1252876 A CN 1252876A CN 98804175 A CN98804175 A CN 98804175A CN 98804175 A CN98804175 A CN 98804175A CN 1252876 A CN1252876 A CN 1252876A
Authority
CN
China
Prior art keywords
speech
logical form
document
mark
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN98804175A
Other languages
English (en)
Inventor
约翰·J·麦瑟利
乔治·E·海德恩
斯蒂芬·D·理查德森
威廉·B·杜兰
卡轮·杰森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Corp
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of CN1252876A publication Critical patent/CN1252876A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99932Access augmentation or optimizing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

本发明涉及利用文本的语义表达进行信息检索。在一种优选实施例中,记号化器从输入字符串生成表征该输入字符串中所表达的语义关系的信息检索记号。记号化器首先从输入字符串建立表征输入字符串中的选定词之间的语义关系的原逻辑形式。接着记号化器确定和输入字符串中选定词之一具有“isa(是)”关系的超属词。然后记号化器从原逻辑形式构造一个或多个替代逻辑形式。通过为输入字符串中的一个或多个选定词中的每个选定词用为该选定词确定的某超属词代替原逻辑形式中的该选词,记号化器构造各替代逻辑形式。最后,记号化器生成代表原逻辑形式和替代逻辑形式的记号。最好使用记号化器生成记号,以便既用于构造代表目标文档的索引又用于对照索引处理查询。

Description

利用文本的语义表示进行信息检索
本发明涉及信息检索领域,并且更具体地涉及信息检索记号化领域。
信息检索指的是确定目标文档中出现查询或查询文档中的词的过程。信息检索可以被有益地应用于几种情况中,包括:处理用户的明确搜索查询,确定和某特定文档相关的文档,判断两份文档的类似性,提取某文档的特征以及概述某文档。
信息检索典型地包括两阶段过程:(1)在编索引阶段,最初通过(a)把文档中的每个词转化成信息检索引擎可理解、可区分的一串字符,称之为“记号”(即 文档的记号化)以及(b)建立各记号到该记号在该文档中出现位置的索引,对文档编索引。(2)在查询阶段中,相似地对查询(或查询文档)进行记号化,并和索引进行比较以确定文档中出现记号化后的查询中的记号的位置。
图1是描述信息检索过程的概述数据流图。在编索引阶段,把目标文档111提供给记号化器112。目标文档是由一些字符串,例如一些句子,组成的,每个字符串出现在目标文档的某特定位置上。将目标文档中的各字符串以及词的位置传送到记号化器120,记号化器120把各字符串中的词转换成一系列可由信息检索引擎130理解及区分的记号。信息检索引擎130的索引建立部分131把这些记号以及它们的位置添加到索引140中。该索引把每个唯一的记号映射到该目标文档中出现该记号的位置。若需要,可以重复该过程,以便把一些不同的目标文档添加到该索引中。若索引140表示一些目标文档中的文本,则位置信息最好包含各位置对应的文档的标记。
在查询阶段,把文本查询112提供给记号化器120。查询可能是单个字符串或一个句子,或者可能是由一些字符串组成的完整文档。记号化器120按它把目标文档中的词转换成记号的相同方式把查询112的文本中的词转换成记号。记号化器120把这些记号传送到信息检索引擎130的索引检索部分132。信息检索引擎的索引检索部分在索引140中搜索这些记号在目标文档中的出现。对于每个记号,信息检索引擎的索引检索部分确定目标文档中出现该记号的各个位置。作为查询结果113返回位置表。
常规记号化器典型地包括输入文本的外表变换,例如把每个大写字符变成小写、确定输入文本中的每个词并且去掉词的后缀。例如,常规记号化器可能把输入的文本字符串
The father is holding the baby。
(该 父亲 正抱着 该 婴儿。)转换成下述记号:
the    (该)
father (父亲)
is     (是)
hold   (抱)
the    (该)
baby   (婴儿)这种记号化方法趋向于使依据它的搜索过分地包含出现这样的词,即其含意是和查询文本中的预定含意不同的。例如,该示例输入文本字符串使用“to support or grasp(支持或抓住)”含意下的动词“hold”。但是,记号“hold”可能会和其含意是“the cargo area of a ship(船的装货区”)的词“hold”匹配。这种记号化方法还趋向于过分包含这样的情况,即其中词之间的关系和查询文本中各词之间的关系不同。例如,在上述示例输入文本字符串中,“father”是词“hold”的主语而“baby”是宾语,该示例的字符串可能和句子“The father and the baby held the toy”匹配,在该句中,“baby”是主语而不是宾语。该方法还会过少地包括出现这样的情况,即采用不同的但在语义上相关的词来代替查询文本中的某个词。例如,上述的输入文本字符串可能不和文本字符串“The parent isholding the baby”匹配。出于常规记号化方法的这些缺点,一种编有记号化文本中隐含的语义关系的记号化器应该是非常实用的。
本发明目的是利用一种改进的记号化器进行信息检索,该改进的记号化器分析输入文本以确定逻辑形式,接着利用超属词扩展逻辑形式。当和常规信息检索索引结构以及查询一起使用时,本发明减少标识出现不同的含意以及标识出现词之间带有不同的关系的次数,并且增加标识出现使用不同的但在语义上相关的用语的次数。
通过对已编索引的文本和查询文本进行语法分析以对该输入文本进行词法、语法和语义分析,本发明克服了和常规记号化过程相关的问题。该分析过程产生一个或多个逻辑形式,它们标识查询文本中起主要作用的词以及它们预定的含意,并且还进而确定这些词之间的关系。该语法分析程序最好产生和输入文本的深主语、动词和深宾语相关的逻辑形式。例如,对于输入文本“The father is holding the baby”,语法分析程序可能生成下述逻辑形式:
深主语    动词    深宾语
father    hold     baby语法分析程序还将该输入文本中采用的特定含意归入这些词。
利用数字词典或辞典(也称为语言知识库)为某词的某特定含义确定和该词的该含义为通用术语的其它词的含义(“超属词”),本发明把语法分析程序生成的逻辑形式中的词改变成它们的超属词以创造附加的逻辑形式,这些附加的逻辑形式所具有的总含义和原始逻辑形式的含义相接近。例如,根据词库中的指示,“parent”的一种含意是“father”的所属含意的超属词,“touch”的一种含意是“hold”的所属含意的超属词,“child”的一种含意以及“person”的一种含意是“baby”的所属含意的超属词,本发明可建立如下的附加逻辑形式:深主语        动词        深宾语parent        hold         babyfather        touch        babyparent        touch        babyfather        hold         childparent        hold         childfather        touch        childparent       touch        childfather       hold         personparent       hold         personfather       touch        personparent       touch        person
然后,本发明把所有生成的逻辑形式变换成可由信息检索系统理解的记号,该系统把记号化后的查询和索引进行比较,并且提供给该信息检索系统。
图1是信息检索过程的概述数据流图。
图2是最好在其上运行本工具的通用计算机系统的高级框图。
图3是一个概述流程图,表示最好由本工具执行的各步骤以便构造和访问语义上代表目标文档的索引。
图4是一个流程图,表示由本工具使用的用以生成输入句子的各记号的记号化例程。
图5是一个逻辑形式图,表示示例的逻辑形式。
图6是一个输入文本图,表示输入文本片断,本工具为这些片断构造图5中示出的逻辑形式。
图7A是一个语言知识库图,表示由语言知识库确定的示例性超属词关系。
图7B是一个语言知识库图,表示为原逻辑形式的深主语man(含意2)选择超属词。
图8是一个语言知识库图,表示为原逻辑形式的动词kiss(含意1)选择超属词。
图9和10是语言知识库图,表示为原逻辑形式的深宾语pig(含意2)选择超属词。
图11表示扩展逻辑形式的逻辑形式。
图12表示通过置换扩展的原逻辑形式建立派生的逻辑形式。
图13是一个索引图,表示索引内容的例子。
图14是一个逻辑形式图,表示本工具为查询“man kissing horse”优选构造的逻辑形式。
图15表示利用超属词扩充原逻辑形式。
图16是一个语言知识库图,表示选择查询逻辑形式的深宾词horse(含意1)的超属词。
图17是部分逻辑形式图,表示和一个只包含深主语和动词的部分查询对应的部分逻辑形式。
图18是部分逻辑形式图,表示和一个只包含动词和深宾语的部分查询对应的部分逻辑形式。
本发明的目的是利用文本的语义表达进行信息检索。当和常规信息检索索引结构以及查询一起使用时,本发明减少标识出现不同的含意以及标识出现词之间存在不同的关系的次数,并且增加标识出现使用不同的但在语义上相关的用语的次数。
在一种优选实施例中,用一种改进的信息检索记号化工具(以下称“本工具”)代替图1所示的常规记号化器,该工具分析输入文本以确定逻辑形式,接着利用超属词扩展逻辑形式。通过对已编索引的文本和查询文本进行语法分析以对该输入文本进行词法、语法和语义分析,本发明克服了和常规记号化过程相关的问题。该分析过程产生一个或多个逻辑形式,它们标识查询文本中起主要作用的词以及它们的预定含意,并且还进而确定这些词之间的关系。该语法分析程序最好产生和输入文本的深主语、动词和深宾语相关的逻辑形式。例如,对于输入文本“The fatheris holding the baby”,该语法分析程序可产生表示深主语是“father”、动词是“hold”及深宾语是“baby”的逻辑形式。由于把输入文本转换成逻辑形式通过去掉修饰语并忽略时态和语态的差导将输入文本“蒸馏”成基本含义,把输入文本片断转换成逻辑形式趋于统一自然语言中表达相同思想可能采用的许多不同方式。该语法分析程序还确定这些词在该输入文本中所使用的特定含义。
利用数字词典或辞典(也称为语言知识库)为某词的某特定含义确定和该词的该含义为通用术语的其它词的含义(“超属词”),本发明把语法分析程序生成的逻辑形式中的词改变成它们的超属词以创造附加的逻辑形式,这些附加的逻辑形式所具有的总含义和原始逻辑形式的含义相接近。然后,本发明把所有生成的逻辑形式变换成可由信息检索系统理解的记号,该系统把记号化后的查询和索引进行比较,并且提供给该信息检索系统。
图2是最好在其上运行本工具的通用计算机系统的高级框图。计算机系统200包括中央处理器(CPU)210、输入/输出部件220及计算机存储器(存储器)230。输入/输出部件中有存储部件221,例如硬盘机。输入/输出部件还包括计算机可读的介质驱动器222,它可用于安装软件产品,其中包括计算机可读介质如CD-ROM上提供的本工具。输入/输出部件还包括因特网连接223,其使计算机系统200通过因特网和其它计算机系统通信。最好包括本工具240的计算机程序驻留在存储器230中并在CPU 210上执行。本工具240包括一个基于规则的语法分析程序,用于分析要记号化的输入文本片断以生成逻辑形式。本工具240还包括一个由该语法分析程序使用的语言知识库242,以把含义号赋予逻辑形式中的词。本工具还利用语言知识库确定所生成的逻辑形式中的各词的超属词。存储器230最好还包括索引250,其用于将根据目标文档生成的记号映射到目标文档中的位置。存储器230还包括一个信息检索引擎(“IR引擎”)260,用于把从目标文档生成的记号存储到索引250中,并且用于确定索引中和从查询生成的记号相匹配的记号。尽管本工具最好在按上述配置的计算机系统中实现,熟练技术人员可意识到它可实现在具有不同配置的计算机系统上。
图3是一个概述流程图,表示为了构造和访问语义上代表目标文档的索引最好由本工具执行的步骤。简言之,本工具首先通过把目标文档的每个句子或句子片断变换成一些记号在语义上对目标文档编索引,这些记号表示描述句子中重要的词之间的关系的扩展逻辑形式,并包括着具有类似含义的超属词。本工具把这些“语义记号”以及目标文档中出现该句子的位置存储到索引中。当对所有目标文档编排索引后,本工具能对照该索引处理信息检索查询。对于接收到的每条这种查询,本工具以对来自目标文档的句子进行记号化的相同方式对查询文本记号化-即通过把句子变换成共同表示查询文本之扩展逻辑形式的各语义记号。然后,本工具把这些语义记号和索引中存储的语义记号进行比较,以确定目标文档中存储的这些语义记号的位置,并且按照与该查询的关联顺序对包含这些语义记号的目标文档分类。本工具最好可更新索引,以便随时包含新目标文档的语义记号。
参照图3,在步骤301-304,本工具循环处理目标文档中各个句子。在步骤302,本工具调用例程以记号化图4所示的句子。
图4是一个流程图,表示本工具使用的生成输入句子或其它输入文本片断的记号的记号化例程。在步骤401,本工具从输入文本片断构造原逻辑形式。如上面所讨论。逻辑形式表示句子或句子片断的基本含义。通过应用语法分析程序241(图2)使输入文本片断得到语法及语义分析处理产生逻辑形式。对于构造表示输入文本字符串的逻辑形式的详细讨论,请参见美国专利申请08/674,610号,这里引用作为参考。
本工具使用的逻辑形式最好析出句子的主要动词、该动词的实际主语的名词(“深主语”)以及该动词的实际宾语的名词(“深宾语”)。图5是一个逻辑形式图,表示示例的原逻辑形式。该逻辑形式具有三个元素“深主语元素510、动词元素520以及深宾语元素530。可以看出,该逻辑形式的深主语是词“man”的含义2。含义号为具有多于一个含义的词指示语法分析程序赋予词的特定含义,该含义是由语法分析程序所使用的语言知识库定义的。例如,词“man”可具有意思为人的第一含义和具有成年男性的第二含义。逻辑形式的动词是词“kiss”的第一含义。最后,深宾语是词“pig”的第二含义。该逻辑形式的简化版本是一个有序三元组550,其第一元素是深主语,第二元素是动词,其第三元素是深宾语:
(man,kiss,pig)
图5中所示的逻辑形式表征一些不同的句子和句子片断。例如,图6是一个表示输入文本片断的输入文本图,本工具会为其构造图5中所示的逻辑形式。图6表示输入文本句子片断“man kissing a pig”。可以看出该短语出现在文档5的词号150处,占据着词位置150、151、152和153。当本工具对该输入文本蒸片断进行记号化时,它生成图5中示出的逻辑形式。本工具也会为下述输入文本片断生成图5中所示的逻辑形式:
The pig was kissed by an unusual man.
The man will kiss the largest pig。
Many pigs have been kissed by that man。如前面所讨论,由于把输入文本转换成逻辑形式通过去掉修饰语并忽略时态和语态的差异将输入文本蒸馏成基本含义,把输入文本片断转换成逻辑形式趋于统一自然语言中表达相同思想可能采用的许多不同方式。
回到图4,在本工具从输入文本构造出原逻辑形式后,例如图5中所示的逻辑形式后,本工具进入步骤420以利用超属词扩展该原逻辑形式。在步骤402后,记号化例程返回。
如上面所述,超属词是一个属术语,它和某特定的词具有“is a”(是)的关系。例如,词“vehicle”是词“automobile”的超属词。本工具最好利用一个语言知识库确定原逻辑形式下的词的超属词。这种语言知识库典型地包含规定某词的超属词的语义链接。
图7A是一个语言知识库图,表示由语言知识库确定的示例超属词关系。请注意,类似于后面的语言知识库,图7A已被简化以便利本说明,并且略掉通常可在语言知识库中发现的不和本说明直接相关的信息。图7A中的每个向上的箭头把某个词和它的超属词连接起来。例如,有一个箭头把词man(含义2)711连接到词person(含义1)714,表示person(含义1)是man(含义2)的超属词。相反,man(含义2)被说成是person(含义1)的“亚属词”。
在为了扩展原逻辑形式而确定超属词时,本工具根据超属词的亚属词的相关为原逻辑形式的每个词选择一个或多个超属词。通过以这种方式选择超属词,本工具在超出输入文本片断含义的范围外(但在控制量内)使逻辑形式的含义广义化。对于某原逻辑形式中的某特定词,本工具首先选择该原逻辑形式的该词的直接超属词。例如,参照图7A,从原逻辑形式中的man(含义2)711开始,本工具选择它的超属词person(含义1)714。下一步,本工具根据person(含义1)714是否具有相对于起始词man(含义2)711的相关亚属词集,判定是否还要选择person(含义1)714的超属词animal(含义3)715。若与起始词man(含义2)711不同的词person的所有含义的大量亚属词至少具有对起始词man(含义2)711的相似性的临阈级,则person(含义1)714具有相对于man(含义2)711的相干亚属词集。
为了确定超属词的不同含义的亚属词之间的相似度,本工具最好咨询语言知识库以得到表示词的这些词句之间的相似程度的相似性权重。图7B是一个语言知识库图,表示man(含义2)和person(含义1)的及person(含义5)的其它亚属词之间的相似性权重。该图表示:man(含义2)和woman(含义1)之间的相似性加权是“.0075”;在man(含义2)和child(含义1)之间的相似性权重是“.0029”;在man(含义2)和villain(含义1)之间的相似性权重是“.0003”;以及在man(含义2)和lead(含义7)之间的相似性权重是“.0002”。这些相似性加权最好是由语言知识库根据该语言知识库保持的词意对之间的语义关系网络计算的。关于利用语言知识库计算词义对之间的相似性加权的详细讨论,请参见标题为“确定词之间的相似性”的美国专利申请号(专利律师卷号661005.524),这里引用作为参考。
为了根据这些相似性加权判定亚属词集是否相干,本工具确定相似性加权的阈值量是否超过相似性加权阈。虽然优选阈百分比是90%,最好为了优化本工具的性能调整阈百分比。还可把相似性加权阈值配置成优化本工具的性能。相似性加权阈值最好和语言知识库提供的相似性加权的总分布相配合。这里,示出采用“.0015”的阈值。从而本工具判定起始词的和超属词的所有含义的其它亚属词之间的至少90%的相似性加权是否等于或高于“.0015”的相似性加权阈。可以从图7B看出,相对于man(含义1)的person的亚属词不满足该条件:尽管man(含义1)和women(含义1)之间以及man(含义1)和child(含义1)之间的相似性加权大于“.0015”,man(含义1)和villain(含义1)之间以及man(含义1)和lead(含义7)之间的相似性加权小于“.0015”。从而本工具不再选择超属词animal(含义3)715,也不选择animal(含义3)的任何超属词。因此,只选择超属词person(含义1)714用于扩展原逻辑形式。
为了扩展原逻辑形式,本工具还选择原逻辑形式的动词和深宾语的超属词。图8是一个语言知识库图,表示选择原逻辑形式的动词kiss(含义1)的超属词。从图中可看出touch(含义2)是kiss(含义1)的超属词。该图还示出kiss(含义1)和touch的所有含义的其它亚属词之间的相似性加权。本工具首先选择原逻辑形式的动词kiss(含义1)的直接超属词touch(含义2)。为了判定是否选择touch(含义2)的超属词interact(含义9),本工具判定kiss(含义1)和touch的所有含义的其它亚属词之间的相似性加权中有多少至少和相似性加权阈值一样大。由于这四个相似性加权中只有两个至少和“.0015”的相似性加权阈值一样大,所以本工具不选择touch(含义2)的超属词interat(含义9)。
图9和图10是语言知识库图,表示选择原逻辑形式的深宾语的超属词和pig(含义2)。从图9中可以看出本工具选择pig(含义2)的超属词swine(含义1)和选择swine(含义1)的超属词animal(含义3)来扩展原逻辑形式,因为swine的唯一含义的90%以上(事实上,100%)的超属词具有等于或高于“.0015”的相似性加权阈值。从图10中可以看出,本工具不继续选择animal(含义3)的超属词organism(含义1),因为animal的含义的超属词中具有等于或高于“.0015”相似性加权阈值的超属词少于90%(实际上25%)。
图11是一个逻辑形式图,表示扩展逻辑形式。从图11中可以看出,扩展逻辑形式的深主语元素1110包括除词man(含义2)1111之外的超属词person(含义1)。可看出动词元素1120包括超属词touch(含义2)1112和词kiss(含义1)1121。还可以看出,扩展逻辑形式的深宾语包括除词pig(含义2)1131之外的超属词swine(含义1)和animal(含义3)1132。
通过在扩展逻辑形式的各个元素中用超属词置换原始词,本工具可创造一个数量比较大的派生逻辑形式,这些逻辑形式在意义上和原逻辑形式比较接近。图12表示通过置换扩展的原逻辑形式建立的派生逻辑形式。从图12中可看出,此置换创造十一个派生逻辑形式,每个逻辑形式在比较准确的方式下表征输入文本的含义。例如,图12示出的派生逻辑形式。
(person,touch,pig)在含义上非常接近句子片断
man kissing a pig图11中所示的扩展逻辑形式表示原逻辑形式加这十一个派生逻辑形式,它们被更紧凑地表示成扩展逻辑形式1200:
((man OR person),(kiss OR touch),(pig OR swine OR animal))
本工具以允许记号可由常规信息检索引擎处理的方式,从该扩展逻辑形式生成逻辑记号。首先,本工具把某保留字符附加到扩展逻辑形式中的各个词上,以确定输入文本片断中出现的词是否是深主语、动词或深宾语。这可确保,当词“man”作为深主语出现在查询输入文本的扩展逻辑形式中时,它不会和存储在索引中的作为动词出现在某扩展逻辑形式的一部分的词“man”匹配。一将保留字符映射为逻辑格式元素的示例  如下:
逻辑形式元素标识字符
深主语   -
动词     ∧
深宾语   #利用保留字符的这种示例映射,为逻辑形式“(man,kiss,pig)”生成的记号应包括“man_”,“kiss^”以及“pig#”。
常规信息检索引擎生成的索引通常把每个记号映射到目标文档中出现该记号的各特定位置。常规信息检索引擎可能利用文档号和词号表示这种目标文档位置,文档号标识包含着该记号的目标文档,词号标识该目标文档中出现该记号的位置。这种目标文档位置允许常规信息检索引擎确定在目标文档中一起出现的多个词,以响应利用“PHRASE(短语)”运算符的查询,该运算符要求其联接的词在目标文档中是相邻的。例如,查询“red PHRASE bicycle”将匹配出现在文档5词611处的“red”以及在文档5词612处的“bicycle”,但不会匹配出现在文档7词762处的“red”以及在文档7词202处的“bicycle”。把目标文档位置存储在索引中还允许常规信息检索引擎响应查询确定目标文档中出现被查询记号的各个点。
对于来自目标文档输入文本片断的扩展逻辑形式,本工具最好类似地向每个记号分配人工目标文档位置,即使扩展逻辑形式的这些记号实际上并不在目标文档中的这些位置上出现。分配这些目标文档位置既(A)允许常规搜索引擎利用PHRASE运算符确定和单个原逻辑形式或派生逻辑形式对应的语义记号的组合,又(B)允许本工具把分配的位置和目标文档中的输入文本片断的实际位置关联起来。从而本工具按如下向语义记号分配位置。逻辑形式元素               位置深主语                     (输入文本片断中第1个词的位置)动词                       (输入文本片断中第1个词的位置)+1深宾语                     (输入文本片断中第1个词的位置)+2从而本工具按如下对从文档5、字150处开始的句子得到的“(man,kiss,pig)”的扩展逻辑形式的记号分配目标文档位置:“man_”和“person”——文档5,词150;“kiss^”和“touch^”——文档5,词151;以及“pig#”、“swine#”和“animal#”——文档5,词152。
回到图3,在步骤303,本工具把记号化例程建立的记号以及它们的出现位置存储到索引中。图13表示索引的示例内容。索引将每个记号映射到文档的标识上以及该记号在该文档中的出现位置。请注意,尽管索引是作为表示出的,以便更清楚地表示索引中的映射,实际上最好把索引存储到一些其它的更有效支持索引中的记号的位置的格式中的一种格式中,例如树状格式。另外,最好利用诸如前缀压缩技术压缩索引中的内容,以将索引的长度降到最低限度。
可以看出,根据步骤303,本工具为扩展逻辑形式下的各个词的索引1300中存储了映射。在索引中存储了从深主语词“man”和“person”到文档号5、词号150处的目标文档位置的映射。词号150是在该处开始图6中所示的输入文本片断的词位置。可以看出,本已把保留字符“”附加在和深主语词对应的记号上。通过附加该保留字符,当以后搜索该索引时,本工具能检索这些词作为逻辑形式的深主语出现的情况,而不检索这些词作为逻辑形式的动词或深宾语的出现。类似地,该索引包括动词“kiss”和“touch”的记号。这些动词词的条目把它们映射到文档5、词号151的目标文档位置上,即深主语词的目标文档位置的后一个词。还可以看出,已为这些动词词的记号附加了保留字符“^”,从而这些词的出现以后不会作为深主语或深宾语元素出现。类似地,该索引包含深宾语词“animal”、“pig”和“swine”的记号,把它们映射到文档号5、词号152的目标文档位置上,即该短语开始的目标文档位置的两个词后。对深宾语词的记号附加保留字符“#”以把它们标识为索引中的深宾语。利用以这种状态示出的索引,通过搜索图12示出的任一派生原逻辑形式的索引,可以找到图6中所示的输入文本片断。
在一种优选实施例中,本工具在同一索引中存储目标文档中字面上出现的词到其目标文档中的实际位置的映射以及该目标文档的语义表达,最好用一个常数递增语义表达的各个语义记号的词号值,其中该常数大于任一文档中的词的数量,以便在访问该索引时把语义表达的语义记号和文字记号区分开来。为了简化图13,未示出添加该常数。
在该例子,本工具将扩展逻辑形式中的每个词的记号添加到索引中,以形成目标文档的语义表达。然而,在一种优选实施例中,本工具对那些可能在区分各目标文档中的文档是有效的逻辑形式记号,限制添加到索引中的扩展逻辑形式记号集。为了如此限制添加剂索引的扩展逻辑形式记号集,本工具最好确定各记号文档频率倒数,其公式由后面的式(1)表示。在该实施例,本工具只把其文档频率倒数超过最小阈值的记号添加到索引中。
回到图3,在目标文档的当前句子之前把记号存储到索引中后,在步骤304,本工具循环回到步骤301以处理目标文档中的下个句子。当处理完目标文档中的所有句子时,本工具进入步骤305。在步骤305,本工具接收查询文本。在步骤306-308,本工具处理接收到的查询。在步骤306,本工具调用记号化例程以对查询文本记号化。图14是一个逻辑形式图,表示根据步骤401(图4)最好由本工具为查询“man kissing horse”构造的逻辑形式。可以该逻辑形式图中看出,深主语是man(含义2),动词是kiss(含义1),深宾语是horse(含义1)。该原逻辑形式更简明地表达成原逻辑形式1450。
(man,kiss,horse)
图15表示根据步骤402(图4)利用超属词扩展原逻辑形式,从图15可看出,类似于取自目标文档的示例输入文本,用超属词person(含义1)扩展深主语man(含义2),用超属词touch(含义2)扩展动词kiss(含义1),还可以看出,用超属词animal(含义3)扩展深宾语horse(含义1)。
图16是一个语言知识库图,表示选择查询逻辑形式的深宾语horse(含义1)的超属词。从图16中可以看出,由于animal(含义3)的亚属词中少于90%的亚属词具有的相似性加权等于或高于“.0015”的相似性加权阈值,所以本工具不选择animal(含义3)的超属词organism(含义1)。从而,本工具只利用超属词animal(含义3)扩展逻辑形式。
回到图3,在步骤307,本工具使用扩展逻辑形式1550(图15)检索目标文档中出现匹配记号的索引位置,该扩展逻辑形式1550是利用原逻辑形式的词含义的超属词构造的。本工具最好通过发出下述与索引对比的查询:
(man_OR person_)PHRASE(kiss ∧OR touch∧)PHRASE(horse#OR animal#)进行检索。PHRASE运算符匹配出现这样的情况,即,该运算符后的操作数的词位置1比其前面的操作数的词位置大。从而,该查询匹配在动词kiss^或touch^之前的深主语man_或person,其中动词kiss^或touch^在深宾语horse#或animal#之前。从图13的索引可看出,在文档号5、词号150处满足该查询。
若该查询不满足该索引,则本工具将继续提出两个不同部分查询下的查询。第一个部分形式只包括深主语和动词,不包括宾语:
(man_OR person_) PHRASE(kiss∧OR touch∧)图17是一个部分逻辑形式图,表示和该第一查询对应的部分逻辑形式。查询的第二部分形式包括动词和深宾语,但不包括深主语:
(kiss∧OR touch∧)PHRASE(horse#OR animal#)图18是一个部分逻辑形式图,表示和该第二部分查询对应的部分逻辑形式。这些部分查询会和索引中具有不同深主语或深宾语的逻辑形式匹配,并且会和不具有深主语或深宾语的部分逻辑形式匹配。这些部分查询考虑查询输入文本片断和目标文档输入文本片断之间的差异,其中包括代词的使用以及暗含的深主语以及深宾语。
回到图3,在确定索引中记号的匹配后,本工具进入步骤308以对目标文档分类,其中按它们与查询的关联性的顺序出现和原逻辑形式或派生逻辑形式对应的各匹配记号的特定组合的匹配。在本发明的不同实施例中,本工具采用一些周知方法中的一种或几种通过关联性对各文档分类,这些方法包括Jaccard加权和二进制项独立加权。本工具最好采用文档频率倒数和项频率等待的组合对匹配的目标文档分类。
在对目标文档中出现较少的记号组合给予较大的加权下,文档频率倒数加权表征记号组合区分文档的能力。例如,对于一组主题是photography(摄影术)的一组目标文档,逻辑形式
(photographer,frame,subject)会出现在该组文档中的每份文档中,从而对于区分各文档它不是一种很好的基准。由于上述逻辑形式在每份目标文档中出现,所以它具有较小的文档频率倒数。记号组合的文档频率倒数的公式如下:
Figure A9880417500201
文档中记号组合的项频率加权量测该文档专用于该记号组合的程度,并假定其中多次出现某特定查询记号的文档要比在其中不太出现该查询记号的文档关联更大。文档中某记号组合的项频率加权公式如下:
项频率(记号组合,文档)=该文档中出现该记号组合的次数(2)
本工具利用各匹配文档的记分对文档分类。本工具首先利用下述公式对每份文档中的各匹配记号组合计算计分:
记分(记号组合,文档)
=文档频率倒数(记号组合)×项频率(记号组合,文档)    (3)接着本工具根据下式通过选择各匹配文档中任一匹配记号组合的最高记分,计算各匹配文档的记分:
Figure A9880417500202
一旦本工具计算出每份文档的记分,本工具可扩大这些记分以反映和那些指向语义匹配的项不同的查询项。在扩大每份文档的记分后,若需要,本工具通过按下式考虑文档的篇幅计算每份文档的归一化记分:篇幅(文档)项可以是某文档的篇幅的任何合理量测,例如该文档中的字符、词、句子或句子片断的数量。可以替代地用一些其它归一化技术归一化文档记分,包括余弦测量归一化、项加权和归一化以及最大项加权归一化。
在计算出每份匹配文档的归一化记分后,本工具按文档的归一化记分的顺序对匹配文档分类。用户最好从分类表中选择一份匹配文档,以得到该文档中匹配记号组的位置,或者显示该文档的匹配部分。
回到图3,在步骤308中对匹配的目标文档分类后,本工具最好进入步骤305以接收下个查询的文本以和索引对比。
上面讨论了通过关联性对包含匹配记号组的文档进行分类。本发明的其它优选实施例类似地通过关联性分别对包含匹配的文档集和文档段落分类。对于被组织成各包含一份或几份文档的文档集的目标文档,本工具最好通过关联性对出现匹配的文档集分类,以确定最相关的文档集供进一步查询。另外,本工具最好可配置成能把每份目标文档划成段落并且对其中出现匹配的文档段落的关联性分类。通过选择一数量的字节、词或句子或者使用目标文档中出现的结构、格式或语言线索,在目标文档中相邻标识这些文档段落。本工具最好还确定论及特定论题的不相邻的文档段落。
虽然参照各优选实施例显示并说明了本发明,熟练技术人员理解,在不背离本发明的范围下在形式和细节上可作出各种更改或修改。例如,记号化程序可以直接采纳或生成对应于一个完整的逻辑形式结构的记号以替代对应于某逻辑形式结构中的一个词的记号,并且把这样的记号存储到索引中。而且,可以应用各种周知技术以在具有语义匹配成分的查询中包括其它类型的搜索。并且,查询可包括若干语义匹配成分。此外,可利用标识词之间的语义关系代替超属词来扩展原逻辑形式。本工具还可以利用原逻辑形式的每个词的预先编译的替代词表扩展原逻辑形式,而不是如前面所说明的那样在运行时根据语言知识库生成超属性表。此外,为了提高匹配精度,记号化程序可以在词的记号中编码标识该词的含义号。在这种情况下,对超属词集的相干性的检查减少成不必为选定超属词的所有含义检查相似性。在本例中,只有词person的含义1的超属词需要带有对于词man(含义2)的起始含义的相似性阈值。由于索引表中的可能匹配项岐义较少,我们可以限制可能产生的错误命中的项集。由于这个原因,只需要检查和逻辑形式中的词具有超属词关系的那些含义。

Claims (17)

1.计算机系统中一种用于从输入字符串生成信息检索记号的方法,该方法包括步骤:
从输入字符串建立表征该输入字符串中选定的词之间的语义关系的原逻辑形式;
确定该输入字符串中各选定词的超属词;
从该原逻辑形式构造一个或多个替代的逻辑形式,通过为该输入字符串中的一个或多个选定词中的每个词用对该选定词确定的超属词代替原逻辑形式中的该选定词,构造每个替代的逻辑形式;以及
生成代表原逻辑形式以及替代逻辑形式的记号,所生成的记号可由信息检索引擎区分。
2.权利要求1的方法,其中构造步骤包括对输入字符串进行语法分析以判明其语法及语义结构的步骤。
3.权利要求1的方法,其中确定步骤包括步骤:
对输入字符串中的每个选定词:
从语言知识库中检索该选定词的一个或多个超属词,每个超属词具有一个表征该超属词对该选定词在含义上的相似性的相似性值;以及
确定其相似性值超过某预先建立的阈值的所有超属词。
4.权利要求1的方法,还包括步骤:
在构造步骤之前,从某搜索查询选择输入字符串;以及
把生成的记号提交给查询引擎以和一份或多份目标文档的表达进行比较。
5.权利要求1的方法,还包括步骤:
在构造步骤之前,从要编排索引的文本体中选择输入字符串;以及
把生成的记号提交给索引子系统以存储在代表该文本体的索引中。
6.权利要求5的方法,还包括确定替代逻辑形式中出现的每个词的文档频率倒数的步骤,并且其中提交步骤不向索引子系统提交这样的表示替代逻辑形式的记号,即这些逻辑形式所包含的词的文档频率倒数小于预先确定的最小文档频率倒数。
7.权利要求5的方法,还包括步骤:
在提交步骤之后,确定替代逻辑形式中出现的每个词的文档频率倒数;以及
从索引中去掉这样的表示替代逻辑形式的记号,即这些逻辑形式所包含的词的文档频率倒数小于预先确定的最小文档倒数。
8.权利要求1的方法,其中确定步骤确定相对于选定词具有相干亚属词集的选定词的超属词。
9.一种计算机可读介质,其内容使计算机系统通过执行下述步骤从输入字符串中生成信息检索记号:
从输入字符串建立表征该输入字符串中选定的词之间的语义关系的原逻辑形式,
确定该输入字符串中各选定词的超属词;
从该原逻辑形式构造一个或多个替代的逻辑形式,通过为该输入字符串中的一个或多个选定词中的每个词用对该选定词确定的超属词代替原逻辑形式中的该选定词,构造每个替代的逻辑形式;
生成代表原逻辑形式以及替代逻辑形式的记号,所生成的记号可由信息检索引擎区分。
10.权利要求9的计算机可读介质,其中构造步骤包括对输入字符串进行语法分析以判明其语法及语义结构的步骤。
11.权利要求9的计算机可读介质,其中确定步骤包括步骤:
对输入字符串中的每个选定词:
从语言知识库中检索该选定词的一个或多个超属词,每个超属词具有一个表征该超属词对该选定词在含义上的相似性的相似性值;以及
确定其相似性值超过某预先建立的阈值的所有超属词。
12.权利要求9的计算机可读介质,其中该计算机可读介质的内容还使计算机系统执行步骤:
在构造步骤之前,从某搜索查询选择输入字符串;以及
把生成的记号提交给查询引擎以和一份或多份目标文档的表达进行比较。
13.权利要求9的计算机可读介质,其中该计算机可读介质的内容还使计算机系统执行步骤:
在构造步骤之前,从要编排索引的文本体中选择输入字符串;以及
把生成的记号提交给索引子系统以存储在代表该文本体的索引中。
14.一种计算机存储器,含有表征一份或几份目标文档的内容的文档索引数据结构,该文档索引数据结构把词映射到目标文档中的位置,该文档索引数据结构为各目标文档中出现的多个词段中的每个词段,把从该词段生成的逻辑形式中所包含的各个词映射到与该词段相对应的位置上,并且把从该词段生成的逻辑形式中所包含的各词的超属词映射到与该词段相对应的位置上,从而可把该文档索引数据结构用于响应接收到查询确定出目标文档中语义上类似于查询段的词段位置。
15.权利要求14的计算机存储器,其中文档索引数据结构把至少一个未在任一目标文档中出现的词映射到目标文档的某位置上。
16.一种用于响应查询的计算机系统,查询包含着与一份或多份目标文档对照的词段,每份目标文档包含一个或多个词段,每个目标文档段具有目标文档中的一个位置,该计算机系统包括:
目标文档接收器,用于接收目标文档;
查询接收器,用于接收对各目标文档的查询;
记号化器,用于从目标文档接收器接收到的目标文档的词段以及从查询接收器接收的查询生成记号,该记号化器包括用于从每个词段合成出一个表征该词段的语义结构的逻辑形式的逻辑形式合成器,该记号化器生成代表从词段中合成出的逻辑形式的记号;
索引存储器,用于存储把每个从某目标文档段生成的记号映射到生成该记号的目标文档段在目标文档中的位置上的关系;以及
查询处理子系统,用于为每次查询在索引存储器中确定和从该查询生成的记号匹配的某记号,并用于返回从该确定的记号映射到的位置的指示。
17.权利要求16的计算机系统,其中逻辑形式合成器合成的逻辑形式包含若干词,并且记号化器还包括:
超属词扩展子系统,用于从逻辑形式合成器生成的逻辑形式创造一个或多个用超属词替代该逻辑形式中的一个或多个词的辅助的逻辑形式,记号化器还生成代表由超属词扩展子系统创造的辅助逻辑形式的记号。
CN98804175A 1997-03-07 1998-02-11 利用文本的语义表示进行信息检索 Pending CN1252876A (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/886,814 1997-03-07
US08/886,814 US6076051A (en) 1997-03-07 1997-03-07 Information retrieval utilizing semantic representation of text

Publications (1)

Publication Number Publication Date
CN1252876A true CN1252876A (zh) 2000-05-10

Family

ID=25389830

Family Applications (1)

Application Number Title Priority Date Filing Date
CN98804175A Pending CN1252876A (zh) 1997-03-07 1998-02-11 利用文本的语义表示进行信息检索

Country Status (5)

Country Link
US (5) US6076051A (zh)
EP (1) EP0965089B1 (zh)
JP (1) JP4282769B2 (zh)
CN (1) CN1252876A (zh)
WO (1) WO1998039714A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1333362C (zh) * 2001-03-26 2007-08-22 美国网上搜索公司 用于智能数据同化的方法和装置
US7630879B2 (en) 2002-09-13 2009-12-08 Fuji Xerox Co., Ltd. Text sentence comparing apparatus
US8065307B2 (en) 2006-12-20 2011-11-22 Microsoft Corporation Parsing, analysis and scoring of document content
CN101508188B (zh) * 2009-03-24 2012-09-26 北京市城南橡塑技术研究所 抗冲击复合衬板
CN105512291A (zh) * 2006-02-28 2016-04-20 贝宝公司 用于扩展数据库搜索查询的方法和系统
CN106598722A (zh) * 2015-10-19 2017-04-26 上海引跑信息科技有限公司 一种在文本信息检索服务中支持分布式事务管理的方法
CN110088754A (zh) * 2016-10-26 2019-08-02 联邦科学和工业研究组织 立法到逻辑的自动编码器
CN114969262A (zh) * 2022-05-31 2022-08-30 云知声智能科技股份有限公司 文本处理方法、装置、存储介质及电子装置

Families Citing this family (588)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7089218B1 (en) 2004-01-06 2006-08-08 Neuric Technologies, Llc Method for inclusion of psychological temperament in an electronic emulation of the human brain
US8725493B2 (en) * 2004-01-06 2014-05-13 Neuric Llc Natural language parsing method to provide conceptual flow
US6076051A (en) * 1997-03-07 2000-06-13 Microsoft Corporation Information retrieval utilizing semantic representation of text
US5933822A (en) * 1997-07-22 1999-08-03 Microsoft Corporation Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision
US6243670B1 (en) * 1998-09-02 2001-06-05 Nippon Telegraph And Telephone Corporation Method, apparatus, and computer readable medium for performing semantic analysis and generating a semantic structure having linked frames
US6167370A (en) * 1998-09-09 2000-12-26 Invention Machine Corporation Document semantic analysis/selection with knowledge creativity capability utilizing subject-action-object (SAO) structures
GB9821969D0 (en) * 1998-10-08 1998-12-02 Canon Kk Apparatus and method for processing natural language
US6964011B1 (en) * 1998-11-26 2005-11-08 Canon Kabushiki Kaisha Document type definition generating method and apparatus, and storage medium for storing program
US6233547B1 (en) * 1998-12-08 2001-05-15 Eastman Kodak Company Computer program product for retrieving multi-media objects using a natural language having a pronoun
WO2000034845A2 (en) * 1998-12-08 2000-06-15 Mediadna, Inc. A system and method of obfuscating data
US6993580B2 (en) * 1999-01-25 2006-01-31 Airclic Inc. Method and system for sharing end user information on network
GB9904662D0 (en) * 1999-03-01 1999-04-21 Canon Kk Natural language search method and apparatus
CA2272739C (en) * 1999-05-25 2003-10-07 Suhayya Abu-Hakima Apparatus and method for interpreting and intelligently managing electronic messages
US6901402B1 (en) * 1999-06-18 2005-05-31 Microsoft Corporation System for improving the performance of information retrieval-type tasks by identifying the relations of constituents
US20060116865A1 (en) 1999-09-17 2006-06-01 Www.Uniscape.Com E-services translation utilizing machine translation and translation memory
US6816857B1 (en) 1999-11-01 2004-11-09 Applied Semantics, Inc. Meaning-based advertising and document relevance determination
US9076448B2 (en) 1999-11-12 2015-07-07 Nuance Communications, Inc. Distributed real time speech recognition system
US7725307B2 (en) 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
US7392185B2 (en) 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
US7050977B1 (en) 1999-11-12 2006-05-23 Phoenix Solutions, Inc. Speech-enabled server for internet website and method
US8793160B2 (en) 1999-12-07 2014-07-29 Steve Sorem System and method for processing transactions
US6823492B1 (en) * 2000-01-06 2004-11-23 Sun Microsystems, Inc. Method and apparatus for creating an index for a structured document based on a stylesheet
US6751621B1 (en) 2000-01-27 2004-06-15 Manning & Napier Information Services, Llc. Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors
GB0006159D0 (en) * 2000-03-14 2000-05-03 Ncr Int Inc Predicting future behaviour of an individual
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
AU4869601A (en) * 2000-03-20 2001-10-03 Robert J. Freeman Natural-language processing system using a large corpus
US7428500B1 (en) 2000-03-30 2008-09-23 Amazon. Com, Inc. Automatically identifying similar purchasing opportunities
US7120574B2 (en) * 2000-04-03 2006-10-10 Invention Machine Corporation Synonym extension of search queries with validation
US20010039490A1 (en) * 2000-04-03 2001-11-08 Mikhail Verbitsky System and method of analyzing and comparing entity documents
US20020010574A1 (en) * 2000-04-20 2002-01-24 Valery Tsourikov Natural language processing and query driven information retrieval
US7962326B2 (en) * 2000-04-20 2011-06-14 Invention Machine Corporation Semantic answering system and method
US7912868B2 (en) * 2000-05-02 2011-03-22 Textwise Llc Advertisement placement method and system using semantic analysis
AU2001271397A1 (en) * 2000-06-23 2002-01-08 Decis E-Direct, Inc. Component models
US6675159B1 (en) * 2000-07-27 2004-01-06 Science Applic Int Corp Concept-based search and retrieval system
US8200485B1 (en) * 2000-08-29 2012-06-12 A9.Com, Inc. Voice interface and methods for improving recognition accuracy of voice search queries
US7328211B2 (en) * 2000-09-21 2008-02-05 Jpmorgan Chase Bank, N.A. System and methods for improved linguistic pattern matching
US7085708B2 (en) 2000-09-23 2006-08-01 Ravenflow, Inc. Computer system with natural language to machine language translator
US20020143524A1 (en) * 2000-09-29 2002-10-03 Lingomotors, Inc. Method and resulting system for integrating a query reformation module onto an information retrieval system
AU2000276396A1 (en) * 2000-09-30 2002-04-15 Intel Corporation (A Corporation Of Delaware) Method and system for building a domain specific statistical language model fromrule-based grammar specifications
US7027974B1 (en) 2000-10-27 2006-04-11 Science Applications International Corporation Ontology-based parser for natural language processing
US7146349B2 (en) * 2000-11-06 2006-12-05 International Business Machines Corporation Network for describing multimedia information
US6978419B1 (en) * 2000-11-15 2005-12-20 Justsystem Corporation Method and apparatus for efficient identification of duplicate and near-duplicate documents and text spans using high-discriminability text fragments
US20020091671A1 (en) * 2000-11-23 2002-07-11 Andreas Prokoph Method and system for data retrieval in large collections of data
US7013308B1 (en) 2000-11-28 2006-03-14 Semscript Ltd. Knowledge storage and retrieval system and method
US20030028564A1 (en) * 2000-12-19 2003-02-06 Lingomotors, Inc. Natural language method and system for matching and ranking documents in terms of semantic relatedness
WO2002054279A1 (en) * 2001-01-04 2002-07-11 Agency For Science, Technology And Research Improved method of text similarity measurement
US7904595B2 (en) 2001-01-18 2011-03-08 Sdl International America Incorporated Globalization management system and method therefor
US6766316B2 (en) 2001-01-18 2004-07-20 Science Applications International Corporation Method and system of ranking and clustering for document indexing and retrieval
US20020133392A1 (en) * 2001-02-22 2002-09-19 Angel Mark A. Distributed customer relationship management systems and methods
US6697793B2 (en) 2001-03-02 2004-02-24 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration System, method and apparatus for generating phrases from a database
US6741981B2 (en) 2001-03-02 2004-05-25 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration (Nasa) System, method and apparatus for conducting a phrase search
US6721728B2 (en) 2001-03-02 2004-04-13 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration System, method and apparatus for discovering phrases in a database
US6823333B2 (en) 2001-03-02 2004-11-23 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration System, method and apparatus for conducting a keyterm search
US6813616B2 (en) * 2001-03-07 2004-11-02 International Business Machines Corporation System and method for building a semantic network capable of identifying word patterns in text
US7426505B2 (en) * 2001-03-07 2008-09-16 International Business Machines Corporation Method for identifying word patterns in text
US7194454B2 (en) * 2001-03-12 2007-03-20 Lucent Technologies Method for organizing records of database search activity by topical relevance
US7860706B2 (en) 2001-03-16 2010-12-28 Eli Abir Knowledge system method and appparatus
US8874431B2 (en) * 2001-03-16 2014-10-28 Meaningful Machines Llc Knowledge system method and apparatus
US8744835B2 (en) * 2001-03-16 2014-06-03 Meaningful Machines Llc Content conversion method and apparatus
US7146308B2 (en) * 2001-04-05 2006-12-05 Dekang Lin Discovery of inference rules from text
US6904428B2 (en) 2001-04-18 2005-06-07 Illinois Institute Of Technology Intranet mediator
GB2375859B (en) * 2001-04-27 2003-04-16 Premier Systems Technology Ltd Search Engine Systems
US6829605B2 (en) * 2001-05-24 2004-12-07 Microsoft Corporation Method and apparatus for deriving logical relations from linguistic relations with multiple relevance ranking strategies for information retrieval
SG103289A1 (en) * 2001-05-25 2004-04-29 Meng Soon Cheo System for indexing textual and non-textual files
US7050964B2 (en) * 2001-06-01 2006-05-23 Microsoft Corporation Scaleable machine translation system
US7734459B2 (en) 2001-06-01 2010-06-08 Microsoft Corporation Automatic extraction of transfer mappings from bilingual corpora
US7003444B2 (en) * 2001-07-12 2006-02-21 Microsoft Corporation Method and apparatus for improved grammar checking using a stochastic parser
US9009590B2 (en) * 2001-07-31 2015-04-14 Invention Machines Corporation Semantic processor for recognition of cause-effect relations in natural language documents
US7251781B2 (en) * 2001-07-31 2007-07-31 Invention Machine Corporation Computer based summarization of natural language documents
US8799776B2 (en) * 2001-07-31 2014-08-05 Invention Machine Corporation Semantic processor for recognition of whole-part relations in natural language documents
US7284191B2 (en) * 2001-08-13 2007-10-16 Xerox Corporation Meta-document management system with document identifiers
US8020754B2 (en) 2001-08-13 2011-09-20 Jpmorgan Chase Bank, N.A. System and method for funding a collective account by use of an electronic tag
US7133862B2 (en) 2001-08-13 2006-11-07 Xerox Corporation System with user directed enrichment and import/export control
US6609124B2 (en) 2001-08-13 2003-08-19 International Business Machines Corporation Hub for strategic intelligence
US7526425B2 (en) 2001-08-14 2009-04-28 Evri Inc. Method and system for extending keyword searching to syntactically and semantically annotated data
US7024351B2 (en) * 2001-08-21 2006-04-04 Microsoft Corporation Method and apparatus for robust efficient parsing
US7047183B2 (en) * 2001-08-21 2006-05-16 Microsoft Corporation Method and apparatus for using wildcards in semantic parsing
US7403938B2 (en) * 2001-09-24 2008-07-22 Iac Search & Media, Inc. Natural language query processing
JP4065936B2 (ja) * 2001-10-09 2008-03-26 独立行政法人情報通信研究機構 機械学習法を用いた言語解析処理システムおよび機械学習法を用いた言語省略解析処理システム
ITFI20010199A1 (it) 2001-10-22 2003-04-22 Riccardo Vieri Sistema e metodo per trasformare in voce comunicazioni testuali ed inviarle con una connessione internet a qualsiasi apparato telefonico
US7194464B2 (en) 2001-12-07 2007-03-20 Websense, Inc. System and method for adapting an internet filter
US7231343B1 (en) * 2001-12-20 2007-06-12 Ianywhere Solutions, Inc. Synonyms mechanism for natural language systems
US20030172368A1 (en) * 2001-12-26 2003-09-11 Elizabeth Alumbaugh System and method for autonomously generating heterogeneous data source interoperability bridges based on semantic modeling derived from self adapting ontology
US7137062B2 (en) * 2001-12-28 2006-11-14 International Business Machines Corporation System and method for hierarchical segmentation with latent semantic indexing in scale space
US7177799B2 (en) * 2002-01-14 2007-02-13 Microsoft Corporation Semantic analysis system for interpreting linguistic structures output by a natural language linguistic analysis system
US7295966B2 (en) * 2002-01-14 2007-11-13 Microsoft Corporation System for normalizing a discourse representation structure and normalized data structure
US7225183B2 (en) * 2002-01-28 2007-05-29 Ipxl, Inc. Ontology-based information management system and method
FR2835334A1 (fr) * 2002-01-31 2003-08-01 France Telecom Systeme et procedes d'indexation et de recherche a extension de requetes, moteurs d'indexation et de recherche
US7031969B2 (en) * 2002-02-20 2006-04-18 Lawrence Technologies, Llc System and method for identifying relationships between database records
US8380491B2 (en) * 2002-04-19 2013-02-19 Educational Testing Service System for rating constructed responses based on concepts and a model answer
US20040039562A1 (en) * 2002-06-17 2004-02-26 Kenneth Haase Para-linguistic expansion
WO2003107223A1 (en) * 2002-06-17 2003-12-24 Beingmeta, Inc. Systems and methods for processing queries
US7493253B1 (en) 2002-07-12 2009-02-17 Language And Computing, Inc. Conceptual world representation natural language understanding system and method
US20040034541A1 (en) * 2002-08-16 2004-02-19 Alipio Caban Client devices, processor-usable media, data signals embodied in a transmission medium and processor implemented methods
JP2004139553A (ja) * 2002-08-19 2004-05-13 Matsushita Electric Ind Co Ltd 文書検索システムおよび質問応答システム
US7136807B2 (en) * 2002-08-26 2006-11-14 International Business Machines Corporation Inferencing using disambiguated natural language rules
JP4038717B2 (ja) * 2002-09-13 2008-01-30 富士ゼロックス株式会社 テキスト文比較装置
US7567902B2 (en) * 2002-09-18 2009-07-28 Nuance Communications, Inc. Generating speech recognition grammars from a large corpus of data
US7194455B2 (en) * 2002-09-19 2007-03-20 Microsoft Corporation Method and system for retrieving confirming sentences
US7171351B2 (en) * 2002-09-19 2007-01-30 Microsoft Corporation Method and system for retrieving hint sentences using expanded queries
US7293015B2 (en) * 2002-09-19 2007-11-06 Microsoft Corporation Method and system for detecting user intentions in retrieval of hint sentences
US20040122736A1 (en) 2002-10-11 2004-06-24 Bank One, Delaware, N.A. System and method for granting promotional rewards to credit account holders
WO2004044888A1 (de) * 2002-11-13 2004-05-27 Schoenebeck Bernd Sprachverarbeitendes system, verfahren zur zuordnung von akustischen und/oder schriftlichen zeichenketten zu wörtern bzw. lexikalischen einträgen
US20040098250A1 (en) * 2002-11-19 2004-05-20 Gur Kimchi Semantic search system and method
JP2006508448A (ja) * 2002-11-28 2006-03-09 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ ワードクラス情報を割り当てる方法
US8155946B2 (en) * 2002-12-23 2012-04-10 Definiens Ag Computerized method and system for searching for text passages in text documents
WO2004077217A2 (en) * 2003-01-30 2004-09-10 Vaman Technologies (R & D) Limited System and method of object query analysis, optimization and execution irrespective of server functionality
US7343280B2 (en) * 2003-07-01 2008-03-11 Microsoft Corporation Processing noisy data and determining word similarity
US20050060140A1 (en) * 2003-09-15 2005-03-17 Maddox Paul Christopher Using semantic feature structures for document comparisons
US7593845B2 (en) * 2003-10-06 2009-09-22 Microsoflt Corporation Method and apparatus for identifying semantic structures from text
CA2542438A1 (en) * 2003-10-21 2005-04-28 Intellectual Property Bank Corp. Document characteristic analysis device for document to be surveyed
US7584092B2 (en) * 2004-11-15 2009-09-01 Microsoft Corporation Unsupervised learning of paraphrase/translation alternations and selective application thereof
US7412385B2 (en) * 2003-11-12 2008-08-12 Microsoft Corporation System for identifying paraphrases using machine translation
CN1629833A (zh) * 2003-12-17 2005-06-22 国际商业机器公司 实现问与答功能和计算机辅助写作的方法及装置
US7359851B2 (en) * 2004-01-14 2008-04-15 Clairvoyance Corporation Method of identifying the language of a textual passage using short word and/or n-gram comparisons
JP2005267607A (ja) * 2004-02-20 2005-09-29 Fuji Photo Film Co Ltd デジタル図鑑システム、図鑑検索方法、及び図鑑検索プログラム
US7983896B2 (en) 2004-03-05 2011-07-19 SDL Language Technology In-context exact (ICE) matching
GB0407389D0 (en) * 2004-03-31 2004-05-05 British Telecomm Information retrieval
US20050256700A1 (en) * 2004-05-11 2005-11-17 Moldovan Dan I Natural language question answering system and method utilizing a logic prover
US7424485B2 (en) * 2004-06-03 2008-09-09 Microsoft Corporation Method and apparatus for generating user interfaces based upon automation with full flexibility
US7363578B2 (en) * 2004-06-03 2008-04-22 Microsoft Corporation Method and apparatus for mapping a data model to a user interface model
US7665014B2 (en) * 2004-06-03 2010-02-16 Microsoft Corporation Method and apparatus for generating forms using form types
US20060009966A1 (en) * 2004-07-12 2006-01-12 International Business Machines Corporation Method and system for extracting information from unstructured text using symbolic machine learning
US20060026522A1 (en) * 2004-07-27 2006-02-02 Microsoft Corporation Method and apparatus for revising data models and maps by example
US7685118B2 (en) * 2004-08-12 2010-03-23 Iwint International Holdings Inc. Method using ontology and user query processing to solve inventor problems and user problems
US8407239B2 (en) 2004-08-13 2013-03-26 Google Inc. Multi-stage query processing system and method for use with tokenspace repository
US7917480B2 (en) 2004-08-13 2011-03-29 Google Inc. Document compression system and method for use with tokenspace repository
US20060047691A1 (en) * 2004-08-31 2006-03-02 Microsoft Corporation Creating a document index from a flex- and Yacc-generated named entity recognizer
US20060047500A1 (en) * 2004-08-31 2006-03-02 Microsoft Corporation Named entity recognition using compiler methods
US20060047690A1 (en) * 2004-08-31 2006-03-02 Microsoft Corporation Integration of Flex and Yacc into a linguistic services platform for named entity recognition
CN100361126C (zh) * 2004-09-24 2008-01-09 北京亿维讯科技有限公司 使用本体论和用户查询处理技术解决问题的方法
US7657519B2 (en) * 2004-09-30 2010-02-02 Microsoft Corporation Forming intent-based clusters and employing same by search
US7996208B2 (en) 2004-09-30 2011-08-09 Google Inc. Methods and systems for selecting a language for text segmentation
US7680648B2 (en) * 2004-09-30 2010-03-16 Google Inc. Methods and systems for improving text segmentation
US8051096B1 (en) 2004-09-30 2011-11-01 Google Inc. Methods and systems for augmenting a token lexicon
US20060074632A1 (en) * 2004-09-30 2006-04-06 Nanavati Amit A Ontology-based term disambiguation
US7546235B2 (en) * 2004-11-15 2009-06-09 Microsoft Corporation Unsupervised learning of paraphrase/translation alternations and selective application thereof
US7552046B2 (en) * 2004-11-15 2009-06-23 Microsoft Corporation Unsupervised learning of paraphrase/translation alternations and selective application thereof
US20060122834A1 (en) * 2004-12-03 2006-06-08 Bennett Ian M Emotion detection device & method for use in distributed systems
US8843536B1 (en) 2004-12-31 2014-09-23 Google Inc. Methods and systems for providing relevant advertisements or other content for inactive uniform resource locators using search queries
US8473449B2 (en) * 2005-01-06 2013-06-25 Neuric Technologies, Llc Process of dialogue and discussion
US7869989B1 (en) * 2005-01-28 2011-01-11 Artificial Cognition Inc. Methods and apparatus for understanding machine vocabulary
EP1851616A2 (en) * 2005-01-31 2007-11-07 Musgrove Technology Enterprises, LLC System and method for generating an interlinked taxonomy structure
EP1846815A2 (en) * 2005-01-31 2007-10-24 Textdigger, Inc. Method and system for semantic search and retrieval of electronic documents
US20060200464A1 (en) * 2005-03-03 2006-09-07 Microsoft Corporation Method and system for generating a document summary
US20060200337A1 (en) * 2005-03-04 2006-09-07 Microsoft Corporation System and method for template authoring and a template data structure
US20060200338A1 (en) * 2005-03-04 2006-09-07 Microsoft Corporation Method and system for creating a lexicon
US20060200336A1 (en) * 2005-03-04 2006-09-07 Microsoft Corporation Creating a lexicon using automatic template matching
US7937396B1 (en) 2005-03-23 2011-05-03 Google Inc. Methods and systems for identifying paraphrases from an index of information items and associated sentence fragments
US9400838B2 (en) * 2005-04-11 2016-07-26 Textdigger, Inc. System and method for searching for a query
US8032823B2 (en) * 2005-04-15 2011-10-04 Carnegie Mellon University Intent-based information processing and updates
US7672908B2 (en) * 2005-04-15 2010-03-02 Carnegie Mellon University Intent-based information processing and updates in association with a service agent
FR2885712B1 (fr) * 2005-05-12 2007-07-13 Kabire Fidaali Dispositif et procede d'analyse semantique de documents par constitution d'arbres n-aire et semantique
CN101366024B (zh) 2005-05-16 2014-07-30 电子湾有限公司 用于处理数据搜索请求的方法和系统
US7401731B1 (en) 2005-05-27 2008-07-22 Jpmorgan Chase Bank, Na Method and system for implementing a card product with multiple customized relationships
GB0512744D0 (en) * 2005-06-22 2005-07-27 Blackspider Technologies Method and system for filtering electronic messages
US7689411B2 (en) 2005-07-01 2010-03-30 Xerox Corporation Concept matching
US7809551B2 (en) * 2005-07-01 2010-10-05 Xerox Corporation Concept matching system
CA2545237A1 (en) * 2005-07-29 2007-01-29 Cognos Incorporated Method and system for managing exemplar terms database for business-oriented metadata content
CA2545232A1 (en) * 2005-07-29 2007-01-29 Cognos Incorporated Method and system for creating a taxonomy from business-oriented metadata content
US8666928B2 (en) 2005-08-01 2014-03-04 Evi Technologies Limited Knowledge repository
JP4639124B2 (ja) * 2005-08-23 2011-02-23 キヤノン株式会社 文字入力補助方法及び情報処理装置
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US20070073533A1 (en) * 2005-09-23 2007-03-29 Fuji Xerox Co., Ltd. Systems and methods for structural indexing of natural language text
US7475072B1 (en) 2005-09-26 2009-01-06 Quintura, Inc. Context-based search visualization and context management using neural networks
US7937265B1 (en) * 2005-09-27 2011-05-03 Google Inc. Paraphrase acquisition
WO2007038713A2 (en) * 2005-09-28 2007-04-05 Epacris Inc. Search engine determining results based on probabilistic scoring of relevance
US7908132B2 (en) * 2005-09-29 2011-03-15 Microsoft Corporation Writing assistance using machine translation techniques
US7949444B2 (en) * 2005-10-07 2011-05-24 Honeywell International Inc. Aviation field service report natural language processing
US9886478B2 (en) 2005-10-07 2018-02-06 Honeywell International Inc. Aviation field service report natural language processing
US8036876B2 (en) * 2005-11-04 2011-10-11 Battelle Memorial Institute Methods of defining ontologies, word disambiguation methods, computer systems, and articles of manufacture
US10319252B2 (en) 2005-11-09 2019-06-11 Sdl Inc. Language capability assessment and training apparatus and techniques
EP1949273A1 (en) * 2005-11-16 2008-07-30 Evri Inc. Extending keyword searching to syntactically and semantically annotated data
US7765212B2 (en) * 2005-12-29 2010-07-27 Microsoft Corporation Automatic organization of documents through email clustering
US8694530B2 (en) 2006-01-03 2014-04-08 Textdigger, Inc. Search system with query refinement and search method
US20070162481A1 (en) * 2006-01-10 2007-07-12 Millett Ronald P Pattern index
FR2896603B1 (fr) * 2006-01-20 2008-05-02 Thales Sa Procede et dispositif pour extraire des informations et les transformer en donnees qualitatives d'un document textuel
US7599861B2 (en) 2006-03-02 2009-10-06 Convergys Customer Management Group, Inc. System and method for closed loop decisionmaking in an automated care system
US8266152B2 (en) * 2006-03-03 2012-09-11 Perfect Search Corporation Hashed indexing
EP1999565A4 (en) * 2006-03-03 2012-01-11 Perfect Search Corp HYPER SPACE INDEX
US8862573B2 (en) 2006-04-04 2014-10-14 Textdigger, Inc. Search system and method with text function tagging
US7991608B2 (en) * 2006-04-19 2011-08-02 Raytheon Company Multilingual data querying
AU2007248585A1 (en) * 2006-05-04 2007-11-15 Jpmorgan Chase Bank, N.A. System and method for restricted party screening and resolution services
US7809663B1 (en) 2006-05-22 2010-10-05 Convergys Cmg Utah, Inc. System and method for supporting the utilization of machine language
US8379830B1 (en) 2006-05-22 2013-02-19 Convergys Customer Management Delaware Llc System and method for automated customer service with contingent live interaction
US7493293B2 (en) * 2006-05-31 2009-02-17 International Business Machines Corporation System and method for extracting entities of interest from text using n-gram models
US20070288248A1 (en) * 2006-06-12 2007-12-13 Rami Rauch System and method for online service of web wide datasets forming, joining and mining
US8140267B2 (en) * 2006-06-30 2012-03-20 International Business Machines Corporation System and method for identifying similar molecules
US8615800B2 (en) 2006-07-10 2013-12-24 Websense, Inc. System and method for analyzing web content
US8020206B2 (en) 2006-07-10 2011-09-13 Websense, Inc. System and method of analyzing web content
US20080027971A1 (en) * 2006-07-28 2008-01-31 Craig Statchuk Method and system for populating an index corpus to a search engine
US8589869B2 (en) 2006-09-07 2013-11-19 Wolfram Alpha Llc Methods and systems for determining a formula
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
JP5076417B2 (ja) * 2006-09-15 2012-11-21 富士ゼロックス株式会社 概念ネットワーク生成システム、概念ネットワーク生成方法及び概念ネットワーク生成プログラム
US7557167B2 (en) * 2006-09-28 2009-07-07 Gore Enterprise Holdings, Inc. Polyester compositions, methods of manufacturing said compositions, and articles made therefrom
US8146051B2 (en) * 2006-10-02 2012-03-27 International Business Machines Corporation Method and computer program product for providing a representation of software modeled by a model
US9098489B2 (en) * 2006-10-10 2015-08-04 Abbyy Infopoisk Llc Method and system for semantic searching
US9069750B2 (en) * 2006-10-10 2015-06-30 Abbyy Infopoisk Llc Method and system for semantic searching of natural language texts
US8892423B1 (en) 2006-10-10 2014-11-18 Abbyy Infopoisk Llc Method and system to automatically create content for dictionaries
US8145473B2 (en) 2006-10-10 2012-03-27 Abbyy Software Ltd. Deep model statistics method for machine translation
US9053090B2 (en) 2006-10-10 2015-06-09 Abbyy Infopoisk Llc Translating texts between languages
US9235573B2 (en) 2006-10-10 2016-01-12 Abbyy Infopoisk Llc Universal difference measure
US9892111B2 (en) 2006-10-10 2018-02-13 Abbyy Production Llc Method and device to estimate similarity between documents having multiple segments
US9633005B2 (en) 2006-10-10 2017-04-25 Abbyy Infopoisk Llc Exhaustive automatic processing of textual information
US9984071B2 (en) 2006-10-10 2018-05-29 Abbyy Production Llc Language ambiguity detection of text
US8195447B2 (en) 2006-10-10 2012-06-05 Abbyy Software Ltd. Translating sentences between languages using language-independent semantic structures and ratings of syntactic constructions
US9495358B2 (en) 2006-10-10 2016-11-15 Abbyy Infopoisk Llc Cross-language text clustering
US8214199B2 (en) * 2006-10-10 2012-07-03 Abbyy Software, Ltd. Systems for translating sentences between languages using language-independent semantic structures and ratings of syntactic constructions
US9075864B2 (en) * 2006-10-10 2015-07-07 Abbyy Infopoisk Llc Method and system for semantic searching using syntactic and semantic analysis
US8548795B2 (en) * 2006-10-10 2013-10-01 Abbyy Software Ltd. Method for translating documents from one language into another using a database of translations, a terminology dictionary, a translation dictionary, and a machine translation system
US9047275B2 (en) 2006-10-10 2015-06-02 Abbyy Infopoisk Llc Methods and systems for alignment of parallel text corpora
US9588958B2 (en) 2006-10-10 2017-03-07 Abbyy Infopoisk Llc Cross-language text classification
US20080086298A1 (en) * 2006-10-10 2008-04-10 Anisimovich Konstantin Method and system for translating sentences between langauges
US9471562B2 (en) 2006-10-10 2016-10-18 Abbyy Infopoisk Llc Method and system for analyzing and translating various languages with use of semantic hierarchy
US9645993B2 (en) 2006-10-10 2017-05-09 Abbyy Infopoisk Llc Method and system for semantic searching
US9110975B1 (en) * 2006-11-02 2015-08-18 Google Inc. Search result inputs using variant generalized queries
US8661029B1 (en) 2006-11-02 2014-02-25 Google Inc. Modifying search result ranking based on implicit user feedback
US9208174B1 (en) * 2006-11-20 2015-12-08 Disney Enterprises, Inc. Non-language-based object search
US9654495B2 (en) * 2006-12-01 2017-05-16 Websense, Llc System and method of analyzing web addresses
GB2458094A (en) 2007-01-09 2009-09-09 Surfcontrol On Demand Ltd URL interception and categorization in firewalls
US7437370B1 (en) * 2007-02-19 2008-10-14 Quintura, Inc. Search engine graphical interface using maps and images
EP2135231A4 (en) * 2007-03-01 2014-10-15 Adapx Inc SYSTEM AND METHOD FOR DYNAMIC LEARNING
US8180633B2 (en) * 2007-03-08 2012-05-15 Nec Laboratories America, Inc. Fast semantic extraction using a neural network architecture
WO2008113045A1 (en) 2007-03-14 2008-09-18 Evri Inc. Query templates and labeled search tip system, methods, and techniques
US8959011B2 (en) 2007-03-22 2015-02-17 Abbyy Infopoisk Llc Indicating and correcting errors in machine translation systems
US9031947B2 (en) * 2007-03-27 2015-05-12 Invention Machine Corporation System and method for model element identification
US7873640B2 (en) * 2007-03-27 2011-01-18 Adobe Systems Incorporated Semantic analysis documents to rank terms
US7720783B2 (en) * 2007-03-28 2010-05-18 Palo Alto Research Center Incorporated Method and system for detecting undesired inferences from documents
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US9092510B1 (en) 2007-04-30 2015-07-28 Google Inc. Modifying search result ranking based on a temporal element of user feedback
GB0709527D0 (en) 2007-05-18 2007-06-27 Surfcontrol Plc Electronic messaging system, message processing apparatus and message processing method
US7792826B2 (en) * 2007-05-29 2010-09-07 International Business Machines Corporation Method and system for providing ranked search results
US8812296B2 (en) 2007-06-27 2014-08-19 Abbyy Infopoisk Llc Method and system for natural language dictionary generation
US8037086B1 (en) * 2007-07-10 2011-10-11 Google Inc. Identifying common co-occurring elements in lists
US8260619B1 (en) 2008-08-22 2012-09-04 Convergys Cmg Utah, Inc. Method and system for creating natural language understanding grammars
US7912840B2 (en) * 2007-08-30 2011-03-22 Perfect Search Corporation Indexing and filtering using composite data stores
US8280721B2 (en) 2007-08-31 2012-10-02 Microsoft Corporation Efficiently representing word sense probabilities
US8868562B2 (en) 2007-08-31 2014-10-21 Microsoft Corporation Identification of semantic relationships within reported speech
EP2183686A4 (en) * 2007-08-31 2018-03-28 Zhigu Holdings Limited Identification of semantic relationships within reported speech
CN101796510A (zh) * 2007-08-31 2010-08-04 微软公司 搜索索引中单词的索引角色分层结构
US8316036B2 (en) * 2007-08-31 2012-11-20 Microsoft Corporation Checkpointing iterators during search
US8712758B2 (en) 2007-08-31 2014-04-29 Microsoft Corporation Coreference resolution in an ambiguity-sensitive natural language processing system
US8209321B2 (en) * 2007-08-31 2012-06-26 Microsoft Corporation Emphasizing search results according to conceptual meaning
US20090070322A1 (en) * 2007-08-31 2009-03-12 Powerset, Inc. Browsing knowledge on the basis of semantic relations
US8229970B2 (en) * 2007-08-31 2012-07-24 Microsoft Corporation Efficient storage and retrieval of posting lists
US8463593B2 (en) * 2007-08-31 2013-06-11 Microsoft Corporation Natural language hypernym weighting for word sense disambiguation
US8229730B2 (en) * 2007-08-31 2012-07-24 Microsoft Corporation Indexing role hierarchies for words in a search index
US8346756B2 (en) * 2007-08-31 2013-01-01 Microsoft Corporation Calculating valence of expressions within documents for searching a document index
US9053089B2 (en) 2007-10-02 2015-06-09 Apple Inc. Part-of-speech tagging using latent analogy
US8165886B1 (en) 2007-10-04 2012-04-24 Great Northern Research LLC Speech interface system and method for control and interaction with applications on a computing system
US8838659B2 (en) * 2007-10-04 2014-09-16 Amazon Technologies, Inc. Enhanced knowledge repository
US8595642B1 (en) 2007-10-04 2013-11-26 Great Northern Research, LLC Multiple shell multi faceted graphical user interface
US8909655B1 (en) 2007-10-11 2014-12-09 Google Inc. Time based ranking
US8594996B2 (en) 2007-10-17 2013-11-26 Evri Inc. NLP-based entity recognition and disambiguation
EP2212772A4 (en) * 2007-10-17 2017-04-05 VCVC lll LLC Nlp-based content recommender
WO2009059297A1 (en) * 2007-11-01 2009-05-07 Textdigger, Inc. Method and apparatus for automated tag generation for digital content
US20090119090A1 (en) * 2007-11-01 2009-05-07 Microsoft Corporation Principled Approach to Paraphrasing
US8725756B1 (en) 2007-11-12 2014-05-13 Google Inc. Session-based query suggestions
US7860885B2 (en) * 2007-12-05 2010-12-28 Palo Alto Research Center Incorporated Inbound content filtering via automated inference detection
US10002189B2 (en) * 2007-12-20 2018-06-19 Apple Inc. Method and apparatus for searching using an active ontology
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US8504361B2 (en) * 2008-02-07 2013-08-06 Nec Laboratories America, Inc. Deep neural networks and methods for using same
US8392436B2 (en) * 2008-02-07 2013-03-05 Nec Laboratories America, Inc. Semantic search via role labeling
US10269024B2 (en) * 2008-02-08 2019-04-23 Outbrain Inc. Systems and methods for identifying and measuring trends in consumer content demand within vertically associated websites and related content
US8065143B2 (en) 2008-02-22 2011-11-22 Apple Inc. Providing text input using speech data and non-speech data
US8180754B1 (en) * 2008-04-01 2012-05-15 Dranias Development Llc Semantic neural network for aggregating query searches
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
US8061142B2 (en) * 2008-04-11 2011-11-22 General Electric Company Mixer for a combustor
US8706477B1 (en) 2008-04-25 2014-04-22 Softwin Srl Romania Systems and methods for lexical correspondence linguistic knowledge base creation comprising dependency trees with procedural nodes denoting execute code
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US8682660B1 (en) * 2008-05-21 2014-03-25 Resolvity, Inc. Method and system for post-processing speech recognition results
US8464150B2 (en) 2008-06-07 2013-06-11 Apple Inc. Automatic language identification for dynamic text processing
US8219397B2 (en) * 2008-06-10 2012-07-10 Nuance Communications, Inc. Data processing system for autonomously building speech identification and tagging data
US8032495B2 (en) * 2008-06-20 2011-10-04 Perfect Search Corporation Index compression
AU2009267107A1 (en) 2008-06-30 2010-01-07 Websense, Inc. System and method for dynamic and real-time categorization of webpages
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
US9262409B2 (en) 2008-08-06 2016-02-16 Abbyy Infopoisk Llc Translation of a selected text fragment of a screen
US9317589B2 (en) * 2008-08-07 2016-04-19 International Business Machines Corporation Semantic search by means of word sense disambiguation using a lexicon
US8112269B2 (en) * 2008-08-25 2012-02-07 Microsoft Corporation Determining utility of a question
US8364663B2 (en) * 2008-09-05 2013-01-29 Microsoft Corporation Tokenized javascript indexing system
US8768702B2 (en) 2008-09-05 2014-07-01 Apple Inc. Multi-tiered voice feedback in an electronic device
JP2010066365A (ja) * 2008-09-09 2010-03-25 Toshiba Corp 音声認識装置、方法、及びプログラム
US8898568B2 (en) 2008-09-09 2014-11-25 Apple Inc. Audio user interface
US8712776B2 (en) 2008-09-29 2014-04-29 Apple Inc. Systems and methods for selective text to speech synthesis
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US8185509B2 (en) * 2008-10-15 2012-05-22 Sap France Association of semantic objects with linguistic entity categories
EP2361465B1 (en) * 2008-10-15 2012-08-29 Hewlett-Packard Development Company, L.P. Retrieving configuration records from a configuration management database
WO2010077714A2 (en) * 2008-12-09 2010-07-08 University Of Houston System Word sense disambiguation
WO2010067118A1 (en) 2008-12-11 2010-06-17 Novauris Technologies Limited Speech recognition involving a mobile device
US8862252B2 (en) 2009-01-30 2014-10-14 Apple Inc. Audio user interface for displayless electronic device
US9805089B2 (en) * 2009-02-10 2017-10-31 Amazon Technologies, Inc. Local business and product search system and method
US8380507B2 (en) 2009-03-09 2013-02-19 Apple Inc. Systems and methods for determining the language to use for speech generated by a text to speech engine
WO2010104970A1 (en) * 2009-03-10 2010-09-16 Ebrary, Inc. Method and apparatus for real time text analysis and text navigation
CN102439590A (zh) * 2009-03-13 2012-05-02 发明机器公司 用于自然语言文本的自动语义标注的系统和方法
KR20110136843A (ko) * 2009-03-13 2011-12-21 인벤션 머신 코포레이션 지식 검색을 위한 시스템 및 방법
US20110301941A1 (en) * 2009-03-20 2011-12-08 Syl Research Limited Natural language processing method and system
US20100250522A1 (en) * 2009-03-30 2010-09-30 Gm Global Technology Operations, Inc. Using ontology to order records by relevance
US20100268600A1 (en) * 2009-04-16 2010-10-21 Evri Inc. Enhanced advertisement targeting
US8601015B1 (en) 2009-05-15 2013-12-03 Wolfram Alpha Llc Dynamic example generation for queries
US8788524B1 (en) * 2009-05-15 2014-07-22 Wolfram Alpha Llc Method and system for responding to queries in an imprecise syntax
US20100299132A1 (en) * 2009-05-22 2010-11-25 Microsoft Corporation Mining phrase pairs from an unstructured resource
WO2010138466A1 (en) 2009-05-26 2010-12-02 Wabsense, Inc. Systems and methods for efficeint detection of fingerprinted data and information
US20100306214A1 (en) * 2009-05-28 2010-12-02 Microsoft Corporation Identifying modifiers in web queries over structured data
US10540976B2 (en) 2009-06-05 2020-01-21 Apple Inc. Contextual voice commands
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10255566B2 (en) 2011-06-03 2019-04-09 Apple Inc. Generating and processing task items that represent tasks to perform
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US20130219333A1 (en) * 2009-06-12 2013-08-22 Adobe Systems Incorporated Extensible Framework for Facilitating Interaction with Devices
US8762131B1 (en) 2009-06-17 2014-06-24 Softwin Srl Romania Systems and methods for managing a complex lexicon comprising multiword expressions and multiword inflection templates
US8762130B1 (en) 2009-06-17 2014-06-24 Softwin Srl Romania Systems and methods for natural language processing including morphological analysis, lemmatizing, spell checking and grammar checking
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US20110015921A1 (en) * 2009-07-17 2011-01-20 Minerva Advisory Services, Llc System and method for using lingual hierarchy, connotation and weight of authority
US20110040604A1 (en) * 2009-08-13 2011-02-17 Vertical Acuity, Inc. Systems and Methods for Providing Targeted Content
US9396485B2 (en) * 2009-12-24 2016-07-19 Outbrain Inc. Systems and methods for presenting content
US20110044447A1 (en) * 2009-08-21 2011-02-24 Nexidia Inc. Trend discovery in audio signals
US10169599B2 (en) * 2009-08-26 2019-01-01 International Business Machines Corporation Data access control with flexible data disclosure
US8498974B1 (en) 2009-08-31 2013-07-30 Google Inc. Refining search results
US8560300B2 (en) * 2009-09-09 2013-10-15 International Business Machines Corporation Error correction using fact repositories
GB2487023A (en) * 2009-09-14 2012-07-04 Arun Jain Zolog intelligent human language interface for business software applications
US9224007B2 (en) * 2009-09-15 2015-12-29 International Business Machines Corporation Search engine with privacy protection
US8972391B1 (en) 2009-10-02 2015-03-03 Google Inc. Recent interest based relevance scoring
US8645372B2 (en) * 2009-10-30 2014-02-04 Evri, Inc. Keyword-based search engine results using enhanced query strategies
US8682649B2 (en) 2009-11-12 2014-03-25 Apple Inc. Sentiment prediction from textual data
US20110131033A1 (en) * 2009-12-02 2011-06-02 Tatu Ylonen Oy Ltd Weight-Ordered Enumeration of Referents and Cutting Off Lengthy Enumerations
US10713666B2 (en) 2009-12-24 2020-07-14 Outbrain Inc. Systems and methods for curating content
US10607235B2 (en) * 2009-12-24 2020-03-31 Outbrain Inc. Systems and methods for curating content
US20110161091A1 (en) * 2009-12-24 2011-06-30 Vertical Acuity, Inc. Systems and Methods for Connecting Entities Through Content
US20110197137A1 (en) * 2009-12-24 2011-08-11 Vertical Acuity, Inc. Systems and Methods for Rating Content
US9600134B2 (en) 2009-12-29 2017-03-21 International Business Machines Corporation Selecting portions of computer-accessible documents for post-selection processing
US8311838B2 (en) 2010-01-13 2012-11-13 Apple Inc. Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts
US8381107B2 (en) 2010-01-13 2013-02-19 Apple Inc. Adaptive audio feedback system and method
US9201905B1 (en) * 2010-01-14 2015-12-01 The Boeing Company Semantically mediated access to knowledge
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US8977584B2 (en) 2010-01-25 2015-03-10 Newvaluexchange Global Ai Llp Apparatuses, methods and systems for a digital conversation management platform
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
JP5398007B2 (ja) * 2010-02-26 2014-01-29 独立行政法人情報通信研究機構 関係情報拡張装置、関係情報拡張方法、及びプログラム
US9710556B2 (en) 2010-03-01 2017-07-18 Vcvc Iii Llc Content recommendation based on collections of entities
US10417646B2 (en) 2010-03-09 2019-09-17 Sdl Inc. Predicting the cost associated with translating textual content
US8676565B2 (en) 2010-03-26 2014-03-18 Virtuoz Sa Semantic clustering and conversational agents
US9378202B2 (en) * 2010-03-26 2016-06-28 Virtuoz Sa Semantic clustering
US8694304B2 (en) 2010-03-26 2014-04-08 Virtuoz Sa Semantic clustering and user interfaces
US8645125B2 (en) 2010-03-30 2014-02-04 Evri, Inc. NLP-based systems and methods for providing quotations
US9110882B2 (en) 2010-05-14 2015-08-18 Amazon Technologies, Inc. Extracting structured knowledge from unstructured text
US8484015B1 (en) 2010-05-14 2013-07-09 Wolfram Alpha Llc Entity pages
US9672204B2 (en) * 2010-05-28 2017-06-06 Palo Alto Research Center Incorporated System and method to acquire paraphrases
US9836460B2 (en) * 2010-06-11 2017-12-05 Lexisnexis, A Division Of Reed Elsevier Inc. Systems and methods for analyzing patent-related documents
WO2011160140A1 (en) 2010-06-18 2011-12-22 Susan Bennett System and method of semantic based searching
US9623119B1 (en) 2010-06-29 2017-04-18 Google Inc. Accentuating search results
US8713021B2 (en) 2010-07-07 2014-04-29 Apple Inc. Unsupervised document clustering using latent semantic density analysis
US8812298B1 (en) 2010-07-28 2014-08-19 Wolfram Alpha Llc Macro replacement of natural language input
US8838633B2 (en) 2010-08-11 2014-09-16 Vcvc Iii Llc NLP-based sentiment analysis
US8719006B2 (en) 2010-08-27 2014-05-06 Apple Inc. Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
JP5012981B2 (ja) * 2010-09-09 2012-08-29 カシオ計算機株式会社 電子辞書装置およびプログラム
US9405848B2 (en) 2010-09-15 2016-08-02 Vcvc Iii Llc Recommending mobile device activities
US8719014B2 (en) 2010-09-27 2014-05-06 Apple Inc. Electronic device with text error correction based on voice recognition data
US9524291B2 (en) 2010-10-06 2016-12-20 Virtuoz Sa Visual display of semantic information
US8725739B2 (en) 2010-11-01 2014-05-13 Evri, Inc. Category-based content recommendation
US9424351B2 (en) * 2010-11-22 2016-08-23 Microsoft Technology Licensing, Llc Hybrid-distribution model for search engine indexes
US9824091B2 (en) 2010-12-03 2017-11-21 Microsoft Technology Licensing, Llc File system backup using change journal
US8620894B2 (en) * 2010-12-21 2013-12-31 Microsoft Corporation Searching files
US10515147B2 (en) 2010-12-22 2019-12-24 Apple Inc. Using statistical language models for contextual lookup
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10235360B2 (en) 2010-12-23 2019-03-19 Koninklijke Philips N.V. Generation of pictorial reporting diagrams of lesions in anatomical structures
JP5237400B2 (ja) * 2011-01-21 2013-07-17 株式会社三菱東京Ufj銀行 検索装置
US10657540B2 (en) 2011-01-29 2020-05-19 Sdl Netherlands B.V. Systems, methods, and media for web content management
US9547626B2 (en) 2011-01-29 2017-01-17 Sdl Plc Systems, methods, and media for managing ambient adaptability of web applications and web services
US8781836B2 (en) 2011-02-22 2014-07-15 Apple Inc. Hearing assistance system for providing consistent human speech
US10580015B2 (en) 2011-02-25 2020-03-03 Sdl Netherlands B.V. Systems, methods, and media for executing and optimizing online marketing initiatives
US10140320B2 (en) 2011-02-28 2018-11-27 Sdl Inc. Systems, methods, and media for generating analytical data
US8543577B1 (en) 2011-03-02 2013-09-24 Google Inc. Cross-channel clusters of information
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
JP5696555B2 (ja) * 2011-03-28 2015-04-08 富士ゼロックス株式会社 プログラム及び情報処理装置
US9116995B2 (en) 2011-03-30 2015-08-25 Vcvc Iii Llc Cluster-based identification of news stories
US20120265784A1 (en) * 2011-04-15 2012-10-18 Microsoft Corporation Ordering semantic query formulation suggestions
US20120310642A1 (en) 2011-06-03 2012-12-06 Apple Inc. Automatically creating a mapping between text data and audio data
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US8812294B2 (en) 2011-06-21 2014-08-19 Apple Inc. Translating phrases from one language into another using an order-based set of declarative rules
US10198506B2 (en) * 2011-07-11 2019-02-05 Lexxe Pty Ltd. System and method of sentiment data generation
US9069814B2 (en) 2011-07-27 2015-06-30 Wolfram Alpha Llc Method and system for using natural language to generate widgets
US8706472B2 (en) 2011-08-11 2014-04-22 Apple Inc. Method for disambiguating multiple readings in language conversion
US9984054B2 (en) 2011-08-24 2018-05-29 Sdl Inc. Web interface including the review and manipulation of a web document and utilizing permission based control
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
US9734252B2 (en) 2011-09-08 2017-08-15 Wolfram Alpha Llc Method and system for analyzing data using a query answering system
US8914277B1 (en) * 2011-09-20 2014-12-16 Nuance Communications, Inc. Speech and language translation of an utterance
US8762156B2 (en) 2011-09-28 2014-06-24 Apple Inc. Speech recognition repair using contextual information
US10169339B2 (en) 2011-10-31 2019-01-01 Elwha Llc Context-sensitive query enrichment
US20130124194A1 (en) * 2011-11-10 2013-05-16 Inventive, Inc. Systems and methods for manipulating data using natural language commands
US9851950B2 (en) 2011-11-15 2017-12-26 Wolfram Alpha Llc Programming in a precise syntax using natural language
US8965750B2 (en) 2011-11-17 2015-02-24 Abbyy Infopoisk Llc Acquiring accurate machine translation
US9195853B2 (en) 2012-01-15 2015-11-24 International Business Machines Corporation Automated document redaction
JP5567749B2 (ja) * 2012-02-15 2014-08-06 楽天株式会社 辞書生成装置、辞書生成方法、辞書生成プログラム、及びそのプログラムを記憶するコンピュータ読取可能な記録媒体
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9064009B2 (en) * 2012-03-28 2015-06-23 Hewlett-Packard Development Company, L.P. Attribute cloud
US8989485B2 (en) 2012-04-27 2015-03-24 Abbyy Development Llc Detecting a junction in a text line of CJK characters
US8971630B2 (en) 2012-04-27 2015-03-03 Abbyy Development Llc Fast CJK character recognition
US9773270B2 (en) 2012-05-11 2017-09-26 Fredhopper B.V. Method and system for recommending products based on a ranking cocktail
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US9460082B2 (en) * 2012-05-14 2016-10-04 International Business Machines Corporation Management of language usage to facilitate effective communication
US8775442B2 (en) 2012-05-15 2014-07-08 Apple Inc. Semantic search using a single-source semantic model
US10417037B2 (en) 2012-05-15 2019-09-17 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US10261994B2 (en) 2012-05-25 2019-04-16 Sdl Inc. Method and system for automatic management of reputation of translators
WO2013185109A2 (en) 2012-06-08 2013-12-12 Apple Inc. Systems and methods for recognizing textual identifiers within a plurality of words
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9195647B1 (en) * 2012-08-11 2015-11-24 Guangsheng Zhang System, methods, and data structure for machine-learning of contextualized symbolic associations
US9405424B2 (en) 2012-08-29 2016-08-02 Wolfram Alpha, Llc Method and system for distributing and displaying graphical items
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US11386186B2 (en) 2012-09-14 2022-07-12 Sdl Netherlands B.V. External content library connector systems and methods
US10452740B2 (en) 2012-09-14 2019-10-22 Sdl Netherlands B.V. External content libraries
US11308528B2 (en) 2012-09-14 2022-04-19 Sdl Netherlands B.V. Blueprinting of multimedia assets
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
US8935167B2 (en) 2012-09-25 2015-01-13 Apple Inc. Exemplar-based latent perceptual modeling for automatic speech recognition
US9916306B2 (en) 2012-10-19 2018-03-13 Sdl Inc. Statistical linguistic analysis of source content
US9892278B2 (en) 2012-11-14 2018-02-13 International Business Machines Corporation Focused personal identifying information redaction
US10095692B2 (en) * 2012-11-29 2018-10-09 Thornson Reuters Global Resources Unlimited Company Template bootstrapping for domain-adaptable natural language generation
US20150317386A1 (en) * 2012-12-27 2015-11-05 Abbyy Development Llc Finding an appropriate meaning of an entry in a text
KR102516577B1 (ko) 2013-02-07 2023-04-03 애플 인크. 디지털 어시스턴트를 위한 음성 트리거
US9135240B2 (en) * 2013-02-12 2015-09-15 International Business Machines Corporation Latent semantic analysis for application in a question answer system
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9977779B2 (en) 2013-03-14 2018-05-22 Apple Inc. Automatic supplementation of word correction dictionaries
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
US10642574B2 (en) 2013-03-14 2020-05-05 Apple Inc. Device, method, and graphical user interface for outputting captions
US9311297B2 (en) * 2013-03-14 2016-04-12 Prateek Bhatnagar Method and system for outputting information
US9733821B2 (en) 2013-03-14 2017-08-15 Apple Inc. Voice control to diagnose inadvertent activation of accessibility features
US10572476B2 (en) 2013-03-14 2020-02-25 Apple Inc. Refining a search based on schedule items
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
US11151899B2 (en) 2013-03-15 2021-10-19 Apple Inc. User training by intelligent digital assistant
US10748529B1 (en) 2013-03-15 2020-08-18 Apple Inc. Voice activated device for use with a voice-based digital assistant
AU2014233517B2 (en) 2013-03-15 2017-05-25 Apple Inc. Training an at least partial voice command system
US10078487B2 (en) 2013-03-15 2018-09-18 Apple Inc. Context-sensitive handling of interruptions
JP6152711B2 (ja) * 2013-06-04 2017-06-28 富士通株式会社 情報検索装置および情報検索方法
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
WO2014197334A2 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
WO2014197335A1 (en) 2013-06-08 2014-12-11 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
KR101959188B1 (ko) 2013-06-09 2019-07-02 애플 인크. 디지털 어시스턴트의 둘 이상의 인스턴스들에 걸친 대화 지속성을 가능하게 하기 위한 디바이스, 방법 및 그래픽 사용자 인터페이스
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
KR101809808B1 (ko) 2013-06-13 2017-12-15 애플 인크. 음성 명령에 의해 개시되는 긴급 전화를 걸기 위한 시스템 및 방법
CN105453026A (zh) 2013-08-06 2016-03-30 苹果公司 基于来自远程设备的活动自动激活智能响应
US9311300B2 (en) * 2013-09-13 2016-04-12 International Business Machines Corporation Using natural language processing (NLP) to create subject matter synonyms from definitions
US20160224637A1 (en) * 2013-11-25 2016-08-04 Ut Battelle, Llc Processing associations in knowledge graphs
US10296160B2 (en) 2013-12-06 2019-05-21 Apple Inc. Method for extracting salient dialog usage from live data
RU2592395C2 (ru) 2013-12-19 2016-07-20 Общество с ограниченной ответственностью "Аби ИнфоПоиск" Разрешение семантической неоднозначности при помощи статистического анализа
US20150178390A1 (en) * 2013-12-20 2015-06-25 Jordi Torras Natural language search engine using lexical functions and meaning-text criteria
RU2613847C2 (ru) 2013-12-20 2017-03-21 ООО "Аби Девелопмент" Выявление китайской, японской и корейской письменности
RU2586577C2 (ru) 2014-01-15 2016-06-10 Общество с ограниченной ответственностью "Аби ИнфоПоиск" Фильтрация дуг в синтаксическом графе
RU2665239C2 (ru) 2014-01-15 2018-08-28 Общество с ограниченной ответственностью "Аби Продакшн" Автоматическое извлечение именованных сущностей из текста
JP6260294B2 (ja) * 2014-01-21 2018-01-17 富士通株式会社 情報検索装置、情報検索方法および情報検索プログラム
RU2640322C2 (ru) 2014-01-30 2017-12-27 Общество с ограниченной ответственностью "Аби Девелопмент" Способы и системы эффективного автоматического распознавания символов
RU2648638C2 (ru) 2014-01-30 2018-03-26 Общество с ограниченной ответственностью "Аби Девелопмент" Способы и системы эффективного автоматического распознавания символов, использующие множество кластеров эталонов символов
RU2556425C1 (ru) * 2014-02-14 2015-07-10 Закрытое акционерное общество "Эвентос" (ЗАО "Эвентос") Способ автоматической итеративной кластеризации электронных документов по семантической близости, способ поиска в совокупности кластеризованных по семантической близости документов и машиночитаемые носители
US10839110B2 (en) * 2014-05-09 2020-11-17 Autodesk, Inc. Techniques for using controlled natural language to capture design intent for computer-aided design
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
TWI566107B (zh) 2014-05-30 2017-01-11 蘋果公司 用於處理多部分語音命令之方法、非暫時性電腦可讀儲存媒體及電子裝置
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
KR101661198B1 (ko) * 2014-07-10 2016-10-04 네이버 주식회사 단문/복문 구조의 자연어 질의에 대한 검색 및 정보 제공 방법 및 시스템
CN104199803B (zh) * 2014-07-21 2017-10-13 安徽华贞信息科技有限公司 一种基于组合理论的文本信息处理系统及方法
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
RU2596600C2 (ru) 2014-09-02 2016-09-10 Общество с ограниченной ответственностью "Аби Девелопмент" Способы и системы обработки изображений математических выражений
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9588961B2 (en) 2014-10-06 2017-03-07 International Business Machines Corporation Natural language processing utilizing propagation of knowledge through logical parse tree structures
US9715488B2 (en) * 2014-10-06 2017-07-25 International Business Machines Corporation Natural language processing utilizing transaction based knowledge representation
US9665564B2 (en) 2014-10-06 2017-05-30 International Business Machines Corporation Natural language processing utilizing logical tree structures
US9710547B2 (en) 2014-11-21 2017-07-18 Inbenta Natural language semantic search system and method using weighted global semantic representations
US9626358B2 (en) 2014-11-26 2017-04-18 Abbyy Infopoisk Llc Creating ontologies by analyzing natural language texts
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9589185B2 (en) 2014-12-10 2017-03-07 Abbyy Development Llc Symbol recognition using decision forests
JP6447161B2 (ja) * 2015-01-20 2019-01-09 富士通株式会社 意味構造検索プログラム、意味構造検索装置、及び意味構造検索方法
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9632999B2 (en) * 2015-04-03 2017-04-25 Klangoo, Sal. Techniques for understanding the aboutness of text based on semantic analysis
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US9778929B2 (en) 2015-05-29 2017-10-03 Microsoft Technology Licensing, Llc Automated efficient translation context delivery
US10762521B2 (en) 2015-06-01 2020-09-01 Jpmorgan Chase Bank, N.A. System and method for loyalty integration for merchant specific digital wallets
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US9578173B2 (en) 2015-06-05 2017-02-21 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10628413B2 (en) * 2015-08-03 2020-04-21 International Business Machines Corporation Mapping questions to complex database lookups using synthetic events
US10628521B2 (en) * 2015-08-03 2020-04-21 International Business Machines Corporation Scoring automatically generated language patterns for questions using synthetic events
US10134389B2 (en) * 2015-09-04 2018-11-20 Microsoft Technology Licensing, Llc Clustering user utterance intents with semantic parsing
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
EP3163467A1 (en) * 2015-10-30 2017-05-03 BIGFLO s.r.l. Method and tool for the automatic reformulation of search keyword strings in document search systems
US10614167B2 (en) 2015-10-30 2020-04-07 Sdl Plc Translation review workflow systems and methods
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10592603B2 (en) * 2016-02-03 2020-03-17 International Business Machines Corporation Identifying logic problems in text using a statistical approach and natural language processing
US11042702B2 (en) 2016-02-04 2021-06-22 International Business Machines Corporation Solving textual logic problems using a statistical approach and natural language processing
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
EP3394798A1 (en) * 2016-03-18 2018-10-31 Google LLC Generating dependency parses of text segments using neural networks
US11200217B2 (en) 2016-05-26 2021-12-14 Perfect Search Corporation Structured document indexing and searching
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US10289680B2 (en) * 2016-05-31 2019-05-14 Oath Inc. Real time parsing and suggestions from pre-generated corpus with hypernyms
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
DK179309B1 (en) 2016-06-09 2018-04-23 Apple Inc Intelligent automated assistant in a home environment
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
DK179049B1 (en) 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179343B1 (en) 2016-06-11 2018-05-14 Apple Inc Intelligent task discovery
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
US11049190B2 (en) 2016-07-15 2021-06-29 Intuit Inc. System and method for automatically generating calculations for fields in compliance forms
US10579721B2 (en) 2016-07-15 2020-03-03 Intuit Inc. Lean parsing: a natural language processing system and method for parsing domain-specific languages
US11222266B2 (en) 2016-07-15 2022-01-11 Intuit Inc. System and method for automatic learning of functions
US10120861B2 (en) 2016-08-17 2018-11-06 Oath Inc. Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time
US9984063B2 (en) 2016-09-15 2018-05-29 International Business Machines Corporation System and method for automatic, unsupervised paraphrase generation using a novel framework that learns syntactic construct while retaining semantic meaning
US9953027B2 (en) * 2016-09-15 2018-04-24 International Business Machines Corporation System and method for automatic, unsupervised paraphrase generation using a novel framework that learns syntactic construct while retaining semantic meaning
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10437833B1 (en) * 2016-10-05 2019-10-08 Ontocord, LLC Scalable natural language processing for large and dynamic text environments
KR102589638B1 (ko) * 2016-10-31 2023-10-16 삼성전자주식회사 문장 생성 장치 및 방법
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
JP6805927B2 (ja) * 2017-03-28 2020-12-23 富士通株式会社 インデックス生成プログラム、データ検索プログラム、インデックス生成装置、データ検索装置、インデックス生成方法、及びデータ検索方法
DK201770439A1 (en) 2017-05-11 2018-12-13 Apple Inc. Offline personal assistant
DK179496B1 (en) 2017-05-12 2019-01-15 Apple Inc. USER-SPECIFIC Acoustic Models
DK179745B1 (en) 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
US10275452B2 (en) 2017-05-12 2019-04-30 International Business Machines Corporation Automatic, unsupervised paraphrase detection
DK201770432A1 (en) 2017-05-15 2018-12-21 Apple Inc. Hierarchical belief states for digital assistants
DK201770431A1 (en) 2017-05-15 2018-12-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
DK179560B1 (en) 2017-05-16 2019-02-18 Apple Inc. FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES
US10152571B1 (en) * 2017-05-25 2018-12-11 Enlitic, Inc. Chest x-ray differential diagnosis system
CA3076418C (en) 2017-09-22 2023-02-21 Intuit Inc. Lean parsing: a natural language processing system and method for parsing domain-specific languages
US10635863B2 (en) 2017-10-30 2020-04-28 Sdl Inc. Fragment recall and adaptive automated translation
US11087097B2 (en) * 2017-11-27 2021-08-10 Act, Inc. Automatic item generation for passage-based assessment
US11410130B2 (en) * 2017-12-27 2022-08-09 International Business Machines Corporation Creating and using triplet representations to assess similarity between job description documents
US10817676B2 (en) 2017-12-27 2020-10-27 Sdl Inc. Intelligent routing services and systems
MY201295A (en) 2017-12-28 2024-02-15 Mimos Berhad A computer-implemented method for self-learning text relevance and determining text relevancy
US11573990B2 (en) * 2017-12-29 2023-02-07 Entefy Inc. Search-based natural language intent determination
IL258689A (en) * 2018-04-12 2018-05-31 Browarnik Abel A system and method for computerized semantic indexing and searching
JP7135399B2 (ja) * 2018-04-12 2022-09-13 富士通株式会社 特定プログラム、特定方法および情報処理装置
US11016985B2 (en) * 2018-05-22 2021-05-25 International Business Machines Corporation Providing relevant evidence or mentions for a query
US11042712B2 (en) * 2018-06-05 2021-06-22 Koninklijke Philips N.V. Simplifying and/or paraphrasing complex textual content by jointly learning semantic alignment and simplicity
US11256867B2 (en) 2018-10-09 2022-02-22 Sdl Inc. Systems and methods of machine learning for digital assets and message creation
US11163956B1 (en) 2019-05-23 2021-11-02 Intuit Inc. System and method for recognizing domain specific named entities using domain specific word embeddings
US11477140B2 (en) 2019-05-30 2022-10-18 Microsoft Technology Licensing, Llc Contextual feedback to a natural understanding system in a chat bot
US10868778B1 (en) 2019-05-30 2020-12-15 Microsoft Technology Licensing, Llc Contextual feedback, with expiration indicator, to a natural understanding system in a chat bot
JP2022547750A (ja) 2019-09-16 2022-11-15 ドキュガミ インコーポレイテッド クロスドキュメントインテリジェントオーサリングおよび処理アシスタント
US11068665B2 (en) 2019-09-18 2021-07-20 International Business Machines Corporation Hypernym detection using strict partial order networks
CN111090668B (zh) * 2019-12-09 2023-09-26 京东科技信息技术有限公司 数据检索方法及装置、电子设备和计算机可读存储介质
US11783128B2 (en) 2020-02-19 2023-10-10 Intuit Inc. Financial document text conversion to computer readable operations
US11651156B2 (en) * 2020-05-07 2023-05-16 Optum Technology, Inc. Contextual document summarization with semantic intelligence
US11954448B2 (en) * 2020-07-21 2024-04-09 Microsoft Technology Licensing, Llc Determining position values for transformer models
US20230343333A1 (en) * 2020-08-24 2023-10-26 Unlikely Artificial Intelligence Limited A computer implemented method for the aut0omated analysis or use of data
US12210824B1 (en) * 2021-04-30 2025-01-28 Now Insurance Services, Inc. Automated information extraction from electronic documents using machine learning
US11966699B2 (en) * 2021-06-17 2024-04-23 International Business Machines Corporation Intent classification using non-correlated features
US11989507B2 (en) 2021-08-24 2024-05-21 Unlikely Artificial Intelligence Limited Computer implemented methods for the automated analysis or use of data, including use of a large language model
US12136484B2 (en) 2021-11-05 2024-11-05 Altis Labs, Inc. Method and apparatus utilizing image-based modeling in healthcare

Family Cites Families (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4823306A (en) * 1987-08-14 1989-04-18 International Business Machines Corporation Text search system
US4839853A (en) * 1988-09-15 1989-06-13 Bell Communications Research, Inc. Computer information retrieval using latent semantic structure
SE466029B (sv) * 1989-03-06 1991-12-02 Ibm Svenska Ab Anordning och foerfarande foer analys av naturligt spraak i ett datorbaserat informationsbehandlingssystem
NL8900587A (nl) * 1989-03-10 1990-10-01 Bso Buro Voor Systeemontwikkel Werkwijze voor het bepalen van de semantische verwantheid van lexicale componenten in een tekst.
US5146406A (en) * 1989-08-16 1992-09-08 International Business Machines Corporation Computer method for identifying predicate-argument structures in natural language text
JP3266246B2 (ja) 1990-06-15 2002-03-18 インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン 自然言語解析装置及び方法並びに自然言語解析用知識ベース構築方法
US5617578A (en) * 1990-06-26 1997-04-01 Spss Corp. Computer-based workstation for generation of logic diagrams from natural language text structured by the insertion of script symbols
US5325298A (en) * 1990-11-07 1994-06-28 Hnc, Inc. Methods for generating or revising context vectors for a plurality of word stems
US5278980A (en) * 1991-08-16 1994-01-11 Xerox Corporation Iterative technique for phrase query formation and an information retrieval system employing same
US5488719A (en) * 1991-12-30 1996-01-30 Xerox Corporation System for categorizing character strings using acceptability and category information contained in ending substrings
US5591661A (en) 1992-04-07 1997-01-07 Shiota; Philip Method for fabricating devices for electrostatic discharge protection and voltage references, and the resulting structures
US5377103A (en) 1992-05-15 1994-12-27 International Business Machines Corporation Constrained natural language interface for a computer that employs a browse function
US5592661A (en) * 1992-07-16 1997-01-07 International Business Machines Corporation Detection of independent changes via change identifiers in a versioned database management system
US5630121A (en) * 1993-02-02 1997-05-13 International Business Machines Corporation Archiving and retrieving multimedia objects using structured indexes
US5454106A (en) * 1993-05-17 1995-09-26 International Business Machines Corporation Database retrieval system using natural language for presenting understood components of an ambiguous query on a user interface
US5619709A (en) * 1993-09-20 1997-04-08 Hnc, Inc. System and method of context vector generation and retrieval
GB9320404D0 (en) * 1993-10-04 1993-11-24 Dixon Robert Method & apparatus for data storage & retrieval
US5873056A (en) * 1993-10-12 1999-02-16 The Syracuse University Natural language processing system for semantic vector representation which accounts for lexical ambiguity
US5724594A (en) 1994-02-10 1998-03-03 Microsoft Corporation Method and system for automatically identifying morphological information from a machine-readable dictionary
US5675819A (en) * 1994-06-16 1997-10-07 Xerox Corporation Document information retrieval using global word co-occurrence patterns
US5794050A (en) * 1995-01-04 1998-08-11 Intelligent Text Processing, Inc. Natural language understanding system
JP2923552B2 (ja) * 1995-02-13 1999-07-26 富士通株式会社 組織活動データベースの構築方法,それに使用する分析シートの入力方法及び組織活動管理システム
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
US6006221A (en) 1995-08-16 1999-12-21 Syracuse University Multilingual document retrieval system and method using semantic vector matching
JP3083742B2 (ja) * 1995-10-03 2000-09-04 インターナショナル・ビジネス・マシーンズ・コーポレ−ション 表計算方法
US5995922A (en) 1996-05-02 1999-11-30 Microsoft Corporation Identifying information related to an input word in an electronic dictionary
US5966686A (en) * 1996-06-28 1999-10-12 Microsoft Corporation Method and system for computing semantic logical forms from syntax trees
US5893104A (en) * 1996-07-09 1999-04-06 Oracle Corporation Method and system for processing queries in a database system using index structures that are not native to the database system
US6038561A (en) * 1996-10-15 2000-03-14 Manning & Napier Information Services Management and analysis of document information text
US5970490A (en) * 1996-11-05 1999-10-19 Xerox Corporation Integration platform for heterogeneous databases
US6076051A (en) * 1997-03-07 2000-06-13 Microsoft Corporation Information retrieval utilizing semantic representation of text
US5895464A (en) * 1997-04-30 1999-04-20 Eastman Kodak Company Computer program product and a method for using natural language for the description, search and retrieval of multi-media objects
US5933822A (en) * 1997-07-22 1999-08-03 Microsoft Corporation Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision
US6070134A (en) * 1997-07-31 2000-05-30 Microsoft Corporation Identifying salient semantic relation paths between two words
US5991713A (en) * 1997-11-26 1999-11-23 International Business Machines Corp. Efficient method for compressing, storing, searching and transmitting natural language text
US6675159B1 (en) * 2000-07-27 2004-01-06 Science Applic Int Corp Concept-based search and retrieval system
US6664964B1 (en) * 2000-11-10 2003-12-16 Emc Corporation Correlation criteria for logical volumes
US7050964B2 (en) 2001-06-01 2006-05-23 Microsoft Corporation Scaleable machine translation system
US7734459B2 (en) 2001-06-01 2010-06-08 Microsoft Corporation Automatic extraction of transfer mappings from bilingual corpora

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1333362C (zh) * 2001-03-26 2007-08-22 美国网上搜索公司 用于智能数据同化的方法和装置
US7630879B2 (en) 2002-09-13 2009-12-08 Fuji Xerox Co., Ltd. Text sentence comparing apparatus
CN105512291A (zh) * 2006-02-28 2016-04-20 贝宝公司 用于扩展数据库搜索查询的方法和系统
CN105512291B (zh) * 2006-02-28 2020-05-15 贝宝公司 用于扩展数据库搜索查询的方法和系统
US8065307B2 (en) 2006-12-20 2011-11-22 Microsoft Corporation Parsing, analysis and scoring of document content
CN101508188B (zh) * 2009-03-24 2012-09-26 北京市城南橡塑技术研究所 抗冲击复合衬板
CN106598722A (zh) * 2015-10-19 2017-04-26 上海引跑信息科技有限公司 一种在文本信息检索服务中支持分布式事务管理的方法
CN110088754A (zh) * 2016-10-26 2019-08-02 联邦科学和工业研究组织 立法到逻辑的自动编码器
CN110088754B (zh) * 2016-10-26 2023-04-28 联邦科学和工业研究组织 立法到逻辑的自动编码器
CN114969262A (zh) * 2022-05-31 2022-08-30 云知声智能科技股份有限公司 文本处理方法、装置、存储介质及电子装置

Also Published As

Publication number Publication date
EP0965089B1 (en) 2015-03-25
WO1998039714A1 (en) 1998-09-11
US6161084A (en) 2000-12-12
JP2001513243A (ja) 2001-08-28
US6076051A (en) 2000-06-13
EP0965089A1 (en) 1999-12-22
US20050065777A1 (en) 2005-03-24
US6246977B1 (en) 2001-06-12
JP4282769B2 (ja) 2009-06-24
US6871174B1 (en) 2005-03-22
US7013264B2 (en) 2006-03-14

Similar Documents

Publication Publication Date Title
CN1252876A (zh) 利用文本的语义表示进行信息检索
CN107993724B (zh) 一种医学智能问答数据处理的方法及装置
KR101157693B1 (ko) 토큰스페이스 저장소와 함께 사용하기 위한 멀티-스테이지질의 처리 시스템 및 방법
US20220261427A1 (en) Methods and system for semantic search in large databases
KR101661198B1 (ko) 단문/복문 구조의 자연어 질의에 대한 검색 및 정보 제공 방법 및 시스템
US6131082A (en) Machine assisted translation tools utilizing an inverted index and list of letter n-grams
CN105045875B (zh) 个性化信息检索方法及装置
US20030078915A1 (en) Generalized keyword matching for keyword based searching over relational databases
CN101051311A (zh) 从应用于中心词提取系统的词条中提取中心词的方法
JPH1145241A (ja) かな漢字変換システムおよびそのシステムの各手段としてコンピュータを機能させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体
CN108763348B (zh) 一种扩展短文本词特征向量的分类改进方法
JP2002520712A (ja) データ検索システムと方法およびサーチ・エンジンにおけるその使用
CN101042692A (zh) 基于语义预测的译文获取方法和设备
WO2015062340A1 (zh) 一种兼容关键词搜索的自然语言搜索方法及系统
CN105335487A (zh) 基于农业技术信息本体库的农业专家信息检索系统及方法
CN102662936A (zh) 融合Web挖掘、多特征与有监督学习的汉英未登录词翻译方法
JP2011118689A (ja) 検索方法及びシステム
CN106649605A (zh) 一种推广关键词的触发方法及装置
CN102314464B (zh) 歌词搜索方法及搜索引擎
JP2007334388A (ja) クラスタリング方法及び装置及びプログラム及びコンピュータ読み取り可能な記録媒体
Liu The application of RAG technology in traditional chinese medicine
JP5298834B2 (ja) 例文マッチング翻訳装置、およびプログラム、並びに翻訳装置を含んで構成された句翻訳装置
JP2003108595A (ja) 情報検索装置、情報検索方法及び情報検索プログラム
CN118820407B (zh) 基于大语言模型的生命周期流数据混合检索方法及装置
CN112163065A (zh) 信息检索方法、系统及介质

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication