JP2001503236A

JP2001503236A - Personal voice message processor and method

Info

Publication number: JP2001503236A
Application number: JP10544101A
Authority: JP
Inventors: スターン，ジョフリー; ウエクスラー，ギル
Original assignee: イー・エヌ・ティーエコーテクノロジーズリミテッド
Priority date: 1997-04-11
Filing date: 1998-04-11
Publication date: 2001-03-06
Also published as: CA2286043A1; EP1060616A2; WO1998047252A3; WO1998047252A2; AU6897698A; IL132306A0; CN1260924A

Abstract

(57)【要約】通信リンクを通じて、ユーザーが遠隔音声処理または対話型音声応答(IVR)ホストコンピュータから受信し、またその後同コンピュータへ送信する音声メッセージ及びその他の音声マテリアルを記録、編集、再生及び再吟味することのできる携帯型装置。本装置は、望ましくは、独自の電源、集積回路及び制御ボタンを有し、内蔵スピーカ、マイクロホン、着脱可能なメモリカードによってローカルに音声信号を記録、編集、保存、および再生を可能にする。また、本装置は標準ＲＪ−１１電話ジャック、モデム・チップセット、ＤＴＭＦトーンデコーダを備え、ホストコンピュータとの間で音声信号の送受信及び制御を行うことができる。本装置は音声信号を記録時の元の速度よりも実質的に高速で送受信できるようにする回路を備えている。 (57) [Summary] Through the communication link, the user can record, edit, play and review voice messages and other audio material received from and subsequently transmitted to the remote voice processing or interactive voice response (IVR) host computer Portable device. The device desirably has its own power supply, integrated circuit and control buttons, and allows the recording, editing, storage and playback of audio signals locally via built-in speakers, microphones and removable memory cards. The apparatus also includes a standard RJ-11 telephone jack, a modem chipset, and a DTMF tone decoder, and can transmit and receive audio signals to and from a host computer. The apparatus includes a circuit that enables transmission and reception of an audio signal at a speed substantially higher than the original speed at the time of recording.

Description

【発明の詳細な説明】パーソナル音声メッセージプロセッサ及び方法発明の分野本発明は、一般に、ディクテーション装置および音声通信装置に関する。詳しくは、ボイスメールおよび音声コンテンツの記録および編集、ならびに共通の電気通信媒体またはデータリンクを用いたインターネットなどの私設または公衆ネットワークを通したこれらの送受信を含む、音声通信のための方法および携帯型装置に関する。発明の背景ボイスメール以外のすべての電子メッセージシステムは中間装置または格納媒体を有し、これによりデータは標準の通信リンクを通して、好ましくは高い通信レートで伝送され、後での予定されたユーザによるオフラインアクセス、再吟味および編集のために、格納媒体または不参加の装置に格納される。ファクシミリ伝送の場合は、画像が送信機によってスキャンされ、次に送信され、最終的に予定された受信者によるオフライン利用のために遠隔地で印刷される。電子メールの場合は、データはコンピュータで生成され、次に送信されて、予定されたユーザの不参加のコンピュータに直接格納されるか、またはコンピュータネットワークにリンクされた中央ホストコンピュータに格納され、後で予定されたユーザによって検索される。最も一般的なネットワークは、ローカルエリアネットワーク（ＬＡＮ）、広域ネットワーク（ＷＡＮ）、およびインターネットなどの公衆ネットワーク、または私設ネットワークである。予定されたユーザが自分のコンピュータにアクセスすると、イー・メールが既に入っているか、またはメールが到着していることおよびこれをどのようにして検索できるかを示すグラフィックエディタで表示されたメッセージを見つける。イー・メールが一旦検索されると、さらに、予定されたユーザのコンピュータにおいて、オフラインでユーザによって読まれ、再吟味され、操作され得る。もしくは、イー・メールはプリンタに出力されてユーザの都合の良いときに再吟味することができるようにハードコピーが提供され得る。ファクシミリ機が利用できないときは、ファクシミリはコンピュータ、または Reflection Technology，Inc．のFaxViewパーソナルファックスリーダーなどの、受取人がオフラインでの独立した再吟味のためのハンドヘルドでペーパレスのファックス機に伝送され得る。ファクシミリおよびイー・メールのメッセージ両方のためのユーティリティが存在し、これにより、メッセージは、許可されたユーザによって、続くユーザのイー・メールアドレスまたは不参加のファクシミリ機への伝送のためにホストから選択され得る。例えば、Duehrenらの米国特許第４，９１８，７２２号を参照せよ。最近、インターネットの利用の広がりおよび増大と共に、そしてより詳しく言えば、ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｔｅ）文書の形態の刊行マテリアルを提供するＷＥＢサイトの人気の増大と共に、これらのファイルを選択し、後でオフラインでアクセスし、ファックスにより独立して再吟味することを可能にするユーティリティが登場した。例えば、Ibex Technol ogies，Inc．によるウェブ用のFactsLineを参照せよ。このようなユーティリティにより、インターネットを通じて提供される大量の情報およびグラフィックスが、インターネットに接続されたコンピュータへのアクセスを持たないユーザ、またはオンラインに費やす時間量を制限したいと考えているユーザに利用可能となる。潜在的ユーザの大部分がインターネットへのアクセスを持たないか、または持っている場合でも、旅行中であるとか、自分のコンピュータにアクセスできないとか、または興味ある文書またはイー・メールへとナビゲートし、文書を番号で特定し、そして選択された文書をテキスト合成音声を用いて電話を通してリアルタイムに読んでもらい、ファックスで送ってもらうかまたはイー・メールに添付して送ってもらうために、コンピュータを立ち上げてウェブサイトのグラフィックスが表示されるのを待って（音声プロンプトに応答してユーザがインターネットにアクセスすることが可能な、Netphonic Communications，Inc.のWeb-On Cal l Voice Browserなどのユーティリティが導入されている）、時間を費やすのを望まないかもしれない。同様に、インターネットの利用が広まったこと、および特に人気のあるウェブサイトへのまたは特定のピーク使用時間中のアクセスが混み合うことにより、インターネットユーザが特定のウェブサイトを「講読(subscribe)」するのを可能にするオフラインブラウザと呼ばれるユーティリティに対する需要が生じた。そしてこの特定のウェブサイトから、ユーザのコンピュータはオフピーク時間中にマテリアルを自動的に、検索し、新しい情報また更新された情報をカテゴリー化および組織化し、ユーザは自分の選択したブラウザを用いてオフラインでこれを再吟味することができる（例えば、FreeLoader，Inc.のFreeLoader）。同様に、ボイスメールをイー・メールアドレスに送ることができ、またウェブサイト上に提供される音声コンテンツを対話型音声応答（ＩＶＲ）システムへの標準電話コールによって更新することができる講読サービスが導入されている（例えば、Telet Communicatio nsの「Amail」および「Dialweb」）。最近では、音声プロセッサシステムの製造業者たちが、インターネットメールのための音声プロフィール用の共通操作性標準（Interoperability standard fo r a Voice Profile for Internet Mail：ＶＰＩＭ）を開発するために、世界のボイスメールシステム市場の６０％以上から成るワークグループを確立した。主にインターネットに世界的にアクセス可能なコンタクトポイントが存在するため、および一般に認められた伝送プロトコル、具体的にはＶＰＩＭのコアとしてのシンプル・メッセージ・転送プロトコル（ＳＭＴＰ）および多目的インターネット・メッセージング・エクステンション（ＭＩＭＥ）を使用しているため、ＴＣＰ／ＩＰ（伝送制御プロトコル／インターネットプロトコル）が伝送手段として選択されている（Business Wire、１９９６年４月２９日号を参照）。これが実現されると、ＶＰＩＭなどの共通操作性標準により、ボイスメールのユーザが自分たちの音声メッセージを、現在電話を通じて行うのと同じように容易にインターネットまたはイントラネットを通して送受信することが可能となるだろう。インターネットを通しての音声でのメッセージングおよび音声イー・メールに加えて、専有のクライアントサーバソフトウェアシステムが最近導入されたことにより、従来のマルチメディアパーソナルコンピュータおよび音声グレードの電話線を有するユーザは、リアルタイムストリーム（ＲＥ）で音声または音声ベースのマルチメディアコンテンツを閲覧、選択および再生し、またはオンデマンドでダウンロード（ＲＥＭ）することができる。興味を持ったユーザは、このような音声コンテンツにアクセスするために、コンテンツプロバイダのウェブサイトからソフトウェアをダウンロードをするだけでよい（例えば、Progressive Netw orkのRealAudio Player and Server）。このようなシステムこそ真の大躍進である。というのも、それまでは、従来のオンライン方法による音声の伝達は、情報を獲得するのに実際のプログラムの５倍の長さを要するほどの低いレートで音声をダウンロードした。この場合は、５分間の音声を聴くのに聴視者は２５分間待つ必要があった。インターネットを通して音声を流すことが利用可能となったために、多数の会社が、専有ソフトウェアでプログラムされたマルチメディアコンピュータを持つユーザがインターネットを通してリアルタイムで話すことができるインターネット電話製品を導入した(Voclatec参照）。このようなシステムは、ユーザがインターネットのローカル・アクセスポイントまたはポイント・オブ・プレゼンスにアクセスして長距離コールをローカルコールにすることができるときは、長距離に有用である。同様に、インターネットを通して音声を流す結果として、コンテンツプロバイダはウェブサイトからの生の音声をブロードキャストすることができる（例えば、Cameron Audio NetworksによるAudionet）。最近では、インターネットを通じた通信のための標準ベースでの実装が導入され、インテルおよびマイクロソフトによってサポートされている。これは、DSP GroupのTrueSpeech G.723圧縮技術を利用している。これは高度なアルゴリズムを使用し、この結果、高い圧縮率にもかかわらず優れた音声品質を得ており、圧縮率２０：１および２４：１でそれぞれ６．３キロビット／秒（ｋｂｐｓ）および５．３ｋｂｐｓで動作する。また、２８．８ｋｂｐｓモデム速度で実効速度を３．７ｋｂｐｓより遅くすることができるサイレンス圧縮も含む。これにより音声を１：７．７８の割合で、または１０分間の音声を１．３分で伝送することが可能となる。２８．８ｋｂｐｓで作動するＶ．３４モデムを使用するTexas InstrumentのC8 0 DSPチップを用いると、１０：１の割合の音声伝送レート（１０分間の音声を１分間で伝送）を電話グレードの音品質で実現することができる。上記から明らかなように、データ、グラフィックス、音声メッセージングおよび音声コンテンツをネットワークを通して転送することがより広まり便利になった一方で、この進展により、音声でのメッセージングおよび音声コンテンツの転送および入出力に関連する従来からのいくつかの欠点が浮き彫りにされた。音声でのメッセージングおよび音声コンテンツがより利用可能になるに従って、このような音声のための中間装置または格納媒体がないことにより生じる欠点がより明瞭となる。イー・メールおよびファクシミリにとって、電話リンクの使用はデータの伝送およびこのデータのための制御コードの伝送に制限される。ネットワークコンピューティングの利用が増大し広まると共に、イー・メールおよびファクシミリのための電話リンク（例えば、RADLinxのPASSaFAX）は、ネットワークにアクセスするためのローカルのポイント・オブ・プレゼンスへの接続へとさらに制限される。イー・メールおよびファクシミリが含むコンテンツは、予定されたユーザによってプリンタに出力され得るものであり、これによりユーザはオフィスから離れるときまたは旅行に出かけるときマテリアルのハードコピーを携帯し、都合の良いときに再吟味することが可能となる。これに対して、音声メッセージおよび音声テキストは、現時点では送り手によって録音され、主としてリアルタイムおよびオンラインで予定された受け取り手によって検索される。せいぜいのところ、ユーザは自分のマルチメディアノートブック型コンピュータを用いて、録音し、格納された音声ファイルまたは流れている音声ファイルにアクセスすることができるだけである。音声へのオフラインアクセスは音声ファイルをマルチメディアコンピュータにダウンロードして、サウンドカードを装備したコンピュータで音声を再生することに制限される。しかし、マルチメディアコンピュータは、画面、キーボードおよび多目的処理能力を有しているため、かろうじて、従来のディクテーション装置またはボイスレコーダの大きさである。このように音声を生成するため、および音声にアクセスするために電話受話器またはマルチメディアコンピュータに依存することは、ファクシミリの受け取り人がファクシミリ機またはファックス可能コンピュータの近くにいるときしかファクシミリを読み、編集し、作成することができないのと類似する。電話の受話器からリアルタイムに、またはマルチメディアコンピュータからオフラインで行う以外にはネットワークベースのボイスメールを作成、再吟味およびアクセスすることができないことにより、音声でのメッセージングおよび音声コンテンツをネットワークベースのメッセージングへと統合するという要望が著しく制限される。ネットワークベースの音声でのメッセージングを格納する専用の携帯型装置は存在しない。同様に、後で装置に高速伝送してユーザが後でオフラインで再吟味できるように、ネットワークに接続されたホストからパーソナル・ボイスメッセージまたは公共の告知をスキャンおよび選択する方法またはユーティリティは存在しない。ユーザが自分の音声メッセージをオフラインで再吟味するのを可能にする唯一の専用装置は、従来のテープベースの留守番録音装置の標準的な機能に置き換わる、主としてデジタル録音技術を用いる住居用またはスモールオフィス・ホームオフィス（ＳＯＨＯ）用の装置である留守番電話装置（Telephone Answering De vice、ＴＡＤ）である。ＴＡＤは電気コンセントおよび電話ジャックの両方に差し込まれ、携帯用ではないため、ユーザはＴＡＤのスピーカの聞こえる範囲内にいなければならないか、またはオンラインおよびリアルタイムで自分のメッセージを検索するために電話を使って呼び込んでもよい。従来、ＴＡＤは非常に制約された外に向けてのメッセージング能力しか提供しない一方で、提供される外へ向けてのメッセージは、所有者がＴＡＤのマイクロホンの範囲内からまたはリアルタイムの電話コールから出力メッセージ（例えば、一般的な挨拶または呼び出し人特定／メールボックス特定のメッセージ）を録音することが必要であった。ネットワークベースであってもＴＡＤベースであっても、オンラインおよびリアルタイム伝送に制限され、物理的に電話機、ＴＡＤまたはマルチメディアコンピュータにアクセスすることが必要な音声でのメッセージングは、特に音声通信は本質的に人が音声を生成またはこれにアクセスするためには口および耳以外の外部ハードウェアまたは器具を必要としないため、不運である。音声は通信の最も自然で自給できる形態である。音声は、入力または検索を行うユーザ側で、筆記用具、キーボード、画面、専用の映像または手から目への(hand-to-eye)調整を必要としない手入らずである。にもかかわらず、ボイスメールが非常に広範囲に用いられるのは、現在の技術が十分であると認められているからではなく、音声の独自の特性の機能のためである。同様に、可聴音および音声を公衆および私設のネットワークを通して利用可能にする非常に多くの革新的なユーティリティが導入されているのは、コンテンツ、メッセージングおよびコマンド発行において可聴音および音声が持つ強力な性質についての評価であり、可聴音および音声をもっと容易に利用可能にする必要性が強調されるだけである。音声でのメッセージングおよび音声コンテンツがもっとアクセス可能になるまでは、上述のネットワークベースの音声ユーティリティの多くがテクノロジー愛好者にとっての新規なものであり続けるであろう。コンピュータ・テレフォン統合（ＣＴＩ）およびユニバーサル・メールボックスについては多くのことが述べられている。これらでは、ネットワークベースのメッセージおよびコンテンツは任意の媒体でおよび任意の選り抜きの入力装置によって生成され、同様に任意の媒体でまたは任意の選り抜きの出力装置によって検索され得る。ファックスはコンピュータ画面上のデータとしてアクセスされ、データはファックスまたはテキスト−音声の音声テキストとしてアクセスされ得る。また、自動音声録音ユーティリティがより有能になれば、音声はイー・メールまたはファックスでの印刷テキストのようにアクセスされるであろう。しかし、音声が電話の受話器または画面／キーボードベースのマルチメディアコンピュータ以外の選り抜きの入出力装置を持たない限り、選り抜きの媒体としての音声に対する要望も同様に大きく制限されるであろう。音声はユーザの声の直接の記録であるため、その熱心な主張、意味および感情の内容は失われることはない。同様に、非常に多くのデータが先ず声で発せられた後でテキストまたはデータに変換されるだけであるため、会合、スピーチおよびラジオ放送についてのタイムリーなデータのためには情報テキスト(info-text )は好適な媒体であるべきだろう。理想的には、ボイスメールは旅行中、標準時間帯を通って通信するとき、および話し言葉で発せられるタイムリーな情報（例えば、会合または講義の議事録）にアクセスするときの好適な通信モードであるべきだろう。音声テキスト（すなわち、コンピュータによって発声されるか、または人によって予め録音されたデータまたはテキスト）は、運転中、装置の操作中、またはレジャー活動を行っているときなどに、モータ技能および映像の使用が不都合であるかまたは故障した場合に、アクセスされるべき情報をメッセージングするために好適なフォーマットであるべきだろう。音声メッセージに直接アクセスするための現在の電話機の使用は、音声でのメッセージングの潜在的な利用を著しく制限してきた。音声メッセージおよび情報 −テキストのリアルタイム伝送は、特に遠距離からのボイスメールの記録および検索を非常に高コストにする。このように高コストで不便であるということは、音声メールおよび情報−テキストをコスト効率の良い方法で且つユーザ自身のペースで作成および再吟味することができないことを意味する。人は電話でのアクセスが可能な場所および状態、また無線通信リンクの場合には無線伝送が可能であり且つ望ましい場所に制限される。音声メールを作成および再吟味するためにマルチメディアコンピュータを適用することは、キーボード、ポインティング・デバイスおよび画面の使用がほとんど手入らずではなく、またマルチメディアコンピュータのサイズおよび費用からその普及および運搬可能性は見込まれないため、音声でのメッセージングをより便利にすることにはほとんど効果がない。現状では、ボイスメールは、別のときにもっと現実のやり方で連絡をとるつもりの個人間での短いメッセージに限定される（電話タグ）。電話を通してまたはマルチメディアコンピュータで、長くて内容の詰まった「メール」を聴くのは送り手および受け取り手の両方にとってコストがかかり不便であるため、ボイス「メール」は音声での「メッセージング」に限定され始めている。さらに、音声信号をユーザの音声プロセッサまたはＴＡＤへの直接の通信リンクを通し、そしてユーザが電話へのアクセスを持つときのみ（オフピーク時に不参加で記録するのとは反対に）リアルタイムに伝送する場合のコストは、情報テキスト（「テープ」上に記録された命令、記録された旅行談、スピーチ記録、記事または本など）のもっと商業的な使用、および他の革新的な広告者／講読者支援の音声テキストの使用を実行可能性がないものにする。最近では、Charles Lamer等に発行され、International Business Machines C orp．に譲渡された米国特許第５，４４４，７６８号、およびShmuel Goldberg等に発行され、Espro Engineeringに譲渡された米国特許第５，３５９，６９８号が、１つ以上の遠隔中央メッセージ設備に格納された音声メッセージを可聴処理する携帯型コンピュータ装置をいずれも開示している。Lamer等のシステムでは、ユーザは音声メッセージを中央メッセージ設備から通信リンクを通して携帯型装置に記録および再生、送信（アップロード）および受信（ダウンロード）することができる。しかし、Lamer等のシステムでは、携帯型装置と１つ以上の遠隔中央メッセージ設備との間に直接の電話リンクを確立することが必要である。La mer等およびGoldberg等のシステムでは、携帯型装置が、直接の通信リンクを通して、個別に従来の閉鎖型で高コストの専有音声処理システムにアクセスすることができる。Lamer等およびGoldberg等のシステムは、中央メッセージ設備への長距離コールによる以外にはボイスメールにアクセスする商業的に実行可能な解決法を提供していない。このような長距離通話料金に関連する費用が、Lamer等のシステムの長期にわたる使用に手が出せないようにするだろう。さらに、Lame r等のシステムは、ユーザが１つ以上の遠隔中央メッセージ設備にコンタクトして、選択された音声ファイルを検索し、伝送することが必要である。このようなポーリング手順に関連する不便性によって、システムの提供する便利性が帳消しにされる。同様に、Lamer等のシステムは、ユーザが携帯型コンピュータ装置によって、利用可能な音声コンテンツを閲覧し得る方法も、後での検索のためにメニューから音声ファイルを選択する方法も提供しない。同様に、Lamer等のシステムは、携帯型コンピュータ装置を直接コンピュータに接続するか、または標準的なタッチトーン電話装置によってローカルに生成されるＤＴＭＦトーンを検出および記録することのいずれかにより、専用「トレーニング」モードを開始することによる以外に、ユーザがサーバのネットワークにリンクされた中央サーバに遠隔アクセスして、制御コードをダウンロードし、パーソナル・ユーザグルーブまたは公衆のデータベースをサーチしてアドレスを探すためのユーティリティを提供していない。典型的なユーザのメールボックス・ユーティリティはユーザのネットワーク・イー・メールサーバで処理され、イー・メールを送受信する過程で順序正しく修正されるため、携帯型コンピュータ装置のためのこのような専用トレーニング・セッションは非実用的である。同様に、新規の音声サーバ・プラットフォーム、ユーティリティおよび圧縮機構が定期的に導入されているため、専用トレーニング・セッションを必要とせずに制御コードおよびアドレスブックの両方を更新するダイナミックでトランスペアレントな方法が必要である。本発明の目的は、広義には、ユーザがオフラインで、どの場所からでも、いかなる活動を行っているときでも、ゆっくりとしたペースで、電話通話料金を負担することなく、また通信リンクが現在アクセス可能であるかどうかに関係なく、ボイスメールを作成および再吟味することができるインターネット接続が可能なディクテーション装置および音声メッセージ記録／レビュー装置および方法を提供することである。好ましくは、ローカルネットワーク・アクセスポイントへの電話リンクを、予め記録されたマテリアルおよび制御コードを高速伝送するための通信リンクとして主として使用してこの伝送を容易にし、これにより電話またはマルチメディアコンピュータおよび音声でのメッセージング用の電話線の、記録または再生装置としての使用を制限することもまた本発明の目的である。ディクテーションおよび音声メッセージ記録／レビュー装置とネットワークサーバとの間にプレメッセージハンドシェーキングを生じさせて、デジタル化音声信号を標準音声圧縮プロトコルおよびＴＣＰ／ＩＰプロトコルスタックの１つに適合させ、ネットワークを通しての音声メッセージの高速伝送を容易にするプロトコルを提供することもまた本発明の目的である。本発明の別の目的は、ユーザが、公衆または私設ネットワークを通して送信および／または受信され得る音声ファイルを記録、編集および再生することを可能にする、携帯型専用音声可能ネットワーク（インターネット）アクセス装置を提供することである。特殊モデム構成の留守番電話装置（Ｍ−ＴＡＤ）の所有者が、圧縮音声メッセージファイルにアクセスして、これをＴＡＤへの直接ケーブル接続または電話リンクのいずれかによって、ＴＡＤのデジタルメモリから携帯型音声メッセージ記録／再生装置に直接ダウンロードするのを可能にする携帯型アクセス装置および方法を提供することもまた本発明の目的である。このような携帯型アクセス装置および方法を提供することにより、ＴＡＤ所有者は送信者にもっとがっしりしたデータ豊富な音声メッセージを自分のＴＡＤに残すようにすすめることができ、またＴＡＤ所有者は、自分のＴＡＤに圧縮デジタル形式で定期的に送ってもらい、本発明の装置にダウンロードして都合のよい時および場所で再生および再吟味することができる音声コンテンツを講読する。またこれにより、ＴＡＤ所有者は、家またはオフィスにいないとき、自分の携帯型ディクテーションおよび音声メッセージ記録／レビュー装置に、自分のＴＡＤとの電話リンクを確立し、すべての格納メッセージを経済的および自動的に検索してすべての発信メッセージ（例えば、一般的および差出人特定の挨拶状）を更新することができる。そして、すべての格納メッセージおよび市外向けの挨拶状はデジタル化圧縮フォーマットで伝送される。本発明は、ユーザが、公衆交換電話システムなどの通信リンクを通して公衆または私設ネットワーク上に位置する遠隔ホストコンピュータから受信され続いて遠隔ホストコンピュータに送信され得る、音声−テキスト、テキスト−音声およびその他の音声マテリアルを含む音声メッセージを記録、編集、再生および再吟味することができる、低コストで携帯型の記録および再生ディクテーションおよび音声メッセージ記録／レビュー装置を提供する。好適な装置は、それ自体の再充電可能な電源、集積回路および制御ボタンを含み、内蔵のスピーカー、マイクロフォンまたはプラグイン・ヘッドセットと、足踏みペダルと、取り外し可能なメモリカードとを介して音声信号をローカルに記録、編集、格納、再生および録音することを可能にする。装置はまた、標準ＲＪ −１１電話ジャック、モデム・チップセット（またはソフトウェア）、または標準あるいは無線のモデムカードを接続し得る、取り外し可能なＰＣＭＣＩＡコネクタを含み、音声信号を公衆または私設ネットワークに接続されたホストコンピュータへ、およびホストコンピュータから伝送し、制御することを可能にするＤＴＭＦトーンデコーダを含む。装置は、最初に記録されたときより実質的に速い速度で音声信号を送受信することができる回路を含む。好適な装置はまた、ネットワークユーザが、ＴＣＰ／ＩＰ一式のＳＭＴＰ（Si mple Mail Transfer Protocol）、ポストオフィス・プロトコル（ＰＯＰ３）およびＭＩＭＥ（Multipurpose Internet Mail Extension）などの標準プロトコルを用いて、インターネット・サービス・プロバイダ（ＩＳＰ）のアクセスポイントおよびシェル・アカウントなどのローカル・アクセスポイントから直接ネットワークにアクセスして、ユーザのイー・メールアドレスに送られてきた音声ファイル（または同様に音声に翻訳され得るデータ／テキストファイル）の再吟味、選択および検索を行い、またこのようなファイルをダウンロードおよび伝送するのを可能にするために必要な端末エミュレーションを有するプロセッサを含む。好適な装置はまた、標準またはタッチスクリーン方式のディスプレイと、ユーザが自分のイー・メールにアクセスするとき、自分のコンピュータ画面に表示されたイー・メールメッセージを作成し、また読むための同様のグラフィック・エディタを表示して、これによりユーザが自分のイー・メール・メッセージをスクロールし、ダウンロードしたい音声ファイルを選択し、またネットワークサーバによって、または装置において音声フォーマット（テキスト−音声）に変換させたいと思うテキストメッセージを選択するのを可能にするソフトウェアとを含む。好適な装置はまた、装置のバッテリを再充電するために電源に接続し得るポートを有する、装置を置くための受台と、通信リンクの確立を可能にする電話ジャックと、ファイルを直接コンピュータにダウンロードおよびアップロードするため、および「再指定(redirected)」ファイルを受信するための、コンピュータのシリアルまたはパラレルポートとを備える。好適な装置はまた、音声を認識し、音声に対して音声で答えることができる言語ユーザインタフェースを含む。このようなインタフェースは、話者に依存しない機能を含むが、パーソナル装置をユーザの声または発音の特徴に合わせて調整し、これにより精度を向上させる話者適応が可能である。この話者適応は、装置を最初に使用する前にひとまとまりのセンテンスを繰り返すことによってシステムをユーザの声に適応させ得るプロトコルを通して実現される（Lernout & Hauspie Speech Product（ＬＨＳＰ）のasr1000製品ラインを参照）。言語インタフェースは、ユーザが、特殊用語および固有名詞を含む語彙を音声認識アプリケーションに拡張することができる語彙ビルダー（ＬＨＳＰのLextool（商標）を参照）、装置が、例えば、「ホーム」をイー・メールアドレスに関連付けることができるようにするといったようなユーザ定義のコマンドと関連付ける語をユーザが作成することができるユーザ・テンプレート（ＬＨＳＰのasr200製品ライン）、自動車、航空機または公的な場所においても、またユーザがヘッドセットを装着していない場合でさえも言語ユーザインタフェースの精度を向上させる、イー・メールアドレスをつづるため、さらに背景ノイズ許容および遠隔音声ソフトウェアのためのアルファベット認識を含む。（ＬＨＳＰ参照）好適な装置はまた、メッセージデータを暗号化および解読することによって、また安全なデジタル署名または音声署名を用いて送信者のアイデンティティを認証することによって、機密に属する情報の信頼性のある安全な伝送を確実にするように設計された公開鍵暗号化技術を含む。好適な装置はまた、ユーザがネットワークサーバによってまだ音声変換されていないデータをダウンロードして、装置でこれを行うことができるテキスト−音声ユーティリティを含む。好適な装置はまた、ユーザが、ニュース記事、地図、利用可能な音声ファイルのメニューなどの印刷物に関連する印刷されたバーコード、または印刷物に関連する音声ファイルを、インターネットのようなネットワークから自動的に検索するために装置が必要とするネットワーク・サーバ、ファイル・ロケーション及びファイルＩＤを含むすべての情報を装置に与えるトラベル・ガイドの中の印刷されたバーコードをスキャンすることができるバーコードリーダーを含む。好適な装置はまた、ユーザが、ニュース記事、地図、利用可能な音声ファイルのメニューなどの印刷物に関連する印刷されたバーコード、または既に検査薄みの音声ファイルのグループ（Goldberg等に記載されているような）からファイルを再生するために装置が必要とするすべての情報を装置に与えるトラベル・ガイドの中の印刷されたバーコードをスキャンすることができるバーコードリーダーを含む。好適な装置はまた、装置、公衆電話、キオスクまたはユーザのコンピュータとの間の音声ファイルおよび制御コードの高速ローカル無線伝送（例えば、１．２Ｍｂｐｓおよび４Ｍｂｐｓ）のための赤外線データ連合（Infrared Data Associ ation、ＩｒＤＡ）などの基準を用いた赤外線インタフェースを含む。好適な装置はまた、オフライン・ブラウザと呼ばれるソフトウェア・ユーティリティを含む。このユーティリティは、ユーザが講読しているネットワークからオフピーク時間中に、または利用可能な新しい音声マテリアルを有する選択されたウェブサイトから、もしくはユーザがオフラインブラウザが検索するようにプログラムしているイー・メールアドレスから、音声ファイルを自動的に取り出すように装置をプログラムする。好適な装置はまた、グラフィック画面ベースのインタフェースによって、または音声プロンプトによって、ユーザが音声ファイルを受信し、また送信するアドレスおよび／またはサイトを求めてインターネット上にあるデータベースなどのネットワークデータベースを閲覧するのを可能にするソフトウェア・ユーティリティを含む。好適な装置はまた、ユーザが自分が音声ファイルを作成したいと思い、また送りたいと思う個人およびグループのイー・メールアドレスを含む自分のイー・メール・アドレスブックへのアクセス、これの更新および／またはダウンロードを行うための、グラフィック・インタフェースおよびメモリを作成するソフトウェア・ユーティリティを含む。このようなユーティリティは、ディクテーションおよび音声メッセージ記録／レビュー装置内のデータをユーザのイー・メールサーバアカウントに含まれるデータに自動的に同期させる。好適な装置はまた、単純な可聴音アラームまたはプログラムされた音声メッセージアラーム（例えば、「家に電話せよ」）間で選択するクロックおよびアラーム機能をオプションとして含み、自分の電話番号、イー・メールアドレス、カレンダー、覚え書きおよびアポイントメントを体系化するためのグラフィック・インタフェースおよびメモリをユーザが作成するソフトウェアユーティリティを含む。好適な装置はまた、ユーザが、インターネットなどの公衆または私設ネットワークを通して利用可能な低ビットレート音声圧縮のための、専有クライアントサーバ・ソフトウェア・システムおよびアップグレードならびに新しく導入された基準をダウンロードして、装置が最新の音声圧縮ソフトウェアを確実に使用し得るようにするソフトウェア・ユーティリティを含む。好適な装置はまた、装置をインターネットへのローカル接続を用いてリアルタイムの二方向、全二重音声会話を行う携帯型インターネット電話装置として使用することができる、アプリケーション・プログラム・インタフェース（ＡＰＩ）を含むがこれに限定されない、音声コンテンツを含む高圧縮および／またはストリーム音声ファイルを装置が受信するのを可能にする、専有クライアントサーバ・ソフトウェア・システムおよびアップグレードならびに新しく導入された基準をユーザがダウンロードするのを可能にするソフトウェア・ユーティリティを含む。好適な装置はまた、ウェブブラウザから作動されるウェブプログラムの機能性を拡張して、音声データなどのデータがユーザのＰＣ内に入るとデータに操作を行って、ユーザが音声ファイルを通信ポートによって、受台内に載置されシリアルまたはパラレルポートに接続された装置に直接向けることができるようにするソフトウェアユーティリティを含む。もしくは、これは、印刷などの指定キーを押すことによってユーザによって始動されると、音声ファイルを装置専用の特別な「プリンタ」ドライバに直接向ける、ＯＬＥ（オブジェクトリンクおよび埋め込み）可能なウェブソフトウェアを介して実現され得る。このユーティリティにより、自分のコンピュータでウェブをブラウズしているユーザは、音声ファイルを直接自分のパーソナル音声サーバにダウンロードして、自分のハードディスクから転送する必要はなく後でアクセスすることが可能になる。好適な装置はまた、ユーザがイー・メールメッセージを選択し、ネットワークに利用可能な適切なテキスト−音声変換アプリケーションによってメッセージをテキストから音声に変換し、その後でデジタル化してデジタル化圧縮音声ファイルとして伝送されるように要求することができるソフトウェア・ユーティリティを含む。本発明はまた、通信リンクに一旦接続されると、ユーザが電話で話している間に、ユーザが他の関連するまたは関連しないデータを処理および／またはデータをネットワークへ受信およびネットワークから送信するのと同時にまたは交互に、音声ファイルを直接ディクテーションおよび音声メッセージ記録装置に転送および受信することができるようにする、ＤＳＶＤ（Digital Simultaneous Voice/Data）および／またはVoiceViewプロトコル（Radish Communications Systems，Inc．）を用いる方法およびソフトウェア・ユーティリティに関する。これらのボイス／データプロトコルを使用することにより、ディクテーションおよび音声メッセージ記録／レビュー装置のユーザは、デジタル化ストリームまたはアナログ音声で話される音声プロンプトに応答して音声ファイルを要求し、声による応答、キーパッド入力またはＤＴＭＦトーンにより応答し、そして同じ電話の接続中にこれらのファイルを高速データモードで転送することが可能になる。本発明はまた、ネットワークサーバの要件またはユーザの好みに適合するためにデジタル化音声ファイルのスケーラビリティを可能にする方法およびソフトウェアユーティリティに関連する。これにより、要求された音声ファイルにより高い信憑性を与えるために、もっと低い圧縮率またはもっと遅い伝送レートをサーバが命令するか、またはユーザが要求すること、およびこの逆が可能となる。本発明の特徴は、記録装置が通信リンクに接続されたままにされ、また電話料金が最も低く、入力ラインに過剰な能力が利用できるオフピーク時間にローカル・ネットワーク・アクセスポイントにダイアルして接続するようにプログラムされ得ることである。記録装置は、ユーザが講読している音声ファイル、ユーザによって装置が注意するようにプログラムされているウェブサイトから利用可能な新しい音声ファイル、および選択されたイー・メールアドレスからユーザに送られる音声メールを求めてネットワークをサーチするようにプログラムされる。本発明の特徴は、記録装置が電話機、コンピュータ、携帯電話またはパーソナル・デジタル・アシスタントと通信リンクとの間に接続され、これらの装置のいずれかを使用中にユーザが音声ファイルを選択し、検索することができるように、標準ＲＪ−１１電話ジャックなどのインタフェースポートが提供されることである。ディクテーションおよび音声メッセージ記録／レビュー装置のメモリに記録されたアナログボイス信号にデジタル変換および圧縮を行って、デジタル化ボイスの高密度格納および高速伝送を可能にするための回路が提供されることもまた、本発明の特徴である。同様に、既に格納または受信されたデジタル化音声のアナログ変換および自然音再生のための回路が提供される。例えば現金自動預け払い機に類似した方法で、ユーザが自分の記録／レビュー装置を接続し、記録／レビュー装置によって直接検索され、また伝送される音声メッセージおよび音声テキストを選択することができる、空港及び観光地などの場所に設置される公衆端末を提供することもまた、本発明の特徴である。図面の簡単な説明上記の記載、および本発明の他の目的、特徴および利点は、添付の図面を参照して以下の好適な実施形態の詳細な説明により、もっと完全に理解され得る。図面において、図１は、本発明を具体化する好適なパーソナル音声メッセージプロセッサの概略ブロック図、図２〜図７（図２は図２ａおよび図２ｂを含む）は、所与の処理が図１の装置でどのように行われるかを示すフローチャートである。詳細な説明図１は、本発明を具体化する、現時点で好適なパーソナルボイスサーバ（ＰＶＳ）システム１０の概略ブロック図である。ＰＶＳシステム１０は、大まかに５つの主要な部分；高度に集積されたＤＳＰ／ＲＩＳＣ集積チップ１１（ＤＳＰはDigital Signal Processorの略であり、ＲＩＳＣはRe duced Instruction Set Computerの略である）；テレコム／音声コーデック１７；ＤＳＰチップに結合されたＳＤＲＡＭ１２および／またはフラッシュメモリ１３；マイクロホン２６、スピーカ１８、タッチスクリーン／ディスプレイＬＣＤ１９、赤外線Ｉ／Ｏ２１、およびバーコードリーダ１５などの周辺機器を含む。ＤＳＰがＶ３２ｂｉｓ、Ｖ３４などのモデムルーチン、ボイス認識、エコーキャンセレーションおよび音声合成を扱うようにするためのオペレーティングシステムソフトウェアもまた提供される。ソフトウェアはまた、チップ１１のＲＩＳＣ部分を介してシステムを制御する。この実施形態の装置１０はボイスサーバと呼ばれるが、これは音楽を含む他のタイプの音声のためにも等しく有用であることは明瞭であろう。ＤＳＰチップは好ましくはPhilips SemiconductorのＰＲ３１１０チップであり、これは４Ｋｂｙｔｅｓの命令キャッシュおよび１Ｋｂｙｔｅのデータキャッシュを有するＭＩＰＳＲ３０００ＲＩＳＣＣＰＵコア、ならびに多数のシステム構成部品および外部Ｉ／Ｏモジュールにインタフェースするための様々な統合された機能を含む。チップはまた、外部モデムチップセットを必要としないソフトウェア・ファックス／モデムなどのＤＳＰ機能を行うハードウェア乗算／積算ユニットを有する。しかし、チップはまた、ＵＡＲＴ（ユニバーサル非同期型レシーブ・トランスミット：Universal Asynchronous Receive Transmit）インタフェース２２（離して図示）を有し、これにより装置は従来のＲＳ２３２シリアル・コネクタ２３を介して外部モデムまたは他の装置（モデム付き留守番電話機など）に接続され得る。ＰＲ３１１００はまた、外部システムメモリ、キャッシュメモリ、ＣＰＵコアおよび外部Ｉ／Ｏモジュール間でデータを転送する効率的な手段を提供する多重ＤＭＡ（direct memory access）チャネル、および高性能でフレキシブルなバス・インタフェース・ユニット（ＢＩＵ）を含む。ＰＲ３１１００はまた、システム・インタフェース・モジュール（ＳＩＭ）を含み、これは、液晶ディスプレイ（ＬＣＤ）１９、赤外線Ｉ／Ｏモジュール２１およびコーデック１７などの様々な外部Ｉ／Ｏモジュールにインタフェースするための統合された機能を提供する。コーデック１７は好ましくはPhilipsのＵＣＢ１１００シングルチップ集積混合信号音声およびテレコムコーデックであり、これは、音声およびテレコミュニケーション・コーデック（アナログ／デジタル符号化および復号化）機能およびタッチスクリーンのアナログ−デジタル変換、ＩＳＤＮ／高速シリアル・赤外線および無線周辺装置を含むシステムのアナログ機能のほとんどを扱う。図１には離して示されているが、高速シリアル・インタフェース１４は実際にはＵＣＢ１１００の一部である。チップは、マイクロホンおよびスピーカを直接接続するように設計されたシングルチャネル・オーディオコーデックを有する（すなわち、構成部品１６および２８は実際にはＵＣＢ１１００の一部である）。この内蔵テレコミュニケーション・コーデックは、電話線に接続するために従来のＲＪ−１１ジャック２０に直接接続され得る。図１の実施形態をより完全に理解するために、ＰＲ３１１００およびＵＣＢ１１００のデータシートを添付し、これらは本明細書において参照として援用される。ＰＲ３１１００のオペレーティングシステム・ソフトウェアは、好ましくは、イギリスのCheshireのEden Group Limitedにより市販されているＥｄｅｎＯＳバージョン２．０である。このオペレーティングシステムはＰＲ３１１００（ＤＩＮＯとも呼ばれる）およびＵＣＢ１１００（ＢＥＴＴＹとも呼ばれる）をサポートするように特別に設計されている。ＥｄｅｎＯＳのデータシートを添付する。これはこのオペレーティングシステムによって提供されるソフトウェア・サポートおよびドライバについて記載している。このデータシートは本明細書において参照として援用されている。メモリ１２、１３はメッセージを格納するため、およびテンポラリ・データを保持するために使用される。フラッシュメモリは、オペレーティングシステム（Ｏ／Ｓ）およびアプリケーション・ソフトウェアを含む必要な永久プログラムの量に応じて、および記録されたメッセージの一部を格納するように構成される。典型的には、ＰＲ３１１００で提供される音声圧縮は、１／２Ｋｂｙｔｅ／秒より小さいデータ帯域幅となる（すなわち、１Ｍｂｙｔｅのメモリは１時間の音声を提供する）。マイクロホン２６およびスピーカ１８は品質およびサイズに基づいて選択される。図２〜図７にフロー図を示し、インターネットを通してメッセージを検索し、これらをＰＶＳにまたＰＶＳから伝送する動作と、回線をつなぎ、データをインターネット内の所定のサーバアドレスから受け取り、格納し、ふるいにかけ、検索し、そしてメッセージをＰＶＳに伝送しＰＶＳから再生するための様々な動作の選択とを記載している。これらの動作には、デジタル形態の圧縮メッセージおよびアナログ形態の音声信号を、スピーカ／マイクロホンおよび電話接続から二方向方式で受け取ることが含まれる。図２ａおよび図２ｂは、ＰＶＳがトランスポート・プロトコルによってインターネット上のロケーションにどのようにしてつながるか、またＰＶＳがどのようにしてそのウェブ／イー・メール・サイトに関連するすべてのデータ（例えば、ＨＴＭＬ言語の表示情報）を得て、専有標準または事実上の標準（例えば、２．５ｋｂｐｓの高圧縮音声）のいずれかを用いて送られたメッセージ（音声、データなど）を受信／格納するかを示すフローチャートである。図２ａおよび図２ｂに示される動作は、ＤＳＰ／ＲＩＳＣのリアルタイム・カーネル（図３を参照してさらに後述する）によって並行して行われる。これにより多重タスクが並行して走り、実行され得る。メイン・タスクの動作はブロック２００で始まる。サイトへのアクセスおよび格納メッセージの格納または受信は他のタスクと並行して行われる。これらのタスクは、ＰＶＳ、またはバーコードリーダ、音声合成器、音声認識の動作などの他のタスクを動作させるために、またはＰＰＰによって他のウェブサイトに同時にアクセスするためにローカルで行うこともできる。ブロック２０２で、所望の動作が市外呼び出しを介してのネットワーク・アクセス・プロバイダへの接続（ブロック２１０）であるかどうかを判定するテストが行われる。ＮＯの場合は、モデムは呼び出し音がなると呼び出しに答え、そのハンドシェーキング手順を完了し、情報の受信を始める（ブロック２０４）。ブロック２２０で、モデムからのデータビットはＤＳＰチップ１１によって受信される。ブロック２３０でＤＳＰチップは入力データを復号する。ブロック２４０で、所望の動作がＨＴＭＬサイトを復号することであるかどうかを判定するテストが行われる。ＮＯの場合は、制御はブロック３４０に移る。さもなくば、ブロック２５０で動作は続き、サイトページの表示が始まる。ブロック２６０で、動作モードが対話型であるか自動であるかを判定するためのテストが行われる。対話型モードでは、ＰＶＳのユーザは、閲覧して、完了すべき所望の動作を選択しなければならない。自動モードでは、音声または他のメッセージを検索するための（１以上の）キーワードがサーチされ、圧縮データを得るために自動的にアクティブにされる。ブロック２６０でのテストで対話型モードであると感知する場合、制御は図２ｂのブロック１１０に移る。ＮＯの場合は、ブロック２７０で始まる自動ブラウジングが行われ、反転されたキーワード記号がサーチされる。ブロック２８０で、キーワードが既にデジタル化されたメッセージに対する要求であるかどうかを判定するテストが行われ、そうである場合は、ブロック２９０で、ＦＴＰプロトコルによって圧縮されたデータがＰＶＳによって受信される。ブロック２８０でのテストの結果が“ｎｏ”である場合、制御はブロック３１０に移行する。ブロック３１０で、メッセージがもう存在しないかどうかを判定するテストが行われ、存在しない場合、制御はブロック１００に戻る。そうでない場合は、ブロック３２０で、キーワードがウェブサーバでのローカル・メッセージを格納する場所の要求であるかどうかを判定するテストか行われる。そうである場合は、このデータ、例えば圧縮音声メッセージ、がＰＶＳからウェブサイトに伝送される（ブロック３３０）。そうでない場合、制御は開始（ブロック１００）に戻る。このウェブサイトにＰＶＳ所有者のための格納メッセージがなくなるまでプロセスが続けられる。ブロック３４０で、このサイトがＦＴＰプロトコル言語を利用しているかどうかを判定するテストが行われる。利用している場合、メッセージはＦＴＰを利用して検索され（ブロック３６０）、ブロック３８０で格納され、制御は図２ｂのブロック１２０に移る。ブロック３４０でＦＴＰプロトコルが使用されていないと判定されると、ブロック３４０で、認可されているアクセス言語が受信されているかどうかを判定するテストが行われる。そうである場合は、ブロック３６０で、その認可されているアクセス言語を利用してメッセージが検索され、次にブロック３８０で格納される。次に、制御は図２ｂのブロック１２０に移る。ブロック３５０で、認可されているアクセス言語が見つけれらない場合は、ブロック３７０でユーザに通知され、制御はブロック１００に戻る。ブロック２６０でモードが対話型であると判定された場合、制御は図２ｂのブロック１１０に移る。ブロック１１２で、ウェブページのキーワードが選択され、ブロック１１４で、プール内のメッセージを捜し出すためにＨＴＭＬ解釈がアクティブにされる。ブロック１１６で、次に、メッセージは送信および／または受信され、制御は図２ａのブロック１００に戻る。データを、好ましくは圧縮形態で格納したブロック３８０に続いて、制御は図２ｂのブロック１２０に移る。格納されたすべてのデータは、フラットデータベース内にデータを生成させ（ブロック１２０）、これが後でデータを捜し出すためにサーチされる。メッセージが音声メッセージである場合は、ＦＴＰプロトコルによって伝送されると同時に圧縮および再生される。ブロック１２２でのテストは、現在のメッセージにこのような行為が必要であるかどうかを判定し、必要である場合は、伸長および音声合成器がアクティブにされ（ブロック１２４）、メッセージが合成の準備ができていることを反映するためにデータベースが更新され、制御はブロック１００に戻る。メッセージが伸長および再生されない場合は、制御はブロック１２２からブロック１２８に移る。ここで、メッセージがウェブサーバに送られるかどうかを判定するテストが行われ、ＮＯの場合、制御はブロック１００に戻る。メッセージがウェブサーバに送られる場合、ブロック１３０でＦＴＰによって送られ、ユーザに転送の完了が通知され（ブロック１３２）、その後制御はブロック１００に戻る。図３は、本アプリケーションにおいてＤＳＰ１１のＲＩＳＣコアＣＰＵで作動されるＥｄｅｎＯＳのカーネルの全体的な動作を示す。カーネルは、それぞれがそれ独自の優先度を持ち、他の（子）タスクを始動させることができる多重プログラムまたはタスクを同時に走らせることができるという点でマルチタスキングである。ブロック４００〜４２０を介してカーネルが初期化されると、ブロック４８０でアイドルモードで動作が開始され、ここでＰＶＳはイベントが生じるのを待ち、イベントが生じるとブロック４３０で扱われる。あらゆるプログラムはこのようにして、ブロック４３０でそのタスクに注意を払ってもらうことによってオペレイティングシステムと相互作用する。発生するイベントのタイプは同期または非同期である。ブロック４４０で、同期イベントが検出されると、この同期イベントの処理は連結部５を介して始動される。そうでない場合は、ブロック４５０で非同期イベントを検出するためのテストが行われる。この場合、非同期イベントの処理が連結部６を介して始動される。それぞれの場合において、処理が開始された後、オペレーティングシステムはアイドルモードに戻って他のイベントの処理を行う。発生する他の特殊なイベントはブロック４６０のエラー処理である。ブロック４５０で非同期イベントが検出されない場合、ブロック４６０で故障イベントを検出するためのテストが行われ、検出されない場合、プログラムはアイドルモードに戻る。ハードウェアの故障、通信の故障またはソフトウェアの故障の場合、ブロック４６０でエラーイベントが検出され、ランタイム・ハンドラーが発動され（ブロック４７０）、そのイベントを扱う。次に、制御はアイドルモードに戻る。図３に示した同期および非同期イベントは単に例示的なものであり、各タイプについて他のものがあり得ることは考えられる。図４は、アナログ音声メッセージを記録するときＤＳＰ／ＲＩＳＣチップ１１のコントローラによって行われるルーチンを示すブロック図である。ブロック７１０で、入力メッセージが内蔵マイクロホンからであるかどうかを判定するテストが行われる。ＮＯの場合、制御は図５のルーチンに移る。そうである場合、音声メッセージはデジタル化および圧縮され（ブロック７２０）、データのワーキングプール内に入れられる（ブロック７３０）。ブロック７４０で、メッセージ全体を格納する前にメモリが一杯であるかどうかを判定するテストが行われる。ＮＯの場合、ルーチンは終了し、制御はアイドルモードに戻る。一杯の場合、記録は不能となり（ブロック７５０）、警告ライトなどによって、メモリが一杯であることがオペレータに通知される。制御はアイドルモードに戻る。図５は、電話線からアナログ音声を記録するために行われるルーチンを示すブロック図である。ブロック８００で、受信されている音声メッセージが通信リンク（電話線）からであるかどうかを判定するテストが行われる。ＮＯの場合、制御は図６のルーチンに移る。そうである場合、メッセージは音声としてテレコム／音声コーデック１７を通って渡され（ブロック８１０）、ブロック８２０で、ＤＳＰ／ＲＩＳＣチップによって圧縮が行われるかどうかを判定するテストが行われる。圧縮が行われる場合、メッセージはローカルメモリに格納され（ブロック８３０）、記録は停止され、制御はアイドルモードに戻される。ＤＳＰ／ＲＩＳＣチップによって圧縮が行われない場合、メッセージはテレコム／音声コーデックに送られ、ここで標準（ＡＤＰＣＭ）アルゴリズムによって圧縮が行われる（ブロック８４０）。次に、メッセージはＤＳＰ／ＲＩＳＣ１１にそのＵＡＲＴを介して戻され（ブロック８５０）、ＤＳＰ／ＲＩＳＣチップの制御によりメッセージはフラッシュメモリ１３に格納される（ブロック８６０）。次に制御はアイドルモードに戻される。図６は、内蔵スピーカを通して格納された音声を再生するために音声／テレコム・コーデックのコントローラによって行われるルーチンのブロック図である。ブロック９００で、オペレータは装置に格納されているメッセージプールからメッセージを選択する。ブロック９１０で、読み出される格納メッセージが元々音声／テレコム/コーデックによって圧縮されたものであるかどうかを判定するテストが行われる。ＮＯの場合、制御はブロック９２０に移る。そうである場合、メッセージは読み出され、音声／テレコム・コーデックを用いて伸長され（ブロック９３０）、伸長されたメッセージは音声／テレコム・コーデックのデジタル −アナログ変換器（ＤＡＣ）に与えられる（ブロック９４０）。メッセージは次に、Ｄ／Ａ変換器および増幅器２８を通り、内蔵スピーカ１８を介して再生され（ブロック９５０）、制御はアイドルモードに戻される。格納メッセージが、元々音声／テレコム・コーデックによって圧縮されたものでない場合、ブロック９２０で、格納メッセージが元々音声／テレコム・コーデックによって圧縮されたものであるかどうかを判定するテストが行われる。ＮＯの場合はユーザに通知され（ブロック９６０）、制御はアイドルモードに戻される。そうである場合、メッセージはコントローラによって読み出され（ブロック９７０）、次に伸長のためにモデムに送られ、さらにモデムから音声／テレコム・コーデック１７のＵＡＲＴポートを通ってメモリ１３に戻される。制御はブロック９４０に移り、元々音声／テレコム・コーデックによって圧縮されたメッセージと同じ方法で再生が処理される。図７は、受台に接続されたＰＶＳがどのようにしてＰＣ（マルチメディアの場合もそうでない場合も）または内蔵モデムを有する特別構成のＴＡＤに接続され、ＰＣまたはＴＡＤのユーザ（Ａ）が、ＰＶＳのテレコム／音声コーデック以外のモデムを通してＰＶＳに音声ファイルを送信またはＰＶＳから音声ファイルを受信し得るかを示す概略図である。これによりＰＣユーザは、ＰＶＳに存在する音声ファイルをＰＣモデムを通して送信または添付することが可能となり、同様に、ＰＣユーザがＰＣモデムを通して受信した音声ファイルを直接ＰＶＳにダウンロードすることが可能となる。同じ構成により、非マルチメディアＰＣのユーザ（Ｂ）が非マルチメディアＰＣのモデムを通して受信された音声ファイルを、ＰＶＳのマルチメディア能力を利用して音声ファイルを再生することによって再生することが可能となる。この構成は同様に、ＰＣユーザ（Ｃ）がＰＶＳの内蔵マイクロホンを通して音声を記録し、これをＰＣのモデムを通してファイルとしてまたはストリーム音声として伝送することを可能にする。このような構成はまた、ＰＣのユーザ（Ｄ）が標準のウェブブラウザ・プログラムを使用しながら、音声ファイルを直接ＰＶＳに向け直すことを可能にする。最後に、モデム構成のＴＡＤと同様の構成により、ＴＡＤユーザが音声メッセージをＴＡＤに、およびＴＡＤからＰＶＳにダウンロードすることが可能となる。ＰＣからＰＶＳへの二方向通信は、ＰＣおよびＰＶＳのシリアルＲＳ２３２ポートで通信ケーブル（例えば、９ピンコネクタ）によって処理され、ＵＡＲＴ通信インタフェースからの入出力を制御する非同期イベント・ソフトウェアによって制御される。ＰＣのソフトウェアはＰＣへ、およびＰＣからＰＶＳへデータを送受信するドライバを扱う。データの送信については、ＰＣがデータをファックスまたはプリンタへ送信するのと同様であり、データの受信については、ＰＣがスキャナからデータを受信するのと同様である。このドライバは、動作タイプ、アクノレッジメントおよび「送信終了」の長さおよびウエイトなどのＰＶＳのために必要なパラメータすべてを設定する。ＰＣはまた、ＰＶＳのスピーカが動作するように、マルチメディア音声メッセージを受信するための付加機器（周辺機器）としてＰＶＳを使用するソフトウェアを扱う。ＰＣはまた、ＰＶＳのマイクロホン入力を管理するソフトウェア、およびソフトウェアと完全に融合され、それに応じてコマンドをＰＶＳに発動するために標準のウェブ・ブラウザ（例えばネットスケープ・ナビゲータ）と融合するソフトウェアを処理する。ＰＶＳ内のソフトウェアは、ＰＶＳの非同期イベントソフトウェアの下で制御される手続きコールの遠隔始動（Remote activation of Procedural Calls：ＲＰＣ）を扱うマルチタスク・オペレーティング機能の一部である。本発明の好適な実施形態を例示のために開示したが、当業者であれば、添付の請求の範囲で定義される本発明の範囲および精神から離れることなく多くの追加、改変および置換が可能であることは理解され得る。DETAILED DESCRIPTION OF THE INVENTION Personal Voice Message Processor and Method Field of the invention The present invention generally relates to dictation devices and voice communication devices. In particular, methods and portables for voice communication, including recording and editing of voice mail and voice content, and their transmission and reception over a private or public network such as the Internet using a common telecommunications medium or data link. Related to the device. Background of the Invention All electronic messaging systems except voicemail have an intermediate device or storage medium whereby data is transmitted over standard communication links, preferably at high communication rates, for later offline access by scheduled users, re-use. Stored on a storage medium or non-participating device for review and editing. In the case of facsimile transmission, the image is scanned by a transmitter, then transmitted and finally printed off-line for off-line use by intended recipients. In the case of e-mail, the data is computer generated and then sent and stored directly on the intended user's unattended computer, or stored on a central host computer linked to a computer network and later Searched by the intended user. The most common networks are local area networks (LANs), wide area networks (WANs), and public networks such as the Internet, or private networks. When the intended user accesses his computer, he finds a message displayed in the graphic editor that indicates that the e-mail is already there or has arrived and how he can find it . Once the e-mail is retrieved, it can also be read, reviewed, and manipulated by the user off-line at the intended user's computer. Alternatively, the e-mail may be output to a printer and provided with a hard copy so that it can be reviewed at the convenience of the user. If a facsimile machine is not available, the facsimile can be sent to a computer or Reflection Technology, Inc. Recipients, such as FaxView's Personal Fax Reader, can be transmitted to a paperless fax machine on a handheld for offline, independent review. A utility exists for both facsimile and e-mail messages, whereby messages are selected by the authorized user from the host for subsequent user e-mail addresses or transmission to non-participating facsimile machines. obtain. See, for example, U.S. Patent No. 4,918,722 to Duehren et al. These files were recently selected with the spread and growth of the Internet and, more particularly, with the growing popularity of WEB sites that provide publishing material in the form of Hyper Text Markup Language (HTML) documents. A utility has emerged that allows offline access and independent review by fax. See, for example, Ibex Technologies, Inc. See FactsLine for the web by. Such utilities make the vast amount of information and graphics provided over the Internet available to users who do not have access to an Internet-connected computer, or who want to limit the amount of time they spend online. Becomes Most of the potential users have or do not have access to the Internet, but are traveling, unable to access their computer, or navigate to documents or emails of interest. Computer to identify the document by number, and have the selected document read in real time over the phone using text-to-speech and sent by fax or attached to an e-mail. Get up and wait for the website graphics to appear (Netphonic Communications, Inc., which allows users to access the Internet in response to voice prompts). Utilities such as Web-On Call Voice Browser have been introduced), and you may not want to spend time. Similarly, the widespread use of the Internet and crowded access to particularly popular websites or during certain peak usage hours causes Internet users to "subscribe" to certain websites There has arisen a demand for a utility called an offline browser that allows for this. And from this particular website, the user's computer automatically searches for materials during off-peak hours, categorizes and organizes new or updated information, and the user can go offline using their chosen browser. You can review this at (eg, FreeLoader, Inc. FreeLoader). Similarly, a subscription service has been introduced that can send voice mail to an e-mail address and update the audio content provided on the website by a standard telephone call to an interactive voice response (IVR) system. (Eg, "Amail" and "Dialweb" in Telelet Communications). More recently, voice processor system manufacturers have been working on the global voicemail system market to develop the Interoperability standard for Voice Profile for Internet Mail (VPIM). Established a workgroup consisting of more than 60% of Due to the presence of globally accessible contact points, mainly on the Internet, and generally accepted transmission protocols, specifically Simple Message Transfer Protocol (SMTP) and multipurpose Internet messaging extensions as the core of VPIM (MIME), TCP / IP (Transmission Control Protocol / Internet Protocol) has been selected as the transmission means (see Business Wire, April 29, 1996). Once this is achieved, common operational standards such as VPIM will allow voicemail users to send and receive their voice messages over the Internet or intranet as easily as they do today over the telephone. Would. In addition to voice messaging and voice e-mail over the Internet, the recent introduction of proprietary client-server software systems allows users with traditional multimedia personal computers and voice-grade telephone lines to access real-time streams ( RE) to browse, select and play audio or audio-based multimedia content, or download (REM) on demand. Interested users need only download software from the content provider's website to access such audio content (eg, RealAudio Player and Server from Progressive Network). Such a system is a real breakthrough. Until then, the transmission of voice by conventional online methods downloaded voice at a rate that was as low as five times the length of a real program to acquire information. In this case, the listener had to wait 25 minutes to hear the voice for 5 minutes. With the availability of audio streaming over the Internet, many companies have introduced Internet telephony products that allow users with multimedia computers programmed with proprietary software to speak in real time over the Internet (see Voclatec). . Such a system is useful for long distances when a user can access a local access point or point of presence on the Internet to make a long distance call a local call. Similarly, as a result of streaming audio over the Internet, content providers can broadcast live audio from websites (eg, Audionet by Cameron Audio Networks). More recently, standards-based implementations for communicating over the Internet have been introduced and are supported by Intel and Microsoft. This is DSP Group's TrueSpeech G. Utilizes 723 compression technology. It uses advanced algorithms, resulting in excellent audio quality despite high compression ratios, with 6. 1 compression ratios of 20: 1 and 24: 1 respectively. 4. 3 kilobits per second (kbps) and It operates at 3 kbps. 28. 2. Effective speed at 8 kbps modem speed. Includes silence compression that can be slower than 7 kbps. As a result, the sound is 1: 7. At a rate of 78 or 10 minutes, 1. It is possible to transmit in three minutes. 28. Operating at 8 kbps. Using Texas Instrument's C80 DSP chip with a 34 modem, a 10: 1 audio transmission rate (10 minutes of audio transmitted in 1 minute) can be achieved with telephone grade audio quality. As is evident from the above, while the transfer and transmission of data, graphics, voice messaging and voice content over a network has become more widespread and convenient, this advance has enabled the transfer of voice messaging and voice content and input and output. Several related drawbacks have been highlighted. As audio messaging and audio content become more available, the disadvantages arising from the absence of intermediate devices or storage media for such audio become more apparent. For e-mail and facsimile, the use of telephone links is limited to the transmission of data and control codes for this data. As the use of network computing grows and becomes widespread, telephone links for e-mail and fax (eg, RADLinx's PASSaFAX) are further restricted to connecting to a local point of presence to access the network. Is done. The content that e-mail and facsimile contain can be output to a printer by the intended user, so that the user can carry a hard copy of the material when leaving the office or going on a trip, and when convenient. It is possible to reexamine. In contrast, voice messages and texts are currently recorded by the sender and retrieved primarily by the intended recipient in real time and online. At best, a user can only access recorded and stored audio files or streaming audio files using his multimedia notebook computer. Offline access to audio is limited to downloading audio files to a multimedia computer and playing the audio on a computer equipped with a sound card. However, multimedia computers are barely the size of conventional dictation devices or voice recorders because they have screens, keyboards, and versatile processing capabilities. Relying on a telephone handset or multimedia computer to generate voice and to access voice in this way means that the facsimile can only read and edit the facsimile when it is near the facsimile machine or fax-enabled computer. Similar to not being able to create. Voice messaging and voice content to network-based messaging by not being able to create, review, and access network-based voicemails except in real-time from a telephone handset or offline from a multimedia computer The desire to integrate with is significantly limited. There is no dedicated portable device for storing network-based voice messaging. Similarly, there is no method or utility to scan and select personal voice messages or public announcements from networked hosts so that they can be later transmitted to the device at high speed for later review offline. The only dedicated device that allows users to review their voice messages offline is the residential or small office, primarily using digital recording technology, replacing the standard features of traditional tape-based answering machines.・ An answering machine (Telephone Answering Device, TAD) for home office (SOHO). Because the TAD is plugged into both an electrical outlet and a telephone jack and is not portable, the user must be within range of the TAD's speakers or hang up the phone to search for his messages online and in real time. You may use it to call. Traditionally, TADs provide only very restricted outbound messaging capabilities, while the outbound messages provided are output by the owner from within the TAD's microphone or from a real-time telephone call. It was necessary to record a message (eg, a general greeting or caller-specific / mailbox-specific message). Voice-based messaging, whether network-based or TAD-based, which is limited to online and real-time transmission and requires physical access to a telephone, TAD or multimedia computer, especially voice communications Unfortunately, humans do not need any external hardware or equipment other than the mouth and ears to generate or access voice. Voice is the most natural and self-sufficient form of communication. The audio is inaccessible on the part of the user performing the input or search, with no need for writing utensils, keyboards, screens, dedicated video or hand-to-eye adjustments. Nevertheless, voicemail is so widely used not because the current technology is perceived to be sufficient, but because of the function of the unique characteristics of speech. Similarly, the vast number of innovative utilities that make audible sounds and sounds available through public and private networks have been introduced due to the powerful nature of audible sounds and sounds in content, messaging and command issuance. And only emphasizes the need to make audible sounds and sounds more readily available. Until voice messaging and voice content become more accessible, many of the network-based voice utilities described above will continue to be new to technology enthusiasts. Much has been said about computer telephone integration (CTI) and universal mailboxes. In these, network-based messages and content may be generated on any medium and by any selected input device, and may also be retrieved on any medium or by any selected output device. The fax is accessed as data on a computer screen, and the data may be accessed as a fax or text-to-speech text. Also, as automated voice recording utilities become more competent, voice will be accessed like printed text in email or fax. However, the desire for audio as the medium of choice will be similarly severely limited, unless the audio has a selected input / output device other than a telephone handset or screen / keyboard based multimedia computer. Since speech is a direct recording of the user's voice, its enthusiastic assertions, meanings and emotions are not lost. Similarly, informational text (info-) is required for timely data about meetings, speeches and radio broadcasts, since so much data is only spoken first and then converted to text or data. text) should be the preferred medium. Ideally, voicemail is the preferred mode of communication when traveling, communicating through standard time zones, and accessing timely information spoken (eg, meeting or lecture minutes). There should be. Speech text (ie, data or text uttered by a computer or pre-recorded by a person) can be used for motor skills and video use, such as when driving, operating equipment, or performing leisure activities. Should be in a format suitable for messaging the information to be accessed if it is inconvenient or fails. The use of current telephones for direct access to voice messages has severely limited the potential use of voice messaging. The real-time transmission of voice messages and information-text makes the recording and retrieval of voicemail, especially from long distances, very expensive. This high cost and inconvenience means that voice mail and information-text cannot be created and reviewed in a cost-effective manner and at the user's own pace. People are restricted to places and conditions that are accessible by telephone and, in the case of wireless communication links, places where wireless transmission is possible and desirable. Applying a multimedia computer to compose and review voice mail is almost inexpensive to use with keyboards, pointing devices and screens, and because of the size and cost of multimedia computers it can be spread and transported It is unlikely to make voice messaging more convenient, since it is unlikely to be. At present, voicemail is limited to short messages between individuals who want to contact each other in a more realistic way at another time (phone tags). Voice "mail" is limited to voice "messaging" because listening to a long and full "mail" over the phone or on a multimedia computer is costly and inconvenient for both the sender and recipient Is starting to be. Further, the audio signal may be transmitted through a direct communication link to the user's audio processor or TAD and transmitted in real time only when the user has access to the telephone (as opposed to recording off-peak during off-peak hours). Costs are more commercial use of informational texts (eg, instructions recorded on “tapes”, recorded trip stories, speech records, articles or books), and other innovative advertiser / subscriber support Make the use of the spoken text infeasible. Recently, it has been published by Charles Lamer and others, and has been published in International Business Machines Corp. U.S. Pat. No. 5,444,768 assigned to Shmuel Goldberg et al. And assigned to Espro Engineering, Inc., are assigned to one or more remote central messaging facilities. All portable computer devices that audibly process voice messages are disclosed. In a system such as Lamer, a user can record and play, send (upload) and receive (download) voice messages from a central messaging facility over a communication link to a portable device. However, systems such as Lamer require that a direct telephone link be established between the portable device and one or more remote central messaging facilities. Lamer et al. And Goldberg et al. Allow portable devices to individually access conventional, closed and costly proprietary voice processing systems through direct communication links. Lamer et al. And Goldberg et al. Do not provide a commercially viable solution to access voicemail other than by long distance calls to a central messaging facility. The costs associated with these long distance charges will keep the long-term use of systems such as Lamer out of reach. In addition, systems such as Lamer require the user to contact one or more remote central messaging facilities to retrieve and transmit selected audio files. The inconvenience associated with such a polling procedure negates the convenience provided by the system. Similarly, systems such as Lamer do not provide a way for a user to view available audio content on a portable computing device, or to select an audio file from a menu for later retrieval. Similarly, systems such as Lamer are dedicated to either connecting a portable computing device directly to the computer or detecting and recording DT MF tones generated locally by a standard touch-tone telephone device. Other than by initiating a "training" mode, the user remotely accesses a central server linked to the server's network, downloads control codes, and searches the personal user groove or public database for addresses. Does not provide a utility for Because a typical user's mailbox utility is processed on the user's network e-mail server and modified in the process of sending and receiving e-mail, such dedicated training tools for portable computing devices Sessions are impractical. Similarly, new voice server platforms, utilities and compression mechanisms are regularly introduced, requiring a dynamic and transparent way to update both control codes and address books without the need for dedicated training sessions It is. SUMMARY OF THE INVENTION Broadly, the object of the present invention is to allow a user to be offline, from any location, performing any activity, at a slow pace, without incurring telephone charges, and that the communication link is currently accessible. It is an object of the present invention to provide an Internet-enabled dictation device and voice message recording / review device and method capable of creating and reviewing voicemail, whether or not possible. Preferably, the telephone link to the local network access point is primarily used as a communication link for high speed transmission of pre-recorded materials and control codes to facilitate this transmission, thereby providing a telephone or multimedia computer and voice It is also an object of the present invention to limit the use of a telephone line for messaging in a telephone as a recording or playback device. Pre-message handshaking occurs between the dictation and voice message recording / review device and the network server to adapt the digitized voice signal to one of the standard voice compression protocols and one of the TCP / IP protocol stacks, and It is also an object of the present invention to provide a protocol that facilitates high speed transmission of voice messages. Another object of the present invention is a portable dedicated voice enabled network (Internet) access device that allows a user to record, edit and play audio files that can be transmitted and / or received over a public or private network. To provide. An owner of a telephone answering machine (M-TAD) with a special modem configuration accesses the compressed voice message file and transfers it from the digital memory of the TAD, either by a direct cable connection to the TAD or by a telephone link. It is also an object of the present invention to provide a portable access device and method that allows for direct download to a message recording / playback device. By providing such a portable access device and method, the TAD owner can encourage the sender to leave a richer, data-rich voice message on his TAD, and the TAD owner can TAD periodically sends it in compressed digital format and subscribes to audio content that can be downloaded to the device of the present invention and played and reviewed at a convenient time and place. This also allows TAD owners to establish a telephone link with their TAD on their portable dictation and voice message recording / reviewing device when not at home or office, making all stored messages economical and automatic. A search can be performed to update all outgoing messages (eg, general and sender-specific greetings). All stored messages and out-of-town greetings are then transmitted in a digitized compressed format. The present invention is directed to voice-to-text, text-to-speech and other communications that allow a user to receive from a remote host computer located on a public or private network and subsequently transmit to a remote host computer over a communication link, such as a public switched telephone system. Provide a low cost portable recording and playback dictation and voice message recording / reviewing device that can record, edit, play and review voice messages including voice material. A preferred device includes its own rechargeable power supply, integrated circuit and control buttons, and audio signals via a built-in speaker, microphone or plug-in headset, foot pedal, and removable memory card. Recording, editing, storing, playing and recording locally. The device also includes a detachable PCMCIA connector to which a standard RJ-11 telephone jack, modem chipset (or software), or standard or wireless modem card can be connected, and which connects audio signals to a public or private network. DTMF tone decoder that allows transmission and control to and from the host computer. The apparatus includes circuitry capable of transmitting and receiving audio signals at a substantially faster rate than when initially recorded. Preferred devices also allow network users to use standard services such as the TCP / IP suite of Simple Mail Transfer Protocol (SMTP), Post Office Protocol (POP3) and Multipurpose Internet Mail Extension (MIME) to provide Internet services. Accessing the network directly from a provider (ISP) access point and local access point, such as a shell account, to an audio file (or data that could also be translated into speech) sent to the user's email address It includes a processor with the necessary terminal emulation to review, select and search text files) and to enable downloading and transmission of such files. Preferred devices also include a standard or touch screen display and similar graphics for creating and reading e-mail messages displayed on their computer screens when users access their e-mails. Displaying an editor, where the user wants to scroll through his email message, select the audio file he wants to download, and have it converted to an audio format (text-to-speech) by a network server or at the device. Software that allows you to select the text message you want. The preferred device also has a cradle for placing the device, a telephone jack to enable the establishment of a communication link, and a file directly to the computer, having a port that can be connected to a power source to recharge the battery of the device. A serial or parallel port on the computer for downloading and uploading and for receiving "redirected" files. Preferred devices also include a language user interface capable of recognizing speech and answering the speech verbally. Such interfaces include speaker independent functions, but allow speaker adaptation to adjust the personal device to the user's voice or pronunciation characteristics, thereby improving accuracy. This speaker adaptation is achieved through a protocol that allows the system to adapt to the user's voice by repeating a batch of sentences before the first use of the device (the asr1000 product line of the Lernout & Hauspie Speech Product (LH SP)). See). The language interface is a vocabulary builder (see Lextool (TM) of LHSP) that allows users to extend vocabulary, including special terms and proper nouns, to speech recognition applications. A user template (LHSP's asr200 product line) where the user can create words to associate with user-defined commands, such as being able to be associated with an address, in a car, airplane or public place; Includes e-mail address spelling, further background noise tolerance and alphabet recognition for remote voice software, improving the accuracy of the language user interface even when the user is not wearing a headset. (See LHSP) The preferred device also provides for authenticity of sensitive information by encrypting and decrypting message data and by authenticating the identity of the sender using a secure digital or voice signature. Includes public key encryption techniques designed to ensure some secure transmission. Preferred devices also include a text-to-speech utility that allows a user to download data that has not yet been transcribed by the network server and do so on the device. The preferred device also allows the user to automatically print printed barcodes related to prints, such as news articles, maps, menus of available audio files, or audio files related to prints, from a network such as the Internet. A barcode reader that can scan printed barcodes in a travel guide that gives the device all information including the network server, file location and file ID needed by the device to retrieve it Including. The preferred device also allows the user to print a bar code associated with a print, such as a news article, a map, a menu of available audio files, or a group of already inspected audio files (such as those described in Goldberg et al.). A barcode reader that can scan printed barcodes in a travel guide that gives the device all the information the device needs to play the file from the device. Suitable devices also include high-speed local wireless transmission of audio files and control codes between the device, a payphone, a kiosk or a user's computer (eg, 1. Includes an infrared interface using standards such as Infrared Data Association (IrDA) for 2 Mbps and 4 Mbps. Preferred devices also include a software utility called an offline browser. This utility can be used during off-peak hours from the network to which the user is subscribing, or from selected websites with new audio material available, or emails that the user is programming to search through an offline browser Program the device to automatically retrieve the audio file from the address. The preferred device also allows a user to browse a network database, such as a database on the Internet, for an address and / or site to receive and send audio files by a graphical screen based interface or by voice prompts. Includes software utilities that enable The preferred device also provides access to, updates to, and / or updates to, an e-mail address book that contains e-mail addresses of individuals and groups that the user wishes to create and send audio files to. Or a software utility that creates a graphic interface and memory for downloading. Such utilities automatically synchronize the data in the dictation and voice message recording / review device with the data contained in the user's email server account. The preferred device also optionally includes a clock and alarm function to select between a simple audible alarm or a programmed voice message alarm (eg, "call home"), your phone number, email address , A graphical interface for organizing calendars, notes and appointments, and software utilities that allow the user to create memory. Preferred devices also allow users to download proprietary client-server software systems and upgrades and newly introduced standards for low bit rate audio compression available through public or private networks, such as the Internet, Includes software utilities to ensure that you can use the latest audio compression software. Preferred devices also include an application program interface (API) that allows the device to be used as a portable Internet telephone device for real-time two-way, full-duplex voice conversations using a local connection to the Internet. To allow users to download proprietary client-server software systems and upgrades and newly introduced standards that allow devices to receive highly compressed and / or streamed audio files, including, but not limited to audio content. Includes software utilities that enable The preferred device also extends the functionality of a web program run from a web browser to operate on the data as it enters the user's PC, such that the user can transfer the audio file through the communication port. , A software utility that allows it to be directed directly to a device mounted in the cradle and connected to a serial or parallel port. Alternatively, this may be via OLE (object linking and embedding) web software that, when initiated by the user by pressing a designated key such as printing, directs the audio file directly to a special "printer" driver dedicated to the device. Can be realized. The utility allows users browsing the web on their computers to download audio files directly to their personal audio server and access them later without having to transfer them from their hard disk. The preferred device also allows the user to select an e-mail message, convert the message from text to speech by a suitable text-to-speech conversion application available on the network, and then digitize and transmit as a digitized compressed audio file Includes software utilities that can be requested to be The present invention also provides that once connected to a communication link, the user can process other related or unrelated data and / or receive and transmit data to and from the network while the user is talking on the phone. Simultaneously or alternately, the Digital Simultaneous Voice / Data (DSVD) and / or VoiceView protocol (Radish Communications Systems, Inc.) allows audio files to be transferred and received directly to dictation and voice message recording devices. ) And software utilities. By using these voice / data protocols, a user of the dictation and voice message recording / review device can request a voice file in response to a voice prompt spoken in a digitized stream or analog voice, a voice response, It responds with keypad input or DTMF tones and allows these files to be transferred in high-speed data mode during the same phone connection. The present invention also relates to methods and software utilities that enable scalability of digitized audio files to meet network server requirements or user preferences. This allows the server to command a lower compression rate or a slower transmission rate, or request the user, and vice versa, to give the requested audio file higher credibility. A feature of the present invention is that the recording device remains connected to the communication link and dials and connects to the local network access point during off-peak hours when telephone charges are lowest and excess capacity is available on the input line. It can be programmed as The recording device can be a voice file to which the user is subscribed, a new voice file available from a website programmed by the user to alert the device, and voice mail sent to the user from the selected email address. It is programmed to search the network for A feature of the present invention is that a recording device is connected between a telephone, computer, cell phone or personal digital assistant and a communication link, and the user can select an audio file and retrieve while using any of these devices. An interface port such as a standard RJ-11 telephone jack is provided. Circuits are also provided for performing digital conversion and compression on the analog voice signals recorded in the memory of the dictation and voice message recording / review device to enable high density storage and high speed transmission of digitized voices. This is a feature of the present invention. Similarly, circuitry is provided for analog conversion and natural sound reproduction of digitized audio already stored or received. Airport, for example, in a manner analogous to an automatic teller machine, where a user can connect his / her own recording / review device and select voice messages and text to be retrieved and transmitted directly by the recording / review device. It is also a feature of the present invention to provide a public terminal installed in a place such as a tourist spot. BRIEF DESCRIPTION OF THE FIGURES The above description, as well as other objects, features and advantages of the present invention, will be more fully understood from the following detailed description of the preferred embodiments when taken in conjunction with the accompanying drawings. In the drawings: FIG. 1 is a schematic block diagram of a suitable personal voice message processor embodying the present invention; FIGS. 2 through 7 (FIG. 2 includes FIGS. 6 is a flowchart showing how the operation is performed by the device of FIG. Detailed description FIG. 1 is a schematic block diagram of a presently preferred personal voice server (PVS) system 10 embodying the present invention. The PVS system 10 comprises roughly five main parts; a highly integrated DSP / RISC integrated chip 11 (DSP stands for Digital Signal Processor, RISC stands for Reduced Instruction Set Computer); Audio codec 17; SDRAM 12 and / or flash memory 13 coupled to a DSP chip; including peripherals such as microphone 26, speaker 18, touch screen / display LCD 19, infrared I / O 21, and bar code reader 15. Operating system software is also provided to allow the DSP to handle modem routines such as V32bis, V34, voice recognition, echo cancellation and speech synthesis. The software also controls the system via the RISC part of chip 11. Although the device 10 of this embodiment is referred to as a voice server, it will be clear that it is equally useful for other types of voice, including music. The DSP chip is preferably a Philips Semiconductor PR3110 chip, which has a MIPS R3000 RISC CPU core with 4Kbytes of instruction cache and 1Kbytes of data cache, and various interfaces for interfacing with a number of system components and external I / O modules. Including integrated functions. The chip also has a hardware multiply / accumulate unit that performs DSP functions, such as a software fax / modem that does not require an external modem chipset. However, the chip also has a UART (Universal Asynchronous Receive Transmit) interface 22 (shown separately), which allows the device to connect to an external modem or other device via a conventional RS232 serial connector 23. (Such as an answering machine with a modem). PR31100 also provides multiple direct memory access (DMA) channels that provide an efficient means of transferring data between external system memory, cache memory, CPU cores and external I / O modules, and a high performance, flexible bus interface. -Includes unit (BIU). The PR31100 also includes a system interface module (SIM), which is integrated to interface to various external I / O modules such as a liquid crystal display (LCD) 19, an infrared I / O module 21, and a codec 17. Provide functions. Codec 17 is preferably a Philips UCB1100 single-chip integrated mixed-signal voice and telecom codec, which includes voice and telecommunication codecs (analog / digital encoding and decoding) functions and touch-screen analog-to-digital conversion, ISDN. Handles most of the analog functions of the system, including high-speed serial and infrared and wireless peripherals. Although shown separately in FIG. 1, high speed serial interface 14 is actually part of UCB1 100. The chip has a single channel audio codec designed to connect microphones and speakers directly (ie, components 16 and 28 are actually part of UCB 1100). This built-in telecommunication codec can be connected directly to a conventional RJ-11 jack 20 for connection to a telephone line. For a more complete understanding of the embodiment of FIG. 1, the data sheets for PR31100 and UCB1100 are attached, which are incorporated herein by reference. The operating system software for PR31100 is preferably Eden OS version 2.0 marketed by Eden Group Limited of Cheshire, UK. This operating system is specially designed to support PR31100 (also called DINO) and UCB1100 (also called BETTY). Attach the Eden OS data sheet. It describes the software support and drivers provided by this operating system. This data sheet is incorporated herein by reference. Memories 12, 13 are used to store messages and to hold temporary data. Flash memory is configured according to the amount of required permanent programs, including the operating system (O / S) and application software, and to store some of the recorded messages. Typically, the audio compression provided by the PR31100 results in a data bandwidth of less than 1/2 Kbyte / sec (ie, 1 Mbyte of memory provides 1 hour of audio). Microphone 26 and speaker 18 are selected based on quality and size. FIGS. 2 to 7 are flow charts showing operations for retrieving messages via the Internet, transmitting them to and from the PVS, connecting the lines, receiving data from a predetermined server address in the Internet, storing, and sieving. , Searching, and selecting various actions for transmitting and replaying messages to and from the PVS. These operations include receiving compressed messages in digital form and audio signals in analog form in a two-way fashion from speaker / microphone and telephone connections. 2a and 2b illustrate how the PVS is connected to a location on the Internet by a transport protocol, and how the PVS provides all data related to its web / email site (eg, HTML display information) to receive / store messages (voice, data, etc.) sent using either proprietary or de facto standards (eg, 2.5 kbps high compression voice) It is a flowchart which shows. The operations shown in FIGS. 2a and 2b are performed in parallel by the DSP / RISC real-time kernel (described further below with reference to FIG. 3). This allows multiple tasks to run and execute in parallel. The operation of the main task begins at block 200. Accessing the site and storing or receiving stored messages is performed in parallel with other tasks. These tasks can also be performed locally to operate PVS or other tasks such as operation of a barcode reader, speech synthesizer, speech recognition, or to access other websites simultaneously by PPP. . At block 202, a test is performed to determine if the desired action is to connect to a network access provider via a toll call (block 210). If no, the modem answers the call when it rings, completes its handshaking procedure, and begins receiving information (block 204). At block 220, data bits from the modem are received by DSP chip 11. At block 230, the DSP chip decodes the input data. At block 240, a test is performed to determine if the desired action is to decode the HTML site. If no, control transfers to block 340. Otherwise, operation continues at block 250, where display of the site page begins. At block 260, a test is performed to determine whether the mode of operation is interactive or automatic. In the interactive mode, the PVS user must browse and select the desired operation to complete. In the automatic mode, the keyword (s) for searching for voice or other messages are searched and activated automatically to obtain compressed data. If the test at block 260 detects an interactive mode, control passes to block 110 of FIG. 2b. If no, automatic browsing begins at block 270 to search for inverted keyword symbols. At block 280, a test is performed to determine if the keyword is a request for an already digitized message, and if so, at block 290, data compressed by the FTP protocol is received by the PVS. If the result of the test at block 280 is “no”, control transfers to block 310. At block 310, a test is performed to determine if the message is no longer present; if not, control returns to block 100. Otherwise, at block 320, a test is performed to determine if the keyword is a request for a location to store a local message at the web server. If so, this data, for example, a compressed voice message, is transmitted from the PVS to the website (block 330). Otherwise, control returns to start (block 100). The process continues until the website has no stored messages for the PVS owner. At block 340, a test is performed to determine if the site utilizes the FTP protocol language. If so, the message is retrieved using FTP (block 360), stored at block 380, and control passes to block 120 of FIG. 2b. If block 340 determines that the FTP protocol is not being used, a test is performed at block 340 to determine whether an authorized access language has been received. If so, at block 360, the message is retrieved utilizing the authorized access language and then stored at block 380. Next, control transfers to block 120 of FIG. 2b. If, at block 350, the authorized access language is not found, the user is notified at block 370 and control returns to block 100. If the mode is determined to be interactive at block 260, control transfers to block 110 of FIG. 2b. At block 112, a keyword for the web page is selected, and at block 114, HTML interpretation is activated to locate the message in the pool. At block 116, the message is then sent and / or received, and control returns to block 100 of FIG. 2a. Following block 380, where the data is stored, preferably in compressed form, control transfers to block 120 of FIG. 2b. All stored data causes the data to be generated in a flat database (block 120), which is later searched to locate the data. If the message is a voice message, it is compressed and played back as it is transmitted by the FTP protocol. The test at block 122 determines whether such action is required for the current message, and if so, the decompression and speech synthesizer is activated (block 124) and the message is ready for synthesis. The database is updated to reflect that has been done, and control returns to block 100. If the message is not decompressed and played, control transfers from block 122 to block 128. Here, a test is performed to determine whether the message is sent to the web server, and if no, control returns to block 100. If the message is sent to a web server, it is sent by FTP at block 130, notifying the user of the completion of the transfer (block 132), and control then returns to block 100. FIG. 3 shows the overall operation of the Eden OS kernel operated by the RISC core CPU of the DSP 11 in this application. The kernel is multitasking in that it can have multiple programs or tasks running at the same time, each with its own priority and capable of starting other (child) tasks. Once the kernel has been initialized via blocks 400-420, operation begins in idle mode at block 480, where the PVS waits for an event to occur, which is handled at block 430. Any program thus interacts with the operating system by drawing attention to its task at block 430. The type of event that occurs can be synchronous or asynchronous. If, at block 440, a synchronization event is detected, processing of the synchronization event is triggered via connection 5. Otherwise, a test is performed at block 450 to detect the asynchronous event. In this case, the processing of the asynchronous event is started via the connection unit 6. In each case, after the processing is started, the operating system returns to the idle mode and processes another event. Another special event that occurs is block 460 error handling. If no asynchronous event is detected at block 450, a test is performed to detect a fault event at block 460; if not, the program returns to idle mode. In the case of a hardware failure, communication failure, or software failure, an error event is detected at block 460 and a runtime handler is invoked (block 470) to handle the event. Next, control returns to the idle mode. The synchronous and asynchronous events shown in FIG. 3 are merely exemplary, and it is contemplated that there may be others for each type. FIG. 4 is a block diagram showing a routine performed by the controller of the DSP / RISC chip 11 when recording an analog voice message. At block 710, a test is performed to determine if the input message is from a built-in microphone. If NO, the control proceeds to the routine of FIG. If so, the voice message is digitized and compressed (block 720) and placed into a working pool of data (block 730). At block 740, a test is performed to determine if the memory is full before storing the entire message. If no, the routine ends and control returns to the idle mode. If so, recording is disabled (block 750) and the operator is notified that the memory is full, such as by a warning light. Control returns to the idle mode. FIG. 5 is a block diagram showing a routine performed to record analog audio from a telephone line. At block 800, a test is performed to determine if the voice message being received is from a communication link (telephone line). If NO, the control proceeds to the routine of FIG. If so, the message is passed as voice through the telecom / voice codec 17 (block 810) and a test is performed at block 820 to determine if compression is performed by the DSP / RISC chip. If compression occurs, the message is stored in local memory (block 830), recording is stopped, and control is returned to idle mode. If no compression is performed by the DSP / RI SC chip, the message is sent to the telecom / voice codec, where compression is performed by a standard (ADPCM) algorithm (block 840). Next, the message is returned to the DSP / RISC 11 via its UART (block 850), and the message is stored in the flash memory 13 under the control of the DSP / RISC chip (block 860). Control is then returned to the idle mode. FIG. 6 is a block diagram of a routine performed by a voice / telecom codec controller to play stored voice through a built-in speaker. At block 900, the operator selects a message from a message pool stored on the device. At block 910, a test is performed to determine whether the stored message to be retrieved was originally compressed by a voice / telecom / codec. If no, control transfers to block 920. If so, the message is read and decompressed using a voice / telecom codec (block 930) and the decompressed message is provided to a voice / telecom codec digital-to-analog converter (DAC) (block). 940). The message then passes through the D / A converter and amplifier 28 and is played back via the internal speaker 18 (block 950), and control is returned to the idle mode. If the stored message was not originally compressed by the voice / telecom codec, a test is performed at block 920 to determine whether the stored message was originally compressed by the voice / telecom codec. If no, the user is notified (block 960) and control is returned to idle mode. If so, the message is read by the controller (block 970), then sent to the modem for decompression, and from the modem back to memory 13 through the UART port of voice / telecom codec 17. Control passes to block 940 where the playback is processed in the same manner as the message originally compressed by the voice / telecom codec. FIG. 7 shows how the PVS connected to the cradle is connected to a PC (whether multimedia or not) or a specially configured TAD with a built-in modem, and the PC or TAD user (A) FIG. 1 is a schematic diagram showing whether an audio file can be sent to or received from a PVS through a modem other than the PVS telecom / audio codec. As a result, the PC user can transmit or attach the audio file existing in the PVS through the PC modem, and similarly, the PC user can directly download the audio file received through the PC modem to the PVS. With the same configuration, the user (B) of the non-multimedia PC can play the audio file received through the modem of the non-multimedia PC by playing the audio file using the multimedia capability of the PVS. Become. This arrangement also allows the PC user (C) to record audio through the PVS's built-in microphone and transmit it as a file or as stream audio through the PC's modem. Such an arrangement also allows the PC user (D) to redirect the audio file directly to the PVS using a standard web browser program. Finally, a configuration similar to the TAD in a modem configuration allows a TAD user to download voice messages to the TAD and from the TAD to the PVS. Two-way communication from the PC to the PVS is handled by a communication cable (eg, a 9-pin connector) at the serial RS232 port of the PC and the PVS and is controlled by asynchronous event software that controls input and output from the UART communication interface. The PC software handles drivers that send and receive data to and from the PC and the PVS. The transmission of data is similar to a PC transmitting data to a fax or a printer, and the reception of data is similar to the PC receiving data from a scanner. This driver sets all parameters required for PVS, such as operation type, acknowledgment and "end of transmission" length and weight. The PC also handles software that uses the PVS as an additional device (peripheral device) to receive multimedia voice messages, such that the PVS speaker operates. The PC also processes software that manages the microphone input of the PVS, and software that is fully integrated with the software and that integrates with a standard web browser (eg, Netscape Navigator) to invoke commands to the PVS accordingly. I do. The software in the PVS is part of a multitasking operating function that handles remote activation of procedural calls (RPC) controlled under the PVS asynchronous event software. While preferred embodiments of the present invention have been disclosed for purposes of illustration, those skilled in the art will appreciate that numerous additions, modifications and substitutions may be made without departing from the scope and spirit of the invention as defined in the appended claims. It can be understood that

【手続補正書】特許法第１８４条の８第１項【提出日】平成１１年３月１０日（１９９９．３．１０）【補正内容】特許請求の範囲１．アナログおよびデジタル形態の音声信号の通信を行い、該信号を格納する携帯型装置において、デジタルの格納手段と、通信チャネルへの通信コネクションと、該通信コネクションに結合された通信入出力と、デジタル入出力とを有するテレコミュニケーション・インタフェースと、前記格納手段に結合された出力を有するアナログ−デジタル変換器と、前記格納手段と該テレコミュニケーション・インタフェースのデジタル入出力とに結合されるコントローラであって、前記通信コネクション上の信号がアナログ音声信号であるかデジタル音声信号であるかを検出する手段と、該検出する手段によって制御され、前記テレコミュニケーション・インタフェース、前記格納手段および前記アナログ−デジタル変換器に結合されたルーティング手段であって、前記検出手段がデジタル信号を検出すると、前記テレコミュニケーション・インタフェースの前記デジタル出力を前記格納手段に結合させ、前記検出手段がアナログ信号を検出すると、前記テレコミュニケーション・インタフェースに前記コネクション上の信号を迂回させ、続いて前記格納手段に格納させるために、前記アナログ−デジタル変換器に前記信号を結合させるルーティング手段とを備えたコントローラとを備えたことを特徴とする携帯型装置。２．前記格納手段への結合が、格納の前に信号を圧縮する装置を介して行われる請求項１に記載の装置。３．前記コントローラは、前記格納手段に格納されたデジタルメッセージを、データおよび制御ビットを含むパケット化データストリームに組み立てる手段と、該パケット化データストリームを、前記通信チャネルを通して伝送するために前記テレコミュニケーション・インタフェースの前記デジタル入力に結合する手段とをさらに備えた請求項１に記載の装置。４．前記コントローラは、前記テレコミュニケーション・インタフェースに、前記パケット化データストリームをデジタル化音声の伝送レートより実質的に高いレートで伝送させるようになした請求項３に記載の装置。５．デジタル通信チャネルへのコネクションと、これらの間及び前記コントローラとの間のインタフェースとをさらに備えた請求項１に記載の装置。６．前記デジタル通信チャネルと、これに対応するインタフェースとは赤外線通信を扱うように設計されている請求項１に記載の装置。７．前記コントローラに結合されたバーコードリーダーをさらに備えた請求項１に記載の装置。８．前記コントローラに結合されたＬＣＤタッチスクリーンをさらに備えた請求項１に記載の装置。９．アナログおよびデジタル形態の音声信号の通信を行い該信号を格納する携帯型装置において、デジタルの格納手段と、通信チャネルへのコネクションと、該コネクションに結合されたアナログ入出力と、デジタル入出力とを有するテレコミュニケーション・インタフェースと、前記格納手段および前記テレコミュニケーション・インタフェースに結合されたコントローラであって、前記格納手段に格納されたデジタルメッセージを、データおよび制御ビットを含むパケット化データストリームに組み立てる手段と、前記通信チャネルを通して伝送するために、該パケット化データストリームを、前記テレコミュニケーション・インタフェースの前記デジタル入力に結合する手段とを備えたコントローラとを備えた携帯型装置。１０．前記コントローラは、前記テレコミュニケーション・インタフェースに、前記パケット化データストリームをデジタル化音声の伝送レートより実質的に高いレートで伝送させるようになした請求項９に記載の装置。１１．前記コントローラは、ＨＴＭＬ言語のメッセージが前記通信チャネルに受信されるのを検出して該言語での二方向通信を可能にするモジュールを含む請求項９に記載の装置。１２．前記コントローラは、前記通信チャネルへのＦＴＰ言語のメッセージの受信を検出して該言語での双方向通信を可能にするモジュールを含む請求項９に記載の装置。１３．前記コントローラは、前記通信チャネルを通したテキスト情報の受信に応じて、該テキスト情報が人間の声で話されたようにまねる可聴メッセージを生成する音声合成器をさらに備えている請求項９に記載の装置。１４．前記コントローラは、格納されたデータについての情報を受信し、また該情報の選択的検索を可能にするためのデータベース管理モジュールをさらに備えた請求項９に記載の装置。１５．アナログおよびデジタル形態の音声信号の通信を通信チャネルを通して行い、該信号を格納する方法において、該チャネル上の信号がアナログ音声信号であるかデジタル音声信号であるかを検出するステッブと、該チャネル上にデジタル信号を検出すると、該チャネルに結合される入力と、デジタル出力とを有するタイプのテレコミュニケーション・インタフェースの出力をデジタルの格納手段に格納するステップと、該チャネル上にアナログ信号を検出すると、該信号をアナログからデジタル形態に変換し、変換された信号をデジタル格納手段に格納するステップとを含む方法。１６．前記格納ステップのいずれかに先立って、前記信号が圧縮される請求項１５に記載の方法。１７．前記チャネルに結合されるアナログ入出力と、デジタル入出力とを有するタイプのテレコミュニケーション・インタフェースを用いて行われ、前記格納手段に格納されたデジタルメッセージを、データおよび制御ビットを含むパケット化データストリームに組み立てるステップと、デジタル化音声の伝送レートより実質的に高いレートで前記通信チャネルを通して伝送するために、該パケット化データストリームを、前記モデムのデジタル入力に結合するステップとをさらに含む請求項１５に記載の方法。１８．アナログおよびデジタル形態の音声信号の通信を通信チャネルを通して行い、該信号を格納する方法において、該チャネルに結合されるアナログ入出力と、デジタル入出力とを有するタイプのテレコミュニケーション・インタフェースを用いて行われ、格納手段に格納されたデジタルメッセージを、データおよび制御ビットを含むパケット化データストリームに組み立てるステップと、デジタル化音声の伝送レートより実質的に高いレートで通信チャネルを通して伝送するために、該パケット化データストリームを、前記モデムのデジタル入力に結合するステップとをさらに含む方法。１９．ユーザが、通信リンクを通して遠隔の装置から受信し、続いて該遠隔の装置へ送信し得るボイスメッセージおよび他の音声素材を記録、編集、再生およびレビューすることを可能にする携帯型装置において、電源用のレセプタクルと、該レセプタクルから電力供給された音声信号を、ローカルで記録、編集、格納および再生する集積回路と、該集積回路によって、それに対するアクセスが制御される非揮発性の格納手段と、音声の可聴再生およびローカルでの入力のために、該集積回路にそれぞれ結合される内蔵のスピーカおよびマイクロホンと、前記集積回路に結合されるテレコミュニケーション・インタフェースチップセットと、該モデムチップセットに結合されるモジュラー電話ジャックとを備え、前記集積回路は、最初に記録されたときより実質的に高いレートで音声信号を送受信するように、この装置を動作させることを特徴する携帯型装置。２０．前記集積回路は、前記通信リンク上に受信されるアナログ信号とデジタル信号との間の区別を可能にするように作用するモジュールを含み、該アナログ信号は前記テレコミュニケーション・インタフェースチップによって処理されずに該集積回路に渡される請求項１９に記載の装置。２１．前記集積回路は、利用可能な少なくとも１つのプロトコルを利用するインターネットで、前記通信リンクを介する通信を可能にするモジュールを含む請求項１９に記載の装置。２２．前記集積回路は、前記通信リンクを通して受信される信号をテキストとして認識し、該信号を、該テキストを話す人の音声をまねる信号に変換するモジュールを含む請求項１９に記載の装置。[Procedure of Amendment] Article 184-8, Paragraph 1 of the Patent Act [Submission date] March 10, 1999 (1999.10.10) [Correction contents] Claims 1. Communicates and stores analog and digital audio signals In portable devices, Digital storage means; A communication connection to the communication channel; A telecommunications input / output coupled to the communication connection and a digital input / output Re-communication interface, An analog-to-digital converter having an output coupled to the storage means; Digital input / output of the storage means and the telecommunications interface A controller coupled to The signal on the communication connection is an analog audio signal or a digital audio signal. Means for detecting whether the signal is a signal, The telecommunications interface is controlled by the detecting means. And a loop coupled to the storage means and the analog-to-digital converter. When the detecting means detects a digital signal, Coupling the digital output of the communication interface to the storage means. When the detection means detects an analog signal, the telecommunication The interface diverts the signal on the connection, and then the storage means A route for coupling the signal to the analog-to-digital converter for storage. And a controller having A portable device comprising: 2. Coupling to the storage means takes place via a device for compressing the signal before storage. The device of claim 1. 3. The controller is The digital message stored in the storage means is converted into data and control bits. Means for assembling into a packetized data stream comprising: Transmitting said packetized data stream over said communication channel. A hand coupled to the digital input of the telecommunications interface The apparatus of claim 1, further comprising a step. 4. The controller includes: Making the packetized data stream substantially higher than the digitized voice transmission rate. 4. The apparatus according to claim 3, wherein the transmission is performed at a high rate. 5. Connections to digital communication channels, and the 2. The apparatus of claim 1, further comprising an interface with the controller. 6. The digital communication channel and the corresponding interface are infrared The apparatus of claim 1, wherein the apparatus is designed to handle communications. 7. The bar code reader further coupled to the controller. An apparatus according to claim 1. 8. A contractor further comprising an LCD touch screen coupled to the controller. An apparatus according to claim 1. 9. A portable device that communicates and stores analog and digital audio signals. In the band type device, Digital storage means; Connection to the communication channel, A text having an analog input / output and a digital input / output coupled to the connection. Re-communication interface, Coupled to the storage means and the telecommunications interface Controller The digital message stored in the storage means is stored as data and control bits. Means for assembling into a packetized data stream containing The packetized data stream for transmission over the communication channel. System to the digital input of the telecommunications interface A controller comprising means for performing Portable device equipped with. 10. The controller communicates with the telecommunications interface , The packetized data stream is substantially sub-divided from the transmission rate of the digitized voice. Apparatus according to claim 9, adapted to transmit at a high rate. 11. The controller sends an HTML language message to the communication channel. A contract that includes a module that detects receipt and enables two-way communication in the language. An apparatus according to claim 9. 12. The controller sends an FTP language message to the communication channel. 10. The method of claim 9 including a module for detecting receipt and enabling two-way communication in the language. The described device. 13. The controller is adapted to receive text information through the communication channel. In response, an audible message is generated that mimics the textual information as spoken in a human voice. The apparatus of claim 9, further comprising a speech synthesizer for generating. 14． The controller receives information about the stored data, and A database management module for enabling selective search of the information; The device according to claim 9. 15. Communication of analog and digital forms of audio signals through communication channels Performing, and storing the signal, Whether the signal on the channel is an analog audio signal or a digital audio signal The step to detect, Detecting a digital signal on the channel, an input coupled to the channel; Telecommunications interface of the type with digital output Storing the force in a digital storage means; When an analog signal is detected on the channel, the signal is converted from analog to digital Converting the signal into a digital form and storing the converted signal in digital storage means. A method that includes 16. The signal is compressed prior to any of the storing steps. 16. The method according to 15. 17． Having an analog input / output coupled to the channel and a digital input / output Using a telecommunications interface of any type The digital message stored in the storage means is converted into data and control bits. Assembling into a packetized data stream comprising: Through the communication channel at a rate substantially higher than the transmission rate of the digitized voice. The packetized data stream to the modem's digital Coupling to the input. 18. Communication of analog and digital forms of audio signals through communication channels Performing, and storing the signal, an analog input / output coupled to the channel. And digital input and output Done using a telecommunications interface, Digital message stored in storage means, including data and control bits Assembling into a packetized data stream; Through the communication channel at a rate substantially higher than the transmission rate of the digitized voice Transmitting the packetized data stream to a digital input of the modem for transmission. Further comprising the step of: 19. A user receives from a remote device over a communication link, and then Record, edit, play and record voice messages and other audio material that can be sent to the device. Portable devices that allow Power supply receptacle, Record, edit, and store audio signals supplied from the receptacle locally And an integrated circuit for reproducing, Non-volatile storage means whose access is controlled by the integrated circuit When, Coupled to the integrated circuit for audible playback of audio and local input, respectively Built-in speakers and microphone, A telecommunications interface chipset coupled to the integrated circuit; And A modular telephone jack coupled to the modem chipset. The integrated circuit outputs an audio signal at a substantially higher rate than was originally recorded. A portable device characterized by operating the device to transmit and receive. 20. The integrated circuit includes an analog signal received on the communication link and a digital signal. A module operative to allow distinction between the analog signal and the analog signal. The signal is the telecommunication 20. The method according to claim 19, wherein the signal is passed to the integrated circuit without being processed by the interface chip. On-board equipment. 21. The integrated circuit utilizes at least one available protocol. A contract including, over the Internet, a module that enables communication over the communication link. The apparatus according to claim 19. 22. The integrated circuit converts the signal received over the communication link into text. And converts the signal into a signal that mimics the voice of the person speaking the text. 20. The device of claim 19, comprising a module.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｍ 11/10 Ｈ０４Ｌ 11/20 １０２Ａ (81)指定国ＥＰ(ＡＴ，ＢＥ，ＣＨ，ＣＹ，ＤＥ，ＤＫ，ＥＳ，ＦＩ，ＦＲ，ＧＢ，ＧＲ，ＩＥ，ＩＴ，ＬＵ，ＭＣ，ＮＬ，ＰＴ，ＳＥ)，ＯＡ(ＢＦ，ＢＪ，ＣＦ，ＣＧ，ＣＩ，ＣＭ，ＧＡ，ＧＮ，ＭＬ，ＭＲ，ＮＥ，ＳＮ，ＴＤ，ＴＧ)，ＡＰ(ＧＨ，ＧＭ，ＫＥ，ＬＳ，ＭＷ，ＳＤ，ＳＺ，ＵＧ，ＺＷ)，ＥＡ(ＡＭ，ＡＺ，ＢＹ，ＫＧ，ＫＺ，ＭＤ，ＲＵ，ＴＪ，ＴＭ)，ＡＬ，ＡＭ，ＡＴ，ＡＵ，ＡＺ，ＢＡ，ＢＢ，ＢＧ，ＢＲ，ＢＹ，ＣＡ，ＣＨ，ＣＮ，ＣＵ，ＣＺ，ＤＥ，ＤＫ，ＥＥ，ＥＳ，ＦＩ，ＧＢ，ＧＥ，ＧＷ，ＨＵ，ＩＤ，ＩＬ，ＩＳ，ＪＰ，ＫＥ，ＫＧ，ＫＰ，ＫＲ，ＫＺ，ＬＣ，ＬＫ，ＬＲ，ＬＳ，ＬＴ，ＬＵ，ＬＶ，ＭＤ，ＭＧ，ＭＫ，ＭＮ，ＭＷ，ＭＸ，ＮＯ，ＮＺ，ＰＬ，ＰＴ，ＲＯ，ＲＵ，ＳＤ，ＳＥ，ＳＧ，ＳＩ，ＳＫ，ＳＬ，ＴＪ，ＴＭ，ＴＲ，ＴＴ，ＵＡ，ＵＧ，ＵＳ，ＵＺ，ＶＮ，ＹＵ，ＺＷ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) H04M 11/10 H04L 11/20 102A (81) Designated country EP (AT, BE, CH, CY, DE, DK) , ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE), OA (BF, BJ, CF, CG, CI, CM, GA, GN, ML, MR, NE) , SN, TD, TG), AP (GH, GM, KE, LS, MW, SD, SZ, UG, ZW), EA (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM) , AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB GE, GW, HU, ID, IL, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MD, MG, MK, MN, MW, MX , NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, UA, UG, US, UZ, VN, YU, ZW

Claims

[Claims] 1. Communicates and stores analog and digital audio signals In portable devices, Digital storage means; A communication connection to the communication channel; A telecommunications input / output coupled to the communication connection and a digital input / output Re-communication interface, An analog-to-digital converter having an output coupled to the storage means; Digital input / output of the storage means and the telecommunications interface A controller coupled to The signal on the communication connection is an analog audio signal or a digital audio signal. Means for detecting whether the signal is a signal, The telecommunications interface is controlled by the detecting means. And a loop coupled to the storage means and the analog-to-digital converter. When the detecting means detects a digital signal, Coupling the digital output of the communication interface to the storage means. When the detection means detects an analog signal, the telecommunication A route that allows the interface to bypass signals on the connection and combine them A controller provided with A portable device comprising: 2. The analog-to-digital converter then stores the data in the storage means. The signal. 3. The coupling to the storage means comprises a device for compressing the signal before storage. The apparatus of claim 1, wherein the apparatus is performed via: 4. The controller is The digital message stored in the storage means is converted into data and control bits. Means for assembling into a packetized data stream comprising: Transmitting said packetized data stream over said communication channel. A hand coupled to the digital input of the telecommunications interface The apparatus of claim 1, further comprising a step. 5. The controller includes: Making the packetized data stream substantially higher than the digitized voice transmission rate. 4. The apparatus according to claim 3, wherein the transmission is performed at a high rate. 6. Connections to digital communication channels, and the 2. The apparatus of claim 1, further comprising an interface with the controller. 7. The digital communication channel and the corresponding interface are infrared The apparatus of claim 1, wherein the apparatus is designed to handle communications. 8. The bar code reader further coupled to the controller. An apparatus according to claim 1. 9. A contractor further comprising an LCD touch screen coupled to the controller. An apparatus according to claim 1. 10. Communicates and stores audio signals in analog and digital form In portable devices, Digital storage means; Connection to the communication channel, Analog input / output and digital input / output coupled to the connection A telecommunications interface having Coupled to the storage means and the telecommunications interface Controller The digital message stored in the storage means is stored as data and control bits. Means for assembling into a packetized data stream containing The packetized data stream for transmission over the communication channel. System to the digital input of the telecommunications interface A controller comprising means for performing Portable device equipped with. 11. The controller communicates with the telecommunications interface , The packetized data stream is substantially sub-divided from the transmission rate of the digitized voice. Apparatus according to claim 9, adapted to transmit at a high rate. 12. The controller sends an HTML language message to the communication channel. A contract that includes a module that detects receipt and enables two-way communication in the language. An apparatus according to claim 9. 13. The controller sends an FTP language message to the communication channel. 10. The method of claim 9 including a module for detecting receipt and enabling two-way communication in the language. The described device. 14． The controller is adapted to receive text information through the communication channel. In response, an audible message is generated that mimics the textual information as spoken in a human voice. The apparatus of claim 9, further comprising a speech synthesizer for generating. 15. The controller receives information about the stored data, and Database for enabling selective search of the information The apparatus according to claim 9, further comprising a management module. 16. Communication of analog and digital forms of audio signals through communication channels Performing, and storing the signal, Whether the signal on the channel is an analog audio signal or a digital audio signal Detecting, Detecting a digital signal on the channel, an input coupled to the channel; Telecommunications interface of the type with digital output Storing the force in a digital storage means; When an analog signal is detected on the channel, the signal is converted from analog to digital Converting the signal into a digital form and storing the converted signal in digital storage means. A method that includes 17． The signal is compressed prior to any of the storing steps. 16. The method according to 15. 18. Having an analog input / output coupled to the channel and a digital input / output Using a telecommunications interface of any type The digital message stored in the storage means is converted into data and control bits. A step to assemble into a packetized data stream containing Through the communication channel at a rate substantially higher than the transmission rate of the digitized voice. The packetized data stream to the modem's digital Coupling to the input. 19. Communication of analog and digital forms of audio signals through communication channels Performing the method and storing the signal in the channel. Telecommunications of the type having an analog input / output coupled and a digital input / output Application interface, Digital message stored in storage means, including data and control bits Assembling into a packetized data stream; Through the communication channel at a rate substantially higher than the transmission rate of the digitized voice Transmitting the packetized data stream to a digital input of the modem for transmission. Further comprising the step of: 20. A user receives from a remote device over a communication link, and then Record, edit, play and record voice messages and other audio material that can be sent to the device. Portable devices that allow Power supply receptacle, Record, edit, and store audio signals supplied from the receptacle locally And an integrated circuit for reproducing, Non-volatile storage means whose access is controlled by the integrated circuit When, Coupled to the integrated circuit for audible playback of audio and local input, respectively Built-in speakers and microphone, A telecommunications interface chipset coupled to the integrated circuit; And A modular telephone jack coupled to the modem chipset. The integrated circuit outputs an audio signal at a substantially higher rate than was originally recorded. A portable device characterized by operating the device to transmit and receive. 21. The integrated circuit includes an analog signal received on the communication link and a digital signal. Module that acts to allow a distinction between And the analog signal is transmitted through the telecommunications interface. 20. The device of claim 19, wherein the device is passed to the integrated circuit without being processed by a chip. 22. The integrated circuit utilizes at least one available protocol. A contract including, over the Internet, a module that enables communication over the communication link. The apparatus according to claim 19. 23. The integrated circuit converts the signal received over the communication link into text. And converts the signal into a signal that mimics the voice of the person speaking the text. 20. The device of claim 19, comprising a module.