[go: up one dir, main page]

GB2368441A - Voice to voice data handling system - Google Patents

Voice to voice data handling system Download PDF

Info

Publication number
GB2368441A
GB2368441A GB0026158A GB0026158A GB2368441A GB 2368441 A GB2368441 A GB 2368441A GB 0026158 A GB0026158 A GB 0026158A GB 0026158 A GB0026158 A GB 0026158A GB 2368441 A GB2368441 A GB 2368441A
Authority
GB
United Kingdom
Prior art keywords
voice
sub
facility
demands
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0026158A
Other versions
GB0026158D0 (en
Inventor
Jonathan Paul Richings
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to GB0026158A priority Critical patent/GB2368441A/en
Publication of GB0026158D0 publication Critical patent/GB0026158D0/en
Publication of GB2368441A publication Critical patent/GB2368441A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/253Telephone sets using digital voice transmission
    • H04M1/2535Telephone sets using digital voice transmission adapted for voice communication over an Internet Protocol [IP] network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6033Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
    • H04M1/6041Portable telephones adapted for handsfree use
    • H04M1/6075Portable telephones adapted for handsfree use adapted for handsfree use in a vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Navigation (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A voice to voice data handling system comprises a multiplicity of mobile, e.g. automobile borne, sub-systems linked to a remote Internet Server by way of individual GSM and GPRS facilities 21,23. Each sub-system has a hands-free facility 11 and a microphone 13 and speaker 15. Each sub-system has a 'ThinSR' facility 19 capable of recognizing a limited range of simple pre-programmed voice commands and otherwise to transmit the command to the Server. A 'ThickSR' facility of the server, with greater power of command interpretation, responds, if successful in its recognition of the command, by causing the required information to be transmitted, through the Internet, the relevant mobile sub-system.

Description

Voice Responsive Data Handling Systems This invention relates to voice responsive data handling systems.
According to the invention, a voice responsive data handling system is constituted as a system as set out in the claims or any of them of the claims schedule hereof and the substance of said claims and their interdependencies are, notionally, set out at this place, also.
The system is intended, primarily, to provide, in a mobile (typically automotive) environment, means for enabling such matters as voice control of navigation, personal information management, and internet/intercom access.
It is particularly, though by no means exclusively, concerned to provide voice to voice communication. By"voice to voice communication" is meant the response to spoken commands with spoken responses.
Typically, the system consists of a local mobile processing sub-system with GSM communications and GPS and a remote host server.
The system is capable of providing, amongst other things, . voice to voice navigation instructions, including traffic information and location based services; . voice to voice personal information management data such, for examples, as diary and/or address and telephone data; . voice to voice email, and speech activated telecom dialling ; . voice over IP.
The system advantageously relies on GPRS technology.
Although voice responsive data processing systems are becoming available in various applications, they are costly and, in mobile environments, they suffer as a result of space limitations and power requirements. This limits functionality of such systems to simple voice dialling.
Systems in accordance with the present invention enable mobile local data processing systems of limited functionality to access remote speech recognition servers. It also allows speech control of server functions such as navigation, email, web browsing and voice mail.
In a mobile environment, a car environment, in particular, voice control is advantageous. Visual displays, of the head-down type, e. g dashboard panel displays, are, with the vehicle in motion, often impractical, hazardous even. There are many circumstances which call for a facility for communication, by the person in control of a vehicle, even when the vehicle is in motion, some of which are mentioned above.
Such applications have two requirements in common, firstly, access to internet based systems and sophisticated user interface. The user interface to mobile applications can best be fulfilled by a wholly speech orientated interface. The capabilities of practical mobile devices are, for the foreseeable future, at least, far from adequate for enabling voice control of such functions.
By distributing speech recognition between local mobile and remote server data processing systems, as now proposed, simple commands may be handled locally, more complex demands being passed to the more powerful server for processing and communication back to the calling location.
The distributed processing is mirrored in the applications to be controlled by the voiced demands; simple functions, such as dialling a mobile phone are handled by the mobile local data processor, the execution of more complex functions, navigation and route planning, are dealt with remotely by the server, and the solution of such complex functions communicated to the caller in speech.
The accompanying figure is a block schematic diagram showing a subsystem of a system in accordance with the invention, being one of a number, a large number of essentially similar sub-systems linked for communication with a common server by way of the Internet and being, in consequence subject to Internet protocols.
Expressions and, abbreviations therefor, employed in the ensuing description for various next subordinate level component parts of the sub-system are next set out together with a short note regarding their several purposes and/or functions in relation to one another are next set out.
"Hands-Free Unit"comprises electronic means, commonly found in incar and office telephone systems, enabling telephone conversations to be held without the need for handset or headset. The speaker of the Unit is to be capable of being heard when several inches away from the user, and, likewise, the microphone is to be sufficiently sensitive to pick up the sound of the user's voice from a similar distance. The Unit often incorporates electronics to filter echoes and background noise that may be present.
"Codec" (Coder/Decoder) comprises means adapted to translate data in either sense between analogue audio and the digital equivalent.
"Voice-Over-Internet-Protocol (VOIP) comprises a secondary level of voice encoding that provides compatability with communication via the Internet, with the advantage of free access to long distance telephone calls.
"Thin Speech Recognition" ("Thin SR") comprises a logic sub-section adapted to recognize pre-programmed voice prompts. The term'thin' implies that the range of voice prompts is confined to a small number of short sounds. This limited voice recognition facility is a feature shared with various present day mobile phones. The process of recognizing a voice sample may be governed by various algorithms with which the system is programmed, the specific algorithms employed being varying according to specific speech recognition concepts implemented in the system. All speech recognition techniques rely on some form of mechanism to break the sample down into discrete analysable parts and it is this pre-processed data that is to be sent elsewhere in the system, a'FatSR', in the event that the'Thin SR'is unable to respond positively to the voice input. It is of secondary importance as to which algorithm is employed, what is important is that the preprocessing reduces the to the minimum the amount of data that needs to be passed for processing at the'FatSR', hereinafter referred to, and, so, to reduce the air-time data traffic.
"Fat Speech Recognition" ('FatSR') comprises, as implied above, a sub-system performing processes similar to those employed in'Thin SR', though with very much greater processing power and memory, this in order to enable it to recognize and to respond to a very much greater range of commands addressed to it.'FatSR'has the potential, also, to access huge independent data resources, databases especially, in order to respond appropriately. It is, in practice, an Internet connected server and, so, is capable of accessing many other resources such, for examples, as routing servers and on-line phone books, "Global System for Mobile Communication" ('GSM') is a communication network the role of which is given in the name of this network.
"General Packet Radio Service" ('GPRS') section of the system is a packet switched overlay subsystem of GSM. It offers both additional functionality of GSM hardware and additional capability of GSM network.
Its importance resides in enabling reliable non-voice data communication over the GSM.
"Text-to-Speech sub-system"refers to a state of art technology subsystem capable of taking a text string (such as may be written into a word processor) and converting this into synthesized voice audio, the
benefit of the'Text-to-Speech'facility being that, as a text string, the data occupies much less data storage (and, therefore,'airtime') than its audio, digital or analogue, equivalent.
The diagram shows the sub-system in its several next-subordinate component parts. These comprise, as indicated, a hands-free unit 11 linked with a microphone 13 and a speaker 15 ; a Codec 17; a'ThinSR' facility 19 ; a GSM facility 21, which includes, as a part thereof, a GPRS facility 23; a VOIP facility 25; and a Text-to-speech facility 27, the several said component parts of the sub-system being linked with one or more other such parts by way of data transmission paths as shown, and the GSM facility 21 being in communication with the remote System Server (not shown), the latter being, as noted previously, provided with the'ThickSR'facility.
All of the aforestated facilities may be state of the art.
So, for example, the hands-free unit 11 may be of a sort which is commonplace, particularly in in-car telephone systems. It comprises the electronics necessary to permit conversations to be conducted without the need for any sort of handset or headset. Typically, a hands-free unit, as 11, incorporates electronics (not shown) adapted to filter echoes and background noise, being an unwanted side effect arising from spacing from the user of a suitably sensitive microphone 13 and suitably powerful speaker 15.
The hands free unit 11 communicates with the Codec 17, where the analogue audio speech of the hands free unit is digitized. In the example, the Codec 17 serves only in the conversion: analogue to digital. The term'Codec'is commonly employed even where, as here, only one of its functions is exercised.
Digitized speech data is passed from the Codec 17 to data discriminator means 19, the'Thin SR'. It is the role of the'Thin SR' facility 19 to recognize pre-programmed voice prompts. The appellation 'Thin'implies, in this application, that the ability of the discriminator means 19 is purposely limited to a small number of simple, particularly short, sounds. This feature is characteristic of voice recognition capabilities in certain presently available mobile phones.'Voice Recognition'embraces a range of different processes respectively characteristic of different recognition techniques known to the art.
Speech recognition methods rely on characteristic algorithms in accordance with which speech samples are broken down into discrete analysable components, and central to the present invention is the
recognition at the'Thin SR'facility 19 as to whether the sample is to be capable of being processed locally as'Thin SR'or, as will be made clear hereinafter, is to be passed, by the GSM 21 for processing at a remote server, as'Thick SR'.
For the purposes of the invention, it is not to the point which particular recognition technique may be employed. What is to be borne in mind at all times in implementing the system of the present invention, is the current hardware costs involved in local processing of speech samples in the mobile environment, e. g. in an automobile, as compared with the costs incurred in'air time'involved in remote processing, at a'Thick SR'Server, of such data samples.
Simple commands, such, for example, as DIAL, are recognizable by the
discriminator means, the'Thin SR'facility 19, locally processed outputs from the'Thin SR'facility 19 being passed by way of the GSM 21 and/or the GPRS facility 23 and the Internet, to the hands-free unit 11 of another, remote, sub-system (not shown) of the system, the digitized voice command being converted to its analogue equivalent by means of a text to speech means, as 27. More complex commands are passed to the VOIP facility 25 and then from the GPRS 23 and GSM 21 to the'Fat SR'facility of the remote system Server. Commands interpreted at the Server produce an appropriate response, these being transmitted from the aerial of transmitter means of the Server to the aerial of receiver means of an addressed sub-system, and from thence to the Hands-free facility 11 of the calling sub-system and the speaker 15 thereof being appropriately activated, this by way of the GSM and GPRS, facilities 21 and 23, and the Text to Speech facility 27 of the calling sub-system.
The system may, as previously indicated, be constituted as a remote processing, voice controlled, navigation system, involving the combination of local with remote speech recognition. Local processing involves high local hardware costs and may be otherwise unacceptable for space and weight reasons. Split processing allows for low local hardware costs, and space and weight penalties. For example, processing the echo and noise reduction permits increased compression and the reduction of transmitted data. Possible reduction command speech to phoneme level would reduce transmission traffic yet further.
From another viewpoint, the invention encompasses the combination of remote and local navigation processing. Following appropriate spoken commands, a route (being a list of topographical details) may be calculated remotely at the server, the route list being passed to a local data processing means. The local processor compares GPS data with the route list and activates the loudspeaker of the hands-off unit of the sub-system, thereby to provide the driver with driving instructions as to the route to be taken. Various events prompt the route list to be recalculated. These events, which may represent deviations, local or remote, from the list previously presented.
Yet again, the invention extends, in its range of application, to the combination of remote and local personal information management planning. An example might be an address book seamless in both locally and remotely derived information. Seamless in that the information may incorporate such matters as a local address book, an office address book, together with information derived from such sources as Directory Enquiries.
Other functionality (not necessarily voice-in/voice-out in character) provided by the distributed system to any of its many subsystems, might include control of in-vehicle systems remote vehicle diagnostics and service scheduling.
The functioning of the several above sub-systems in the operation of the system is further explained by discussion of a few examples: A. Local Voice Recognition Operation (Mobile Originated) A simple operation may be to command the system to dial the on-board phone in preparation for a speech call. In this case, the user's voice command passes from the microphone 13 of the hands free unit 11 to the Codec 17 where it is converted to a digital equivalent. The digitized
voice command passes to the'ThinSR'facility 19 where an attempt is made to'recognize'the command. If successful, the'ThinSR'facility 19 controls the GSM facility 21 to dial the number entered on the onboard phone.
B. Remote Voice Operation (Mobile Operated) If the'ThinSR'facility 19 fails to recognize the voice command then the voice data (or partially analysed equivalent) is passed by way of the GPRS facility 23 of the GSM facility 21, being transmitted
therefrom to the remote Server, incorporating the'FatSR'facility, which, as with the'ThinSR'facility 19, attempts to recognize the voice command. If this proves to be successful, then depending upon the command, the appropriate response is made, the on-board Text-to Speech facility 27 causing activation of the on-board speaker 15, accordingly.
C. Remote Voice Response Operation (Server Operated) In this mode, if at the remote Server, the'FatSR'facility thereof recognizes a command to provide, as an example, relatively complex navigation data for the attention of the user, such recognized command is passed to a mapping server facility (not represented) and this last mentioned server responds by computing appropriate vehicle routing information, passing such computed information back to the'FatSR' facility in the form of a multiplicity of text strings constituting
routing instructions. From the'Fat SR'facility, the routing instructions are transmitted by way of the GPRS facility 23 of the GSM facility 21 and then transmitted for translation, as before mentioned, at the text-to-speech sub-system 27 for relaying to the user by means of the on-board speaker arrangement 15.
Alternatively, if the'ThinSR'facility 19 is not able to interpret a request to call a certain telephone or like address, in that that address is not stored locally, the greater resources of the 'FatSR'facility at the remote Server can be deployed, using its own information or information retrieved from an extra-system database for such information, the data so retrieved being communicated, as before, by way of the GPRS facility 23 of the GSM facility 21, ultimately for translation at the text-to-speech facility 27 and voice synthesized reproduction, as before, at the speaker 15. Thereafter, the user, being provided with the requisite information, connection with the desired phonic address by recourse, by the user, to the GSM network.
D. VOIP Voice Communication As an alternative to normal GSM voice communication, voice data might be routed to the VOIP 25, then transmitted, via the GPRS facility 23 of the GSM facility 21 to the Internet,-and, finally, to a VOIP of another telecoms receiving system. Incoming VOIP encoded voice data is, similarly, routed back through the GPRS, VOIP and out to the speaker via the hands free subsystem.

Claims (4)

  1. CLAIMS 1. A voice responsive data handling system which comprises: a multiplicity of sub-systems each comprising data processing means (hereinafter referred to each as"local data processor") being data processing means competent to respond to data derived from any voice input demand within a limited range of such demands; common to all of said sub-systems, second data processing means (hereinafter referred to as"the server"), being data processing means competent to respond to data derived from voice input demands being demands within of a range of demands not within the competence of local data processors, or any of them, for processing; a multiplicity of transmitter means respectively associated with said multiplicity of sub-systems; a multiplicity of receiver means respectively associated with said multiplicity of sub-systems; and a multiplicity of logic means respectively associated with the several said sub-systems; and in which each said logic means is operative to cause the associated local data processor to process voice input demands, or components thereof, whenever said demands, or components thereof, as the case may be, are within the competence of the local processor for processing and, otherwise to cause said associated transmitter means to transmit said voice-derived demands or components thereof to receiver means of said server for processing thereby and the transmission of the processed result from the server to the receiver means of the sub-system from which a voice demand to said remote data processor emanated.
  2. 2. A voice responsive data processing system as claimed in claim 1 in which the logic means of each sub-system incorporates individual means to discriminate between voice demands, or components thereof, within the processing competence of the local data processor associated with the sub-system and those that are not, and to cause said demands or demand components to be routed to said local processor or to the server, as the case may be.
  3. 3. A voice responsive data processing system as claimed in claim 1 or 2 in which: said multiplicity of sub-systems are transportable; communication between the sub-systems or any of them and the server is by wireless transmission; and each of the several said subsystems comprises: a hands free facility; a CODEC facility; a GSM facility incorporating a GPRS facility; a VOIP facility; a'Thin SR'facility ; and a Text to speech facility, all of the said facilities being as described and/or as defined hereinbefore; and being linked each with one or more of the other said facilities by way of communication paths in an arrangement substantially as hereinbefore described with reference to the accompanying drawing.
  4. 4. A voice responsive data processing system substantially as hereinbefore described with reference to the accompanying drawing.
GB0026158A 2000-10-26 2000-10-26 Voice to voice data handling system Withdrawn GB2368441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB0026158A GB2368441A (en) 2000-10-26 2000-10-26 Voice to voice data handling system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0026158A GB2368441A (en) 2000-10-26 2000-10-26 Voice to voice data handling system

Publications (2)

Publication Number Publication Date
GB0026158D0 GB0026158D0 (en) 2000-12-13
GB2368441A true GB2368441A (en) 2002-05-01

Family

ID=9901973

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0026158A Withdrawn GB2368441A (en) 2000-10-26 2000-10-26 Voice to voice data handling system

Country Status (1)

Country Link
GB (1) GB2368441A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2379786A (en) * 2001-09-18 2003-03-19 20 20 Speech Ltd Speech processing apparatus
GB2379785A (en) * 2001-09-18 2003-03-19 20 20 Speech Ltd Speech recognition
DE102009017177A1 (en) 2008-04-23 2009-10-29 Volkswagen Ag Speech recognition arrangement for the acoustic operation of a function of a motor vehicle
US20140122088A1 (en) * 2012-10-26 2014-05-01 Samsung Electronics Co., Ltd. Image processing apparatus and control method thereof and image processing system
DE102014200570A1 (en) * 2014-01-15 2015-07-16 Bayerische Motoren Werke Aktiengesellschaft Method and system for generating a control command
CN106992009A (en) * 2017-05-03 2017-07-28 深圳车盒子科技有限公司 Vehicle-mounted voice exchange method, system and computer-readable recording medium
US9953643B2 (en) 2010-12-23 2018-04-24 Lenovo (Singapore) Pte. Ltd. Selective transmission of voice data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0872827A2 (en) * 1997-04-14 1998-10-21 AT&T Corp. System and method for providing remote automatic speech recognition services via a packet network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0872827A2 (en) * 1997-04-14 1998-10-21 AT&T Corp. System and method for providing remote automatic speech recognition services via a packet network

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2379786A (en) * 2001-09-18 2003-03-19 20 20 Speech Ltd Speech processing apparatus
GB2379785A (en) * 2001-09-18 2003-03-19 20 20 Speech Ltd Speech recognition
DE102009017177A1 (en) 2008-04-23 2009-10-29 Volkswagen Ag Speech recognition arrangement for the acoustic operation of a function of a motor vehicle
DE102009017177B4 (en) 2008-04-23 2022-05-05 Volkswagen Ag Speech recognition arrangement and method for acoustically operating a function of a motor vehicle
US9953643B2 (en) 2010-12-23 2018-04-24 Lenovo (Singapore) Pte. Ltd. Selective transmission of voice data
DE102011054197B4 (en) * 2010-12-23 2019-06-06 Lenovo (Singapore) Pte. Ltd. Selective transmission of voice data
US20140122088A1 (en) * 2012-10-26 2014-05-01 Samsung Electronics Co., Ltd. Image processing apparatus and control method thereof and image processing system
DE102014200570A1 (en) * 2014-01-15 2015-07-16 Bayerische Motoren Werke Aktiengesellschaft Method and system for generating a control command
CN106992009A (en) * 2017-05-03 2017-07-28 深圳车盒子科技有限公司 Vehicle-mounted voice exchange method, system and computer-readable recording medium

Also Published As

Publication number Publication date
GB0026158D0 (en) 2000-12-13

Similar Documents

Publication Publication Date Title
CN100433840C (en) Speech Recognition Technology Based on Local Interrupt Detection
CN100530355C (en) Method and apparatus for information signal provision based on speech recognition
US7047197B1 (en) Changing characteristics of a voice user interface
CN1188834C (en) Method and device for processing an input speech signal during presentation of an output audio signal
CN103067443B (en) Speech-based interface service identification and enablement for connecting mobile devices
EP1661122B1 (en) System and method of operating a speech recognition system in a vehicle
CN103095325B (en) There is the mobile voice platform architecture of remote service interface
US20030120493A1 (en) Method and system for updating and customizing recognition vocabulary
US8521235B2 (en) Address book sharing system and method for non-verbally adding address book contents using the same
JP2009530666A (en) How to provide automatic speech recognition, dictation, recording and playback for external users
CN103152702A (en) Speech-based user interface for a mobile device
US7212969B1 (en) Dynamic generation of voice interface structure and voice content based upon either or both user-specific contextual information and environmental information
CN103123621A (en) Mobile voice platform architecture
US20200211560A1 (en) Data Processing Device and Method for Performing Speech-Based Human Machine Interaction
CN1790483B (en) Method and system for managing multilingual name tags with embedded speech recognition
US20050114139A1 (en) Method of operating a speech dialog system
GB2368441A (en) Voice to voice data handling system
CN1893487B (en) Method and system for phonebook delivery
CN117373439A (en) Method and system for providing vehicle-mounted voice service
JP2022036388A (en) Electronic apparatus, service system and notification method
KR20050102743A (en) Short message transmission method using speech recognition of mobile phone

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)