WO2006018867A1

WO2006018867A1 - Communication method, communication device, communication system, and program

Info

Publication number: WO2006018867A1
Application number: PCT/JP2004/011820
Authority: WO
Inventors: Shoichi Sano
Original assignee: Fujitsu Limited
Priority date: 2004-08-18
Filing date: 2004-08-18
Publication date: 2006-02-23

Abstract

There is provided a communication method for transmitting/receiving video information and audio information between a first and a second communication device via a communication transmission path. The first communication device transmits audio information destined to the second communication device, to the communication transmission path. Data indicating an utterance duration of a speaker in the first communication device is generated from the audio information. In parallel with transmission of the audio information, data destined to the second communication device is transmitted from the first communication device to the communication transmission path.

Description

Specification

COMMUNICATION METHOD, COMMUNICATION DEVICE, COMMUNICATION SYSTEM, AND PROGRAM

Technical field

The present invention relates to a communication method, a communication device, a communication system, and a program, and more particularly to a communication method, a communication device, a communication system, and a program suitable for an IP phone, a TV phone, a video conference system, and the like.

Background art

In an IP phone, a TV phone, a TV conference system, and the like, audio information and image information are transmitted and received between a communication device on the transmission side and a communication device on the reception side via a communication transmission path. The audio information and the image information are encoded and transmitted by the communication device on the transmission side, and are combined and reproduced by the communication device on the reception side.

[0003] Audio information is pre-determined from the actual utterance (input) in the communication device on the transmission side by processing such as encoding, decoding, synchronization with image information, buffering, delay of the communication transmission path, etc. After the delay time, the speaker power of the receiving communication device is uttered (output). The synchronization of the audio information with the image information means that the image information displayed on the communication device on the receiving side is synchronized with the reproduced audio information, that is, the displayed movement of the mouth of the speaker and the reproduced audio. This is the process of combining information.

Disclosure of the invention

Problems to be solved by the invention

[0004] As described above, the audio information is reproduced in the receiving communication device after a predetermined delay time from the actual utterance in the transmitting communication device. Also, since the image information displayed in the receiving communication device is synchronized with the voice information uttered by the receiving communication device, the actual image input force in the transmitting communication device is also a predetermined time later. Is displayed. The predetermined time is about 0.4 to 0.5 seconds, for example. For this reason, in the communication device on the reception side, the utterance in the communication device on the transmission side cannot be recognized unless a predetermined time has elapsed since the actual utterance in the communication device on the transmission side. As a result, in the communication device on the reception side, there was actually a voice in the communication device on the transmission side, that is, the transmission The receiving speaker may start talking without knowing that the other speaker has started speaking, and the conversation between the sending and receiving parties will not be able to carry out a smooth conversation. There was a problem.

[0005] As a conventional technique for identifying a speaker, for example, a method proposed in Japanese Patent Laid-Open No. 6-83391 is known, and as a conventional technique for preventing a conversational collision, for example, A method proposed in Japanese Laid-Open Patent Publication Nos. 2002-232475 and 2002-158984 is known.

Means for solving the problem

[0006] In view of the above, a general object of the present invention is to provide a new and useful communication method, communication apparatus, communication system, and program that solve the above problems.

[0007] A more specific object of the present invention is to provide a communication method, a communication apparatus, and a communication system capable of enabling smooth conversation between the transmission side and the reception side by informing the reception side of the utterance on the transmission side in substantially real time. System and program.

[0008] Another object of the present invention is a communication method for transmitting and receiving image information and audio information between a first communication device and a second communication device via a communication transmission path, from the first communication device to the communication device. A first transmission step of transmitting voice information addressed to a second communication device to the communication transmission line, and generation of generating the voice information power and data indicating a speaker's utterance period in the first communication device; And a second transmission step of transmitting the data addressed to the second communication device from the first communication device to the communication transmission path in parallel with the first transmission step. It is to provide a characteristic communication method. According to the communication method of the present invention, it is possible to communicate smoothly between the transmitting side and the receiving side by notifying the receiving side of the utterance on the transmitting side in substantially real time.

[0009] Still another object of the present invention is a communication device that transmits and receives image information and audio information via a communication transmission path, wherein the first transmission means transmits the audio information to the communication transmission path, and A generation unit that generates data indicating a speaker's utterance period in the communication device from voice information, and a second transmission unit that transmits the data to the communication transmission path in parallel with the first transmission unit. And providing a communication apparatus characterized by comprising: According to the communication device of the present invention, the utterance on the transmission side is notified to the reception side in substantially real time, so that transmission can be performed. It is possible to have a smooth conversation between the receiving side and the receiving side.

[0010] Another object of the present invention is a communication device that transmits and receives image information and audio information via a communication transmission path, and receives and reproduces the audio information via the communication transmission path. In parallel with the receiving means, the second receiving means for receiving data indicating the utterance period of the speaker on the transmitting side via the communication transmission path, and a display for displaying the data It is another object of the present invention to provide a communication apparatus comprising the means. According to the communication apparatus of the present invention, it is possible to perform a smooth conversation between the transmitting side and the receiving side by notifying the receiving side of the utterance on the transmitting side in substantially real time.

[0011] Still another object of the present invention is a communication system for transmitting and receiving image information and audio information between a first communication device and a second communication device via a communication transmission path, and First transmission means for transmitting voice information addressed to the second communication device to the communication transmission path, and generation means for generating data indicating the utterance period of the speaker in the first communication device from the voice information Parallel to the first transmission means, second transmission means for transmitting the data addressed to the second communication device from the first communication device to the communication transmission path, and the first communication It is an object of the present invention to provide a communication system comprising display means for receiving the data from a device through the communication transmission path and displaying the data on a display unit of the second communication device. According to the communication system according to the present invention, it is possible to perform a smooth conversation between the transmission side and the reception side by notifying the reception side of the utterance on the transmission side in substantially real time.

Another object of the present invention is a program that causes a computer to function as a communication device that transmits and receives image information and audio information via a communication transmission path. The computer transmits audio information to the computer. A first transmission procedure for transmitting to a road; a generation procedure for causing the computer to generate data indicating a speech period of a speaker in the communication device from the voice information; and causing the computer to perform the first transmission procedure in parallel. Then, a second transmission procedure for transmitting the data to the communication transmission path is provided. According to the program of the present invention, it is possible to have a smooth conversation between the transmission side and the reception side by informing the reception side of the utterance on the transmission side in substantially real time.

Still another object of the present invention is a program that causes a computer to function as a communication device that transmits and receives image information and audio information via a communication transmission path. A first reception procedure for receiving and reproducing the voice information via the communication transmission path, and indicating to the computer an utterance period of the speaker on the transmission side in parallel with the first reception procedure. An object of the present invention is to provide a program characterized by including a second receiving procedure for receiving data via the communication transmission path and a display procedure for displaying the data on the computer. According to the program of the present invention, it is possible to perform a smooth conversation between the transmission side and the reception side by informing the reception side of the utterance on the transmission side in substantially real time.

[0014] Still other objects and features of the present invention will become apparent from the following description taken in conjunction with the drawings.

Brief Description of Drawings

FIG. 1 is a block diagram showing a part of a first embodiment of a communication system according to the present invention.

FIG. 2 is a time chart for explaining the operation of the voice detection signal processing system.

FIG. 3 is a block diagram showing a part of a second embodiment of a communication system according to the present invention.

FIG. 4 is a block diagram showing a part of a third embodiment of a communication system according to the present invention.

FIG. 5 is a block diagram showing a part of a fourth embodiment of the communication system according to the present invention.

FIG. 6 is a block diagram showing a part of a fifth embodiment of a communication system according to the present invention.

FIG. 7 is a diagram for explaining the operation of the multipoint control device.

FIG. 8 is a block diagram showing a part of a sixth embodiment of the communication system according to the present invention.

FIG. 9 is a diagram showing an example of a display screen when performing communication between multiple points.

FIG. 10 is a diagram showing another embodiment of the display screen when performing communication between multiple points.

FIG. 11 is a diagram showing an example of a display screen when communication between two points is performed.

BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, embodiments of a communication method, a communication apparatus, a communication system, and a program according to the present invention will be described with reference to the drawings.

Example 1

FIG. 1 is a block diagram showing a part of a first embodiment of a communication system according to the present invention. The first embodiment of the communication system employs the first embodiment of the communication method according to the present invention, the first embodiment of the communication apparatus according to the present invention, and the first embodiment of the program according to the present invention. In this embodiment, the present invention is applied to an IP phone. In FIG. 1, a communication device 1 on the transmission side includes an analog / digital converter (ADC) 3, an encoding circuit 4, an interface unit 5, a level detection unit 6, an encoding unit 7, and an interface unit 8. . The microphone 2 may be part of the communication device 1 or may be externally connected. The communication device 11 on the reception side is connected to the communication device 1 on the transmission side via a communication transmission path 10 composed of a network such as the Internet. The communication device 11 on the receiving side includes an interface unit 12, a decoding unit 13, a digital / analog converter (DAC) 14, an interface unit 16, a decoding unit 17 and a DAC 18. The speaker 15 and / or the display unit 19 may be part of the communication device 11 or may be externally connected.

The display unit 19 includes a display such as a CRT, LCD, PDP and / or a display such as a lamp. In the following description, lighting of the display unit 19 or lighting of the lamp means lighting of the lamp and Z displayed on the display screen of the display.

In FIG. 1, for convenience of explanation, only a part of the transmission system is shown for the communication device 1 and only a part of the reception system is shown for the communication device 11, but in practice, Since bidirectional communication is performed, the communication device 1 has a reception system similar to the communication device 11, and the communication device 11 has a transmission system similar to the communication device 1. In addition, since a known technique can be used for processing and displaying image information, illustration and description of the image information processing system are omitted. Furthermore, since the well-known technique can be used for the synchronization of the audio information to be reproduced and the image information to be displayed in the communication device 11 on the receiving side, illustration and description of the synchronization processing unit are also omitted. Since the transmission and reception (detection) of the destination of the audio signal transmitted from the communication device 1 on the transmission side, that is, the address of the communication device 11 on the reception side can also be used, it is illustrated and described. Is omitted.

In the communication device 1 on the transmission side, the utterance of the speaker is detected by the microphone 2 and input to the ADC 3 and the level detection unit 6 as an analog audio signal. The digital audio signal output from the ADC 3 is subjected to arbitrary encoding processing (compression processing) by the encoding unit 4 and input to the interface unit 5. The interface unit 5 converts the encoded audio data into audio data of a protocol suitable for the communication transmission path 10. In this embodiment, the interface unit 5 outputs VoIP (Voice Over Internet Protocol) encoded voice data to the communication transmission line 10. Microphone 2, ADC3, code key 4 and interface 5 This constitutes the transmission system of the IP telephone unit on the communication side, and these can be realized with known elements and circuits.

On the other hand, in the communication device 11 on the receiving side, the encoded voice data of VoIP input from the communication transmission path 10 is subjected to processing complementary to the interface unit 5 by the interface unit 12. The encoded audio data from the interface unit 12 is subjected to decryption processing (decompression processing) by the decoding unit 13. The DAC 14 converts the decoded digital audio signal into the original analog audio signal, and the speaker 15 reproduces the speech of the transmitting speaker. The interface unit 12, the decoding unit 13, the DAC 14, and the speaker 15 constitute a receiving system of the IP telephone unit on the receiving side, and these can be realized by known elements and circuits.

Next, a speech detection signal processing system will be described with reference to FIG. FIG. 2 is a time chart for explaining the operation of the voice detection signal processing system. In FIG. 2, (A) and (E) explain the processing of the voice detection signal in the transmission side communication device 1, and (F) and (G) show the voice detection signal in the communication device 11 on the reception side. The process will be described. In FIG. 2, (A) is an audio signal input to the level detection unit 6, (B) is an audio detection signal output from the level detection unit 6, and (C) is generated by the code key unit 7. The voice start detection signal, (D) shows the voice sample signal sampled by the encoding unit 7, and (E) shows the lamp lighting data output from the coding unit 7. In FIG. 2, (F) shows the lamp lighting pulse signal output from the decoding unit 17, and (G) shows the lighting time of the display unit 19 that is turned on by the output of the DAC 18.

In the communication apparatus 1 on the transmission side, the analog audio signal in FIG. 2 (A) output from the microphone 2 is input to the level detection unit 6. The level detection unit 6 compares the analog voice signal with the threshold value L1, and outputs the voice detection signal shown in FIG. 2 (B) indicating the utterance of the speaker on the communication side when the threshold value L1 is exceeded. The threshold value L1 is set to a value that can distinguish, for example, a speaker's utterance and noise other than the utterance. The voice detection signal is subjected to arbitrary encoding processing (compression processing) by the encoding unit 7 and input to the interface unit 8. Specifically, the encoding unit 7 generates the voice start detection signal shown in FIG. 2C indicating the start of utterance (ie, the rising edge of the utterance detection signal) from the voice detection signal, and the voice detection signal is arbitrarily sampled. If the audio detection signal is sampled at the interval T1 and the audio detection signal is on (or high level), the audio sample signal shown in FIG. Based on the OR of the signal and audio sample signal, the lamp lighting data in Fig. 2 (E) Output. Sampling of the audio detection signal can be started after time T1 from the rise of the audio start detection signal. Each time the lamp lighting data is generated, the interface unit 8 converts the lamp lighting data into lamp lighting data of a protocol suitable for the communication transmission path 10 and outputs the data in packets. In the present embodiment, the interface unit 8 outputs UDP (User Datagram Protocol) lamp lighting data to the communication transmission path 10 in units of packets. Since the lamp lighting data is transmitted based on the voice detection signal in this way, the lamp lighting data is transmitted with less delay time than the packet loss, that is, for example, TCP / IP (Transmission Control Protocol / Use UDP with less delay time than Internet Protocol). The microphone 2, the level detection unit 6, the encoding unit 7, and the interface unit 8 constitute a voice detection signal processing system on the transmission side.

On the other hand, in the communication device 11 on the receiving side, UDP lamp lighting data in units of packets input from the communication transmission path 10 is subjected to processing complementary to the interface unit 8 by the interface unit 16. The lamp lighting data from the interface unit 16 is subjected to decryption processing (decompression processing) by the decryption unit 17. Specifically, the decoding unit 17 generates a pulse lighting pulse signal shown in FIG. 2 (F) by generating a pulse having a pulse width (time) T2 every time a lamp lighting data packet is received from the interface unit 16. The DAC 14 converts the lamp lighting pulse signal into an analog lamp lighting signal. In Fig. 2 (F), hatching indicates the overlapping pulse part, and the larger the number of overlapping pulses, the narrower the hatching interval. Based on the analog lamp lighting signal, the display unit 19 stands for the lighting time shown in Fig. 2 (G), that is, the time corresponding to the time when the speaker on the transmitting side is speaking, and the timing of the speaker on the transmitting side. Illuminated in real time. Therefore, in the communication device 11 on the reception side, if the display unit 19 is lit, it can be visually recognized that the speaker is speaking in the communication device 1 on the transmission side. While the light is lit, the conversational force of the conversation can be reliably prevented by the speaker refraining from speaking in the communication device 11 on the receiving side. The interface unit 16, the decoding unit 17, the DAC 18, and the display unit 19 constitute an audio detection signal processing system on the reception side.

[0026] The sampling interval (time) T1 used in the encoding unit 7 in the communication device 1 on the communication side and the pulse width (time) T2 used in the decoding unit 17 in the communication device 11 on the reception side are appropriate. By adjusting to, the number of packets sent can be increased or decreased. If the time Tl and Τ2 are set small, Although the number of packets transmitted increases, the lamp lighting time approaches the actual utterance period (speech time) of the speaker in the communication device 1 on the transmission side, and the end of the utterance can be accurately notified to the communication device 11 on the reception side. . However, since it is important to accurately notify the receiving communication device 11 of the start of utterance rather than the end of utterance in order to prevent a conversational disruption, the accuracy of notifying the receiving communication device 11 of the end of utterance is important. It is not always necessary to significantly improve Therefore, it is desirable to set the time Tl and Τ2 large and reduce the number of packet transmissions so that the end of utterance does not become unnatural depending on the situation of the communication transmission path 10.

[0027] The time T1 may be adjusted by the communication device 1 on the transmission side, and the time Τ2 may be adjusted by the communication device 11 on the reception side, but the communication device 1 on the transmission side is connected to the communication device 11 on the reception side shown in FIG. Since the receiving side communication device 1 has the same recording system as the transmitting side communication device 1, the time Tl and, 2 may be adjusted by the communication devices 1 and 11, respectively. For example, each of the communication devices 1 and 11 can optimize the threshold L1 and the time Tl and Τ2 for each communication partner and store them in a corresponding file in an internal memory (not shown). In this case, each communication device 1, 11 recognizes the communication device that is the communication partner based on the IP address, unique identification number, etc. at the time of connection with the communication partner, and automatically uses the corresponding file to set the threshold value. Ability to set L1 and time Tl, T2.

Example 2

FIG. 3 is a block diagram showing a part of a second embodiment of the communication system according to the present invention. The second embodiment of the communication system employs the second embodiment of the communication method according to the present invention, the second embodiment of the communication apparatus according to the present invention, and the second embodiment of the program according to the present invention. In this embodiment, the present invention is applied to an IP phone. In FIG. 3, the same parts as those in FIG.

In the present embodiment, the communication device 1 on the transmission side is further provided with a lamp lighting pulse generation unit 21, a display unit 22, and an L1 setting unit 23. The lamp lighting pulse generator 21 outputs a signal similar to the lamp lighting pulse signal shown in FIG. 2 (F) based on the lamp lighting data shown in FIG. 2 (E) output from the encoder 7. Based on the lamp lighting pulse signal from the lamp lighting noise generation unit 21, the display unit 22 has a lighting time similar to that shown in FIG. Lights for a while. The L1 setting section 23 has dials, buttons, It is composed of a graphic user interface (GUI) on the display screen of the display unit 22 and is used for inputting the threshold value L1 to the level detection unit 6.

[0030] Therefore, in the communication device 1 on the transmission side, the speaker speaks and confirms the lighting state of the display unit 22, and operates the L1 setting unit 23 to appropriately adjust and set the threshold L1. be able to. In addition, the speaker can also confirm the normal operation of the voice detection signal processing system (level detection unit 6 and encoding unit 7) on the transmission side by confirming the lighting state of the display unit 22. Example 3

FIG. 4 is a block diagram showing a part of a third embodiment of the communication system according to the present invention. The third embodiment of the communication system employs the third embodiment of the communication method according to the present invention, the third embodiment of the communication apparatus according to the present invention, and the third embodiment of the program according to the present invention. In this embodiment, the present invention is applied to an IP phone. In FIG. 4, the same parts as those in FIG.

In the present embodiment, the communication device 1 on the transmission side is further provided with a sample button 31, a sampling unit 32, sample storage units 33 and 34, and an average value calculation unit 35. The digital audio signal output from the ADC 3 is sampled by the sampling unit 32 at two timings determined based on the operation of the sample button 31, and the respective samples are stored in the sample storage units 33 and 34. The average value calculation unit 35 inputs the average value of the two samples stored in the sample storage units 33 and 34 to the level detection unit 6 as the threshold value L1.

That is, in the communication device 1 on the transmission side, the speaker operates the sample button 31 during utterance, stops the utterance, and operates the sample button 31. As a result, voice data at the time of utterance and voice data at the time of silent (such as ambient noise) are sampled and stored in the corresponding sample storage sections 33 and 34. The average value calculation unit 35 can calculate the average value of the voice data at the time of utterance and the voice data at the time of no voice stored in the sample storage units 33 and 34, and can set this average value as the threshold value L1.

Example 4

FIG. 5 is a block diagram showing a part of the fourth embodiment of the communication system according to the present invention. The fourth embodiment of the communication system employs the fourth embodiment of the communication method according to the present invention, the fourth embodiment of the communication apparatus according to the present invention, and the fourth embodiment of the program according to the present invention. Real In the embodiment, the present invention is applied to an IP phone. In FIG. 5, the same parts as those in FIG. FIG. 5 shows a part of the reception system of the communication device 1 on the transmission side and a part of the transmission system of the communication device 11 on the reception side.

In this embodiment, the communication device 1 on the transmission side is further provided with a display unit 42 and an L1 setting unit 43, and the L1 setting unit 53 is further provided on the communication device 11 on the reception side. The L1 setting units 4 3, 53 are composed of dials, buttons, the graphic user interface (GUI) on the display screen of the corresponding display units 42, 19, etc., and the threshold value L1 is input to the corresponding interface unit 8 Is for. Further, in the communication device 1 on the transmission side, the output of the interface unit 16 is input to the level detection unit 6. On the other hand, in the communication device 11 on the receiving side, the output of the interface unit 16 is input to the level detection unit 6. When the threshold value L1 is input to the interface unit 8 in the communication device 1 on the transmission side, the threshold value L1 is input to the level detection unit 6 on the reception side via the interface unit 16 of the communication device 11 on the reception side. For this reason, the lighting state of the display unit 19 on the reception side for which the threshold L1 is set is sent to the transmission side, and the display unit 42 is lit in the same manner as the display unit 19 on the reception side.

Therefore, in the communication device 1 on the transmission side, the speaker speaks and operates the L1 setting unit 43 while confirming the lighting state of the display unit 42, thereby setting the threshold L1 for the communication device 11 on the reception side. It can be adjusted and set appropriately. Similarly, in the communication device 11 on the receiving side, the speaker speaks and operates the L1 setting unit 53 while confirming the lighting state of the display unit 19, thereby setting the threshold L1 for the communication device 1 on the transmitting side. Can be adjusted and set appropriately Example 5

FIG. 6 is a block diagram showing a part of a fifth embodiment of the communication system according to the present invention. The fifth embodiment of the communication system employs the fifth embodiment of the communication method according to the present invention, the fifth embodiment of the communication apparatus according to the present invention, and the fifth embodiment of the program according to the present invention. In this embodiment, the present invention is applied to an IP phone, and has the same configuration as the communication device 1 (or 11) shown in FIGS. 1, 3 to 5, and is installed at a plurality of points A to D. It is assumed that the communication devices 101A-101D are connected via the multipoint control device 51 on the communication transmission line 10. The multipoint control device 51 is composed of a general-purpose computer, for example, and includes a communication device 101A. One communication device of 101D is provided with a function of performing an over (OR) process of lamp lighting data from another communication device and transmitting it. In Fig. 6, the solid line arrows indicate the output of lamp lighting data by voice detection at each point, and the broken line arrows indicate the input of lamp lighting data from the other party to each point.

FIG. 7 is a diagram for explaining the operation of the multipoint control device 51. This figure shows the operation when lamp lighting data from the communication devices 101B and 101D is transmitted to the communication device 101A. The multipoint control device 51 transmits the result of obtaining the OR (OR) of the lamp lighting data from the communication device 101B 101D to the communication device 101A as the lamp lighting data. Accordingly, the display unit of communication device 101A is lit when any speaker of communication devices 101B-101D is speaking. Therefore, the communication device 101A can visually recognize that any speaker of the communication device 101B 101D is speaking, and the communication device 101A can speak while the display unit of the communication device 101A is lit. By refraining from speaking, it is possible to reliably prevent the conversational force.

Example 6

FIG. 8 is a block diagram showing a part of a sixth embodiment of the communication system according to the present invention. The sixth embodiment of the communication system employs the sixth embodiment of the communication method according to the present invention, the sixth embodiment of the communication apparatus according to the present invention, and the sixth embodiment of the program according to the present invention. In this embodiment, it is assumed that the present invention is applied to an IP phone, and communication devices installed at three or more points are connected via a communication transmission line 10. For convenience of explanation, it is assumed that four communication devices are installed at four points, and the two communication devices installed at two points have the configuration shown in FIG. In FIG. 8, the same parts as those in FIG. 1 are denoted by the same reference numerals, and the description thereof is omitted.

In FIG. 8, it is assumed that the communication device 1 on the transmission side is installed at the point B and the communication device 11 on the reception side is installed at the point A. Illustration of communication devices installed at points C and D is omitted. The three lamp lighting data transmitted from the communication device installed at the point B-D are input to the interface unit 16-1 of the communication device 11 installed at the point A via the communication transmission path 10. The three lamp lighting data are input to the corresponding decryption units 17-B-17-D based on, for example, the sender IP address of the communication device at point B-D. Recovery The lamp lighting pulse signal from the encoding unit 17-B-17-D is supplied to the corresponding display unit 19-B-19-D via the corresponding DAC 18-B-18-D. Therefore, among the display units 19-B-19-D, the display unit corresponding to the point where the speaker is speaking is turned on.

[0041] FIG. 9 shows that the display unit 19 (display unit 19—B 19-D) of the communication device 11 installed at the point A comprises a single display, and three lamps from the communication device at the point B—D. An example of the display screen when the display units 19-B 19-D corresponding to the lighting data are displayed on the display screen of the display together with the speaker image 200B as the lamp display 19B 19D is shown. In the case where the speaker recognizes his / her utterance at the communication device 11, as shown in FIG. 9, the display unit corresponding to the display unit 22 shown in FIG. It may be displayed on the screen.

[0042] Note that FIG. 9 shows a case where, for example, only the image 200B of the speaker who spoke first is displayed on the display screen of the display. As you can see, all speakers 200B-200D can be displayed simultaneously. Figure 10 shows that the display unit 19 (display unit 19 1 B- 19-D) of the communication device 11 installed at point A consists of a single display, and the three lamps from the communication device at point B-D are lit. Other implementations of the display screen when the display unit 19—B— 19—D corresponding to the data is displayed on the display screen of the display as a lamp display 19B—19D with all the images 200 B—200D An example is shown. When the speaker wants to recognize his / her utterance in the communication device 11, as shown in FIG. 10, the display unit corresponding to the display unit 22 shown in FIG. May be displayed. In this case, the images 200A-200D of the speakers at the respective points A—D and the utterance status of each speaker are displayed correspondingly, so that the speaker who is speaking can be easily recognized. Also, as shown in Fig. 10, by changing the size of the lamp display 19A for yourself (point A) and the lamp display 19B-19D for other points BD, or by making the shape and color different, It is also possible to easily distinguish the utterance status of other speakers. The display of the image itself is not directly related to the gist of the present invention, and the description thereof is omitted.

[0043] FIG. 11 shows a display screen displayed on the display screen of the display constituting the display unit of the communication device when communication is performed between two points as in the first to fourth embodiments. Implementation It is a figure which shows an example. Figure 11 shows that the display 19 of the communication device 11 installed at the point A consists of a single display, and the lamp display 19B according to the lamp lighting data from the communication device 1 installed at the point B is the story of the point B. An example of the display screen when it is displayed on the display screen of the display together with the person's image 200B is shown. When the speaker wants to recognize his / her voice in the communication device 11, as shown in FIG. 11, the display unit corresponding to the display unit 22 shown in FIG. May be displayed. As shown in Figure 11, the lamp display 19A for yourself (point A) and the lamp display 19B for other points B can be made different from each other by changing the size or shape and color of the lamp display 19B. It is also possible to easily identify the speaker's utterance situation.

[0044] A program according to the present invention causes a computer constituting a communication device on the communication side and / or a communication device on the reception side in each of the above embodiments to function as a communication device. In this case, the computer constituting the communication device may be a general-purpose computer having a well-known configuration including a memory for storing a program and a processor such as a CPU for executing the program. The program may be stored in a computer-readable storage medium.

[0045] In each of the above embodiments, the present invention is applied to a communication system capable of two-way communication at the same time. However, the present invention is not limited to this, for example, similarly to a half-duplex communication system. Applicable. For example, in a half-duplex communication system that performs echo cancellation, the utterance of the receiving speaker does not reach the transmitting side when the transmitting speaker speaks, and the transmitting speaker does not reach when the receiving speaker speaks. Speaking does not reach the receiver, and there is a case where the conversation ability S is not smooth. However, by applying the present invention, when the other speaker speaks, information indicating the utterance is displayed on the receiving side, so that conversation can be performed smoothly.

[0046] While the present invention has been described with reference to the embodiments, it is needless to say that the present invention is not limited to the above embodiments, and various modifications and improvements can be made within the scope of the present invention. .

Claims

The scope of the claims

[1] A communication method for transmitting and receiving image information and audio information between first and second communication devices via a communication transmission path,

A first transmission step of transmitting voice information addressed to the second communication device from the first communication device to the communication transmission line;

A generating step for generating data indicating a speaker's utterance period in the first communication device from the voice information;

In parallel with the first transmission step, the first communication device power includes a second transmission step of transmitting the data addressed to the second communication device to the communication transmission path, Communication method.

[2] The communication method according to claim 1, wherein in the first and second transmission steps, the transmission of the voice information and the transmission of the data are performed using different protocols.

[3] The first transmission step transmits the audio information through a first signal processing system that encodes the audio information, and the second transmission step differs from the first signal processing system in the first signal processing system. 3. The communication method according to claim 1, wherein the data is transmitted through a second signal processing system that generates data.

[4] The communication method according to any one of claims 1 to 3, wherein the generation step generates the data based on a threshold value.

5. The communication method according to claim 4, further comprising a setting step of variably setting the threshold value from the first communication device or the second communication device.

6. The communication method according to claim 1, further comprising a display step of displaying the data on a display unit of the second communication device.

7. The communication method according to claim 6, wherein the display step displays the data on the display unit together with the image information from the first communication device received via the communication transmission path.

[8] A communication method for transmitting and receiving image information and audio information between a first communication device and a second communication device via a communication transmission path,

A first reception step of receiving and reproducing the audio information from the first communication device by the second communication device via the communication transmission path; In parallel with the first reception step, data indicating the utterance period of the speaker in the first communication device from the first communication device is sent to the second communication device via the communication transmission path. Receive step to receive at

And a display step of displaying the data on the display unit of the second communication device.

9. The communication method according to claim 6, wherein the display step displays the data on the display unit together with the image information from the first communication device received via the communication transmission path.

[10] A communication device that transmits and receives image information and audio information via a communication transmission path, wherein the first transmission means transmits audio information to the communication transmission path;

Generating means for generating data indicating the utterance period of the speaker in the communication device from the voice information;

A communication apparatus comprising: a second transmission unit configured to transmit the data to the communication transmission path in parallel with the first transmission unit.

11. The communication apparatus according to claim 10, wherein the first and second transmission units transmit the audio information and the data using different protocols.

[12] The first transmission means includes a first signal processing system that encodes the audio information, and the second transmission means generates the data unlike the first signal processing system. 12. The communication device according to claim 10, further comprising a second signal processing system.

[13] The generation unit may generate the data based on a threshold value.

13. The communication device according to any one of 12.

14. The communication device according to claim 13, further comprising setting means for variably setting the threshold value from the first communication device or the second communication device.

15. The communication apparatus according to any one of claims 10 to 14, further comprising display means for displaying data received via the communication transmission path.

[16] The display means displays the data together with the image information received via the communication transmission path.

The communication device according to claim 15.

[17] A communication device for transmitting and receiving image information and audio information via a communication transmission path, wherein the first receiving means receives and reproduces the audio information via the communication transmission path; In parallel with the first receiving means, second receiving means for receiving data indicating the utterance period of the speaker on the transmitting side via the communication transmission path;

A communication device comprising display means for displaying the data.

[18] The display means displays the data together with the image information received via the communication transmission path.

The communication device according to claim 15.

[19] A communication system for transmitting and receiving image information and audio information between the first and second communication devices via a communication transmission path,

First transmission means for transmitting voice information addressed to the second communication device from the first communication device to the communication transmission line;

Generating means for generating data indicating a speaker's utterance period in the first communication device from the voice information;

In parallel with the first transmission means, second transmission means for transmitting the data addressed to the second communication apparatus to the communication transmission path in parallel with the first communication apparatus power;

A communication system comprising display means for receiving the data from the first communication device via the communication transmission path and displaying the data on a display unit of the second communication device.

20. The communication system according to claim 19, wherein the first and second transmission means transmit the audio information and the data using different protocols.

[21] A program that causes a computer to function as a communication device that transmits and receives image information and audio information via a communication transmission path,

A first transmission procedure for causing the computer to transmit voice information to the communication transmission line;

A program comprising: a second transmission procedure for causing the computer to transmit the data to the communication transmission path in parallel with the first transmission procedure.

[22] A program that causes a computer to function as a communication device that transmits and receives image information and audio information via a communication transmission path,

A first reception procedure for causing the computer to receive and reproduce the audio information via the communication transmission path; A second reception procedure for causing the computer to receive data indicating the utterance period of the speaker on the transmission side via the communication transmission line in parallel with the first reception procedure;

And a display procedure for displaying the data on the computer.