TW202325008A - Method for automatically switching on/off sound receiving channel of video conference and electronic device - Google Patents
Method for automatically switching on/off sound receiving channel of video conference and electronic device Download PDFInfo
- Publication number
- TW202325008A TW202325008A TW110146730A TW110146730A TW202325008A TW 202325008 A TW202325008 A TW 202325008A TW 110146730 A TW110146730 A TW 110146730A TW 110146730 A TW110146730 A TW 110146730A TW 202325008 A TW202325008 A TW 202325008A
- Authority
- TW
- Taiwan
- Prior art keywords
- participant
- video conference
- electronic device
- video
- video image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000013473 artificial intelligence Methods 0.000 claims description 10
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
Images
Landscapes
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
本發明是有關於一種自動切換方法及電子裝置,且特別是有關於一種視訊會議收音的自動切換方法及電子裝置。The present invention relates to an automatic switching method and an electronic device, and in particular to an automatic switching method and an electronic device for video conferencing radio.
因應遠距上班之需求,讓許多線上會議軟體的使用率愈來愈高。公司員工可透過線上會議來進行工作上的討論,以降低面對面接觸的機會。In response to the demand for remote work, the usage rate of many online conference software is increasing. Employees of the company can conduct work discussions through online meetings to reduce the chance of face-to-face contact.
考慮到與會者周遭環境對開會品質造成的影響,與會者常需要自行切換會議的麥克風為靜音或是收音模式。然而也時常發生與會者要發言時忘記將麥克風調整為收音模式,導致其他與會者沒有辦法聽到聲音的狀況;或是與會者沒有要發言時卻忘記將麥克風切換為靜音模式,導致周遭環境的聲音透過會議而讓其他與會者聽見。Considering the impact of the surrounding environment of the participants on the quality of the meeting, the participants often need to switch the microphone of the conference to mute or radio mode. However, it often happens that participants forget to switch the microphone to radio mode when they want to speak, resulting in the situation that other participants cannot hear the sound; Go through the meeting and be heard by other participants.
本發明係有關於一種視訊會議收音的自動切換方法及電子裝置,其能夠透過與會者的視訊影像判斷與會者目前是否正處於發言狀態,進而自動開/關視訊會議的收音通道。此外,本發明還進一步透過雙重的判斷步驟來協助確認目前是否由與會者本人發言。The invention relates to an automatic switching method and an electronic device for video conference audio, which can judge whether the participant is currently speaking through the video image of the participant, and then automatically turn on/off the audio channel of the video conference. In addition, the present invention further assists in confirming whether the participant speaks at present through double judging steps.
根據本發明之一方面,提出一種視訊會議收音的自動切換方法。視訊會議收音的自動切換方法包括以下步驟。首先,偵測電子裝置的音量訊息。若音量訊息大於門檻條件,則接收電子裝置之視訊會議之與會者的視訊影像。接著,根據視訊影像判斷與會者是否處於發言狀態。若與會者未處於發言狀態,則關閉視訊會議的收音通道。According to one aspect of the present invention, a method for automatically switching audio in a video conference is proposed. The method for automatic switching of video conference audio includes the following steps. Firstly, the volume information of the electronic device is detected. If the volume information is greater than the threshold condition, the video images of the participants of the video conference of the electronic device are received. Then, it is determined whether the participant is speaking according to the video image. If the participant is not speaking, the audio channel of the video conference is turned off.
根據本發明之另一方面,提出一種電子裝置。電子裝置包括收音單元、處理單元以及攝像單元。收音單元用以偵測音量訊息。處理單元用以執行視訊會議及接收音量訊息。攝像單元用以取得視訊會議之與會者的視訊影像。若處理單元判斷音量訊息大於門檻條件,則接收視訊影像,並根據視訊影像判斷與會者是否處於發言狀態。若處理單元判斷與會者未處於發言狀態,則關閉視訊會議的收音通道。According to another aspect of the present invention, an electronic device is provided. The electronic device includes a radio unit, a processing unit and a camera unit. The radio unit is used for detecting volume information. The processing unit is used for performing video conferencing and receiving volume information. The camera unit is used for obtaining video images of participants in the video conference. If the processing unit judges that the volume information is greater than the threshold condition, it receives the video image, and judges whether the participant is speaking according to the video image. If the processing unit judges that the participant is not speaking, then close the audio channel of the video conference.
為了對本發明之上述及其他方面有更佳的瞭解,下文特舉實施例,並配合所附圖式詳細說明如下:In order to have a better understanding of the above-mentioned and other aspects of the present invention, the following specific examples are given in detail with the accompanying drawings as follows:
以下將詳述本發明的各實施例,並配合圖式作為例示。除了這些詳細描述之外,本發明還可以廣泛地施行在其他的實施例中,任何所述實施例的輕易替代、修改、等效變化都包含在本發明的範圍內,並以之後的專利範圍為準。在說明書的描述中,為了使讀者對本發明有較完整的瞭解,提供了許多特定細節及實施範例;然而,這些特定細節及實施範例不應視為本發明的限制。此外,眾所周知的步驟或元件並未描述於細節中,以避免造成本發明不必要之限制。Various embodiments of the present invention will be described in detail below and illustrated with accompanying drawings. In addition to these detailed descriptions, the present invention can also be widely implemented in other embodiments, and any easy replacement, modification, and equivalent changes of any of the described embodiments are included in the scope of the present invention, and are defined in the following patent scope prevail. In the description of the specification, many specific details and implementation examples are provided in order to enable readers to have a more complete understanding of the present invention; however, these specific details and implementation examples should not be regarded as limitations of the present invention. Also, well-known steps or elements have not been described in detail in order to avoid unnecessarily limiting the invention.
請參照第1圖,其為根據本發明一實施例繪示電子裝置100的方塊圖。電子裝置100例如是筆記型電腦、桌上型電腦、平板電腦或智慧型手機。電子裝置100包括收音單元110、攝像單元120、顯示單元130及處理單元140。收音單元110為裝設於電子裝置100上的收音元件,用以接收各種聲音以偵測音量訊息,例如是麥克風。攝像單元120用以拍攝各種影像,例如是相機。顯示單元130用以顯示各種資訊,例如是顯示面板。處理單元140用以執行各種處理程序,例如是電路板、晶片、電路、電腦程式產品、或電腦可讀取記錄媒體。處理單元140包括判斷模組141、影像處理模組142、辨識模組143及收音通道切換模組144。判斷模組141用以接收收音單元110的音量訊息,並對此進行判斷程序。影像處理模組142用以對攝像單元120之影像進行影像處理。辨識模組143用以利用人工智慧演算法對處理後的影像進行影像辨識。收音通道切換模組144用以依據辨識模組143的辨識結果開啟或關閉視訊會議的收音通道。Please refer to FIG. 1 , which is a block diagram illustrating an electronic device 100 according to an embodiment of the present invention. The electronic device 100 is, for example, a notebook computer, a desktop computer, a tablet computer or a smart phone. The electronic device 100 includes a
第2圖為根據本發明一實施例繪示視訊會議收音的自動切換方法的流程圖。請參照第1圖和第2圖,在步驟S110,電子裝置100執行視訊會議。使用者可例如點選電子裝置100之程式,以令處理單元140執行此視訊會議,並將視訊會議的畫面顯示於顯示單元130。FIG. 2 is a flow chart illustrating a method for automatically switching audio in a video conference according to an embodiment of the present invention. Referring to FIG. 1 and FIG. 2 , in step S110 , the electronic device 100 executes a video conference. The user can, for example, click on a program of the electronic device 100 to make the processing unit 140 execute the video conference and display the video conference image on the
在步驟S120,收音單元110偵測電子裝置100的音量訊息。然後,在步驟S130,判斷模組141判斷音量訊息是否大於門檻條件。若判斷模組141判斷音量訊息大於門檻條件,則進入步驟140;若判斷模組141判斷音量訊息未大於門檻條件,則進入步驟120。於此,判斷模組141根據收音單元110之音量訊息偵測是否有輸入訊號。舉例來說,當輸入訊號大於一定的音量程度,例如輸入訊號大於5%音量的門檻條件時,即可判定目前有人正在發言(但不確定是否為與會者本人發言,或是其他與會者發言),或是周遭環境音過大。In step S120 , the
在步驟S140,處理單元140接收視訊會議之與會者的視訊影像。接著,在步驟S150,處理單元140根據視訊影像判斷與會者是否處於發言狀態。一實施例中,處理單元140可透過偵測視訊影像中與會者的嘴型變化或手勢動作,來判斷與會者是否處於發言狀態。具體地,處理單元140可利用人工智慧演算法對視訊影像中與會者的嘴型變化或手勢動作進行影像辨識,來判斷與會者是否處於發言狀態。舉例來說,在步驟S140中,影像處理模組142可接收視訊影像,然後於步驟S150中,先對視訊影像進行影像處理,例如是將視訊影像進行去背的處理,以提取與會者單純的人物畫面。接著,再由辨識模組143利用人工智慧演算法對此與會者單純的人物畫面進行嘴型變化或手勢動作的影像辨識,來判斷與繪者是否處於發言狀態。In step S140, the processing unit 140 receives the video images of the participants of the video conference. Next, in step S150 , the processing unit 140 determines whether the participant is speaking according to the video image. In one embodiment, the processing unit 140 can determine whether the participant is speaking by detecting the change of the participant's mouth shape or gesture in the video image. Specifically, the processing unit 140 can use an artificial intelligence algorithm to perform image recognition on the participant's mouth shape changes or gestures in the video image, so as to determine whether the participant is in a speaking state. For example, in step S140, the image processing module 142 can receive the video image, and then in step S150, first perform image processing on the video image, for example, process the video image to extract the simple Character screen. Next, the
以下進一步描述辨識模組143利用人工智慧演算法進行訓練的方式。The manner in which the
第3圖為根據本發明一實施例繪示利用人工智慧演算法進行訓練的流程圖。請參照第1圖和第3圖,在步驟S210,首先利用攝像單元120拍攝單純只有背景的影像。接著,在步驟S220,利用攝像單元120拍攝包含背景及人物的影像。然後,在步驟S230,影像處理模組142擷取單純的人物畫面,例如透過去背的處理提取單純的人物畫面。之後,可在不同背景及/或不同人物條件下,重複執行步驟S210~S230,得到多張單純的人物畫面,這些單純的人物畫面可做為辨識模組143的影像訓練集。在步驟S240中,辨識模組143利用人工智慧演算法進行訓練,如卷積神經網路(Convolutional Neural Networks, CNN)、類神經網路結構(如Softmax函數)及/或其它合適的人工智慧演算法來進行模型的訓練。FIG. 3 is a flow chart illustrating training using an artificial intelligence algorithm according to an embodiment of the present invention. Please refer to FIG. 1 and FIG. 3 , in step S210 , the
前述的不同人物條件,可包含同一人物/不同人物分別張嘴說話、閉嘴不說話、戴口罩情況下比「OK」的手勢(表示目前正在開口說話)、或是戴口罩情況下比「叉」的手勢(表示目前並未開口說話)。舉例來說,第4A圖至第4G圖繪示不同拍攝影像IMGa~IMGg的實施例,其中第4A圖至第4D圖和第4E圖至第4G圖為同一人物(如與會者H)分別在兩種不同的背景下所拍攝之影像IMGa~IMGg。第4A圖和第4E圖顯示人物張嘴說話,這些影像IMGa、IMGe可透過影像處理模組142擷取單純的人物畫面之後,由使用者標註「開始講話」並送進辨識模組143進行訓練;第4B圖和第4F圖顯示人物閉嘴不說話,這些影像IMGb、IMGf可透過影像處理模組142擷取單純的人物畫面之後,由使用者標註「結束講話」並送進辨識模組143進行訓練;第4C圖顯示人物在戴口罩情況下比「OK」的手勢,此影像IMGc可透過影像處理模組142擷取單純的人物畫面之後,由使用者標註「開始講話」並送進辨識模組143進行訓練;第4D圖和第4G圖顯示人物在戴口罩情況下比「叉」的手勢,這些影像IMGd、IMGg可透過影像處理模組142擷取單純的人物畫面之後,由使用者標註「結束講話」並送進辨識模組143進行訓練。如此一來,訓練完成後的辨識模組143即可依據人物(如與會者H)的嘴型變化M(如第4A、4B、4E、4F圖)或手勢動作G(如第4C、4D、4G圖)來判斷是否正在發言。The aforementioned different character conditions can include the same character/different characters opening their mouths to speak, shutting their mouths and not speaking, gestures like "OK" when wearing a mask (indicating that they are currently speaking), or gestures like "cross" when wearing a mask gesture (indicating that you are not speaking at the moment). For example, Fig. 4A to Fig. 4G show the embodiments of different shooting images IMGa-IMGg, wherein Fig. 4A-Fig. 4D and Fig. 4E-Fig. 4G are the same person (such as participant H) respectively Images IMGa~IMGg taken under two different backgrounds. Figures 4A and 4E show the characters opening their mouths to speak. These images IMGa and IMGe can be captured through the image processing module 142 to capture the pure character images, and then marked "start speaking" by the user and sent to the
請參照第1圖和第2圖,在步驟S150中,當判斷與會者未處於發言狀態時,則進入步驟S160,收音通道切換模組144自動關閉視訊會議的收音通道,以免周遭環境的聲音透過視訊會議而讓其他與會者聽見;當判斷與會者處於發言狀態時,則進入步驟S170,收音通道切換模組144自動開啟視訊會議的收音通道,而不須由與會者自己調整。Please refer to Fig. 1 and Fig. 2. In step S150, when it is judged that the participant is not in the state of speaking, then enter step S160, and the audio
第5圖繪示視訊會議VC的一實施例。請參照第1圖、第2圖和第5圖,當判斷模組141判斷音量訊息大於門檻條件時,攝像單元120將與會者H的視訊影像傳送至影像處理模組142進行影像處理。辨識模組143對處理後的影像進行影像辨識,並依據與會者H的嘴型變化M判斷與會者H目前並未處於發言狀態,而音量訊息可能來自於與會者周遭環境的聲音。收音通道切換模組144即自動將視訊會議VC的收音通道MIC關閉而設為靜音,以免對視訊會議VC造成干擾。FIG. 5 shows an embodiment of a video conference VC. Please refer to FIG. 1 , FIG. 2 and FIG. 5 , when the
此外,另一實施例中,若與會者H臉上戴者口罩,且又忘記比出手勢,導致辨識模組143無法偵測與會者H的嘴型變化M及手勢動作G,則可於顯示單元130顯示一提示訊息,以提醒與會者H做出適當的手勢動作G。In addition, in another embodiment, if the participant H wears a mask on his face and forgets to make gestures, so that the
本發明所提出的視訊會議收音的自動切換方法及電子裝置,先透過偵測電子裝置的音量訊息,當音量訊息大於預設的門檻條件時,才接著透過與會者的視訊影像判斷音量訊息的來源是否為與會者本人。若是,則自動將視訊會議的收音通道開啟;若否,則自動關閉視訊會議的收音通道,藉此提升使用者視訊會議的體驗。The method and electronic device for automatically switching audio in a video conference proposed by the present invention first detect the volume information of the electronic device, and then determine the source of the volume information through the video images of the participants when the volume information is greater than a preset threshold condition Whether it is the participant himself or not. If so, the audio channel of the video conference is automatically turned on; if not, the audio channel of the video conference is automatically turned off, so as to improve the experience of the user video conference.
雖然本發明已以實施例揭露如上,然其並非用以限定本發明。本發明所屬技術領域中具有通常知識者,在不脫離本發明之精神和範圍內,當可作各種之更動與潤飾。因此,本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed above with the embodiments, it is not intended to limit the present invention. Those skilled in the art of the present invention can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention should be defined by the scope of the appended patent application.
100:電子裝置 110:收音單元 120:攝像單元 130:顯示單元 140:處理單元 141:判斷模組 142:影像處理模組 143:辨識模組 144:收音通道切換模組 VC:視訊會議 G:手勢動作 H:與會者 IMGa~IMGg:影像 M:嘴型變化 MIC:收音通道 S110~S170,S210~S240:步驟 100: Electronic device 110: Radio unit 120: camera unit 130: display unit 140: processing unit 141: Judgment module 142: Image processing module 143: Identification module 144:Radio channel switching module VC: video conferencing G: Gestures H: Participants IMGa~IMGg: Image M: mouth shape change MIC: radio channel S110~S170, S210~S240: steps
第1圖為根據本發明一實施例繪示電子裝置的方塊圖; 第2圖為根據本發明一實施例繪示視訊會議收音的自動切換方法的流程圖; 第3圖為根據本發明一實施例繪示利用人工智慧演算法進行訓練的流程圖; 第4A圖至第4G圖繪示不同拍攝影像的實施例;及 第5圖繪示視訊會議的一實施例。 FIG. 1 is a block diagram illustrating an electronic device according to an embodiment of the present invention; FIG. 2 is a flow chart illustrating an automatic switching method for video conferencing radio according to an embodiment of the present invention; FIG. 3 is a flow chart illustrating training using an artificial intelligence algorithm according to an embodiment of the present invention; Figures 4A to 4G illustrate examples of different captured images; and FIG. 5 illustrates an embodiment of a video conference.
S110~S170:步驟 S110~S170: steps
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW110146730A TW202325008A (en) | 2021-12-14 | 2021-12-14 | Method for automatically switching on/off sound receiving channel of video conference and electronic device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW110146730A TW202325008A (en) | 2021-12-14 | 2021-12-14 | Method for automatically switching on/off sound receiving channel of video conference and electronic device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| TW202325008A true TW202325008A (en) | 2023-06-16 |
Family
ID=87803677
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW110146730A TW202325008A (en) | 2021-12-14 | 2021-12-14 | Method for automatically switching on/off sound receiving channel of video conference and electronic device |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TW202325008A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI866802B (en) * | 2024-03-04 | 2024-12-11 | 弘真科技股份有限公司 | Intelligent assisting system for video equipment |
-
2021
- 2021-12-14 TW TW110146730A patent/TW202325008A/en unknown
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI866802B (en) * | 2024-03-04 | 2024-12-11 | 弘真科技股份有限公司 | Intelligent assisting system for video equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7564476B1 (en) | Prevent video calls based on appearance | |
| US7907165B2 (en) | Speaker predicting apparatus, speaker predicting method, and program product for predicting speaker | |
| CN113676693B (en) | Picture presentation method, video conference system, and readable storage medium | |
| US10586131B2 (en) | Multimedia conferencing system for determining participant engagement | |
| US11405584B1 (en) | Smart audio muting in a videoconferencing system | |
| EP2993860B1 (en) | Method, apparatus, and system for presenting communication information in video communication | |
| WO2016177262A1 (en) | Collaboration method for intelligent conference and conference terminal | |
| JP2012054897A (en) | Conference system, information processing apparatus, and information processing method | |
| CN113473061B (en) | Method and electronic device for video calling | |
| CN113473066A (en) | Video conference picture adjusting method | |
| TW202018649A (en) | Asymmetric video conferencing system and method thereof | |
| TW202325008A (en) | Method for automatically switching on/off sound receiving channel of video conference and electronic device | |
| US10469800B2 (en) | Always-on telepresence device | |
| CN113301291B (en) | Anti-interference method, system, equipment and storage medium in network video conference | |
| TWI860917B (en) | Multi-camera people matching and selection method | |
| US12386581B2 (en) | Videoconference automatic mute control system | |
| CN106454494A (en) | Multimedia information processing method and system, multimedia device and terminal device | |
| CN106303714A (en) | The control method of multimedia equipment, device and terminal unit | |
| CN114040145B (en) | Video conference portrait display method, system, terminal and storage medium | |
| CN112995565A (en) | Camera adjusting method of display device, display device and storage medium | |
| CN111182256A (en) | An information processing method and server | |
| CN109472225A (en) | Conference control method and device | |
| CN104539873B (en) | Tele-conferencing system and the method for carrying out teleconference | |
| TW202332278A (en) | Video processing method and associated system on chip | |
| TWI857326B (en) | Video processing method for performing partial highlighting with aid of auxiliary information detection, and associated system on chip |