WO2018000953A1

WO2018000953A1 - Audio and video processing method, apparatus and microphone

Info

Publication number: WO2018000953A1
Application number: PCT/CN2017/083816
Authority: WO
Inventors: 丁鹏; 李靖
Original assignee: 中兴通讯股份有限公司
Priority date: 2016-06-29
Filing date: 2017-05-10
Publication date: 2018-01-04
Also published as: CN107547824A

Abstract

Provided in the present invention are an audio and video processing method, an apparatus, and a microphone, the method comprising: the microphone receiving one or a plurality of audio and video; the microphone synthesizing the one or a plurality of video into one video, and encoding the one audio or an audio selected from among the plurality of audio; the microphone sending the synthesized video and the encoded audio to a video and audio device. The present invention solves the problem wherein a traditional video access device in related technologies cannot fulfill needs due to limits of an input source interface thereof, thereby increasing convenience of collaborative interaction.

Description

Audio and video processing method, device and microphone

Technical field

The present invention relates to the field of communications, and in particular to an audio and video processing method, apparatus, and microphone.

Background technique

With the development of video and audio technology and the popularity of applications, there are more and more video products, such as wireless speakers, wireless microphones, high definition multimedia interface (HDMI) split screen, conference TV Products such as terminals, these products, make people's communication easier. However, in the field of multimedia communication, these devices are used independently. Mobile devices, wireless microphones, split screens, terminals, etc. are not used together, and the overall integration capability is relatively weak. For example, the split screen device is a simple input device, which can output a certain way to the display device. For multimedia communication fields such as conference TV, the input source is not only all the way, but also needs to be Multi-channel synthesis, and also available for selection, the traditional split screen device does not have the access source of the mobile device, or the access of the data file, and the input device of the conference television terminal is many, in addition to the video source, there is an audio source, There is no product on the market that combines audio, video and multi-channel video. The video source can be mobile devices, computers, data sources, etc. With the development of multimedia communication, especially the development of cloud conferences, collaborative interaction Convenience will greatly enhance the competitiveness of products.

In the conference TV field, most of the current microphones are pure audio collection devices, and most of the video input is responsible for other dedicated audio and video equipment, such as traditional conference television terminals, set-top boxes, etc., and the input interfaces of these professional devices are limited. A scenario that does not meet the convenience of collaborative interaction. The traditional video access processing is centered on conference TV, set-top boxes and other devices, which have the following drawbacks:

1) With the improvement of video and audio access capabilities, such as digital component serial interface (SDI), HDMI, analog signal output interface (Video Graphics Adapter, VGA for short), DVG, etc. Kind of access interface, etc. for the conference TV equipment, etc. cause numerous peripheral interfaces and complicated wiring.

2) With the increase of video and audio transmission access methods, such as airplay, Digital Living Network Alliance (DLNA), miracast, NFC, Bluetooth, etc., traditional conference television equipment involves hardware software and hardware development versions. Production, long production cycle, etc. It is difficult to switch to the latest video and audio access methods.

3) The auxiliary stream of the traditional conference TV can only be connected one way, and many people discuss it. In the case of multi-person access, the switching of the auxiliary stream is very troublesome.

A conventional video access device in the related art cannot provide an effective solution because the input source interface is limited and cannot meet the required problem.

Summary of the invention

The embodiment of the invention provides an audio and video processing method, device and a microphone, so as to at least solve the problem that the conventional video access device in the related art cannot meet the requirement due to the limited input source interface.

According to an embodiment of the present invention, an audio and video processing method is provided, including: a microphone receiving one or more audio and video; the microphone synthesizing the one or more channels into one channel video, and combining one channel of audio or video The selected audio in the multi-channel audio is encoded; the microphone transmits the synthesized one-channel video and the encoded audio to the audio-visual device.

Preferably, before the mic receives one or more audio and video, the method further includes: the microphone externally broadcasting audio and video access capability by using a universal protocol, where the universal protocol includes DLNA, wireless transmission airplay, Wireless display WIFI display.

Preferably, the receiving, by the mic, one or more audio and video comprises: receiving the one or more audio and video by means of a physical port, a wireless local area network (WLAN), a Bluetooth or a near field communication NFC.

Preferably, before the microphone synthesizes the one or more channels into one channel video, the method further includes: the microphone decoding the received one or more channels of video; according to the foregoing The encoded format negotiated by the audio device for the decoded one or more channels of video Encoding is performed, wherein the encoding format includes H263, H264, H265, Moving Picture Experts Group (MPEG), MP4, VP8, VP9.

Preferably, the microphone synthesizing the one or more channels into one channel of video includes: the microphone receiving the selected input source and the information of the synthesis mode sent by the video and audio device; and selecting one or more corresponding ones according to the information The road video and the corresponding synthesis method combine the selected one or more channels into one channel video.

Preferably, the synthesizing manner includes one of the following: a font-shaped layout manner, and a left-right symmetric layout manner.

Preferably, before the microphone synthesizes the one or more channels into one channel video, the method further includes: the microphone selecting, by the video and audio device, the video to be played from the one or more channels of video .

Another aspect of the present invention provides an audio and video processing apparatus, including: a receiving module configured to receive one or more audio and video; and a synthesizing module configured to synthesize the one or more channels into One channel of video, and encodes one channel of audio or audio selected from multiple channels of audio; the transmitting module is configured to send the synthesized video and the encoded audio to the audio and video device.

Preferably, the device further includes: a broadcast module, configured to externally broadcast audio and video access capability by using a universal protocol, where the universal protocol includes a digital living network alliance DLNA, a wireless transmission airplay, and a wireless display WIFI display.

Preferably, the receiving module comprises: a receiving unit configured to receive the one or more audio and video by means of a physical port, a wireless local area network WIFI, a Bluetooth or a near field communication NFC.

Preferably, the apparatus further includes: a decoding module configured to decode the received one or more channels of video; and an encoding module configured to decode the decoded image according to an encoding format negotiated in advance with the video and audio device The one or more video is encoded, wherein the encoding format includes H263, H264, H265, MPEG, MP4, VP8, VP9.

In an embodiment of the invention, a microphone is also provided, including the above device.

In the embodiment of the present invention, a computer storage medium is further provided, and the computer storage medium may store an execution instruction for performing the implementation of the audio and video processing method in the foregoing embodiment.

Through the embodiment of the present invention, the microphone receives one or more audio and video; the microphone synthesizes the one or more channels into one channel video, and encodes one channel of audio or audio selected from the plurality of channels of audio; The microphone sends the synthesized video and the encoded audio to the audio and video equipment, which solves the problem that the traditional video access equipment in the related art cannot meet the needs due to the limited input source interface, and improves the convenience of cooperation and interaction.

DRAWINGS

The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:

1 is a flowchart of an audio and video processing method according to an embodiment of the present invention;

2 is a block diagram of an audio and video processing apparatus according to an embodiment of the present invention;

3 is a block diagram 1 of an audio and video processing apparatus in accordance with a preferred embodiment of the present invention;

4 is a block diagram 2 of an audio and video processing apparatus in accordance with a preferred embodiment of the present invention;

Figure 5 is a block diagram showing the structure of a novel microphone in accordance with a preferred embodiment of the present invention;

6 is a first schematic diagram of an audio video access process in accordance with a preferred embodiment of the present invention;

FIG. 7 is a second schematic diagram of an audio video access process according to a preferred embodiment of the present invention; FIG.

FIG. 8 is a third schematic diagram of an audio video access process according to a preferred embodiment of the present invention; FIG.

9 is a schematic diagram 4 of an audio video access process according to a preferred embodiment of the present invention;

FIG. 10 is a fifth schematic diagram of an audio video access process according to a preferred embodiment of the present invention; FIG.

11 is a sixth schematic diagram of an audio video access process in accordance with a preferred embodiment of the present invention;

FIG. 12 is a schematic diagram 7 of an audio video access process according to a preferred embodiment of the present invention; FIG.

13 is a schematic diagram 8 of an audio video access process in accordance with a preferred embodiment of the present invention.

detailed description

The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.

It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order.

An audio and video processing method is provided in this embodiment. FIG. 1 is a flowchart of an audio and video processing method according to an embodiment of the present invention. As shown in FIG. 1, the process includes the following steps:

Step S102, the microphone receives one or more audio and video;

Step S104, the microphone synthesizes one or more channels of video into one channel of video, and encodes one channel of audio or audio selected from the plurality of channels of audio;

In step S106, the microphone sends the synthesized video and the encoded audio to the audio and video device.

Through the above steps, the microphone receives one or more channels of audio and video; the microphone combines one or more channels of video into one channel of video, and encodes one channel of audio or audio selected from multiple channels of audio, wherein The audio is selected by one or more channels of audio for encoding; the microphone sends the synthesized video and the encoded audio to the video and audio device, which solves the problem that the traditional video access device in the related art cannot meet the requirement due to the limited input source interface. The problem is to improve the convenience of collaborative interaction.

In order to allow other devices to discover that the microphone can be accessed, the microphone broadcasts audio and video access capabilities through a universal protocol before receiving one or more audio and video. The universal protocol includes the Digital Living Network Alliance DLNA, wireless transmission airplay, Wireless display WIFI display, it should be noted that it is not limited to the above protocols.

In an optional embodiment, the microphone receiving one or more audio and video may include: the microphone is connected through a physical port, a wireless local area network WIFI, a Bluetooth or a near field communication NFC. Receive one or more audio and video.

Preferably, before synthesizing the one or more channels of video into one channel of video, the microphone decodes the received one or more channels of video; according to the encoding format negotiated in advance with the video and audio device, the decoded one way or The multi-channel video is encoded, wherein the encoding format includes H263, H264, H265, MPEG, MP4, VP8, VP9, and the like.

Preferably, the merging of the one or more channels of video into one channel of the video may include: receiving, by the mic, the information of the selected input source and the compositing mode sent by the video and audio device; selecting corresponding one or more channels of video according to the information, and The corresponding synthesis method combines the selected one or more channels into one channel video.

The above-mentioned synthesis method includes one of the following: a font layout manner, a left-right symmetric layout manner, and it should be noted that it is not limited to the two implementation manners.

In order to better realize the selection of the video, the microphone can select the video to be played from the one or more videos by controlling the video and audio device, synthesize the selected video, and transmit the selected video to the video and audio device for playing.

The embodiment of the present invention further provides a traffic monitoring processing device. FIG. 2 is a block diagram of an audio and video processing device according to an embodiment of the present invention. As shown in FIG. 2, the method includes:

The receiving module 22 is configured to receive one or more audio and video;

The synthesizing module 24 is configured to combine the one or more channels of video into one channel of video, and encode one channel of audio or audio selected from the plurality of channels of audio;

The sending module 26 is configured to send the combined video and the encoded audio to the audio and video device.

3 is a block diagram 1 of an audio and video processing apparatus according to a preferred embodiment of the present invention. As shown in FIG. 3, the apparatus further includes:

The broadcast module 32 is configured to broadcast audio and video access capabilities through a universal protocol, where the universal protocol includes a digital living network alliance DLNA, a wireless transmission airplay, and a wireless display WIFI display.

4 is a block diagram 2 of an audio and video processing apparatus according to a preferred embodiment of the present invention. As shown in FIG. 4, the apparatus further includes:

The decoding module 42 is configured to decode the received one or more channels of video;

The encoding module 44 is configured to encode the decoded one or more channels according to an encoding format negotiated in advance with the video and audio device, where the encoding format includes H263, H264, H265, MPEG, MP4, VP8, VP9, etc. .

Embodiments of the present invention also provide a microphone including the above device.

Embodiments of the present invention also provide a storage medium. Optionally, in the embodiment, the storage medium may be configured to store program code set to perform the following steps:

Step S1, the microphone receives one or more audio and video;

Step S2, the microphone synthesizes one or more channels of video into one channel of video, and encodes one channel of audio or audio selected from the plurality of channels of audio;

In step S3, the microphone sends the synthesized video and the encoded audio to the audio and video device.

Optionally, in this embodiment, the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory. A variety of media that can store program code, such as a disc or a disc.

Optionally, in the embodiment, the processor performs the above steps S1, S2 and S3 according to the stored program code in the storage medium.

For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the optional embodiments, and details are not described herein again.

In the field of conference television, the user is often closest to the sound collection device, and the embodiment of the invention surrounds the audio collection device, such as a microphone, to add video access on the microphone to solve the current video and audio field. The drawbacks of the present invention are the following technical solutions:

A new type of microphone supporting video input and output according to an embodiment of the present invention. FIG. 5 is a structural block diagram of a novel microphone according to a preferred embodiment of the present invention. As shown in FIG. 5, the following modules are mainly included:

The capability notification module 52 is configured to report its ability to be externally accessed to facilitate access to the device by an external source.

The video capture module 54 is configured to collect video data with physical access, such as a common physical interface such as VGA, HDMI, or DVI.

The audio collection module 56 is a microphone acquisition sound module.

The data receiving module 58 is configured to receive video and audio data through a non-physical interface in addition to audio input with a physical interface, and is processed by the data receiving module 58. Received data includes wireless WIFI, miracast, wifi display, airplay, dlna and other interconnection protocols, or other data such as NFC, Bluetooth, etc., audio and so on.

The media negotiation module 510 is configured to be responsible for negotiating with the remote device the media capabilities employed between the two parties.

The media processing module 512 is configured to process the collected and received video, the audio data, including superimposing or synthesizing the multi-access video data, and generating data in a corresponding format according to the E compression encoding.

The media sending module 514 is configured to send the superimposed or synthesized data to an external video and audio device, such as a conference television terminal, as needed, and the superimposed and synthesized data may be one of multiple paths in the access system, or in the access system. All roads determine the superposition or composition of several of them as needed.

The input source control module 516 is configured to receive the control signaling sent by the video and audio device, and is used to select the video and audio source for the new microphone acquisition, to select which way to view and the audio source according to the control, and select which specific synthesis mode to send. For audio and video equipment.

The method for supporting a new type of microphone application for video input and output according to the present invention includes the following content: a new type of microphone exposes its own vision through the capability notification module 52 through a general protocol. Audio access capability. General protocols include and are not limited to DLNA, airplay, wifi display, etc. The communication carrier includes, but is not limited to, WIFI, Bluetooth, NFC, and the like. If the external video source is a physical video signal, the external video source is directly connected to the new microphone and processed by the video capture module 54. If the external source is a wireless video input source, such as a cell phone, PAD, etc. Then, the external video source searches for a new type of microphone through a universal protocol, and the new type of microphone realizes access to the wireless video source through the data receiving module 58; the general protocol includes and is not limited to DLNA, airplay, wifi display, and the like. The wireless method includes and is not limited to WIFI, Bluetooth, NFC and other communication methods. If the external source is a wireless audio input source, such as the music of a mobile phone. Then, the external audio source searches for a new type of microphone through a universal protocol, and the new type of microphone realizes access to the wireless audio frequency source through the data receiving module 58; the general protocol includes and is not limited to DLNA, airplay, wifi display, and the like. The wireless method includes and is not limited to WIFI, Bluetooth, NFC and other communication methods. The media negotiation module 510 and the video and audio processing device to be accessed by the new microphone, such as the conference television terminal, negotiate a codec format of the video and audio. The processing of the video and audio data collected and received by the media processing module 512 for the system includes: decoding the collected video physical signal, and then encoding according to the capability negotiated by the negotiation module, the format includes and is not limited to H264, Moving Picture Experts Group (Moving Picture) Experts Group, referred to as MPEG), MP4, etc. The non-audio and audio data to be received, such as file data, is decoded by means of a folder or a file, and then encoded according to the negotiated capability. The encoding format includes and is not limited to H264, MPEG, MP4, and the like. The video information collected by the physical and non-physical methods is superimposed or synthesized to synthesize a video. The synthesis method includes, but is not limited to, a variety of layout manners such as a font shape and a left-right symmetry. The physically acquired audio, as well as the NFC, Bluetooth incoming audio, are encoded as needed. The superimposed, synthesized video, and encoded audio data are transmitted to an external AV device through the data module.

The new microphone communicates with the video and audio device through the input source input control module, receives the selected input source and the synthesis mode information sent by the video and audio device, selects the corresponding input source according to the information new microphone, and performs the corresponding synthesis mode through the media sending module 514. , the video and audio data is sent to the audio and video equipment.

Example one

The notebook A is connected to the conference television. FIG. 6 is a schematic diagram 1 of the audio video access processing according to a preferred embodiment of the present invention. As shown in FIG. 6, the method includes:

The first step: the new microphone broadcasts its own video and audio access capability through the capability notification module 52.

The second step: the notebook A accesses the new type of microphone, including: physically accessing the new type of microphone, the access mode can be HDMI, VGA, etc., and the new microphone collects the media signal of the notebook through the video acquisition module 54. The notebook A searches for a new type of microphone through a protocol such as wifi display or DLNA, airplay, etc., communicates with the data receiving module 58 of the new microphone, and transmits the media data to the new microphone to complete the access.

The third step: the new microphone negotiates the video and audio format to be encoded through the media negotiation module 510 and the conference television terminal;

The fourth step: the input source control module 516 obtains which video source needs to be selected by the external conference television terminal, and the synthesis mode, since only one video source is selected, the notebook A is selected;

The fifth step: the media processing module 512 performs encoding according to the synthesized mode, the selected video source, and the negotiated encoding format;

Step 6: The media sending module 514 sends the encoded media data to the conference television terminal.

Step 7: The user can see the video of the processed notebook through the output of the AV processing device;

The eighth step: the video selected by the user changes, and the corresponding video source and the synthesized mode selected by the input source control module 516 are sent to the conference television terminal.

Example two

Notebook A and Notebook B access the conference TV through the new microphone, including:

The first step: the new microphone broadcasts its own video and audio access capability through the capability notification module 52;

Step 2: Notebook A accesses the new mic, including: physically accessing the new mic, The access mode may be HDMI, VGA, etc., and the new microphone collects the media signal of the notebook through the video capture module 54. The notebook A searches for a new type of microphone through a protocol such as wifi display or DLNA, airplay, etc., communicates with the data receiving module 58 of the new microphone, and transmits the media data to the new microphone to complete the access.

The third step: the notebook B accesses the new type of microphone, including: physical access to the new type of microphone, the access mode can be HDMI, VGA and other signal access, the new microphone collects the media signal of the notebook through the video capture module 54. The notebook B searches for a new type of microphone through a protocol such as wifi display or DLNA, airplay, etc., communicates with the data receiving module 58 of the new microphone, and transmits the media data to the new microphone to complete the access.

The fourth step: the new microphone negotiates the video and audio format to be encoded through the media negotiation module 510 and the conference television terminal;

Step 5: The input source control module 516 obtains which video source needs to be selected for the external conference television terminal, and the synthesis mode.

FIG. 7 is a second schematic diagram of an audio video access process according to a preferred embodiment of the present invention. As shown in FIG. 7, notebook A and notebook B are simultaneously selected. The synthesis method may be that the notebook A and the notebook B are stacked on the left or right, or may be vertically symmetrical, and is not limited to a specific screen layout.

FIG. 8 is a third schematic diagram of an audio video access process according to a preferred embodiment of the present invention. As shown in FIG. 8, a notebook A is selected accordingly. Since only one video source is selected, the synthesis method is the content of the notebook A.

FIG. 9 is a schematic diagram 4 of an audio video access process according to a preferred embodiment of the present invention. As shown in FIG. 9, a notebook B is selected correspondingly. Since only one video source is selected, the synthesis method is the content of the notebook B.

Step 7: The user can see the processed video through the output of the AV processing device.

Example three

Notebook A and Notebook B, Notebook C access to the conference TV through the new microphone, including:

The third step: the notebook C accesses the new type of microphone, including: physical access to the new type of microphone, the access mode can be HDMI, VGA and other signal access, the new microphone collects the media signal of the notebook through the video capture module 54. The notebook C searches for a new type of microphone through a protocol such as wifi display or DLNA, airplay, etc., communicates with the data receiving module 58 of the new microphone, and transmits the media data to the new microphone to complete the access.

The fourth step: the new microphone negotiates the video and audio format to be encoded through the media negotiation module 510 and the conference television terminal.

FIG. 10 is a schematic diagram 5 of an audio video access process according to a preferred embodiment of the present invention. As shown in FIG. 10, notebook A and notebook B, notebook C are simultaneously selected. The synthesis method can be notebook A and notebook B, and notebook C accounts for one-third of each, and is not limited to a specific screen layout.

FIG. 11 is a schematic diagram 6 of an audio video access process according to a preferred embodiment of the present invention. As shown in FIG. 11, notebook A and notebook B are selected accordingly. The composition method can be half of the contents of the notebook A and the notebook B, and is not limited to the layout of the screen.

FIG. 12 is a schematic diagram 7 of an audio video access process according to a preferred embodiment of the present invention. As shown in FIG. 12, a notebook C is selected correspondingly. Since only one video source is selected, the synthesis method is the content of the notebook C. You can choose any of the input sources.

Step 6: The media processing module 512 encodes according to the synthesized mode, the selected video source, and the negotiated encoding format.

Step 7: The media sending module 514 sends the encoded media data to the conference television terminal.

Step 8: The user can see the processed video through the output of the AV processing device.

The ninth step: the video selected by the user changes, and the corresponding video source and the synthesized mode selected by the input source control module 516 are sent to the conference television terminal.

Example four

The notebook A and the notebook B, the notebook C, and the NFC/Bluetooth device are connected to the conference television through a new type of microphone. FIG. 13 is a schematic diagram of the audio video access processing according to a preferred embodiment of the present invention. As shown in FIG. 13, the method includes:

The first step: notebook A, notebook B, notebook C according to the first step of the previous example, the second step, the third step, etc. access to the new microphone.

The second step: the media processing module 512 encodes the signal, and the processed video signals of the notebook A, the notebook B, and the notebook C, and the NFC/Bluetooth device transmits the file to the new microphone through the NFC/Bluetooth, and the new microphone displays the contents of the folder;

The third step: the new microphone and the conference television terminal negotiate the coding capability and the synthesis mode;

The fourth step: superimposing or synthesizing and encoding the video content to be displayed in the above A, B, C and the received file content according to the result of the third step negotiation;

Step 5: The new microphone negotiates the video and audio format to be encoded through the media negotiation module 510 and the conference television terminal;

Step 6: The media sending module 514 sends all the processed data to the conference television terminal;

Step 7: According to the needs of the user, the user can select to view the contents of notebook A, notebook B, notebook C, NFC/Bluetooth information or simultaneously watch notebook A and notebook B, notebook C through the input source control module 516 of the new microphone. Video content, content displayed by NFC/Bluetooth information. It should be noted that the NFC/Bluetooth device accesses the output of the new microphone to the conference television device, and is not limited to three devices, and is not limited to being sent to the conference television device, and the video and audio device capable of outputting can be used.

Compared with the prior art, it facilitates the effect of video and audio communication delivery in the video and audio field, and simplifies line access, etc., greatly enhancing the communication effect.

It will be apparent to those skilled in the art that the various modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.

The above description is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and scope of the present invention are intended to be included in the present invention. Within the scope of protection.

Industrial applicability

Claims

An audio and video processing method includes:

Mike receives one or more audio and video;

The microphone synthesizes the one or more channels into one channel of video, and encodes one channel of audio or audio selected from the plurality of channels of audio;

The microphone sends the synthesized video and the encoded audio to the audio and video device.
The method of claim 1 wherein before the mic receives one or more audio and video, the method further comprises:

The microphone broadcasts audio and video access capabilities through a universal protocol, where the universal protocol includes a digital living network alliance DLNA, a wireless transmission airplay, and a wireless display WIFI display.
The method of claim 2 wherein said receiving one or more audio and video by said microphone comprises:

The microphone receives the one or more audio and video through a physical port, a wireless local area network (WIFI), a Bluetooth or a near field communication NFC.
The method of claim 1, wherein before the mic combines the one or more channels of video into one channel of video, the method further comprises:

The microphone decodes the received one or more channels of video;

The microphone encodes the decoded one or more channels according to an encoding format negotiated in advance with the video and audio device, wherein the encoding format includes H263, H264, H265, Moving Picture Experts Group MPEG, MP4 , VP8, VP9.
The method of claim 4 wherein said mic will be said one way or Multi-channel video synthesis into one channel video includes:

Receiving, by the microphone, a selection input source and a synthesis mode information sent by the video and audio device;

The microphone selects one or more channels of video according to the information, and combines the selected one or more channels into one channel video according to the corresponding synthesis manner.
The method according to claim 5, wherein the synthesizing manner comprises one of the following: a font-shaped layout manner, and a left-right symmetric layout manner.
The method according to any one of claims 1 to 6, wherein before the merging the one or more channels of video into one video, the method further comprises:

The microphone selects a video to be played from the one or more videos through the video and audio device.
An audio and video processing device applied to a microphone, comprising:

a receiving module configured to receive one or more audio and video;

a synthesis module configured to combine the one or more channels of video into one channel of video and encode one channel of audio or audio selected from the plurality of channels of audio;

The sending module is configured to send the synthesized video and the encoded audio to the audio and video device.
The apparatus of claim 8 wherein said apparatus further comprises:

The broadcast module is configured to broadcast audio and video access capabilities through a universal protocol, where the universal protocol includes a digital living network alliance DLNA, a wireless transmission airplay, and a wireless display WIFI display.
The apparatus of claim 8 wherein said apparatus further comprises:

a decoding module, configured to decode the received one or more channels of video;

And an encoding module, configured to encode the decoded one or more channels according to an encoding format negotiated in advance with the video and audio device, where the encoding format includes H263, H264, H265, Moving Picture Experts Group MPEG , MP4, VP8, VP9.
A mic comprising the apparatus of any one of claims 8 to 10.