CN114095769B

CN114095769B - Live broadcast low-delay processing method of application-level player and display device

Info

Publication number: CN114095769B
Application number: CN202010857879.7A
Authority: CN
Inventors: 吕鹏; 朱宗花
Original assignee: Hisense Visual Technology Co Ltd
Current assignee: Hisense Visual Technology Co Ltd
Priority date: 2020-08-24
Filing date: 2020-08-24
Publication date: 2024-05-14
Anticipated expiration: 2040-08-24
Also published as: CN114095769A

Abstract

The application discloses a live broadcast low-delay processing method and display equipment of an application level player, wherein the storage amount of audio data before decoding in an audio data cache queue exceeds a threshold value by the application level player, and when a discarding identifier is identified, the audio data before decoding stored in the audio data cache queue is discarded; and synchronizing the decoded audio data and the decoded video data to obtain new live broadcast data for frame loss playing. And when the discarding identifier is not recognized, resampling the decoded audio data to reset the audio playing time stamp, and synchronizing the decoded video data with the decoded audio data with the reset audio playing time stamp to realize double-speed playing. Therefore, the application level player detects the audio buffer size after unpacking, and modifies the audio and video synchronization mechanism when the audio and video buffer size exceeds the threshold value, so that the frame loss playing or the double-speed playing to the current latest frame data is realized, and the delay phenomenon generated during live broadcasting is reduced.

Description

Live broadcast low-delay processing method of application-level player and display device

Technical Field

The application relates to the technical field of video live broadcasting, in particular to a live broadcasting low-delay processing method and display equipment of an application-level player.

Background

The display equipment comprises an intelligent television, a double-screen laser television, an intelligent set top box, an intelligent box, a product with an intelligent display screen and the like. With the rapid development of display devices, the functions of the display devices will be more and more abundant and the performances will be more and more powerful. For example, the display device may implement functions that require real-time interaction with a user, such as live broadcasting in a class, screen casting in a mobile phone, video conferencing, and a class of small-sized class, where the display device is typically implemented using an application-level player configured with the display device, where the application-level player includes ijkpalyer, gstreamer, nuplayer, exoplayer, vlcplayer, and the like.

When live broadcasting is realized by using the display equipment, the configured application level player is easy to have time-consuming conditions due to the influence of a network playing protocol in a playing stage, and if the network fluctuates, the live broadcasting stream is also easy to have frequent blocking and screen-display problems. The factors can cause delay in live broadcast, and the delay phenomenon is represented in a large time difference between a network live broadcast stream end and the time of a user watching a picture.

Typically, to ensure smoothness of play, the acceptable live stream delay is within an acceptable range. However, in a scenario where low latency requirements of the application level player are high, the user experience is seriously affected if the application level player is high in latency.

Disclosure of Invention

The application provides a live broadcast low-delay processing method and display equipment of an application-level player, which are used for solving the problems that the time consumption is easy to start during live broadcast and the delay is large due to network fluctuation.

In a first aspect, the present application provides a display apparatus comprising:

A controller within which is configured an application level player for live broadcast, the application level player configured to:

Acquiring live broadcast data generated during live broadcast, and performing decapsulation processing on the live broadcast data to obtain audio data before decoding and video data before decoding;

Storing the pre-decoding audio data to an audio data buffer queue, and storing the pre-decoding video data to a video data buffer queue;

If the storage amount of the audio data before decoding stored in the audio data cache queue exceeds a threshold value, identifying whether a discarding identifier exists, wherein the discarding identifier is used for representing whether the audio data needs to be discarded or not;

If the discarding identifier exists, discarding the audio data stored in the audio data buffer queue before decoding, and performing a clearing process;

Acquiring pre-decoding audio data newly stored in the emptied audio data cache queue, and decoding the pre-decoding audio data newly stored and the pre-decoding video data stored in the video data cache queue to obtain decoded audio data and decoded video data;

And synchronously processing the decoded audio data and the decoded video data to obtain new live broadcast data for playing.

In some embodiments of the application, the application level player is further configured to:

if the storage amount of the pre-decoding audio data stored in the audio data cache queue does not exceed the threshold value, decoding the pre-decoding audio data stored in the audio data cache queue and the pre-decoding video data stored in the video data cache queue to obtain decoded audio data and decoded video data;

In some embodiments of the present application, the application level player, when performing the synchronization processing of the decoded audio data and the decoded video data, is further configured to:

Acquiring an audio time stamp corresponding to the audio data before decoding of the first frame newly stored in the audio data buffer queue;

determining a video discarding time stamp for discarding video data based on an audio time stamp corresponding to the audio data before decoding of the first frame;

and acquiring the appointed decoded video data corresponding to the video discarding time stamp in the video data cache queue, discarding the appointed decoded video data, and realizing synchronous playing of the decoded audio data and the decoded video data.

If the discarding identifier does not exist, decoding the pre-decoding audio data stored in the audio data cache queue and the pre-decoding video data stored in the video data cache queue to obtain decoded audio data and decoded video data;

Acquiring appointed decoded audio data corresponding to the storage amount exceeding a threshold value based on the decoded audio data stored in the audio data cache queue;

Resetting the audio playing time stamp corresponding to the appointed decoded audio data;

And synchronizing the appointed decoded audio data played based on the reset audio playing time stamp with the decoded video data to obtain new live broadcast data for playing.

In some embodiments of the present application, the application level player, when executing the resetting of the audio play time stamp corresponding to the specified decoded audio data, is further configured to:

Acquiring an audio playing time stamp and corresponding playing time length of the appointed decoded audio data;

adjusting the playing time length according to a preset rule;

and adjusting the audio playing time stamp based on the adjusted playing time length.

In some embodiments of the present application, the application level player, after performing the synchronization processing of the decoded audio data to be played based on the reset audio play time stamp and the decoded video data, obtains new live broadcast data to be played, is further configured to:

Acquiring a video playing time stamp of the decoded video data stored in the video data cache queue;

adjusting the video playing time stamp based on a resetting rule of the audio playing time stamp, wherein the adjusted video playing time stamp is the same as the reset audio playing time stamp;

Playing the corresponding appointed decoded audio data according to the reset audio playing time stamp, and playing the corresponding decoded video data according to the adjusted video playing time stamp, so as to realize synchronous playing of the decoded audio data and the decoded video data.

In some embodiments of the application, the application level player, when executing a reset rule based on the audio play time stamp, adjusts the video play time stamp, is further configured to:

acquiring the audio playing time stamp and the corresponding playing time length;

determining a corresponding video play time stamp based on the audio play time stamp;

Determining the video adjusted playing time length corresponding to the decoded video data based on the playing time length adjusted according to a preset rule;

And adjusting the video playing time stamp based on the video adjusted playing time length.

and judging whether the storage amount of the audio data before decoding stored in the audio data cache queue exceeds a threshold value or not according to a preset time interval.

The player up interface is invoked, a low delay flag is set to characterize whether the low delay function is turned on, and a discard flag is set to characterize whether the audio data needs to be discarded.

In a second aspect, the present application further provides a method for processing live broadcast low latency of an application level player, where the method includes:

In a third aspect, the present application further provides a storage medium, where a program may be stored, where the program may implement some or all of the steps in each embodiment of the live low latency processing method including the application level player provided by the present application when executed.

As can be seen from the above technical solution, in the live broadcast low-delay processing method and display device for an application-level player provided by the embodiments of the present invention, the configured application-level player stores pre-decoding audio data obtained by performing decapsulation processing on live broadcast data in an audio data buffer queue, and stores pre-decoding video data in a video data buffer queue; when the storage amount of the pre-decoding audio data stored in the audio data cache queue exceeds a threshold value and a discarding mark is identified, discarding the pre-decoding audio data stored in the audio data cache queue, and performing a clearing process; and decoding the newly stored pre-decoding audio data and the pre-decoding video data stored in the video data cache queue, and synchronously processing the obtained post-decoding audio data and the obtained post-decoding video data to obtain new live broadcast data for playing. And when the discarding identifier is not recognized, resampling the decoded audio data to reset the audio playing time stamp, and synchronizing the decoded video data with the decoded audio data with the reset audio playing time stamp to realize double-speed playing. Therefore, the method and the display device provided by the embodiment of the invention have the advantages that the application level player detects the audio buffer size after unpacking, and when the audio buffer size exceeds the threshold value, the audio-video synchronization mechanism is modified, so that the direct jump (frame loss playing) or the double-speed playing to the current latest frame data is realized, and the phenomena of time consumption caused by playing and larger delay caused by network fluctuation during live broadcasting are reduced.

Drawings

In order to more clearly illustrate the technical solution of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

A schematic diagram of an operational scenario between a display device and a control apparatus according to some embodiments is schematically shown in fig. 1;

a hardware configuration block diagram of a display device 200 according to some embodiments is exemplarily shown in fig. 2;

A hardware configuration block diagram of the control device 100 according to some embodiments is exemplarily shown in fig. 3;

a schematic diagram of the software configuration in a display device 200 according to some embodiments is exemplarily shown in fig. 4;

An icon control interface display schematic of an application in a display device 200 according to some embodiments is illustrated in fig. 5;

A flowchart of a live low latency processing method for an application level player according to some embodiments is illustrated in fig. 6;

A data flow diagram of a live low latency processing method for an application level player according to some embodiments is illustrated in fig. 7;

A method flow diagram of a synchronization process according to some embodiments is illustrated in fig. 8;

a method flow diagram for using double-speed playback according to some embodiments is illustrated in fig. 9;

A method flow diagram for resetting an audio playback time stamp according to some embodiments is illustrated in fig. 10.

Detailed Description

For the purposes of making the objects, embodiments and advantages of the present application more apparent, an exemplary embodiment of the present application will be described more fully hereinafter with reference to the accompanying drawings in which exemplary embodiments of the application are shown, it being understood that the exemplary embodiments described are merely some, but not all, of the examples of the application.

Based on the exemplary embodiments described herein, all other embodiments that may be obtained by one of ordinary skill in the art without making any inventive effort are within the scope of the appended claims. Furthermore, while the present disclosure has been described in terms of an exemplary embodiment or embodiments, it should be understood that each aspect of the disclosure can be practiced separately from the other aspects.

It should be noted that the brief description of the terminology in the present application is for the purpose of facilitating understanding of the embodiments described below only and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.

The terms first, second, third and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar or similar objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated (Unless otherwise indicated). It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are, for example, capable of operation in sequences other than those illustrated or otherwise described herein.

Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to those elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" as used in this disclosure refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the function associated with that element.

The term "remote control" as used herein refers to a component of an electronic device (such as a display device as disclosed herein) that can be controlled wirelessly, typically over a relatively short distance. Typically, the electronic device is connected to the electronic device using infrared and/or Radio Frequency (RF) signals and/or bluetooth, and may also include functional modules such as WiFi, wireless USB, bluetooth, motion sensors, etc. For example: the hand-held touch remote controller replaces most of the physical built-in hard keys in a general remote control device with a touch screen user interface.

The term "gesture" as used herein refers to a user behavior by which a user expresses an intended idea, action, purpose, and/or result through a change in hand shape or movement of a hand, etc.

A schematic diagram of an operational scenario between a display device and a control apparatus according to some embodiments is schematically shown in fig. 1. As shown in fig. 1, a user may operate the display apparatus 200 through the mobile terminal 300 and the control device 100.

In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes infrared protocol communication or bluetooth protocol communication, and other short-range communication modes, etc., and the display device 200 is controlled by a wireless or other wired mode. The user may control the display device 200 by inputting user instructions through keys on a remote control, voice input, control panel input, etc. Such as: the user can input corresponding control instructions through volume up-down keys, channel control keys, up/down/left/right movement keys, voice input keys, menu keys, on-off keys, etc. on the remote controller to realize the functions of the control display device 200.

In some embodiments, mobile terminals, tablet computers, notebook computers, and other smart devices may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on a smart device. The application program, by configuration, can provide various controls to the user in an intuitive User Interface (UI) on a screen associated with the smart device.

In some embodiments, the mobile terminal 300 may install a software application with the display device 200, implement connection communication through a network communication protocol, and achieve the purpose of one-to-one control operation and data communication. Such as: it is possible to implement a control command protocol established between the mobile terminal 300 and the display device 200, synchronize a remote control keyboard to the mobile terminal 300, and implement a function of controlling the display device 200 by controlling a user interface on the mobile terminal 300. The audio/video content displayed on the mobile terminal 300 can also be transmitted to the display device 200, so as to realize the synchronous display function.

As also shown in fig. 1, the display device 200 is also in data communication with the server 400 via a variety of communication means. The display device 200 may be permitted to make communication connections via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. By way of example, display device 200 receives software program updates, or accesses a remotely stored digital media library by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The server 400 may be a cluster, or may be multiple clusters, and may include one or more types of servers. Other web service content such as video on demand and advertising services are provided through the server 400.

The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The particular display device type, size, resolution, etc. are not limited, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired.

The display apparatus 200 may additionally provide a smart network television function of a computer support function, including, but not limited to, a network television, a smart television, an Internet Protocol Television (IPTV), etc., in addition to the broadcast receiving television function.

A hardware configuration block diagram of a display device 200 according to some embodiments is illustrated in fig. 2.

In some embodiments, at least one of the controller 250, the modem 210, the communicator 220, the detector 230, the input/output interface 255, the display 275, the audio output interface 285, the memory 260, the power supply 290, the user interface 265, and the external device interface 240 is included in the display apparatus 200.

In some embodiments, the display 275 is configured to receive image signals from the first processor output, and to display video content and images and components of the menu manipulation interface.

In some embodiments, display 275 includes a display screen assembly for presenting pictures, and a drive assembly for driving the display of images.

In some embodiments, the video content is displayed from broadcast television content, or alternatively, from various broadcast signals that may be received via a wired or wireless communication protocol. Or may display various image content received from a network communication protocol from a network server side.

In some embodiments, the display 275 is used to present a user-manipulated UI interface generated in the display device 200 and used to control the display device 200.

In some embodiments, depending on the type of display 275, a drive assembly for driving the display is also included.

In some embodiments, display 275 is a projection display and may further include a projection device and a projection screen.

In some embodiments, communicator 220 is a component for communicating with external devices or external servers according to various communication protocol types. For example: the communicator 220 may include at least one of a Wifi module 221, a bluetooth module 222, a wired ethernet module 223, or other network communication protocol module or a near field communication protocol module, and an infrared receiver.

In some embodiments, the display device 200 may establish control signal and data signal transmission and reception between the communicator 220 and the external control device 100 or the content providing device.

In some embodiments, the user interface 265 may be used to receive infrared control signals from the control device 100 (e.g., an infrared remote control, etc.).

In some embodiments, the detector 230 is a signal that the display device 200 uses to capture or interact with the external environment.

In some embodiments, the detector 230 includes an optical receiver, a sensor for capturing the intensity of ambient light, a parameter change may be adaptively displayed by capturing ambient light, etc.

In some embodiments, the detector 230 may further include an image collector 232, such as a camera, a video camera, etc., which may be used to collect external environmental scenes, collect attributes of a user or interact with a user, adaptively change display parameters, and recognize a user gesture to implement a function of interaction with the user.

In some embodiments, the detector 230 may also include a temperature sensor or the like, such as by sensing ambient temperature.

In some embodiments, the display device 200 may adaptively adjust the display color temperature of the image. The display device 200 may be adjusted to display a colder color temperature shade of the image, such as when the temperature is higher, or the display device 200 may be adjusted to display a warmer color shade of the image when the temperature is lower.

In some embodiments, the detector 230 also includes a sound collector 231 or the like, such as a microphone, that may be used to receive the user's sound. Illustratively, a voice signal including a control instruction for a user to control the display apparatus 200, or an acquisition environmental sound is used to recognize an environmental scene type so that the display apparatus 200 can adapt to environmental noise.

In some embodiments, as shown in fig. 2, the input/output interface 255 is configured to enable data transfer between the controller 250 and external other devices or other controllers 250. Such as receiving video signal data and audio signal data of an external device, command instruction data, or the like.

In some embodiments, external device interface 240 may include, but is not limited to, the following: any one or more interfaces of a high definition multimedia interface HDMI interface, an analog or data high definition component input interface, a composite video input interface, a USB input interface, an RGB port, and the like can be used. The plurality of interfaces may form a composite input/output interface.

In some embodiments, as shown in fig. 2, the modem 210 is configured to receive the broadcast television signal by a wired or wireless receiving manner, and may perform modulation and demodulation processes such as amplification, mixing, and resonance, and demodulate the audio/video signal from a plurality of wireless or wired broadcast television signals, where the audio/video signal may include a television audio/video signal carried in a television channel frequency selected by a user, and an EPG data signal.

In some embodiments, the frequency point demodulated by the modem 210 is controlled by the controller 250, and the controller 250 may send a control signal according to the user selection, so that the modem responds to the television signal frequency selected by the user and modulates and demodulates the television signal carried by the frequency.

In some embodiments, the broadcast television signal may be classified into a terrestrial broadcast signal, a cable broadcast signal, a satellite broadcast signal, an internet broadcast signal, or the like according to a broadcasting system of the television signal. Or may be differentiated into digital modulation signals, analog modulation signals, etc., depending on the type of modulation. Or it may be classified into digital signals, analog signals, etc. according to the kind of signals.

In some embodiments, the controller 250 and the modem 210 may be located in separate devices, i.e., the modem 210 may also be located in an external device to the main device in which the controller 250 is located, such as an external set-top box or the like. In this way, the set-top box outputs the television audio and video signals modulated and demodulated by the received broadcast television signals to the main body equipment, and the main body equipment receives the audio and video signals through the first input/output interface.

In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored on the memory. The controller 250 may control the overall operation of the display apparatus 200. For example: in response to receiving a user command to select to display a UI object on the display 275, the controller 250 may perform an operation related to the object selected by the user command.

In some embodiments, the object may be any one of selectable objects, such as a hyperlink or an icon. Operations related to the selected object, such as: displaying an operation of connecting to a hyperlink page, a document, an image, or the like, or executing an operation of a program corresponding to the icon. The user command for selecting the UI object may be an input command through various input means (e.g., mouse, keyboard, touch pad, etc.) connected to the display device 200 or a voice command corresponding to a voice uttered by the user.

As shown in fig. 2, the controller 250 includes at least one of a random access Memory 251 (Random Access Memory, RAM), a Read-Only Memory 252 (ROM), a video processor 270, an audio processor 280, other processors 253 (e.g., a graphics processor (Graphics Processing Unit, GPU), a central processing unit 254 (Central Processing Unit, CPU), a communication interface (Communication Interface), and a communication Bus 256 (Bus), which connects the respective components.

In some embodiments, RAM 251 is used to store temporary data for the operating system or other on-the-fly programs, and in some embodiments ROM 252 is used to store various system boot instructions.

In some embodiments, ROM 252 is used to store a basic input output system, referred to as a basic input output system (Basic Input Output System, BIOS). The system comprises a drive program and a boot operating system, wherein the drive program is used for completing power-on self-checking of the system, initialization of each functional module in the system and basic input/output of the system.

In some embodiments, upon receipt of a power-on signal, the display device 200 power begins to boot, and the processor 254 executes system boot instructions in the ROM 252 to copy temporary data of the operating system stored in memory into the RAM 251 to facilitate booting or running the operating system. When the operating system is started, the processor 254 copies temporary data of various applications in memory to the RAM 251, and then facilitates the starting or running of the various applications.

In some embodiments, processor 254 is used to execute operating system and application program instructions stored in memory. And executing various application programs, data and contents according to various interactive instructions received from the outside, so as to finally display and play various audio and video contents.

In some example embodiments, the processor 254 may include a plurality of processors. The plurality of processors may include one main processor and one or more sub-processors. A main processor for performing some operations of the display apparatus 200 in the pre-power-up mode and/or displaying a picture in the normal mode. One or more sub-processors for one operation in a standby mode or the like.

In some embodiments, the graphics processor 253 is configured to generate various graphical objects, such as: icons, operation menus, user input instruction display graphics, and the like. The device comprises an arithmetic unit, wherein the arithmetic unit is used for receiving various interaction instructions input by a user to carry out operation and displaying various objects according to display attributes. And a renderer for rendering the various objects obtained by the arithmetic unit, wherein the rendered objects are used for being displayed on a display.

In some embodiments, video processor 270 is configured to receive external video signals, perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image composition, etc., according to standard codec protocols for input signals, and may result in signals that are displayed or played on directly displayable device 200.

In some embodiments, video processor 270 includes a demultiplexing module, a video decoding module, an image compositing module, a frame rate conversion module, a display formatting module, and the like.

The demultiplexing module is used for demultiplexing the input audio/video data stream, such as the input MPEG-2, and demultiplexes the input audio/video data stream into video signals, audio signals and the like.

And the video decoding module is used for processing the demultiplexed video signals, including decoding, scaling and the like.

And an image synthesis module, such as an image synthesizer, for performing superposition mixing processing on the graphic generator and the video image after the scaling processing according to the GUI signal input by the user or generated by the graphic generator, so as to generate an image signal for display.

The frame rate conversion module is configured to convert the input video frame rate, for example, converting the 60Hz frame rate into the 120Hz frame rate or the 240Hz frame rate, and the common format is implemented in an inserting frame manner.

The display format module is used for converting the received frame rate into a video output signal, and changing the video output signal to a signal conforming to the display format, such as outputting an RGB data signal.

In some embodiments, the graphics processor 253 may be integrated with the video processor, or may be separately configured, where the integrated configuration may perform processing of graphics signals output to the display, and the separate configuration may perform different functions, such as gpu+frc (FRAME RATE Conversion) architecture, respectively.

In some embodiments, the audio processor 280 is configured to receive an external audio signal, decompress and decode the audio signal according to a standard codec protocol of an input signal, and perform noise reduction, digital-to-analog conversion, and amplification processing, so as to obtain a sound signal that can be played in a speaker.

In some embodiments, video processor 270 may include one or more chips. The audio processor may also comprise one or more chips.

In some embodiments, video processor 270 and audio processor 280 may be separate chips or may be integrated together with the controller in one or more chips.

In some embodiments, the audio output, under the control of the controller 250, receives sound signals output by the audio processor 280, such as: the speaker 286, and an external sound output terminal that can be output to a generating device of an external device, other than the speaker carried by the display device 200 itself, such as: external sound interface or earphone interface, etc. can also include the close range communication module in the communication interface, for example: and the Bluetooth module is used for outputting sound of the Bluetooth loudspeaker.

The power supply 290 supplies power input from an external power source to the display device 200 under the control of the controller 250. The power supply 290 may include a built-in power circuit installed inside the display device 200, or may be an external power source installed in the display device 200, and a power interface for providing an external power source in the display device 200.

The user interface 265 is used to receive an input signal from a user and then transmit the received user input signal to the controller 250. The user input signal may be a remote control signal received through an infrared receiver, and various user control signals may be received through a network communication module.

In some embodiments, a user inputs a user command through the control apparatus 100 or the mobile terminal 300, the user input interface is then responsive to the user input through the controller 250, and the display device 200 is then responsive to the user input.

In some embodiments, a user may input a user command through a Graphical User Interface (GUI) displayed on the display 275, and the user input interface receives the user input command through the Graphical User Interface (GUI). Or the user may input the user command by inputting a specific sound or gesture, the user input interface recognizes the sound or gesture through the sensor, and receives the user input command.

In some embodiments, a "user interface" is a media interface for interaction and exchange of information between an application or operating system and a user that enables conversion between an internal form of information and a form acceptable to the user. A commonly used presentation form of a user interface is a graphical user interface (Graphic User Interface, GUI), which refers to a graphically displayed user interface that is related to computer operations. It may be an interface element such as an icon, a window, a control, etc. displayed in a display screen of the electronic device, where the control may include a visual interface element such as an icon, a button, a menu, a tab, a text box, a dialog box, a status bar, a navigation bar, a Widget, etc.

The memory 260 includes memory storing various software modules for driving the display device 200. Such as: various software modules stored in the first memory, including: at least one of a base module, a detection module, a communication module, a display control module, a browser module, various service modules, and the like.

The base module is a bottom software module for signal communication between the various hardware in the display device 200 and for sending processing and control signals to the upper modules. The detection module is used for collecting various information from various sensors or user input interfaces and carrying out digital-to-analog conversion and analysis management.

For example, the voice recognition module includes a voice analysis module and a voice instruction database module. The display control module is used for controlling the display to display the image content, and can be used for playing the multimedia image content, the UI interface and other information. And the communication module is used for carrying out control and data communication with external equipment. And the browser module is used for executing data communication between the browsing servers. And the service module is used for providing various services and various application programs. Meanwhile, the memory 260 also stores received external data and user data, images of various items in various user interfaces, visual effect maps of focus objects, and the like.

Fig. 3 illustrates a block diagram of a configuration of the control device 100 according to some embodiments. As shown in fig. 3, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface, a memory, and a power supply.

The control device 100 is configured to control the display device 200, and may receive an input operation instruction of a user, and convert the operation instruction into an instruction recognizable and responsive to the display device 200, to function as an interaction between the user and the display device 200. Such as: the user responds to the channel addition and subtraction operation by operating the channel addition and subtraction key on the control apparatus 100, and the display apparatus 200.

In some embodiments, the control device 100 may be a smart device. Such as: the control apparatus 100 may install various applications for controlling the display apparatus 200 according to user's needs.

In some embodiments, as shown in fig. 1, a mobile terminal 300 or other intelligent electronic device may function similarly to the control device 100 after installing an application that manipulates the display device 200. Such as: the user may implement the functions of controlling the physical keys of the device 100 by installing various function keys or virtual buttons of a graphical user interface available on the mobile terminal 300 or other intelligent electronic device.

The controller 110 includes a processor 112 and RAM 113 and ROM 114, a communication interface 130, and a communication bus. The controller is used to control the operation and operation of the control device 100, as well as the communication collaboration among the internal components and the external and internal data processing functions.

The communication interface 130 enables communication of control signals and data signals with the display device 200 under the control of the controller 110. Such as: the received user input signal is transmitted to the display device 200. The communication interface 130 may include at least one of a WiFi chip 131, a bluetooth module 132, an NFC module 133, and other near field communication modules.

A user input/output interface 140, wherein the input interface includes at least one of a microphone 141, a touchpad 142, a sensor 143, keys 144, and other input interfaces. Such as: the user can implement a user instruction input function through actions such as voice, touch, gesture, press, and the like, and the input interface converts a received analog signal into a digital signal and converts the digital signal into a corresponding instruction signal, and sends the corresponding instruction signal to the display device 200.

The output interface includes an interface that transmits the received user instruction to the display device 200. In some embodiments, an infrared interface may be used, as well as a radio frequency interface. Such as: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. And the following steps: when the radio frequency signal interface is used, the user input instruction is converted into a digital signal, and then the digital signal is modulated according to a radio frequency control signal modulation protocol and then transmitted to the display device 200 through the radio frequency transmission terminal.

In some embodiments, the control device 100 includes at least one of a communication interface 130 and an input-output interface 140. The control device 100 is provided with a communication interface 130 such as: the WiFi, bluetooth, NFC, etc. modules may send the user input instruction to the display device 200 through a WiFi protocol, or a bluetooth protocol, or an NFC protocol code.

A memory 190 for storing various operation programs, data and applications for driving and controlling the control device 200 under the control of the controller. The memory 190 may store various control signal instructions input by a user.

A power supply 180 for providing operating power support for the various elements of the control device 100 under the control of the controller. May be a battery and associated control circuitry.

In some embodiments, the system may include a Kernel (Kernel), a command parser (shell), a file system, and an application. The kernel, shell, and file system together form the basic operating system architecture that allows users to manage files, run programs, and use the system. After power-up, the kernel is started, the kernel space is activated, hardware is abstracted, hardware parameters are initialized, virtual memory, a scheduler, signal and inter-process communication (IPC) are operated and maintained. After the kernel is started, shell and user application programs are loaded again. The application program is compiled into machine code after being started to form a process.

A schematic diagram of the software configuration in the display device 200 according to some embodiments is schematically shown in fig. 4. Referring to FIG. 4, in some embodiments, the system is divided into four layers, from top to bottom, an application layer (referred to as an "application layer"), an application framework layer (Application Framework) layer (referred to as a "framework layer"), a An Zhuoyun row layer (Android runtime) and a system library layer (referred to as a "system runtime layer"), and a kernel layer, respectively.

In some embodiments, at least one application program is running in the application program layer, and these application programs may be a Window (Window) program of an operating system, a system setting program, a clock program, a camera application, and the like; and may be an application program developed by a third party developer, such as a hi-see program, a K-song program, a magic mirror program, etc. In particular implementations, the application packages in the application layer are not limited to the above examples, and may actually include other application packages, which the embodiments of the present application do not limit.

The framework layer provides an application programming interface (application programming interface, API) and programming framework for the application programs of the application layer. The application framework layer includes a number of predefined functions. The application framework layer corresponds to a processing center that decides to let the applications in the application layer act. An application program can access resources in a system and acquire services of the system in execution through an API interface

As shown in fig. 4, the application framework layer in the embodiment of the present application includes a manager (Managers), a Content Provider (Content Provider), and the like, where the manager includes at least one of the following modules: an activity manager (ACTIVITY MANAGER) is used to interact with all activities running in the system; a Location Manager (Location Manager) is used to provide system services or applications with access to system Location services; a package manager (PACKAGE MANAGER) for retrieving various information about the application packages currently installed on the device; a notification manager (Notification Manager) for controlling the display and clearing of notification messages; a Window Manager (Window Manager) is used to manage bracketing icons, windows, toolbars, wallpaper, and desktop components on the user interface.

In some embodiments, the activity manager is to: the lifecycle of each application program is managed, as well as the usual navigation rollback functions, such as controlling the exit of the application program (including switching the currently displayed user interface in the display window to the system desktop), opening, backing (including switching the currently displayed user interface in the display window to the previous user interface of the currently displayed user interface), etc.

In some embodiments, the window manager is configured to manage all window procedures, such as obtaining a display screen size, determining whether there is a status bar, locking the screen, intercepting the screen, controlling display window changes (e.g., scaling the display window down, dithering, distorting, etc.), and so on.

In some embodiments, the system runtime layer provides support for the upper layer, the framework layer, and when the framework layer is in use, the android operating system runs the C/C++ libraries contained in the system runtime layer to implement the functions to be implemented by the framework layer.

In some embodiments, the kernel layer is a layer between hardware and software. As shown in fig. 4, the kernel layer contains at least one of the following drivers: audio drive, display drive, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (e.g., fingerprint sensor, temperature sensor, touch sensor, pressure sensor, etc.), and the like.

In some embodiments, the kernel layer further includes a power driver module for power management.

In some embodiments, the software programs and/or modules corresponding to the software architecture in fig. 4 are stored in the first memory or the second memory shown in fig. 2 or fig. 3.

In some embodiments, taking a magic mirror application (photographing application) as an example, when the remote control receiving device receives an input operation of the remote control, a corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes the input operation into the original input event (including the value of the input operation, the timestamp of the input operation, etc.). The original input event is stored at the kernel layer. The application program framework layer acquires an original input event from the kernel layer, identifies a control corresponding to the input event according to the current position of the focus and takes the input operation as a confirmation operation, wherein the control corresponding to the confirmation operation is a control of a magic mirror application icon, the magic mirror application calls an interface of the application framework layer, the magic mirror application is started, and further, a camera driver is started by calling the kernel layer, so that a still image or video is captured through a camera.

In some embodiments, for a display device with a touch function, taking a split screen operation as an example, the display device receives an input operation (such as a split screen operation) acted on a display screen by a user, and the kernel layer may generate a corresponding input event according to the input operation and report the event to the application framework layer. The window mode (e.g., multi-window mode) and window position and size corresponding to the input operation are set by the activity manager of the application framework layer. And window management of the application framework layer draws a window according to the setting of the activity manager, then the drawn window data is sent to a display driver of the kernel layer, and the display driver displays application interfaces corresponding to the window data in different display areas of the display screen.

An icon control interface display schematic of an application in a display device 200 according to some embodiments is illustrated in fig. 5. In some embodiments, as shown in fig. 5, the application layer contains at least one icon control that the application can display in the display, such as: a live television application icon control, a video on demand application icon control, a media center application icon control, an application center icon control, a game application icon control, and the like.

In some embodiments, the live television application may provide live television via different signal sources. For example, a live television application may provide television signals using inputs from cable television, radio broadcast, satellite services, or other types of live television services. And, the live television application may display video of the live television signal on the display device 200.

In some embodiments, the video on demand application may provide video from different storage sources. Unlike live television applications, video-on-demand provides video displays from some storage sources. For example, video-on-demand may come from the server side of cloud storage, from a local hard disk storage containing stored video programs.

In some embodiments, the media center application may provide various multimedia content playing applications. For example, a media center may be a different service than live television or video on demand, and a user may access various images or audio through a media center application.

In some embodiments, an application center may be provided to store various applications. The application may be a game, an application, or some other application associated with a computer system or other device but which may be run in a smart television. The application center may obtain these applications from different sources, store them in local storage, and then be run on the display device 200.

In some embodiments, when the display device uses the application level player to perform live broadcast, a delay phenomenon is very easy to occur, so that a large time difference exists between a user watching picture and a network live broadcast streaming end. Live scenes include, but are not limited to, live in a classroom, screen casting by a mobile phone, video conferencing, a small class, and the like, which require real-time interaction with a user.

In a scenario with high requirements on low delay of the application level player, for example, when a user uses the display device to perform a video conference, if there is a large delay, both sides of the video conference cannot acquire the voice content of the other side in time, which affects the user experience.

The delay phenomenon is usually caused by time consuming play of the application-level player and network fluctuation, so in order to ensure that a low delay effect is achieved during live broadcast, the embodiment of the invention provides a display device, which comprises a controller, wherein the controller is internally provided with the application-level player for live broadcast. The application level player can detect the audio buffer size after the unpacking, and when the audio buffer size exceeds a threshold value, the audio and video synchronization mechanism is modified, so that the smooth transition (multiple quick results) of the audio and video data or the direct jump (frame loss) to the current latest frame data is realized, and the delay phenomenon is reduced.

To determine whether the display device is turning on the low latency function, a player up interface may be invoked by the application level player based on a user selection, a low latency flag to characterize whether the low latency function is turned on, and a discard flag to characterize whether audio data needs to be discarded.

Before starting the live broadcast, i.e. before generating the live stream, the application level player makes a personalized setting through its configured SetOption (player up interface) interface, decides whether to start the low delay mode and decides whether to discard the audio data when there is a delay phenomenon.

If it is determined to turn on the low delay mode, a low delay flag is set. And when the low-delay identification is identified in the live broadcast, starting a low-delay mode, and executing the relevant process of the live broadcast low-delay processing method of the application-level player so as to achieve the low-delay effect. If it is determined that the audio data needs to be discarded when the delay phenomenon occurs, a discard flag is set. When the delay phenomenon is generated, whether the audio data need to be discarded is judged firstly, and if the discarding identification is recognized, the operation of discarding the audio data is executed.

The decision whether to generate the delay phenomenon is based on whether the storage amount of the buffered audio data exceeds a threshold value, and in some embodiments, the threshold value may be set to 300ms, and if the storage amount of the buffered audio data exceeds 300ms, it may be determined that the delay phenomenon occurs in the current live broadcast process. When the delay phenomenon is judged to be generated, the live broadcast low-delay processing method of the application-level player can be executed, so that the low-delay effect is achieved, and the user experience is improved.

A flowchart of a live low latency processing method for an application level player according to some embodiments is illustrated in fig. 6; a data flow diagram of a live low latency processing method for an application level player according to some embodiments is illustrated in fig. 7. Referring to fig. 6 and 7, in a display device according to an embodiment of the present invention, when performing a live low-latency processing method of an application-level player, the application-level player configured to perform the following steps:

S1, acquiring live broadcast data generated during live broadcast, and performing decapsulation processing on the live broadcast data to obtain audio data before decoding and video data before decoding.

When a user uses the display device to live, the application-level player generates a network live stream, namely live data. In order to realize the transmission of the network live stream, the application level player needs to perform decapsulation and decoding to realize playing, and the application level player performs decapsulation processing on the live data to obtain audio data before decoding and video data before decoding.

S2, storing the audio data before decoding into an audio data buffer queue, and storing the video data before decoding into a video data buffer queue.

In order to ensure stable live broadcasting, a certain amount of pre-decoding audio data is buffered and stored in an audio data buffer queue, and a certain amount of pre-decoding video data is buffered and stored in a video data buffer queue.

In some embodiments, it may be set that the audio data buffer queue and the video data buffer queue both store 2 frames of data, i.e., the audio data buffer queue stores 2 frames of pre-decoding audio data, and the video data buffer queue stores 2 frames of pre-decoding video data.

And storing the pre-decoding audio data required by decoding in an audio data buffer queue, so that when the decoding operation is carried out subsequently, the corresponding pre-decoding audio data cached in advance can be directly obtained from the audio data buffer queue, namely, the pre-decoding audio data is obtained from the audio data buffer queue to carry out decoding processing while being stored in the audio data buffer queue, and the stable carrying out of the subsequent decoding processing process can be ensured. Wherein, the storage and the acquisition processes are carried out continuously for one frame of data. The processing procedure of the video data is the same as that of the audio data, and will not be described here again.

Because a certain amount of pre-decoding audio data is stored in the audio data cache queue, if the storage amount is too large, the audio data delay time is long. Therefore, it is possible to determine whether or not a high delay occurs by determining whether or not the storage amount of pre-decoding audio data stored in the audio data buffer queue exceeds a threshold value, to accurately determine whether or not low delay processing is performed.

And S3, if the storage amount of the audio data before decoding stored in the audio data cache queue exceeds a threshold value, identifying whether a discarding identifier exists, wherein the discarding identifier is used for representing whether the audio data needs to be discarded.

When the display device performs the current live broadcast, whether to start the low-delay mode can be preset. If the low-delay identification is identified, the current live broadcast process needs to execute the relevant process corresponding to the low-delay mode, namely, judging whether the threshold value is exceeded or not according to the storage amount of the audio data before decoding stored in the audio data cache queue.

In some embodiments, the determining process may be performed in a loop, and at predetermined time intervals, it is determined whether the storage amount of the pre-decoding audio data stored in the audio data buffer queue exceeds a threshold. The preset time interval may be set to 2s, or may be other time, which is not specifically limited herein.

The application level player performs threshold judgment once every 2s based on the storage amount of the audio data before decoding stored in the current audio data buffer queue. The longer the time, the greater the amount of storage of pre-decoding audio data stored in the audio data buffer queue.

If at a certain moment, the application level player detects that the storage amount of the audio data before decoding stored in the audio data cache queue exceeds a threshold value, the delay phenomenon of the current live broadcast process is indicated. When the delay phenomenon occurs, the solution includes frame loss playing or double-speed playing, and the frame loss playing needs to discard the audio data. Therefore, it is necessary to identify whether a discard flag exists, and the discard flag is set by the application level player before live broadcast.

S4, if the discarding identifier exists, discarding the audio data stored in the audio data buffer queue before decoding, and performing the emptying treatment.

If a discard flag is present, a process of discarding audio data needs to be performed. Therefore, the audio data buffer queue needs to be emptied, namely the audio data before decoding currently stored in the audio data buffer queue is discarded, so that the frame loss playing is realized. For example, if the pre-decoding audio data buffered for 350ms in the audio data buffer queue is discarded, when the threshold is exceeded (when 300ms is set), the pre-decoding audio data for 350ms is discarded when the discard flag is recognized, and the audio data buffer queue is emptied.

S5, obtaining the pre-decoding audio data newly stored in the audio data buffer queue after the emptying processing, and decoding the pre-decoding audio data newly stored and the pre-decoding video data stored in the video data buffer queue to obtain the post-decoding audio data and the post-decoding video data.

And storing the audio data before decoding of the next frame in the audio data buffer queue again at the next moment after the audio data buffer queue finishes the emptying operation. In some embodiments, the duration of the audio data before one frame decoding is 40ms, 50ms, 60ms, 70ms, or the like. Therefore, when new two-frame pre-decoding audio data required for ensuring stable decoding is stored in the audio data buffer queue, the newly stored pre-decoding audio data cannot exceed a threshold value, and at this time, the latest pre-decoding audio data can be subjected to decoding processing to obtain decoded video data.

And meanwhile, normal decoding processing is carried out on the video data before decoding stored in the video data buffer queue, so as to obtain the video data after decoding.

In some embodiments, the threshold determination is made with the amount of storage of the audio data before decoding, and the threshold determination is made without using the amount of storage of the video data before decoding. The reason is that the encoding formats of the video data and the audio data are different and limited by the encoding formats of the video data, if the video data before decoding is used as a threshold value for judgment and the frame loss operation is executed, the live broadcast picture is caused to have a screen display phenomenon, and the user experience is affected.

S6, synchronizing the decoded audio data and the decoded video data to obtain new live broadcast data for playing.

After the decoding process is finished, the audio data before decoding is discarded, so that the audio data after decoding can be synchronized with the audio data by taking the time stamp of the audio data as a reference in order to ensure the synchronization of the audio and the picture during live broadcasting. In the synchronization process, the decoded video data in the same period as the discarded audio data before decoding is discarded, and the video directly jumps to the current latest frame data, so that the live broadcast picture directly jumps to the latest picture, the frame loss playing is realized, and the delay phenomenon is reduced.

A method flow diagram of a synchronization process according to some embodiments is illustrated in fig. 8. Referring to fig. 8, in some embodiments, the application level player, when performing the synchronization processing of the decoded audio data and the decoded video data, is further configured to perform the steps of:

S61, acquiring an audio time stamp corresponding to the audio data before decoding newly stored in the audio data buffer queue.

S62, determining a video discarding time stamp for discarding video data based on the audio time stamp corresponding to the audio data before decoding of the first frame.

S63, acquiring the appointed decoded video data corresponding to the video discarding time stamp in the video data cache queue, discarding the appointed decoded video data, and realizing synchronous playing of the decoded audio data and the decoded video data.

When synchronizing video data to audio data, a time stamp of the audio data is used as a synchronization reference in order to ensure the accuracy of synchronization. And the discarded video data takes the audio time stamp of the audio data before decoding of the newly stored first frame as a reference according to the need, so as to ensure that the discarded audio and video are in the same period, and further ensure the synchronization of audio and video during live broadcasting.

Therefore, the audio time stamp corresponding to the audio data before decoding of the first frame stored in the audio data buffer queue is required to be obtained, and the audio time stamp corresponding to the audio data before decoding of the first frame stored in the audio data buffer queue is used for identifying the time when the audio data before decoding is discarded, namely the time when the audio data before decoding of the last frame in the discarded audio data before decoding is located.

To ensure that video data is synchronized with audio data, the video discard time stamp is guaranteed to be the same as the audio time stamp. Accordingly, the audio time stamp corresponding to the audio data before decoding of the newly stored first frame can be taken as the video discard time stamp for discarding the video data. The video discard timestamp is used to identify the time at which the last frame of decoded video data that needs to be discarded was located.

Based on the video discard time stamp, corresponding appointed decoded video data can be selected from the video data buffer queue, the appointed decoded video data comprises decoded video data of a target frame number, and the appointed decoded video data refers to the corresponding decoded video data from the current playing time of the decoded video data to the video discard time stamp.

When the video data discarding is realized, the current playing time of the decoded video data is directly jumped to the position corresponding to the video discarding time stamp, so that the decoded audio data played after video jumping can correspond to the decoded audio data corresponding to the played first frame of the newly stored audio data before decoding, and synchronous playing of the decoded audio data and the decoded video data is realized.

The selected designated decoded video data is discarded so that the decoded video data can be directly skipped to the video data of the current latest frame. The audio data is discarded before the decoding process, the decoded audio data can also directly jump to the audio data of the current latest frame, and the synchronous playing of the decoded audio data and the decoded video data can be realized, thereby avoiding the delay phenomenon.

When the video data is discarded, the decoded video data is taken as a discarding target, so that the limitation of the video coding format of the video data before decoding can be reduced, and further, the screen display phenomenon caused by video frame loss is avoided.

After the decoded video data are synchronized with the decoded audio data, the decoded audio data are discarded before decoding and the decoded video data with the same time period are discarded, so that the latest live broadcast data can be played, namely, the latest decoded video data and the latest decoded audio data are used for live broadcast, delay phenomena caused by time consumption and network fluctuation during live broadcast are reduced, and a low delay effect is realized.

Therefore, the display device provided by the embodiment of the invention can detect the audio buffer size after unpacking by the application level player, and when the audio buffer size exceeds the threshold value, the audio-video synchronization mechanism is modified, so that the audio data and the video data are directly jumped (lost) to the current latest frame data, and the delay phenomenon caused by time consumption of playing and network fluctuation during live broadcast is reduced.

In some embodiments, when determining whether the storage amount of the audio data before decoding stored in the audio data buffer queue exceeds the threshold, if the storage amount does not exceed the threshold, no delay phenomenon occurs, and the audio and video data is normally decoded and played. Therefore, on the basis of the live low-latency processing method of the application-level player executed by the display device provided in the foregoing embodiment, the application-level player configured thereof is further configured to execute the steps of:

Step 201, if the storage amount of the pre-decoding audio data stored in the audio data buffer queue does not exceed the threshold value, performing decoding processing on the pre-decoding audio data stored in the audio data buffer queue and the pre-decoding video data stored in the video data buffer queue to obtain decoded audio data and decoded video data.

And 202, synchronously processing the decoded audio data and the decoded video data to obtain new live broadcast data for playing.

When the storage amount of the audio data before decoding stored in the audio data cache queue is judged not to exceed the threshold value, the fact that the delay phenomenon is not generated in the current live broadcast process is indicated, namely, a low-delay mode is not needed. At this time, the audio data before decoding stored in the audio data buffer queue can be directly decoded to obtain decoded audio data; and decoding the video data before decoding stored in the video data buffer queue to obtain decoded video data.

After obtaining the decoded audio data and the decoded video data, the step of synchronizing the audio data and the video data may be performed. Whether synchronization is required is determined based on the play time stamp of the decoded audio data and the play time stamp of the decoded video data. If the two play time stamps are not synchronized, i.e. the decoded audio data and the decoded video data deviate, the decoded video data need to be synchronized to the decoded audio data.

When in synchronization, the playing time stamp of the decoded video data can be adjusted according to the playing time stamp of the decoded audio data as a reference, so that the playing time stamp of the decoded video data is identical to the playing time stamp of the decoded audio data, the decoded video data and the decoded audio data after the playing time stamp is adjusted are played as new live broadcast data, the synchronization of audio and video can be ensured, and the delay phenomenon is avoided.

The representation form of the play time stamp of the decoded video data can be adjusted to discard the decoded video data with the corresponding frame number, so that the frame loss play is realized. The specific implementation of discarding the video data may refer to the specific process of step S6 and the related steps in the foregoing embodiments, which are not described herein again.

In some embodiments, since the display device performs low-delay processing on the preparation, a double-speed playing method can be adopted in addition to the above-adopted frame-loss playing method. If the display device does not need to adopt the frame loss playing method, the application level player does not need to set a discarding identifier for representing whether audio data need to be discarded through a SetOption (player up interface) interface configured by the application level player before starting live broadcasting.

In the live broadcast process, if the storage amount of the audio data before decoding buffered in the audio data buffer queue exceeds a threshold value, namely when a delay phenomenon occurs, no discarding mark is recognized, the method for solving the delay phenomenon is described to adopt a double-speed playing mode, and a frame loss playing mode is not required.

A method flow diagram for using double-speed playback according to some embodiments is illustrated in fig. 9. In some embodiments, referring to fig. 9, on the basis of the live low-latency processing method of the application-level player performed by the display device provided in the foregoing embodiments, the application-level player configured by the display device is further configured to perform the following steps:

S401, if the discarding identifier does not exist, decoding the pre-decoding audio data stored in the audio data buffer queue and the pre-decoding video data stored in the video data buffer queue to obtain decoded audio data and decoded video data.

S402, acquiring appointed decoded audio data corresponding to the storage amount exceeding the threshold value based on the decoded audio data stored in the audio data cache queue.

S403, resetting the audio playing time stamp corresponding to the appointed decoded audio data.

S404, performing synchronous processing on the appointed decoded audio data and the decoded video data which are played based on the reset audio playing time stamp, and obtaining new live broadcast data for playing.

After judging that the storage amount of the audio data before decoding buffered in the audio data buffer queue exceeds a threshold value, the application level player recognizes whether a discard identifier exists, and if the discard identifier is not recognized, the current mode for solving the delay phenomenon is not a mode of frame loss playing, but a mode of double-speed playing.

Therefore, when the pre-decoding audio data cached in the audio data cache queue does not need to be emptied, the pre-decoding audio data cached in the audio data cache queue can be directly decoded to obtain the post-decoding audio data; and directly decoding the video data before decoding buffered in the video data buffer queue to obtain decoded video data.

When the double-speed playing is adopted, the decoded audio data stored in the audio data cache list is required to be re-adopted, and the double-speed playing is only carried out on the part exceeding the threshold value in the audio data cache list. Therefore, it is determined that the storage amount of decoded audio data buffered in the audio data buffer queue exceeds the specified decoded audio data corresponding to the threshold value. Resetting the play time stamp of the appointed decoded audio data to realize double-speed play.

A method flow diagram for resetting an audio playback time stamp according to some embodiments is illustrated in fig. 10. In some embodiments, referring to fig. 10, in step S403, that is, performing the resetting of the audio play time stamp corresponding to the specified decoded audio data, the application level player is further configured to perform the following steps:

S4031, acquiring an audio playing time stamp and a corresponding playing time length of the appointed decoded audio data.

S4032, adjusting the playing time according to a preset rule.

S4033, adjusting the audio playing time stamp based on the adjusted playing time length.

When the double-speed playing is realized, the audio playing time stamp of the appointed decoded audio data is adjusted, and the audio playing time stamp is used for identifying the moment of each frame of decoded audio data in the appointed decoded audio data exceeding a threshold value. And determining the playing time length corresponding to the appointed decoded audio data exceeding the threshold value based on the time of the first frame of decoded audio data exceeding the threshold value and the time of the last frame of decoded audio data exceeding the threshold value.

When the playing speed is adjusted, the playing time length of the audio data after decoding needs to be adjusted first, and in some embodiments, the playing time length is adjusted according to a preset rule. Wherein, the preset rule can be 2 times speed, 3 times speed and the like. For example, when the preset rule is 2 times of speed, if the playing duration of the decoded audio data is designated as 1s, the adjusted playing duration is 500ms, that is, the decoded audio data is designated to be played according to 500 ms.

After determining the adjusted playing time length of the appointed decoded audio data, the audio playing time stamp can be reset, and the time of the first frame of decoded audio data and the time of the last frame of decoded audio data in the appointed decoded audio data are reset.

After the resetting of the audio playing time stamp of the appointed decoded audio data is completed, the decoded video data can be synchronized to the reset appointed decoded audio data, so that new live broadcast data can be obtained for playing after synchronization.

When synchronizing video data to audio data, a time stamp of the audio data is used as a synchronization reference in order to ensure the accuracy of synchronization. And the double-speed playing of the decoded video data takes the decoded audio data needing to be played at the double-speed as a reference to ensure that the audio and the video after the double-speed playing are in the same period, thereby ensuring the synchronization of the audio and the video during live broadcasting.

In some embodiments, in step S404, the application level player performs synchronization processing on the decoded audio data and the decoded video data that are played based on the reset audio play time stamp, to obtain new live broadcast data for playing, and is further configured to perform the following steps:

Step 4041, obtaining the video playing time stamp of the decoded video data stored in the video data buffer queue.

Step 4042, adjusting the video playing time stamp based on the resetting rule of the audio playing time stamp, wherein the adjusted video playing time stamp is the same as the reset audio playing time stamp.

Step 4043, playing the corresponding appointed decoded audio data according to the reset audio playing time stamp, and playing the corresponding decoded video data according to the adjusted video playing time stamp, so as to realize synchronous playing of the decoded audio data and the decoded video data.

When synchronizing video data to audio data, the video playing time stamp is required to be synchronized to the audio playing time stamp, so that the video playing time stamp of the decoded video data stored in the video data buffer queue is obtained, and the video playing time stamp is used for identifying the moment of each frame of decoded video data stored in the video data buffer queue.

The video playing time stamp adopts the same resetting rule along with the audio playing time stamp, and after the video playing time stamp is reset, the adjusted video playing time stamp is the same as the reset audio playing time stamp so as to ensure synchronous playing of the audio and the video.

In some embodiments, the application level player adjusts the video playback time stamp in step 4042, i.e., executing a reset rule based on the audio playback time stamp, is further configured to perform the steps of:

step 40421, obtaining an audio playing time stamp and a corresponding playing time length.

Step 40422, determining a corresponding video playing time stamp based on the audio playing time stamp.

Step 40423, determining the adjusted playing time length of the video corresponding to the decoded video data based on the adjusted playing time length according to the preset rule.

Step 40424, adjusting a video playing time stamp based on the video adjusted playing time length.

In order to ensure the accuracy of synchronization, the audio playing time stamp is used as the video playing time stamp to be adjusted. And in order to ensure that the audio and the video after the double-speed playing are in the same time period, the adjusted playing time length corresponding to the decoded audio data is designated as the video adjusted playing time length corresponding to the decoded video data. The preset rule according to which the playing time of the decoded audio data is specified is adjusted may refer to the preset rule described in steps S4031 to S4033, and will not be described here.

After determining the video adjusted playing time length of the appointed decoded video data, the video playing time stamp can be reset, and the time of the first frame of decoded video data and the time of the last frame of decoded video data in the appointed decoded video data are reset. After the video playing time stamp is reset, namely, after double-speed adjustment is realized, the rendering speed can be increased, and the double-speed playing effect is realized.

Designating the decoded audio data to be played according to the reset audio playing time stamp to obtain new audio data; and playing the decoded video data according to the reset video playing time stamp to obtain new video data. The new audio data and the new video data are the new live broadcast data which are subjected to double-speed playing adjustment, and the new live broadcast data are based on the new live broadcast data for playing, so that synchronous playing of the decoded audio data and the decoded video data can be realized, further, the low-delay effect can be realized, and the delay phenomenon caused by playing time and network fluctuation during live broadcast is reduced.

Therefore, the display device provided by the embodiment of the invention can detect the audio buffer size after unpacking by the application level player, modify the audio-video synchronization mechanism when the audio data is beyond a threshold value and the audio data is not required to be discarded, realize smooth transition of the audio data and the video data, realize the effect of double-speed playing by resampling, and reduce the delay phenomenon caused by time consuming of playing and network fluctuation during live broadcasting.

According to the technical scheme, the application-level player configured by the display device stores the audio data before decoding obtained by unpacking live broadcast data into the audio data cache queue, and stores the video data before decoding into the video data cache queue; when the storage amount of the pre-decoding audio data stored in the audio data cache queue exceeds a threshold value and a discarding mark is identified, discarding the pre-decoding audio data stored in the audio data cache queue, and performing a clearing process; and decoding the newly stored pre-decoding audio data and the pre-decoding video data stored in the video data cache queue, and synchronously processing the obtained post-decoding audio data and the obtained post-decoding video data to obtain new live broadcast data for playing. And when the discarding identifier is not recognized, resampling the decoded audio data to reset the audio playing time stamp, and synchronizing the decoded video data with the decoded audio data with the reset audio playing time stamp to realize double-speed playing. It can be seen that, in the display device provided by the embodiment of the invention, the application-level player detects the audio buffer size after decapsulation, and modifies the audio-video synchronization mechanism when the audio buffer size exceeds the threshold value, so as to realize smooth transition (multiple quick-acting effect) of audio and video data or direct jump (frame loss playing) to the current latest frame data, thereby reducing the phenomena of larger delay caused by time consuming starting and network fluctuation during live broadcast.

A flowchart of a live low latency processing method for an application level player according to some embodiments is illustrated in fig. 6. Referring to fig. 6, a method for processing live broadcast with low latency of an application level player according to an embodiment of the present invention is performed by the application level player in the display device provided by the foregoing embodiment, and includes the following steps:

S1, acquiring live broadcast data generated during live broadcast, and performing decapsulation processing on the live broadcast data to obtain audio data before decoding and video data before decoding;

s2, storing the pre-decoding audio data into an audio data buffer queue, and storing the pre-decoding video data into a video data buffer queue;

S3, if the storage amount of the audio data before decoding stored in the audio data cache queue exceeds a threshold value, identifying whether a discarding mark exists, wherein the discarding mark is used for representing whether the audio data needs to be discarded or not;

s4, if the discarding identifier exists, discarding the audio data stored in the audio data buffer queue before decoding, and performing a clearing process;

s5, obtaining the pre-decoding audio data newly stored in the emptied audio data cache queue, and decoding the pre-decoding audio data newly stored and the pre-decoding video data stored in the video data cache queue to obtain decoded audio data and decoded video data;

In a specific implementation, the present invention further provides a computer storage medium, where the computer storage medium may store a program, where the program may include some or all of the steps in each embodiment of the live low-latency processing method of the application level player provided by the present invention when the program is executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (random access memory RAM), or the like.

It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in essence or what contributes to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.

The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for the live low latency processing method embodiment of the application level player, since it is substantially similar to the display device embodiment, the description is relatively simple, and reference is made to the description in the display device embodiment for relevant points.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. The illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A display device, characterized by comprising:

If the storage amount of the audio data before decoding stored in the audio data cache queue exceeds a threshold value, judging whether a low-delay mark is set, wherein the low-delay mark is used for representing that the delay is higher than the threshold value;

if the low-delay identifier is set, identifying whether a discard identifier exists, wherein the low-delay identifier and the discard identifier are selected by a user, and the application-level player invokes the player to set an upward interface;

discarding the pre-decoding audio data stored in the audio data buffer queue for emptying processing if the discarding identifier exists, wherein the discarding identifier represents a situation that the processing delay is higher than a threshold value by discarding the audio data;

Synchronizing the decoded audio data and the decoded video data to obtain new live broadcast data for playing;

if the discarding identifier does not exist, decoding the pre-decoding audio data stored in the audio data cache queue and the pre-decoding video data stored in the video data cache queue to obtain the post-decoding audio data and the post-decoding video data, wherein the discarding identifier does not exist to represent the situation that the processing delay is higher than a threshold value in a mode of double-speed playing;

acquiring appointed decoded audio data based on the decoded audio data stored in the audio data cache queue, wherein the appointed decoded audio data is the decoded audio data exceeding a threshold number;

2. The display device of claim 1, wherein the application level player is further configured to:

3. The display device of claim 1 or 2, wherein the application level player, when performing the synchronization processing of the decoded audio data and the decoded video data, is further configured to:

4. The display device of claim 1, wherein the application level player, upon performing the resetting of the audio play time stamp corresponding to the specified decoded audio data, is further configured to:

adjusting the playing time length according to a preset rule;

5. The display device of claim 1, wherein the application level player, upon performing a synchronization process of the decoded audio data to be played based on the reset audio play time stamp and the decoded video data, obtains new live data to be played, is further configured to:

6. The display device of claim 5, wherein the application level player, when executing a reset rule based on the audio play time stamp, adjusts the video play time stamp, is further configured to:

7. The display device of claim 1, wherein the application level player is further configured to:

8. The display device of claim 1, wherein the application level player is further configured to:

9. A method for processing live broadcast of an application level player with low delay, the method comprising: