WO2022160911A1 - 显示设备上语音方案的切换方法、显示设备及控制装置 - Google Patents
显示设备上语音方案的切换方法、显示设备及控制装置 Download PDFInfo
- Publication number
- WO2022160911A1 WO2022160911A1 PCT/CN2021/133767 CN2021133767W WO2022160911A1 WO 2022160911 A1 WO2022160911 A1 WO 2022160911A1 CN 2021133767 W CN2021133767 W CN 2021133767W WO 2022160911 A1 WO2022160911 A1 WO 2022160911A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- voice
- data
- video
- display device
- user
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000009471 action Effects 0.000 claims abstract description 42
- 230000004044 response Effects 0.000 claims description 6
- 238000004806 packaging method and process Methods 0.000 claims description 5
- 230000000875 corresponding effect Effects 0.000 description 99
- 238000010586 diagram Methods 0.000 description 35
- 238000004891 communication Methods 0.000 description 17
- 230000003993 interaction Effects 0.000 description 15
- 230000006870 function Effects 0.000 description 13
- 230000008569 process Effects 0.000 description 9
- 230000033001 locomotion Effects 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000011664 signaling Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
Definitions
- the present application relates to the field of display technology, and in particular, to a method for switching a voice scheme on a display device, a display device, and a control device.
- some display devices may be equipped with an intelligent voice function, and users can conveniently control the display devices by inputting voice.
- Embodiments of the present application provide a method for switching a voice scheme on a display device, a display device, and a control device.
- a display device including:
- Controller configured as:
- the voice control command includes action data for switching the voice scheme of the display device for the user to operate the control device, and voice data for the user to search for target content on the display device. ;
- the display page of the display is switched to the target voice scheme page corresponding to the action data, and the target content corresponding to the voice data is displayed on the target voice scheme page.
- an embodiment of the present application further provides a control device, including:
- Controller configured as:
- the voice control instruction generated by packaging the voice data and the action data is sent to the display device.
- an embodiment of the present application also provides a method for switching a voice scheme on a display device, including:
- the voice control command includes action data for switching the voice scheme of the display device for the user to operate the control device, and voice data for the user to search for target content on the display device. ;
- the display page of the display is switched to the target voice scheme page corresponding to the action data, and the target content corresponding to the voice data is displayed on the target voice scheme page.
- the embodiments of the present application also provide another method for switching voice schemes on a display device, including:
- the voice control instruction generated by packaging the voice data and the action data is sent to the display device.
- FIG. 1 shows a schematic diagram of a usage scenario of a display device according to some embodiments
- FIG. 2 shows a block diagram of the hardware configuration of the control apparatus 100 according to some embodiments
- FIG. 3 shows a block diagram of a hardware configuration of a display device 200 according to some embodiments
- FIG. 4 shows a software configuration diagram in the display device 200 according to some embodiments
- FIG. 5 shows a schematic diagram of an interaction between the controller 250 and the controller 110 according to some embodiments
- FIG. 6 shows a schematic diagram of a second interaction between the controller 110 and the controller 250 according to some embodiments
- Fig. 7 shows a kind of schematic diagram of the speech scheme page according to some embodiments.
- FIG. 8 shows a second schematic diagram of a voice scheme page according to some embodiments.
- FIG. 9 shows a schematic diagram of a user-operated control device 100 according to some embodiments.
- FIG. 10 shows a schematic diagram of a third interaction between the controller 110 and the controller 250 according to some embodiments.
- FIG. 11 shows a second schematic diagram of a user-operated control device 100 according to some embodiments.
- FIG. 12 shows a schematic diagram of a fourth interaction between the controller 110 and the controller 250 according to some embodiments.
- FIG. 13 shows a third schematic diagram of a user-operated control device 100 according to some embodiments.
- FIG. 14 shows a flow chart of a method for switching voice schemes on a display device according to some embodiments
- 15 shows another flowchart of a method for switching voice schemes on a display device according to some embodiments
- Figure 16 shows a block diagram of a video search system according to some embodiments.
- FIG. 17 shows a schematic diagram of a user interface in the display device 200 according to some embodiments.
- FIG. 18 shows a schematic diagram of a user interface in yet another display device 200 according to some embodiments.
- FIG. 19 shows a schematic diagram of a user interface in yet another display device 200 according to some embodiments.
- FIG. 20 shows a schematic diagram of a user interface in yet another display device 200 according to some embodiments.
- Figure 21 shows a video search method signaling diagram according to some embodiments
- Figure 22 shows a signaling diagram of yet another video search method according to some embodiments.
- module refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code capable of performing the functions associated with that element.
- FIG. 1 shows a schematic diagram of a usage scenario of a display device according to some embodiments.
- the display device 200 also performs data communication with the server 400 , and the user can operate the display device 200 through the smart device 300 or the control device 100 .
- control device 100 may be a remote control, and the communication between the remote control and the display device includes at least one of infrared protocol communication or Bluetooth protocol communication, and other short-range communication methods, and the display is controlled wirelessly or wiredly. device 200.
- the user can control the display device 200 by inputting user instructions through at least one of keys on the remote control, voice input, and control panel input.
- the smart device 300 may include any one of a mobile terminal, a tablet computer, a computer, a laptop computer, an AR/VR device, and the like.
- the smart device 300 may also be used to control the display device 200 .
- the display device 200 is controlled using an application running on the smart device.
- the display device may not use the above-mentioned smart device or control device to receive instructions, but receive user control through touch or gesture.
- the smart device 300 and the display device may also be used to communicate data.
- the display device 200 can also be controlled in a manner other than the control apparatus 100 and the smart device 300.
- the module for acquiring voice commands configured inside the display device 200 can directly receive the user's voice command for control.
- the user's voice command control can also be received through the voice control device provided outside the display device 200 .
- the display device 200 is also in data communication with the server 400 .
- the display device 200 may be allowed to communicate via local area network (LAN), wireless local area network (WLAN), and other networks.
- the server 400 may provide various contents and interactions to the display device 200 .
- the server 400 may be a cluster or multiple clusters, and may include one or more types of servers.
- the software steps executed by one step execution body can be migrated to another step execution body that is in data communication with it for execution as required.
- the software steps executed by the server may be migrated to be executed on the display device with which it is in data communication as required, and vice versa.
- FIG. 2 shows a block diagram of a hardware configuration of the control apparatus 100 according to some embodiments.
- the control device 100 includes a controller 110 , a communication interface 130 , a user input/output interface 140 , a memory, and a power supply.
- the control device 100 can receive the user's input operation instruction, and convert the operation instruction into an instruction that the display device 200 can recognize and respond to, and play an intermediary role between the user and the display device 200 .
- the communication interface 130 is used for external communication, including at least one of a WIFI chip, a Bluetooth module, NFC or an alternative module.
- the user input/output interface 140 includes at least one of a microphone, a touchpad, a sensor, a button, or an alternative module.
- FIG. 3 shows a block diagram of a hardware configuration of the display device 200 according to some embodiments.
- display device 200 includes tuner 210, communicator 220, detector 230, external device interface 240, controller 250, display 260, audio output interface 270, memory, power supply, user interface at least one.
- the controller includes a central processing unit, a video processing unit, an audio processing unit, a graphics processing unit, a RAM, a ROM, and a first interface to an nth interface for input/output.
- the display 260 includes a display screen component for presenting a picture, and a driving component for driving the image display, for receiving the image signal output from the controller, for displaying the video content, the image content and the menu manipulation interface Components and user-manipulated UI interfaces, etc.
- the display 260 may be at least one of a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.
- the tuner demodulator 210 receives broadcast television signals through wired or wireless reception, and demodulates audio and video signals, such as EPG data signals, from a plurality of wireless or cable broadcast television signals.
- communicator 220 is a component for communicating with external devices or servers according to various communication protocol types.
- the communicator may include at least one of a Wifi module, a Bluetooth module, a wired Ethernet module and other network communication protocol chips or near field communication protocol chips, and an infrared receiver.
- the display device 200 may establish transmission and reception of control signals and data signals with the control apparatus 100 or the server 400 through the communicator 220 .
- the detector 230 is used to collect signals from the external environment or interaction with the outside.
- the detector 230 includes a light receiver, a sensor for collecting ambient light intensity; alternatively, the detector 230 includes an image collector, such as a camera, which can be used to collect external environmental scenes, user attributes or user interaction gestures, or , the detector 230 includes a sound collector, such as a microphone, for receiving external sound.
- the external device interface 240 may include, but is not limited to, the following: High Definition Multimedia Interface (HDMI), Analog or Data High Definition Component Input Interface (Component), Composite Video Input Interface (CVBS), USB Input Interface (USB), Any one or more interfaces such as RGB ports. It may also be a composite input/output interface formed by a plurality of the above-mentioned interfaces.
- HDMI High Definition Multimedia Interface
- Component Analog or Data High Definition Component Input Interface
- CVBS Composite Video Input Interface
- USB Input Interface USB
- Any one or more interfaces such as RGB ports. It may also be a composite input/output interface formed by a plurality of the above-mentioned interfaces.
- the controller 250 and the tuner 210 may be located in different separate devices, that is, the tuner 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box Wait.
- the controller 250 controls the operation of the display device and responds to user operations.
- the controller 250 controls the overall operation of the display apparatus 200 .
- the controller 250 may perform an operation related to the object selected by the user command.
- the object may be any of the selectable objects, such as hyperlinks, icons, or other operable controls.
- the operations related to the selected object include: displaying operations connected to hyperlinked pages, documents, images, etc., or executing operations of programs corresponding to the icons.
- the controller includes a central processing unit (Central Processing Unit, CPU), a video processor, an audio processor, a graphics processor (Graphics Processing Unit, GPU), RAM (Random Access Memory, RAM), ROM (Read- Only Memory, ROM), at least one of the first interface to the nth interface for input/output, a communication bus (Bus), and the like.
- CPU Central Processing Unit
- video processor video processor
- audio processor audio processor
- graphics processor Graphics Processing Unit, GPU
- RAM Random Access Memory
- ROM Read- Only Memory
- CPU processor It is used to execute the operating system and application program instructions stored in the memory, and to execute various application programs, data and contents according to various interactive instructions received from external input, so as to finally display and play various audio and video contents.
- CPU processor which can include multiple processors. For example, it includes a main processor and one or more sub-processors.
- the graphics processor is used to generate various graphic objects, such as at least one of icons, operation menus, and user input instructions to display graphics.
- the graphics processor includes an operator, which performs operations by receiving various interactive instructions input by the user, and displays various objects according to the display attributes; it also includes a renderer, which renders various objects obtained based on the operator, and the rendered objects are used for rendering. displayed on the display.
- the video processor is used to decompress, decode, scale, reduce noise, convert frame rate, convert resolution, and synthesize images according to the standard codec protocol of the received external video signal. At least one of the processes can obtain a signal that is directly displayed or played on the displayable device 200 .
- the video processor includes at least one of a demultiplexing module, a video decoding module, an image synthesis module, a frame rate conversion module, a display formatting module, and the like.
- the demultiplexing module is used for demultiplexing the input audio and video data stream.
- the video decoding module is used to process the demultiplexed video signal, including decoding and scaling.
- the image synthesizing module such as an image synthesizer, is used for superimposing and mixing the GUI signal generated by the graphics generator according to the user's input or itself, and the zoomed video image, so as to generate an image signal that can be displayed.
- the frame rate conversion module is used to convert the input video frame rate.
- the display formatting module is used to convert the received frame rate into the video output signal, and change the signal to conform to the display format signal, such as outputting the RGB data signal.
- the audio processor is configured to receive an external audio signal, perform decompression and decoding according to a standard codec protocol of the input signal, and perform at least one of noise reduction, digital-to-analog conversion, and amplification processing. , to get a sound signal that can be played in the loudspeaker.
- the user may input user commands on a graphical user interface (GUI) displayed on the display 260, and the user input interface receives the user input commands through the graphical user interface (GUI).
- GUI graphical user interface
- the user may input a user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through a sensor to receive the user input command.
- a "user interface” is a medium interface for interaction and information exchange between an application program or an operating system and a user, which enables conversion between an internal form of information and a form acceptable to the user.
- the user interface 280 is an interface that can be used to receive control input (eg, physical buttons on the display device body, or others).
- control input eg, physical buttons on the display device body, or others.
- the system of the display device may include a kernel (Kernel), a command parser (shell), a file system and an application program.
- kernel Kernel
- shell command parser
- file system an application program.
- the kernel, shell, and file system make up the basic operating system structures that allow users to manage files, run programs, and use the system.
- the kernel starts, activates the kernel space, abstracts hardware, initializes hardware parameters, etc., runs and maintains virtual memory, scheduler, signals and inter-process communication (IPC).
- IPC inter-process communication
- the shell and user applications are loaded.
- An application is compiled into machine code after startup, forming a process.
- the system of the display device may include a kernel (Kernel), a command parser (shell), a file system and an application program.
- kernel Kernel
- shell command parser
- file system an application program.
- the kernel, shell, and file system make up the basic operating system structures that allow users to manage files, run programs, and use the system.
- the kernel starts, activates the kernel space, abstracts hardware, initializes hardware parameters, etc., runs and maintains virtual memory, scheduler, signals and inter-process communication (IPC).
- IPC inter-process communication
- the shell and user applications are loaded.
- An application is compiled into machine code after startup, forming a process.
- the system of the display device is divided into three layers, which are respectively an application layer, a middleware layer and a hardware layer from top to bottom.
- the application layer mainly includes common applications on the TV and Application Framework (Application Framework), among which, common applications are mainly applications developed based on browser Browser, such as: HTML5APPs; and native applications (Native APPs);
- Application Framework is a complete program model, with all the basic functions required by standard application software, such as: file access, data exchange..., and the use interface of these functions (toolbar, status bar, menu, dialog frame).
- Native APPs can support online or offline, message push or local resource access.
- the middleware layer includes middleware such as various TV protocols, multimedia protocols, and system components.
- the middleware can use the basic services (functions) provided by the system software to connect various parts of the application system or different applications on the network, and can achieve the purpose of resource sharing and function sharing.
- the hardware layer mainly includes the HAL interface, hardware and drivers.
- the HAL interface is a unified interface for connecting all TV chips, and the specific logic is implemented by each chip.
- Drivers mainly include: audio driver, display driver, Bluetooth driver, camera driver, WIFI driver, USB driver, HDMI driver, sensor driver (such as fingerprint sensor, temperature sensor, pressure sensor, etc.), and power driver.
- some display devices 200 may be equipped with an intelligent voice function, and the user can conveniently control the display device 200 by inputting a voice.
- voice solutions can also be applied to the display device 200, so a display device 200 may have multiple voice solutions, such as the most popular Amazon's Alexa voice solution and Google's voice solution, and Some other countries have their own specific voice schemes, etc.
- the voice scheme in the display device 200 mainly collects the voice content input by the user through a radio device such as a remote control.
- a radio device such as a remote control.
- voice scheme B selects voice scheme B to search for voice content during the current operation.
- the user interacts with the UI menu of the display device 200 and selects a voice scheme on the UI menu through the remote control; the other is that the user selects a voice scheme on the UI menu through the remote control;
- the display device 200 is controlled to use different voice schemes by pressing different buttons or voice keys. Then use the second voice scheme.
- the former will increase the interaction complexity of the user, thereby affecting the user experience, while the latter will increase the redundancy of the same function buttons such as the remote control, increase the hardware cost or increase the degree of button reuse. Either way, it will increase the difficulty of the user's operation when switching the voice scheme.
- the embodiments of the present application provide a method for switching voice schemes on a display device, a display device 200 and a control apparatus 100, which can associate different voice schemes with actions of a user operating the control apparatus 100, and users can
- the user's preference operates the control device 100, generates corresponding actions, and then controls the display device 200 to use the voice scheme that the user wants to select.
- the use of keys of the control device 100 can be reduced, and the difficulty of switching the voice scheme can be reduced.
- the display device 200 in the embodiment of the present application has the controller 250 .
- the controller 250 may receive a voice control instruction or the like input by the user to the display device 200 through the control device 100, and control the display 260 to switch the display page to the page of the voice scheme desired by the user according to the voice control instruction.
- controller 110 is also included in the control device 100 .
- the controller 110 may receive voice data input by the user, and use or control a sensor or the like to detect motion data of the user operating the control device 100 .
- the data transmission and processing between the display device 200 and the control device 100 are performed by their respective controllers.
- FIG. 5 shows a schematic diagram of an interaction between the controller 250 and the controller 110 according to some embodiments.
- the user may input voice data that the user wants to watch on the display device 200 through the control apparatus 100 , such as “how is the weather” and the like.
- different voice schemes in the display device 200 can be corresponding to different actions of the control device 100 in advance.
- the control device 100 can be operated to make an action corresponding to the target voice scheme, and then the controller 110 packs the detected action data together with the voice data to generate a voice control instruction, and It is sent to the display device 200 .
- the controller 250 of the display device 200 After receiving the voice control instruction, the controller 250 of the display device 200 finds the target voice scheme corresponding to the action data and the target content corresponding to the voice data. Then, the controller 250 switches the display page of the display 260 to the target voice scheme page corresponding to the action data, and displays the target content corresponding to the voice data on the target voice scheme page.
- the actions of the user operating the control device 100 in the embodiments of the present application may include, but are not limited to, rotating the direction the control device 100 points to, holding the control device 100 to move, or drawing gestures on the control device 100 .
- the display device 200 can have multiple voice schemes at the same time, different voice schemes can correspond to different directions, can correspond to different movement trajectories, and can also correspond to different gestures.
- the user can control the display device 200 to switch to the corresponding target voice scheme page for display only by operating the control device 100 to perform certain actions.
- This method can avoid installing too many voice control buttons on the control device 100, and can also prevent the user from frequently interacting with the UI menu of the display device 200, so that the user's voice scheme switching operation is more concise.
- the control apparatus 100 in this embodiment of the present application may be a remote controller connected to the display device 200 via Bluetooth, or may be a smart terminal installed with a virtual remote controller, or the like.
- the control device 100 is a remote control
- the user can operate the remote control to rotate in different directions, and can also operate the remote control to move out a certain trajectory;
- the control device 100 is a smart terminal equipped with a virtual remote control
- the user can not only operate the smart The terminal rotates in different directions, and the smart terminal can be operated to move a certain trajectory, and different gestures can be drawn on the display screen of the smart terminal.
- sensors such as gravity or gyroscope are installed in the control devices 100 such as remote controllers and smart terminals, so as to detect direction data and trajectory data.
- FIG. 6 shows a schematic diagram of a second interaction between the controller 110 and the controller 250 according to some embodiments.
- the controller 110 can use a sensor to detect the direction in which the user operates the control device 100 (ie, the remote control) while receiving the voice data input by the user. , and generate orientation data, such as left, right, etc. Then, the controller 110 packages the direction data and the voice data to generate a voice control instruction and sends it to the display device 200 .
- the controller 250 of the display device 200 parses it to obtain voice data and direction data. Then, the controller 250 obtains the target voice scheme page corresponding to the direction data, and switches the display page of the display 260 to the target voice scheme page; at the same time, the controller 250 also obtains the target content corresponding to the voice data, and displays it on the display 260. Display the target content on the target voice scheme page while displaying the target voice scheme page.
- Figure 7 shows a schematic diagram of a voice plan page according to some embodiments.
- FIG. 8 shows a second schematic diagram of a speech scheme page according to some embodiments.
- FIG. 9 shows a schematic diagram of a user-operated control device 100 according to some embodiments.
- the specific operation mode is shown in FIG. 9 .
- the left direction data of the remote controller corresponds to the voice scheme A
- the right direction data of the remote controller corresponds to the voice scheme B.
- the user can hold the remote controller and point to the left while speaking the voice content (for example, "what's the weather") to the remote controller, and the controller 110 sends the received voice data and direction data to the display device 200 .
- the controller 250 of the display device 200 obtains the voice solution A corresponding to the direction data on the left, and simultaneously obtains the target content (eg, the weather data at the current moment) corresponding to the voice data.
- the controller 250 switches the display page of the display 260 to the display page of the voice solution A, as shown in FIG. 7 , and displays the weather information of the current moment on the right side of the display page.
- the user can hold the remote controller and point to the right while speaking the voice content (for example, "what's the weather") to the remote controller, and the controller 110 sends the received voice data and direction data to the display device 200 .
- the controller 250 of the display device 200 obtains the voice solution B corresponding to the right direction data, and at the same time obtains the target content corresponding to the voice data (for example, the weather data at the current moment). Then, the controller 250 switches the display page of the display 260 to the display page of the voice solution B, as shown in FIG. 8 , and displays the weather information at the current moment at the bottom of the display page.
- the display position of the target content shown in FIG. 7 and FIG. 8 is not unique.
- the target content can also be displayed on the target voice solution page according to the user's preference or actual needs. any location.
- FIG. 10 shows a third schematic diagram of interaction between the controller 110 and the controller 250 according to some embodiments.
- the controller 110 may also use a sensor to detect whether the user operates the control device 100 (ie the remote control) while receiving the voice data input by the user. Move the trajectory and generate trajectory data such as circle, square, triangle, etc. Then, the controller 110 packages the trajectory data and the voice data to generate a voice control instruction and sends it to the display device 200 .
- the controller 250 of the display device 200 parses it to obtain voice data and trajectory data. Then, the controller 250 obtains the target voice scheme page corresponding to the trajectory data, and switches the display page of the display 260 to the target voice scheme page; at the same time, the controller 250 also obtains the target content corresponding to the voice data, and displays it on the display 260 Display the target content on the target voice scheme page while displaying the target voice scheme page.
- FIG. 11 shows a second schematic diagram of a user-operated control device 100 according to some embodiments.
- the circular trajectory data moved by the remote controller corresponds to the voice scheme A
- the triangular trajectory data corresponds to the voice scheme B.
- the user speaks the voice content (such as "how is the weather") to the remote controller, he moves the remote controller to draw a circular trajectory, for example, as shown in FIG. 11 .
- the voice scheme A can be displayed on the display device 200 , and the weather information at the current moment is displayed on the page, as shown in Figure 7; or, while the user speaks the voice content (such as "how is the weather") to the remote control, move the remote control to draw a triangle track , correspondingly, the page of the voice solution B can be displayed on the display device 200 , and the weather information of the current moment can be displayed on the page, as shown in FIG. 8 .
- the virtual remote control after the virtual remote control is installed on the smart terminal, it can also be used as the control device 100, for example, a virtual remote control is installed on a smart phone and the like. Since the smart terminal itself has a gravity sensor or a gyroscope, etc., the smart terminal can realize the same function as the above-mentioned physical remote control, and can detect both the direction data of its own movement and the trajectory data of its own movement.
- the smart terminal since the smart terminal itself also has a display screen, the smart terminal can also detect gestures drawn by the user on the display screen and generate gesture data.
- FIG. 12 shows a schematic diagram of a fourth interaction between the controller 110 and the controller 250 according to some embodiments.
- the controller 110 can also detect when the user is on the display screen of the control device 100 (ie, the smart terminal) while receiving the voice data input by the user. and generate gesture data, such as drawing a "Z"-shaped gesture, an "O"-shaped gesture, an "L”-shaped gesture, etc. with a finger. Then, the controller 110 packages the gesture data and the voice data to generate a voice control instruction and sends it to the display device 200 .
- the controller 250 of the display device 200 parses it to obtain voice data and gesture data. Then, the controller 250 obtains the target voice scheme page corresponding to the gesture data, and switches the display page of the display 260 to the target voice scheme page; at the same time, the controller 250 also obtains the target content corresponding to the voice data, and displays it on the display 260 Display the target content on the target voice scheme page while displaying the target voice scheme page.
- FIG. 13 shows a third schematic diagram of a user-operated control device 100 according to some embodiments.
- the controller 250 of the display device 200 After parsing the voice control instruction, the controller 250 of the display device 200 obtains the voice scheme A corresponding to the "Z"-shaped gesture data, and at the same time obtains the target content corresponding to the voice data (for example, the weather data at the current moment). Then, the controller 250 switches the display page of the display 260 to the display page of the voice solution A, as shown in FIG. 7 , and displays the weather information at the current moment on the right side of the display page.
- the user can draw an "L"-shaped gesture on the display screen while speaking the voice content (such as "how is the weather") to the smart terminal, and the controller 110 sends the received voice data and gesture data to the display screen device 200.
- the controller 250 of the display device 200 obtains the voice scheme B corresponding to the "L"-shaped gesture data, and simultaneously obtains the target content corresponding to the voice data (eg, the weather data at the current moment). Then, the controller 250 switches the display page of the display 260 to the display page of the voice solution B, as shown in FIG. 8 , and displays the weather information of the current moment on the right side of the display page.
- a voice button is set on both the remote controller and the virtual remote controller.
- the user can speak the voice content by pressing the button, and release the button after the input of the voice content is completed.
- the controller 110 needs to start detecting the user's operation of the remote controller when the user presses the voice button, so as to ensure that the motion data and the voice data are collected synchronously.
- the action time may be longer than the voice input time, that is, the user has finished speaking the voice content, but the operation action has not been completed. At this time, the user can continue to keep the voice button pressed, until the operation is completed.
- the remote controller and the smart terminal mentioned in the foregoing embodiments may control the display device 200 independently, or may control the display device 200 jointly.
- the display device 200 can be controlled by using the direction data, trajectory data, etc. separately, or the display device 200 can be controlled by using the direction data and the trajectory data simultaneously.
- the display device 200 can be controlled by using direction data, trajectory data, gesture data, etc. separately, or any two or all three of the data can be used simultaneously to control the display device 200 .
- the control device 100 provided by the embodiment of the present application detects the action of the user operating the control device 100 while receiving the voice data input by the user and generates the action data; and packages the voice data and the action data to generate a voice control command. sent to the display device 200 .
- There is no need to set too many redundant buttons on the control device 100 only one voice button is used to receive the voice content input by the user, and the operation action of the user is detected, and the page corresponding to the voice scheme can be switched on the display device 200 and The target content reduces the difficulty of switching the voice scheme.
- the display device 200 after receiving the voice control instruction, switches the display page of the display 260 to the voice scheme page corresponding to the action data, and displays the voice scheme page corresponding to the voice data. target content.
- the display device 200 also avoids the manual selection of the voice scheme by the user using the UI menu, and also reduces the difficulty of switching the voice scheme.
- control device 100 In the above interaction scheme between the control device 100 and the display device 200, different voice schemes are associated with the actions of the user to operate the control device 100, and the user can operate the control device 100 according to his own preferences, generate corresponding actions, and then control The display device 200 uses the voice scheme that the user wants to select.
- the key setting of the control device 100 is reduced, and the use of the UI menu of the display device 200 is also avoided, so that the switching difficulty of the voice scheme is reduced.
- FIG. 14 shows a flowchart of a method for switching a voice scheme on a display device according to some embodiments.
- This embodiment of the present application provides a method for switching voice schemes that can be applied to the display device 200 in the foregoing embodiment.
- the method is executed by the controller 250 and other control components that can realize the control function.
- the method can be specifically It includes the following steps:
- Step S101 receiving a voice control instruction sent by the control device 100 .
- the voice control instruction includes action data of the user operating the control apparatus 100 for switching the voice scheme of the display device 200 , and voice data input by the user for searching the target content on the display device 200 .
- Step S102 in response to the voice control instruction, switch the display page of the display 260 to the target voice scheme page corresponding to the action data, and display the target content corresponding to the voice data on the target voice scheme page.
- the method further includes: parsing the voice control instruction to obtain voice data and direction data; the direction data is data generated according to the direction in which the user operates the control device 100; obtaining target content corresponding to the voice data; When the display page of the display 260 is switched to the target voice scheme page corresponding to the direction data, the target content is displayed on the target voice scheme page.
- the method further includes: parsing the voice control instructions to obtain voice data and trajectory data; the trajectory data is data generated according to the trajectory of the user operating the control device 100; acquiring target content corresponding to the voice data; When the display page of the display 260 is switched to the target voice scheme page corresponding to the trajectory data, the target content is displayed on the target voice scheme page.
- the method further includes: parsing the voice control instruction to obtain voice data and gesture data; the gesture data is data generated according to the gesture input by the user on the control device 100; obtaining target content corresponding to the voice data; While switching the display page of the display 260 to the target voice scheme page corresponding to the gesture data, the target content is displayed on the target voice scheme page.
- FIG. 15 shows another flowchart of a method for switching a voice scheme on a display device according to some embodiments.
- This embodiment of the present application also provides a voice solution switching method that can be applied to the control device 100 in the foregoing embodiment.
- the method is executed by the controller 110 and other control components that can realize the control function, as shown in FIG. Can include the following steps:
- Step S201 while receiving the voice data input by the user, detecting the action of the user operating the control device 100 and generating action data.
- Step S202 sending the voice control instruction generated by packaging the voice data and the action data to the display device 200 .
- the method further includes: while receiving the voice data input by the user, using a sensor to detect the direction in which the user operates the control device 100 and generate direction data; wherein, different direction data correspond to different directions in the display device 200 voice program.
- the method further includes: while receiving the voice data input by the user, using a sensor to detect the trajectory of the user operating the control device 100 and generate trajectory data; wherein different trajectory data corresponds to different trajectory data in the display device 200 voice program.
- the method further includes: while receiving the voice data input by the user, detecting the gesture input by the user on the control device 100 and generating gesture data; wherein different gesture data corresponds to different gesture data in the display device 200 voice program.
- An embodiment of the present application provides a server, where the server is configured to execute:
- the sound data also includes a video application name, and when a video application corresponding to the video application name is installed on the display device, the video resource name corresponding to the video resource name is searched in the video application corresponding to the video application name the video resources, and feeding back the video resources to the display device;
- the sound data also includes a video application name, and when the video application corresponding to the video application name is not installed on the display device, no video resource is fed back to the display device.
- the server is further configured to execute:
- the server is further configured to execute:
- the server includes:
- a voice recognition sub-server configured to receive the voice data collected by the display device, identify at least the video resource name from the voice data, and send the data identified from the voice data to the instruction generation sub-server;
- an instruction generation sub-server for executing, generating a resource search instruction according to the data identified from the sound data, and sending the resource search instruction to the display device;
- a video search sub-server configured to perform, in the data identified from the sound data, a video application name is also included, and when a video application corresponding to the video application name is installed on the display device, receive the display
- the video search sub-server is further configured to execute, the data identified from the sound data also includes a video application name, and when the video application corresponding to the video application name is not installed on the display device, the video application name is not installed. A video search request sent by the display device is received.
- the video search sub-server is further configured to execute, when the data identified from the sound data does not contain a video application name, and a video application runs on the display device, Receive the video search request sent by the display device, search for the video resource corresponding to the video resource name in the currently running video application according to the video search request, and feed back the video resource to the display device .
- the video search sub-server is further configured to execute, when the data identified from the sound data does not contain a video application name, and the video application is not running on the display device, Receive the video search request sent by the display device, search for the video resource corresponding to the video resource name in all video applications installed on the display device according to the video search request, and feed back the video resource to the display device.
- Some embodiments of the present application also provide a display device, including:
- a sound collector which is used to collect the user's sound data
- a controller configured to execute, sending the sound data to the server, where the sound data at least contains a video resource name;
- the sound data also includes a video application name, and when a video application corresponding to the video application name is installed on the display device, a video resource corresponding to the video resource name is received from the server, wherein, The video resource is searched in the video application corresponding to the video application name;
- the sound data further includes a video application name
- the video application corresponding to the video application name is not installed on the display device, no video resource is received from the server.
- the controller when the sound data further includes a video application name, and when the video application corresponding to the video application name is not installed on the display device, the controller is further configured to execute : Display prompt information on the display, where the prompt information is used to prompt the user that the video application corresponding to the video application name is not installed on the display device.
- the controller is further configured to execute:
- the sound data does not contain a video application name and the video application is not running on the display device, receive a video resource corresponding to the video resource name from the server, where the video resource is the searched from all video apps installed on the display device as described above.
- Some embodiments of the present application also provide a video search method, applied to a display device, including:
- the sound data also includes a video application name, and when a video application corresponding to the video application name is installed on the display device, a video resource corresponding to the video resource name is received from the server, wherein, The video resource is searched in the video application corresponding to the video application name;
- the sound data also includes a video application name, and when the video application corresponding to the video application name is not installed on the display device, no video resources are received from the server.
- the display device is integrated with an intelligent voice assistant, and users can use the remote control to conduct video searches through the intelligent voice assistant.
- the traditional display device usually searches the whole device, that is, searches for videos on multiple video applications installed on the display device at the same time, so it is impossible to search for videos on a specified video application. , resulting in poor video search experience for users.
- the present application provides a video search system, such as the frame diagram of the video search system shown in FIG. 16 , the system includes a display device 200 and a server 400 .
- the embodiment of the present application is a scenario of interaction between a display device and a server.
- a variety of video applications are installed on the display device, and the server is used to identify the sound data collected by the display device, and at the same time provide video resources of the various video applications.
- the user inputs sound data to the display device, and the sound collector of the display device collects the sound data input by the user.
- the display device can send the transcoded sound data to the server.
- the sound data includes at least a video resource name.
- the server After receiving the sound data, the server identifies the data from the sound data, specifically at least the name of the video resource.
- the audio data further includes a video application name
- a video application corresponding to the video application name is installed on the display device. That is, the server not only recognizes the video resource name from the sound data, but also recognizes the video application name from the sound data, and at the same time, the display device has the video application installed. Then, the display device invokes the search interface of the video application, and searches the server for the video resource corresponding to the video resource name. After the search is successful, the server feeds back the video resource to the display device.
- the audio data further includes a video application name
- the video application corresponding to the video application name is not installed on the display device. Then, the display device cannot call the search interface of the video application, and also cannot search the server for the video resource corresponding to the video resource name, and cannot feed back the video resource to the display device.
- the display device when the user inputs sound data "search for video X in video application A", the display device sends the sound data to the server.
- the server identifies the video resource name as video X and the video application name as application A from the sound data.
- the display device calls the search interface of the video application A to search for the video X in the server. After the video resource of the video X is searched, the video resource of the video X is fed back to the display device. Thereby, the purpose of searching for a specified video resource in a specified video application through the voice assistant is realized, and the user's video search experience is improved.
- the display device cannot call the search interface of the video application A, so it cannot search for the video X in the server, and also cannot feed back the video resources of the video X to the display device.
- the search interface of the currently running video application is called, and the server Search for the video resource corresponding to the video application name. After searching for the video resource corresponding to the video resource name, the video resource is fed back to the display device.
- the whole machine search function is invoked, and the whole machine searches for the video application. video resources.
- the display device when the user inputs sound data "search for video X", the display device sends the sound data to the server.
- the server can only identify the video resource name, video X, from the sound data.
- the display device calls the search interface of the video application B to search the server for the video resource of the video X. After the video resource of the video X is searched, the video resource is fed back to the display device.
- the whole device search function search in all video applications installed on the display device
- the whole device searches for the video resources of video X After the video resource of the video X is searched, the video resource is fed back to the display device.
- the video resources of the video X searched in different video applications may be displayed in order according to the user's preference for each video application.
- server 400 includes speech recognition sub-service 400A, instruction generation sub-server 400B, and video search sub-server 400C.
- the speech recognition sub-server may be a server of an intelligent speech partner, which is used for parsing speech and semantics, and recognizing relevant instructions.
- the instruction generation sub-server and the video search sub-server may be local servers for generating relevant search instructions according to the parsed semantics.
- the video search sub-server is used for receiving search requests from the display device and feeding back relevant resources.
- the user inputs sound data to the display device, and the sound collector of the display device collects the sound data input by the user.
- the display device may send the transcoded sound data to the speech recognition sub-server.
- the sound data includes at least a video resource name.
- the voice recognition sub-server After receiving the voice data, the voice recognition sub-server performs voice and semantic analysis on the voice data, identifies relevant command parameters, and specifically at least identifies the video resource name.
- the audio data further includes a video application name
- the display device is installed with a video application corresponding to the video application name.
- the speech recognition sub-server parses the sound data, it also recognizes the video application name. Afterwards, the speech recognition sub-server sends the recognized video application name and video resource name, as well as other related parameters (such as executed operation parameters, device parameters, language parameters, etc.) to form the command, to the command generation sub-server.
- the instruction generation sub-server generates a resource search instruction according to the video application name, the video resource name and other relevant parameters forming the instruction.
- the speech recognition sub-server may directly recognize the relevant instructions from the sound data, and send the recognized relevant instructions to the instruction generation sub-server.
- the instruction generation sub-server converts the identified relevant instructions into resource search instructions identifiable by the display device. The specific process of parsing the sound data and generating a resource search instruction according to the parsed data is not limited in this application.
- the instruction generation sub-server feeds back the generated resource search instruction to the display device.
- the display device After receiving the resource search instruction, the display device generates a video search request according to the resource search instruction.
- the display device sends the video search request to the video search sub-server, that is, calls the search interface of the video application corresponding to the video application name, and the video search sub-server searches for video resources in the video application.
- the video search sub-server After searching for the video resource of the video resource name, the video search sub-server feeds back the video resource to the display device.
- the sound data is displayed on the display device.
- the display device sends the sound data to the speech recognition sub-server.
- the voice recognition sub-server identifies from the voice data that the video resource name is Video X, the video application name is Application A, and other related parameters (the operation to be performed is search).
- the speech recognition sub-server sends the identified video resource name Video X, the video application name Application A and other relevant parameters to the instruction generation sub-server.
- the instruction generation sub-server generates a resource search instruction according to the identified video resource name video X, video application name application A and other relevant parameters: search video X in video application A.
- the instruction generation sub-server feeds back the generated resource search instruction to the display device.
- the display device After receiving the resource search instruction, the display device jumps from the user interface shown in FIG. 17 to the user interface shown in FIG. 18 .
- the user interface shown in FIG. 18 is the user interface of the video application A.
- the specific implementation process is as follows: the display device generates a video search request according to the resource search instruction, and sends the video search request to the video search sub-server, so that the application A searches for the video X in the video application A. That is, the search interface of application A is called on the display device, and the application A is searched and the video application A is searched for video X.
- the video search sub-server After searching for the video resource of the video X, the video search sub-server feeds back the video resource of the video X to the display device.
- the user interface shown in FIG. 18 displays the video resources of the video X obtained by the search to the user (the video X and other videos related to the video X can be displayed).
- the speech recognition sub-server parses the sound data, it also recognizes the video application name. Afterwards, the speech recognition sub-server sends the recognized video application name and video resource name, as well as other related parameters (such as executed operation parameters, device parameters, language parameters, etc.) to form the command, to the command generation sub-server.
- the instruction generation sub-server generates a resource search instruction according to the video application name, the video resource name and other relevant parameters forming the instruction.
- the speech recognition sub-server may directly recognize the relevant instructions from the sound data, and send the recognized relevant instructions to the instruction generation sub-server.
- the instruction generation sub-server converts the identified relevant instructions into resource search instructions identifiable by the display device.
- the instruction generation sub-server feeds back the generated resource search instruction to the display device. Since the video application corresponding to the video application name is not installed on the display device at this time, the search interface of the video application cannot be called. Then, the corresponding video search request cannot be sent to the video search sub-server, and the video resource cannot be obtained from the video search sub-server.
- the voice recognition sub-server identifies relevant parameters for generating an instruction from the voice data, and sends the relevant parameters to the instruction generating sub-server.
- the instruction generation sub-server generates a resource search instruction according to the relevant parameters of the generation instruction: search for video X in video application A.
- application A is not installed on the display device, so the search interface of application A cannot be called, so a video search request cannot be sent to the video search sub-server.
- the video search sub-server also cannot feed back the video resources of the search video X.
- the audio data also includes a video application name
- the display device does not have a video application corresponding to the video application name installed
- the corresponding video search request cannot be sent to the video search sub-server, nor can the video search request be sent from the video search sub-server.
- the video search sub-server obtains the video resource.
- the controller generates prompt information and displays the prompt information on the display at the same time.
- the prompt information can be: The application does not exist, please search in other applications.
- the speech recognition sub-server can only recognize the video resource name from the audio data.
- the speech recognition sub-server sends the identified video resource name and other relevant parameters of the instruction generation to the instruction generation sub-server.
- the instruction generation sub-server generates a resource search instruction according to the video resource name and other relevant parameters.
- the display device After receiving the resource search instruction, the display device generates a video search request according to the resource search instruction, that is, calls the search interface of the currently running video application. Send a video search request to the video search sub-server. A video resource corresponding to the video resource name is searched in the currently running video application, and the video resource obtained by the search is fed back to the display device.
- the user interface shown in FIG. 19 is the home page interface of video application B, including a navigation bar and recommended videos.
- the user interface shown in FIG. 19 when the user inputs sound data "search for video X", the sound data may be displayed in the user interface.
- the display device sends the sound data to the speech recognition sub-server.
- the speech recognition sub-server identifies the relevant parameters of the command generation from the voice data, and sends the relevant parameters to the command generation sub-server.
- the instruction generation sub-server generates a resource search instruction according to the relevant parameters of the generation instruction: searching for the video X, and sends the resource search instruction to the display device.
- the display device After receiving the resource search instruction, the display device jumps from the user interface shown in FIG. 19 to the user interface shown in FIG. 20 .
- the specific implementation process is as follows: the display device invokes the search interface of the video application B according to the resource search instruction to generate a video search request.
- the video search request is sent to the video search sub-server, the video resource of the video X is searched in the video application B, and the video resource of the video X obtained by the search is fed back to the display device.
- the video resources related to the video X obtained from the search are displayed on the display device.
- the speech recognition sub-server can only recognize the video resource name from the audio data.
- the instruction generation sub-server generates a resource search instruction according to the data identified from the sound data.
- the display device does not currently have a video application running in the background, and the sound data does not contain the video application name, so the display device calls the search interface of all installed video applications, generates a video search request, and sends the video search request to the video search sub-server . Thereby, the whole machine search is realized, and finally all the video resources corresponding to the video resource names obtained by the search are fed back to the display device.
- the voice recognition sub-server identifies relevant parameters of the instruction generation from the voice data, and sends the relevant parameters to the instruction generation sub-server.
- Instruction generation The sub-server generates a resource search instruction according to the relevant parameters of the generated instruction: search for video X.
- the search instruction does not contain the specified video application, and the display device does not currently have a video application running in the background. Then, the search interfaces of all installed video applications are called, a video search request is generated, and the video search request is sent to the video search sub-server. Thereby, the whole machine can search the video resources of the video X. Finally, the video resource corresponding to the video X is fed back to the display device.
- the instruction generation sub-server includes an ApplicationName parameter in the resource search instruction generated according to the data identified from the sound data. If the speech recognition sub-server recognizes the name of an application from the sound data, it assigns the name of the application to the ApplicationName parameter. After the display device receives the resource search instruction, if the value of the ApplicationName field is not empty, and the value of the ApplicationName field is consistent with the name of an application installed on the display device, the application is opened. At the same time, the name of the video resource to be searched is sent to the search interface of the application, so as to search for the video resource corresponding to the video resource name in the application.
- the display device After the display device receives the resource search instruction, if the value of the ApplicationName field is not empty, and there is no application with the same value as the value of the ApplicationName field in the applications installed on the display device, the application cannot be opened, and the video resource cannot be searched in the application. .
- the display device After the display device receives the resource search instruction, if the value of the ApplicationName field is empty, that is, the name of the application is not recognized from the sound data, and there is an application running in the background of the current display device, the currently running video application is opened. At the same time, the name of the video resource to be searched is sent to the search interface of the currently running application, so as to search for the video resource corresponding to the video resource name in the currently running application.
- the display device After the display device receives the resource search instruction, if the value of the ApplicationName field is empty, and there is no application running in the background of the current display device, all video applications installed on the display device are opened. At the same time, the name of the video resource to be searched is sent to the search interface of all video applications to realize the search on the whole machine.
- An embodiment of the present application provides a video search method, such as the signaling diagram of the video search method shown in FIG. 21 , the method includes the following steps:
- Step 1 The display device collects sound data, where the sound data is a voice command input by a user through a user input interface.
- the sound data includes at least a video resource name.
- the display device sends the sound data to the server.
- Step 2 After the server receives the sound data, if the sound data also includes a video application name, and a video application corresponding to the video application name is installed on the display device. Then, the video resource corresponding to the video resource name is searched in the video application corresponding to the video application name.
- Step 3 The server feeds back the video resource corresponding to the video resource name to the display device.
- the audio data further includes a video application name
- the video application corresponding to the video application name is not installed on the display device. Then the video resource is not fed back to the display device.
- the audio data does not contain the video application name, and there is currently a video application running on the display device. Then, a video resource corresponding to the video resource name is searched in the currently running video application, and the video resource is fed back to the display device.
- the whole machine searches for the video resource corresponding to the video resource name, and feeds back the video resource to the display device.
- the embodiments of the present application provide another video search method, such as the signaling diagram of the video search method as shown in Figure 22, the method includes the following steps:
- Step 1 The display device collects sound data, where the sound data is a voice command input by a user through a user input interface.
- the sound data includes at least a video resource name.
- the display device sends the sound data to the speech recognition sub-server.
- Step 2 The speech recognition sub-server recognizes the parameters of the command generation from the voice data, including at least the video resource name, and sends the parameters of the generation command to the command generation sub-server.
- Step 3 The instruction generation sub-server generates a resource search instruction according to each parameter of the generation instruction, and feeds back the resource search instruction to the display device.
- Step 4 the display device receives the resource search instruction, and the sound data also includes a video application name at this time. If the video application corresponding to the video application name is installed in the display device, then generate a video search request according to the resource search instruction (calling and video the search interface of the video application corresponding to the application name), and send the video search request to the video search sub-server.
- Step 5 After receiving the video search request, the video search sub-server searches for the video resource corresponding to the video resource name in the video application corresponding to the video application name, and feeds back the video resource corresponding to the video resource name to the display device.
- the search interface of the video application cannot be called, and the video search request cannot be generated according to the resource search instruction.
- the search interface of the currently running video application is invoked, and a video search request is sent to the video search sub-server.
- the video search sub-server After receiving the video search request, the video search sub-server searches for the video resource corresponding to the video resource name in the currently running video application, and feeds back the video resource to the display device.
- the whole device searches for a video resource corresponding to the video resource name.
- the search interfaces of all video applications installed on the display device are called, and a video search request is sent to the video search sub-server.
- the video search sub-server searches all video applications for video resources corresponding to the video resource names, and feeds back all the searched video resources corresponding to the video resource names to the display device.
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
本申请提供了一种显示设备上语音方案的切换方法、显示设备及控制装置。显示设备在接收到语音控制指令后,将显示器的显示页面切换至与动作数据对应的语音方案页面,以及在该页面上显示与语音数据对应的目标内容。
Description
本申请要求在2021年1月29日提交的、申请号为202110124749.7、发明名称为“一种视频搜索方法、显示设备及服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中;申请要求在2021年2月4日提交的、申请号为202110156337.1、发明名称为“显示设备上语音方案的切换方法、显示设备及控制装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及显示技术领域,尤其涉及一种显示设备上语音方案的切换方法、显示设备及控制装置。
随着显示设备趋于智能化,一些显示设备上可配备智能语音功能,用户可以通过输入语音的方式方便地控制显示设备。
发明内容
本申请实施方式提供了一种显示设备上语音方案的切换方法、显示设备及控制装置。
第一方面,本申请实施方式提供了一种显示设备,包括:
显示器;
控制器,被配置为:
接收控制装置发送的语音控制指令;所述语音控制指令包括用于切换显示设备语音方案的用户操作所述控制装置的动作数据,以及,用于在显示设备上搜索目标内容的用户输入的语音数据;
响应于所述语音控制指令,将显示器的显示页面切换至与所述动作数据相对应的目标语音方案页面,以及在所述目标语音方案页面上显示与所述语音数据对应的目标内容。
第二方面,本申请实施方式还提供了一种控制装置,包括:
控制器,被配置为:
在接收用户输入的语音数据的同时,检测用户操作所述控制装置的动作并生成动作数据;
将所述语音数据与所述动作数据打包生成的语音控制指令发送给显示设备。
第三方面,本申请实施方式还提供了一种显示设备上语音方案的切换方法,包括:
接收控制装置发送的语音控制指令;所述语音控制指令包括用于切换显示设备语音方案的用户操作所述控制装置的动作数据,以及,用于在显示设备上搜索目标内容的用户输入的语音数据;
响应于所述语音控制指令,将显示器的显示页面切换至与所述动作数据相对应的目标语音方案页面,以及在所述目标语音方案页面上显示与所述语音数据对应的目标内容。
第四方面,本申请实施方式还提供了另一种显示设备上语音方案的切换方法,包括:
在接收用户输入的语音数据的同时,检测用户操作所述控制装置的动作并生成动作数据;
将所述语音数据与所述动作数据打包生成的语音控制指令发送给显示设备。
下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了根据一些实施例的显示设备的使用场景的示意图;
图2示出了根据一些实施例的控制装置100的硬件配置框图;
图3示出了根据一些实施例的显示设备200的硬件配置框图;
图4示出了根据一些实施例的显示设备200中软件配置图;
图5示出了根据一些实施例的控制器250与控制器110之间的一种交互示意图;
图6示出了根据一些实施例的控制器110与控制器250之间的第二种交互示意图;
图7示出了根据一些实施例的语音方案页面的一种示意图;
图8示出了根据一些实施例的语音方案页面的第二种示意图;
图9示出了根据一些实施例的用户操作控制装置100的一种示意图;
图10示出了根据一些实施例的控制器110与控制器250之间的第三种交互示意图;
图11示出了根据一些实施例的用户操作控制装置100的第二种示意图;
图12示出了根据一些实施例的控制器110与控制器250之间的第四种交互示意图;
图13示出了根据一些实施例的用户操作控制装置100的第三种示意图;
图14示出了根据一些实施例的显示设备上语音方案的切换方法的一种流程图;
图15示出了根据一些实施例的显示设备上语音方案的切换方法的另一种流程图;
图16示出了根据一些实施例的视频搜索系统的框架图;
图17示出了根据一些实施例中显示设备200中的用户界面示意图;
图18示出了根据一些实施例中又一种显示设备200中的用户界面示意图;
图19示出了根据一些实施例中又一种显示设备200中的用户界面示意图;
图20示出了根据一些实施例中又一种显示设备200中的用户界面示意图;
图21示出了根据一些实施例的视频搜索方法信令图;
图22示出了根据一些实施例的又一种视频搜索方法信令图。
为使本申请的目的和实施方式更加清楚,下面将结合本申请示例性实施例中的附图,对本申请示例性实施方式进行清楚、完整地描述,显然,描述的示例性实施例仅是本申请一部分实施例,而不是全部的实施例。
需要说明的是,本申请中对于术语的简要说明,仅是为了方便理解接下来描述的实施方式,而不是意图限定本申请的实施方式。除非另有说明,这些术语应当按照其普通和通常的含义理解。
本申请中说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”等是用于区别类似或同类的对象或实体,而不必然意味着限定特定的顺序或先后次序,除非另外注明。应该理解这样使用的用语在适当情况下可以互换。
术语“包括”和“具有”以及他们的任何变形,意图在于覆盖但不排他的包含,例如,包含了一系列组件的产品或设备不必限于清楚地列出的所有组件,而是可包括没有清楚地列出的或对于这些产品或设备固有的其它组件。
术语“模块”是指任何已知或后来开发的硬件、软件、固件、人工智能、模糊逻辑或硬件或/和软件代码的组合,能够执行与该元件相关的功能。
图1示出了根据一些实施例的显示设备的使用场景的示意图。如图1所示,显示设备200还与服务器400进行数据通信,用户可通过智能设备300或控制装置100操作显示设备200。
在一些实施例中,控制装置100可以是遥控器,遥控器和显示设备的通信包括红外协议通信或蓝牙协议通信,及其他短距离通信方式中的至少一种,通过无线或有线方式来控制显示设备200。用户可以通过遥控器上按键、语音输入、控制面板输入等至少一种输入用户指令,来控制显示设备200。
在一些实施例中,智能设备300可以包括移动终端、平板电脑、计算机、笔记本电脑,AR/VR设备等中的任意一种。
在一些实施例中,也可以使用智能设备300以控制显示设备200。例如,使用在智能设备上运行的应用程序控制显示设备200。
在一些实施例中,显示设备可以不使用上述的智能设备或控制设备接收指令,而是通过触摸或者手势等接收用户的控制。
在一些实施例中,也可以使用智能设备300和显示设备进行数据的通信。
在一些实施例中,显示设备200还可以采用除了控制装置100和智能设备300之外的方式进行控制,例如,可以通过显示设备200设备内部配置的获取语音指令的模块直接接收用户的语音指令控制,也可以通过显示设备200设备外部设置的语音控制装置来接收用户的语音指令控制。
在一些实施例中,显示设备200还与服务器400进行数据通信。可允许显示设备200通过局域网(LAN)、无线局域网(WLAN)和其他网络进行通信连接。服务器400可以向显示设备200提供各种内容和互动。服务器400可以是一个集群,也可以是多个集群,可以包括一类或多类服务器。
在一些实施例中,一个步骤执行主体执行的软件步骤可以随需求迁移到与之进行数据通信的另一步骤执行主体上进行执行。示例性的,服务器执行的软件步骤可以随需求迁移到与之数据通信的显示设备上执行,反之亦然。
图2示出了根据一些实施例的控制装置100的硬件配置框图。如图2所示,控制装置100包括控制器110、通信接口130、用户输入/输出接口140、存储器、供电电源。控制装置100可接收用户的输入操作指令,且将操作指令转换为显示设备200可识别和响应的指令,起用用户与显示设备200之间交互中介作用。
在一些实施例中,通信接口130用于和外部通信,包含WIFI芯片,蓝牙模块,NFC或可替代模块中的至少一种。
在一些实施例中,用户输入/输出接口140包含麦克风,触摸板,传感器,按键或 可替代模块中的至少一种。
图3示出了根据一些实施例的显示设备200的硬件配置框图。
在一些实施例中,显示设备200包括调谐解调器210、通信器220、检测器230、外部装置接口240、控制器250、显示器260、音频输出接口270、存储器、供电电源、用户接口中的至少一种。
在一些实施例中控制器包括中央处理器,视频处理器,音频处理器,图形处理器,RAM,ROM,用于输入/输出的第一接口至第n接口。
在一些实施例中,显示器260包括用于呈现画面的显示屏组件,以及驱动图像显示的驱动组件,用于接收源自控制器输出的图像信号,进行显示视频内容、图像内容以及菜单操控界面的组件以及用户操控UI界面等。
在一些实施例中,显示器260可为液晶显示器、OLED显示器、以及投影显示器中的至少一种,还可以为一种投影装置和投影屏幕。
在一些实施例中,调谐解调器210通过有线或无线接收方式接收广播电视信号,以及从多个无线或有线广播电视信号中解调出音视频信号,如以及EPG数据信号。
在一些实施例中,通信器220是用于根据各种通信协议类型与外部设备或服务器进行通信的组件。例如:通信器可以包括Wifi模块,蓝牙模块,有线以太网模块等其他网络通信协议芯片或近场通信协议芯片,以及红外接收器中的至少一种。显示设备200可以通过通信器220与控制装置100或服务器400建立控制信号和数据信号的发送和接收。
在一些实施例中,检测器230用于采集外部环境或与外部交互的信号。例如,检测器230包括光接收器,用于采集环境光线强度的传感器;或者,检测器230包括图像采集器,如摄像头,可以用于采集外部环境场景、用户的属性或用户交互手势,再或者,检测器230包括声音采集器,如麦克风等,用于接收外部声音。
在一些实施例中,外部装置接口240可以包括但不限于如下:高清多媒体接口(HDMI)、模拟或数据高清分量输入接口(分量)、复合视频输入接口(CVBS)、USB输入接口(USB)、RGB端口等任一个或多个接口。也可以是上述多个接口形成的复合性的输入/输出接口。
在一些实施例中,控制器250和调谐解调器210可以位于不同的分体设备中,即调谐解调器210也可在控制器250所在的主体设备的外置设备中,如外置机顶盒等。
在一些实施例中,控制器250,通过存储在存储器上中各种软件控制程序,来控制显示设备的工作和响应用户的操作。控制器250控制显示设备200的整体操作。例如:响应于接收到用于选择在显示器260上显示UI对象的用户命令,控制器250便可以执行与由用户命令选择的对象有关的操作。
在一些实施例中,所述对象可以是可选对象中的任何一个,例如超链接、图标或其他可操作的控件。与所选择的对象有关操作有:显示连接到超链接页面、文档、图像等操作,或者执行与所述图标相对应程序的操作。
在一些实施例中控制器包括中央处理器(Central Processing Unit,CPU),视频处理器,音频处理器,图形处理器(Graphics Processing Unit,GPU),RAM Random Access Memory,RAM),ROM(Read-Only Memory,ROM),用于输入/输出的第一接口至第n接口,通信总线(Bus)等中的至少一种。
CPU处理器。用于执行存储在存储器中操作系统和应用程序指令,以及根据接收外部输入的各种交互指令,来执行各种应用程序、数据和内容,以便最终显示和播放各种音视频内容。CPU处理器,可以包括多个处理器。如,包括一个主处理器以及一个或多个子处理器。
在一些实施例中,图形处理器,用于产生各种图形对象,如:图标、操作菜单、以及用户输入指令显示图形等中的至少一种。图形处理器包括运算器,通过接收用户输入各种交互指令进行运算,根据显示属性显示各种对象;还包括渲染器,对基于运算器得到的各种对象,进行渲染,上述渲染后的对象用于显示在显示器上。
在一些实施例中,视频处理器,用于将接收外部视频信号,根据输入信号的标准编解码协议,进行解压缩、解码、缩放、降噪、帧率转换、分辨率转换、图像合成等视频处理中的至少一种,可得到直接可显示设备200上显示或播放的信号。
在一些实施例中,视频处理器,包括解复用模块、视频解码模块、图像合成模块、帧率转换模块、显示格式化模块等中的至少一种。其中,解复用模块,用于对输入音视频数据流进行解复用处理。视频解码模块,用于对解复用后的视频信号进行处理,包括解码和缩放处理等。图像合成模块,如图像合成器,其用于将图形生成器根据用户输入或自身生成的GUI信号,与缩放处理后视频图像进行叠加混合处理,以生成可供显示的图像信号。帧率转换模块,用于对转换输入视频帧率。显示格式化模块,用于将接收帧率转换后视频输出信号,改变信号以符合显示格式的信号,如输出RGB数据信号。
在一些实施例中,音频处理器,用于接收外部的音频信号,根据输入信号的标准编解码协议,进行解压缩和解码,以及降噪、数模转换、和放大处理等处理中的至少一种,得到可以在扬声器中播放的声音信号。
在一些实施例中,用户可在显示器260上显示的图形用户界面(GUI)输入用户命令,则用户输入接口通过图形用户界面(GUI)接收用户输入命令。或者,用户可通过输入特定的声音或手势进行输入用户命令,则用户输入接口通过传感器识别出声音或手势,来接收用户输入命令。
在一些实施例中,“用户界面”,是应用程序或操作系统与用户之间进行交互和信息交换的介质接口,它实现信息的内部形式与用户可以接受形式之间的转换。
在一些实施例中,用户接口280,为可用于接收控制输入的接口(如:显示设备本体上的实体按键,或其他等)。
在一些实施例中,显示设备的系统可以包括内核(Kernel)、命令解析器(shell)、文件系统和应用程序。内核、shell和文件系统一起组成了基本的操作系统结构,它们让用户可以管理文件、运行程序并使用系统。上电后,内核启动,激活内核空间,抽象硬件、初始化硬件参数等,运行并维护虚拟内存、调度器、信号及进程间通信(IPC)。内核启动后,再加载Shell和用户应用程序。应用程序在启动后被编译成机器码,形成一个进程。
如图4所示,显示设备的系统可以包括内核(Kernel)、命令解析器(shell)、文件系统和应用程序。内核、shell和文件系统一起组成了基本的操作系统结构,它们让用户可以管理文件、运行程序并使用系统。上电后,内核启动,激活内核空间,抽象硬件、初始化硬件参数等,运行并维护虚拟内存、调度器、信号及进程间通信(IPC)。 内核启动后,再加载Shell和用户应用程序。应用程序在启动后被编译成机器码,形成一个进程。
如图4所示,将显示设备的系统分为三层,从上至下分别为应用层、中间件层和硬件层。
应用层主要包含电视上的常用应用,以及应用框架(Application Framework),其中,常用应用主要是基于浏览器Browser开发的应用,例如:HTML5APPs;以及原生应用(Native APPs);
应用框架(Application Framework)是一个完整的程序模型,具备标准应用软件所需的一切基本功能,例如:文件存取、资料交换…,以及这些功能的使用接口(工具栏、状态列、菜单、对话框)。
原生应用(Native APPs)可以支持在线或离线,消息推送或本地资源访问。
中间件层包括各种电视协议、多媒体协议以及系统组件等中间件。中间件可以使用系统软件所提供的基础服务(功能),衔接网络上应用系统的各个部分或不同的应用,能够达到资源共享、功能共享的目的。
硬件层主要包括HAL接口、硬件以及驱动,其中,HAL接口为所有电视芯片对接的统一接口,具体逻辑由各个芯片来实现。驱动主要包含:音频驱动、显示驱动、蓝牙驱动、摄像头驱动、WIFI驱动、USB驱动、HDMI驱动、传感器驱动(如指纹传感器,温度传感器,压力传感器等)、以及电源驱动等。
第一部分:
随着显示设备200趋于智能化,一些显示设备200上可配备智能语音功能,用户可以通过输入语音的方式方便地控制显示设备200。目前,越来越多的语音方案也可以应用于显示设备200中,因此一台显示设备200就可能会存在多个语音方案,比如目前最流行的Amazon的Alexa语音方案和Google的语音方案,以及其他某些国家自己特定的语音方案等。
目前显示设备200中的语音方案,主要通过遥控器等收音设备来收集用户输入的语音内容,当显示设备200上存在两个或者多个语音方案时,会存在不同语音方案之间切换的问题。比如,用户前一次选择语音方案A来进行语音内容的搜索,但是当前操作时,又选择了语音方案B来进行语音内容的搜索。
目前,通常有两种方法来实现不同语音方案的切换:一种是,用户与显示设备200的UI菜单进行交互,通过遥控器在UI菜单上选择某一语音方案;另一种是,用户在遥控器上通过按下不同的按键或语音键来控制显示设备200使用不同的语音方案,比如某显示设备200通过短按某些遥控器按键或者语音键默认使用第一种语音方案,而按键长按则使用第二种语音方案。以上两种语音方案的切换方法中,前者会增加用户的交互复杂度,从而影响用户体验,而后者则会增加遥控器等相同功能按键的冗余,增加硬件成本或者增加按键复用程度。无论哪种方式,会都增加用户切换语音方案时的操作难度。
基于以上内容,本申请实施例提供了一种显示设备上语音方案的切换方法、显示设备200及控制装置100,可以将不同的语音方案与用户操作控制装置100的动作相关联,用户可以根据自己的喜好操作控制装置100,产生对应的动作,进而控制显示设备200使用用户想要选择的语音方案。这种方式可以减少控制装置100的按键使用, 使语音方案的切换难度减小。
如前述实施例所述,本申请实施例中的显示设备200具有控制器250。控制器250可以接收用户通过控制装置100输入到显示设备200的语音控制指令等,并根据语音控制指令控制显示器260将显示页面切换至用户希望看到的语音方案的页面。
另外,控制装置100中也具有控制器110。控制器110可以接收用户输入的语音数据,以及利用或者控制传感器等检测用户操作控制装置100的动作数据。
显示设备200与控制装置100之间的数据传输与处理过程,均是由其各自的控制器执行。
图5示出了根据一些实施例的控制器250与控制器110之间的一种交互示意图。
如图5所示,用户可以通过控制装置100输入想要在显示设备200上观看的语音数据,例如“天气如何”等。本申请实施例中,预先可将显示设备200中的不同语音方案与不同的控制装置100的动作相对应,当用户想要在显示设备200选择自己喜欢的语音方案显示自己想要观看的内容时,可以在向控制装置100输入语音数据的同时,操作控制装置100做出与目标语音方案对应的动作,而后,控制器110将检测到的动作数据,连同语音数据一起打包生成语音控制指令,并将其发送给显示设备200。显示设备200的控制器250在接收到语音控制指令后,找到与动作数据对应的目标语音方案以及与语音数据对应的目标内容。然后,控制器250将显示器260的显示页面切换至与动作数据相对应的目标语音方案页面,以及在目标语音方案页面上显示与语音数据对应的目标内容。
本申请实施例中所说的用户操作控制装置100的动作可以包括但不限于:转动控制装置100所指的方向、手持控制装置100移动、或者在控制装置100上画出手势等。由于显示设备200中可以同时具备多种语音方案,那么不同的语音方案可以对应不同的方向,可以对应不同的移动轨迹,也可以对应不同的手势。
用户采用本申请实施例中所提供的控制装置100和显示设备200时,仅可以通过操作控制装置100做出一定的动作,就可以控制显示设备200切换到相应的目标语音方案页面进行显示。这种方式可以避免控制装置100上安装多过的语音控制按键,也可以避免用户与显示设备200的UI菜单频繁进行交互,使得用户的语音方案切换操作更加方面简洁。
本申请实施例中的控制装置100可以是与显示设备200蓝牙连接的遥控器,也可以是安装有虚拟遥控器的智能终端等。当控制装置100是遥控器时,用户可以操作遥控器转动不同的方向,也可以操作遥控器移动出一定的轨迹;当控制装置100是装有虚拟遥控器的智能终端时,用户不仅可以操作智能终端转动不同的方向,以及可以操作智能终端移动出一定的轨迹,还可以在智能终端的显示屏上画出不同的手势等等。
另外,例如遥控器、智能终端等控制装置100中,均安装有重力或者陀螺仪等传感器,以便其检测方向数据和轨迹数据。
图6示出了根据一些实施例的控制器110与控制器250之间的第二种交互示意图。
在一些实施例中,当控制装置100为遥控器时,如图6所示,控制器110可以在接收用户输入的语音数据的同时,利用传感器检测用户操作控制装置100(即遥控器)的方向,并生成方向数据,例如左、右方向等。然后,控制器110再将该方向数据与语音数据打包生成语音控制指令发送给显示设备200。
显示设备200的控制器250接收到语音控制指令后,将其解析获得语音数据和方向数据。而后,控制器250获取与方向数据对应的目标语音方案页面,并将显示器260的显示页面切换至该目标语音方案页面;同时控制器250还要获取与语音数据对应的目标内容,并在显示器260显示目标语音方案页面的同时,将目标内容显示在目标语音方案页面上。
图7示出了根据一些实施例的语音方案页面的一种示意图。图8示出了根据一些实施例的语音方案页面的第二种示意图。图9示出了根据一些实施例的用户操作控制装置100的一种示意图。
以用户操作遥控器分别指向左、右方向为例,具体操作方式如图9所示。其中,预先设置遥控器向左的方向数据对应语音方案A,遥控器向右的方向数据对应语音方案B。用户在对着遥控器说出语音内容(例如“天气如何”)的同时,可以拿着遥控器指向左侧,控制器110将接收到的语音数据以及方向数据发送给显示设备200。显示设备200的控制器250解析语音控制指令后,获取与左侧的方向数据对应的语音方案A,同时获取与语音数据对应的目标内容(例如当前时刻的天气数据)。然后,控制器250将显示器260的显示页面切换至与语音方案A的显示页面,如图7所示,并在该显示页面上的右侧显示当前时刻的天气信息。
或者,用户在对着遥控器说出语音内容(例如“天气如何”)的同时,可以拿着遥控器指向右侧,控制器110将接收到的语音数据以及方向数据发送给显示设备200。显示设备200的控制器250解析语音控制指令后,获取与向右的方向数据对应的语音方案B,同时获取与语音数据对应的目标内容(例如当前时刻的天气数据)。然后,控制器250将显示器260的显示页面切换至与语音方案B的显示页面,如图8所示,并在该显示页面上的底部显示当前时刻的天气信息。
另外,如图7和图8中所示的目标内容的显示位置并不是唯一的,在其他一些实施例中,也可以根据用户的喜好或者实际需求等,将目标内容显示在目标语音方案页面上的任何位置。
图10示出了根据一些实施例的控制器110与控制器250之间的第三种交互示意图。
在一些实施例中,当控制装置100为遥控器时,如图10所示,控制器110还可以在接收用户输入的语音数据的同时,利用传感器检测用户操作控制装置100(即遥控器)的移动轨迹,并生成轨迹数据,例如圆形、方形、三角形等。然后,控制器110再将该轨迹数据与语音数据打包生成语音控制指令发送给显示设备200。
显示设备200的控制器250接收到语音控制指令后,将其解析获得语音数据和轨迹数据。而后,控制器250获取与轨迹数据对应的目标语音方案页面,并将显示器260的显示页面切换至该目标语音方案页面;同时控制器250还要获取与语音数据对应的目标内容,并在显示器260显示目标语音方案页面的同时,将目标内容显示在目标语音方案页面上。
图11示出了根据一些实施例的用户操作控制装置100的第二种示意图。
例如,将遥控器移动的圆形轨迹数据对应语音方案A,将三角形轨迹数据对应语音方案B。用户在对着遥控器说出语音内容(例如“天气如何”)的同时,移动遥控器画出圆形轨迹,例如图11所示,对应的,在显示设备200上就可以显示出语音方案A的页面,以及在页面上显示出当前时刻的天气信息,如图7所示;或者,用户在对 着遥控器说出语音内容(例如“天气如何”)的同时,移动遥控器画出三角形轨迹,对应的,在显示设备200上就可以显示出语音方案B的页面,以及在页面上显示出当前时刻的天气信息,如图8所示。
在一些实施例中,智能终端上安装虚拟遥控器后,也可以作为控制装置100使用,例如在智能手机上安装虚拟遥控器等。由于智能终端本身具有重力传感器或者陀螺仪等,因此,智能终端可实现与上述实体遥控器一样的功能,既能检测自身移动的方向数据,也能检测出自身移动的轨迹数据。
同时,由于智能终端本身还具有显示屏,因此智能终端还能检测用户在显示屏上画出的手势并生成手势数据。
图12示出了根据一些实施例的控制器110与控制器250之间的第四种交互示意图。
在一些实施例中,当控制装置100为上述智能终端时,如图12所示,控制器110还可以在接收用户输入的语音数据的同时,检测用户在控制装置100(即智能终端)显示屏上输入的手势并生成手势数据,例如用手指画出“Z”字形手势、“O”字形手势、“L”字形手势等。然后,控制器110再将该手势数据与语音数据打包生成语音控制指令发送给显示设备200。
显示设备200的控制器250接收到语音控制指令后,将其解析获得语音数据和手势数据。而后,控制器250获取与手势数据对应的目标语音方案页面,并将显示器260的显示页面切换至该目标语音方案页面;同时控制器250还要获取与语音数据对应的目标内容,并在显示器260显示目标语音方案页面的同时,将目标内容显示在目标语音方案页面上。
图13示出了根据一些实施例的用户操作控制装置100的第三种示意图。
以用户在智能终端显示屏上分别画出“Z”字形手势和“L”字形手势为例,其中,预先设置“Z”字形手势数据对应语音方案A,“L”字形手势数据对应语音方案B。用户在对着智能终端说出语音内容(例如“天气如何”)的同时,可以在显示屏上画出“Z”字形手势,例如图13所示。控制器110将接收到的语音数据以及手势数据发送给显示设备200。显示设备200的控制器250解析语音控制指令后,获取与“Z”字形手势数据对应的语音方案A,同时获取与语音数据对应的目标内容(例如当前时刻的天气数据)。然后,控制器250将显示器260的显示页面切换至与语音方案A的显示页面,如图7所示,并在该显示页面上的右侧显示当前时刻的天气信息。
或者,用户在对着智能终端说出语音内容(例如“天气如何”)的同时,可以在显示屏上画出“L”字形手势,控制器110将接收到的语音数据以及手势数据发送给显示设备200。显示设备200的控制器250解析语音控制指令后,获取与“L”字形手势数据对应的语音方案B,同时获取与语音数据对应的目标内容(例如当前时刻的天气数据)。然后,控制器250将显示器260的显示页面切换至与语音方案B的显示页面,如图8所示,并在该显示页面上的右侧显示当前时刻的天气信息。
在本申请实施例中,遥控器和虚拟遥控器上均设置有语音按键,用户按下该按键即可说出语音内容,在语音内容输入完成后释放按键。并且,为了保证动作数据的准确性和时效性,控制器110需要在用户按下语音按键的同时,开始检测用户操作遥控器的动作,以保证动作数据与语音数据同步进行采集。
在一些实施例中,也可能会出现动作时间大于语音输入时间的情况,即用户已经 将语音内容说完,但是操作动作并未做完,此时,用户可以继续保持语音按键按下的状态,直至操作动作完成。
值得说明的是,前述实施例中所说的遥控器和智能终端,可以分别单独控制显示设备200,也可以共同控制显示设备200。并且,使用遥控器时,既可以分别单独使用方向数据、轨迹数据等对显示设备200进行控制,也可以同时利用方向数据和轨迹数据共同对显示设备200进行控制。或者,使用智能终端时,既可以分别单独使用方向数据、轨迹数据、手势数据等对显示设备200进行控制,也可以同时利用其中的任两项或者全部三项数据共同对显示设备200进行控制。
由上述内容可知,本申请实施例提供的控制装置100,在接收用户输入的语音数据的同时检测用户操作控制装置100的动作并生成动作数据;并将语音数据与动作数据打包生成的语音控制指令发送给显示设备200。该控制装置100上不必设置过多冗余的按键,只需利用一个语音按键接收用户输入的语音内容,并且检测出用户的操作动作,就可以在显示设备200上切换出对应语音方案的页面以及目标内容,减小了语音方案的切换难度。
另外,本申请实施例提供的显示设备200,在接收到语音控制指令后,将显示器260的显示页面切换至与动作数据相对应的语音方案页面,以及在语音方案页面上显示与语音数据对应的目标内容。该显示设备200也避免了用户使用UI菜单进行语音方案的手动选择,同时也减小了语音方案的切换难度。
在上述控制装置100与显示设备200之间的交互方案中,将不同的语音方案与用户操作控制装置100的动作相关联,用户可以根据自己的喜好操作控制装置100,产生对应的动作,进而控制显示设备200使用用户想要选择的语音方案。减少控制装置100的按键设置,也避免使用显示设备200的UI菜单,使语音方案的切换难度减小。
图14示出了根据一些实施例的显示设备上语音方案的切换方法的一种流程图。
本申请实施例提供了一种可以应用于前述实施例显示设备200上的语音方案的切换方法,该方法由可以实现控制功能的控制器250以及其他控制部件执行,如图14所示,具体可以包括如下步骤:
步骤S101,接收控制装置100发送的语音控制指令。
其中,语音控制指令包括用于切换显示设备200语音方案的用户操作控制装置100的动作数据,以及,用于在显示设备200上搜索目标内容的用户输入的语音数据。
步骤S102,响应于语音控制指令,将显示器260的显示页面切换至与动作数据相对应的目标语音方案页面,以及在目标语音方案页面上显示与语音数据对应的目标内容。
在一些实施例中,所述方法还包括:解析语音控制指令,获得语音数据和方向数据;方向数据为根据用户操作控制装置100的方向生成的数据;获取与语音数据对应的目标内容;在将显示器260的显示页面切换至与方向数据对应的目标语音方案页面的同时,在目标语音方案页面上显示目标内容。
在一些实施例中,所述方法还包括:解析语音控制指令,获得语音数据和轨迹数据;轨迹数据为根据用户操作控制装置100的轨迹生成的数据;获取与语音数据对应的目标内容;在将显示器260的显示页面切换至与轨迹数据对应的目标语音方案页面的同时,在目标语音方案页面上显示目标内容。
在一些实施例中,所述方法还包括:解析语音控制指令,获得语音数据和手势数据;手势数据为根据用户在控制装置100上输入的手势生成的数据;获取与语音数据对应的目标内容;在将显示器260的显示页面切换至与手势数据对应的目标语音方案页面的同时,在目标语音方案页面上显示目标内容。
图15示出了根据一些实施例的显示设备上语音方案的切换方法的另一种流程图。
本申请实施例还提供了一种可以应用于前述实施例控制装置100上的语音方案的切换方法,该方法由可以实现控制功能的控制器110以及其他控制部件执行,如图15所示,具体可以包括如下步骤:
步骤S201,在接收用户输入的语音数据的同时,检测用户操作控制装置100的动作并生成动作数据。
步骤S202,将语音数据与动作数据打包生成的语音控制指令发送给显示设备200。
在一些实施例中,所述方法还包括:在接收用户输入的语音数据的同时,利用传感器检测用户操作控制装置100的方向并生成方向数据;其中,不同的方向数据对应显示设备200中不同的语音方案。
在一些实施例中,所述方法还包括:在接收用户输入的语音数据的同时,利用传感器检测用户操作控制装置100的轨迹并生成轨迹数据;其中,不同的轨迹数据对应显示设备200中不同的语音方案。
在一些实施例中,所述方法还包括:在接收用户输入的语音数据的同时,检测用户在控制装置100上输入的手势并生成手势数据;其中,不同的手势数据对应显示设备200中不同的语音方案。
由于本申请实施例中两种显示设备上语音方案的切换方法可以分别应用于如前述实施例所述的控制器250和控制器110中,因此,关于本申请实施例中两种显示设备上语音方案的切换方法的其他内容可以参照前述关于控制器250和控制器110实施例的内容,在此不再赘述。
第二部分:
本申请实施例提供一种服务器,所述服务器用于执行:
接收显示设备采集的声音数据,所述声音数据中至少包含视频资源名称;
在所述声音数据中还包含视频应用名称,且在所述显示设备上安装有所述视频应用名称对应的视频应用时,在所述视频应用名称对应的视频应用中搜索所述视频资源名称对应的视频资源,以及将所述视频资源反馈至所述显示设备;
在所述声音数据中还包含视频应用名称,且在所述显示设备上未安装所述视频应用名称对应的视频应用时,不向所述显示设备反馈视频资源。
在本申请一些实施例中,所述服务器还用于执行:
在所述声音数据中不包含视频应用名称,且在所述显示设备上运行有视频应用时,在当前运行的视频应用中搜索所述视频资源名称对应的视频资源,以及将所述视频资源反馈至所述显示设备。
在本申请一些实施例中,所述服务器还用于执行:
在所述声音数据中不包含视频应用名称,且在所述显示设备上未运行视频应用时,在所述显示设备上安装的所有视频应用中,搜索所述视频资源名称对应的视频资源,以及将所述视频资源反馈至所述显示设备。
在本申请一些实施例中,所述服务器包括:
语音识别子服务器,用于执行接收显示设备采集的声音数据,从所述声音数据中至少识别视频资源名称,以及将从所述声音数据中识别的数据发送至指令生成子服务器;
指令生成子服务器,用于执行,根据从所述声音数据中识别的数据生成资源搜索指令,以及将所述资源搜索指令发送至所述显示设备;
视频搜索子服务器,用于执行,在从所述声音数据中识别的数据中还包含视频应用名称,且在所述显示设备上安装有所述视频应用名称对应的视频应用时,接收所述显示设备发送的视频搜索请求,根据所述视频搜索请求在所述视频应用名称对应的视频应用中,搜索所述视频资源名称对应的视频资源,以及将所述视频资源反馈至所述显示设备,其中,所述视频搜索请求为根据所述资源搜索指令生成的;
所述视频搜索子服务器,还用于执行,在从所述声音数据中识别的数据中还包含视频应用名称,且在所述显示设备上未安装所述视频应用名称对应的视频应用时,不接收所述显示设备发送的视频搜索请求。
在本申请一些实施例中,所述视频搜索子服务器,还用于执行,在从所述声音数据中识别的数据中不包含视频应用名称,且在所述显示设备上运行有视频应用时,接收所述显示设备发送的所述视频搜索请求,根据所述视频搜索请求在当前运行的视频应用中,搜索所述视频资源名称对应的视频资源,以及将所述视频资源反馈至所述显示设备。
在本申请一些实施例中,所述视频搜索子服务器,还用于执行,在从所述声音数据中识别的数据中不包含视频应用名称,且在所述显示设备上未运行视频应用时,接收所述显示设备发送的所述视频搜索请求,根据所述视频搜索请求在所述显示设备上安装的所有视频应用中,搜索所述视频资源名称对应的视频资源,以及将所述视频资源反馈至所述显示设备。
本申请一些实施例还提供一种显示设备,包括:
显示器;
声音采集器,用于执行采集用户的声音数据;
控制器,用于执行,将所述声音数据发送至服务器,所述声音数据中至少包含视频资源名称;
在所述声音数据中还包含视频应用名称,且在所述显示设备上安装有所述视频应用名称对应的视频应用时,从所述服务器接收与所述视频资源名称对应的视频资源,其中,所述视频资源为在与所述视频应用名称对应的视频应用中搜索的;
在所述声音数据中还包含视频应用名称时,且在所述显示设备上未安装所述视频应用名称对应的视频应用时,不从所述服务器接收视频资源。
在本申请一些实施例中,所述控制器,在所述声音数据中还包含视频应用名称时,且在所述显示设备上未安装所述视频应用名称对应的视频应用时,还用于执行:在所述显示器上显示提示信息,其中,所述提示信息用于提示用户在所述显示设备上未安装所述视频应用名称对应的视频应用。
在本申请一些实施例中,所述控制器,还用于执行:
在所述声音数据中不包含视频应用名称时,且在所述显示设备上运行有视频应用 时,从所述服务器接收与所述视频资源名称对应的视频资源,其中,所述视频资源为在当前运行的视频应用中搜索的;
在所述声音数据中不包含视频应用名称时,且在所述显示设备上未运行视屏应用,从所述服务器接收与所述视频资源名称对应的视频资源,其中,所述视频资源为在所述显示设备上安装的所有视频应用中搜索的。
本申请一些实施例还提供一种视频搜索方法,应用于显示设备,包括:
将采集的声音数据发送至服务器,所述声音数据中至少包含视频资源名称;
在所述声音数据中还包含视频应用名称,且在所述显示设备上安装有所述视频应用名称对应的视频应用时,从所述服务器接收与所述视频资源名称对应的视频资源,其中,所述视频资源为在与所述视频应用名称对应的视频应用中搜索的;
在所述声音数据中还包含视频应用名称,且在所述显示设备上未安装所述视频应用名称对应的视频应用时,不从所述服务器接收视频资源。
目前显示设备集成有智能语音助手,用户可以利用遥控器,通过智能语音助手进行视频搜索。
然而,用户在输入想要搜索的视频资源名称后,传统的显示设备通常是进行整机搜索,即同时在显示设备安装的多款视频应用上搜索视频,因此无法实现在指定视频应用上搜索视频的目的,造成用户的视频搜索体验较差。
为了解决上述问题,本申请提供一种视频搜索系统,如图16所示的视频搜索系统的框架图,该系统包括显示设备200和服务器400。本申请实施例为显示设备和服务器交互的场景。显示设备上安装有多款视频应用,服务器用于识别显示设备采集的声音数据,同时于提供多款视频应用的视频资源。
利用本实施例的视频搜索系统进行视频搜索的过程,具体为:
用户向显示设备输入声音数据,显示设备的声音采集器采集到用户输入的声音数据。显示设备可以将转码后的声音数据发送至服务器。所述声音数据中至少包含视频资源名称。
服务器在接收到声音数据后,从声音数据中识别数据,具体的至少识别视频资源名称。
在一些实施例中,如果所述声音数据中还包含视频应用名称,并且显示设备上安装有该视频应用名称对应的视频应用。即服务器不仅从声音数据中识别出视频资源名称,还从声音数据中识别出视频应用名称,同时显示设备安装有所述视频应用。则显示设备调用所述视频应用的搜索接口,在服务器中搜索所述视频资源名称对应的视频资源。搜索成功后,服务器将所述视频资源反馈至显示设备。
在一些实施例中,如果该声音数据中还包含视频应用名称,并且显示设备上未安装该视频应用名称对应的视频应用。则显示设备不可调用所述视频应用的搜索接口,同样也不能在服务器中搜索所述视频资源名称对应的视频资源,不能向显示设备反馈所述视频资源。
示例性的,当用户输入声音数据“在视频应用A中搜索视频X”时,显示设备将所述声音数据发送至服务器。服务器从所述声音数据中识别出视频资源名称为视频X和视频应用名称为应用A。
如果在显示设备上安装有视频应用A,则显示设备调用视频应用A的搜索接口, 在服务器中搜索视频X。在搜索到视频X的视频资源后,将视频X的视频资源反馈至显示设备。从而实现通过语音助手,在指定的视频应用中搜索指定视频资源的目的,提升用户视频搜索的体验。
如果在显示设备上未安装视频应用A,则显示设备不可调用视频应用A的搜索接口,也就无法在服务器中搜索视频X,同样无法向显示设备反馈视频X的视频资源。
在一些实施例中,如果所述声音数据中不包含视频应用名称,而只包含视频资源名称,同时当前在显示设备的后台运行有视频应用,则调用当前运行的视频应用的搜索接口,在服务器中搜索所述视频应用名称对应的视频资源。在搜索到视频资源名称对应的视频资源后,将所述视频资源反馈至显示设备。
在一些实施例中,如果所述声音数据中不包含视频应用名称,而只包含视频资源名称,同时当前在显示设备的后台没有运行的视频应用,则调用整机搜索功能,整机搜索所述视频资源。
示例性的,当用户输入声音数据“搜索视频X”时,显示设备将所述声音数据发送至服务器。服务器从所述声音数据中只能识别出视频资源名称,视频X。
如果当前显示设备的后台运行有视频应用B,则显示设备调用视频应用B的搜索接口,在服务器中搜索视频X的视频资源。在搜索到视频X的视频资源后,将所述视频资源反馈至显示设备。
如果当前显示设备的后台没有视频应用运行,无法调用单个视频应用的搜索接口,则调用整机搜索功能(在显示设备上安装的所有视频应用内搜索),整机搜索视频X的视频资源。在搜索到视频X的视频资源后,将所述视频资源反馈至显示设备。在该场景,有可能在多个视频应用中搜索到视频X的视频资源。在显示设备上,可以按照该用户对各个视频应用的喜好程度,对在不同视频应用中搜索到的视频X的视频资源进行排序展示。
在一些实施例中,服务器400包括语音识别子服务400A、指令生成子服务器400B以及视频搜索子服务器400C。语音识别子服务器可以是智能语音合作商的服务器,用于解析语音和语义,识别出相关指令。指令生成子服务器和视频搜索子服务器可以是本地服务器,用于根据解析的语义生成相关的搜索指令。视频搜索子服务器用于接收显示设备端的搜索请求,反馈相关的资源。
利用本实施例的视频搜索系统进行视频搜索的过程,具体为:
用户向显示设备输入声音数据,显示设备的声音采集器采集到用户输入的声音数据。显示设备可以将转码后的声音数据发送至语音识别子服务器。所述声音数据中至少包含视频资源名称。
语音识别子服务器在接收到声音数据后,对声音数据进行语音和语义的解析,识别出相关指令参数,具体的至少识别视频资源名称。
在一些实施例中,如果所述声音数据中还包含视频应用名称,且显示设备安装有视频应用名称对应的视频应用。语音识别子服务器对声音数据解析后,还识别出视频应用名称。之后语音识别子服务器将识别出的视频应用名称和视频资源名称,以及与形成指令的其他相关参数(如执行的操作参数、设备参数,语言参数等),发送至指令生成子服务器。
指令生成子服务器根据视频应用名称、视频资源名称以及形成指令的其他相关参 数,生成资源搜索指令。这里,也可以是语音识别子服务器直接从声音数据中识别出相关指令,将识别出的相关指令发送至指令生成子服务器。指令生成子服务器将识别出的相关指令,转化为显示设备可识别的资源搜索指令。解析声音数据,根据解析的数据生成资源搜索指令的具体的过程,本申请不作限制。
指令生成子服务器将生成的资源搜索指令反馈至显示设备。显示设备接收到资源搜索指令后,根据所述资源搜索指令生成视频搜索请求。显示设备将所述视频搜索请求发送至视频搜索子服务器,即调用视频应用名称对应视频应用的搜索接口,视频搜索子服务器中搜索该视频应用中的视频资源。
在搜索到视频资源名称的视频资源后,视频搜索子服务器将该视频资源反馈至显示设备。
示例性的,如图17所示,当用户输入声音数据“在视频应用A中搜索视频X”,在显示设备上显示所述声音数据。显示设备将所述声音数据发送至语音识别子服务器。语音识别子服务器接收到所述声音数据后,从声音数据中识别出视频资源名称为视频X、视频应用名称为应用A以及其他相关参数(需要执行的操作为搜索)。
语音识别子服务器将识别出的视频资源名称视频X、视频应用名称应用A以及其他相关参数发送至指令生成子服务器。指令生成子服务器根据识别出的视频资源名称视频X、视频应用名称应用A以及其他相关参数生成资源搜索指令:在视频应用A中搜索视频X。指令生成子服务器将生成的所述资源搜索指令反馈至显示设备。
显示设备在接收到所述资源搜索指令后,从如图17所示的用户界面跳转到图18所示的用户界面,图18的用户界面为视频应用A的用户界面。
具体的实现过程为:显示设备根据所述资源搜索指令生成视频搜索请求,并将所述视频搜索请求发送至视频搜索子服务器,以使在应用A中搜索在视频应用A中搜索视频X。即在显示设备上调用应用A的搜索接口,在应用A中搜索在视频应用A中搜索视频X。
在搜索到视频X的视频资源后,视频搜索子服务器将视频X的视频资源反馈至显示设备。如图18所示的用户界面,向用户展示搜索得到的视频X的视频资源(可以展示视频X和与视频X相关的其他视频)。
在一些实施例中,如果所述声音数据中还包含视频应用名称,且显示设备未安装视频应用名称对应的视频应用。语音识别子服务器对声音数据解析后,还识别出视频应用名称。之后语音识别子服务器将识别出的视频应用名称和视频资源名称,以及与形成指令的其他相关参数(如执行的操作参数、设备参数,语言参数等),发送至指令生成子服务器。
指令生成子服务器根据视频应用名称、视频资源名称以及形成指令的其他相关参数,生成资源搜索指令。这里,也可以是语音识别子服务器直接从声音数据中识别出相关指令,将识别出的相关指令发送至指令生成子服务器。指令生成子服务器将识别出的相关指令,转化为显示设备可识别的资源搜索指令。
指令生成子服务器将生成的资源搜索指令反馈至显示设备。由于此时显示设备未安装所述视频应用名称对应的视频应用,无法调用所述视频应用的搜索接口。则无法向视频搜索子服务器发送对应的视频搜索请求,同样也无法从视频搜索子服务器获取所述视频资源。
示例性的,当用户输入声音数据“在视频应用A中搜索视频X”时,语音识别子服务器从所述声音数据中识别生成指令的相关参数,并将相关参数发送至指令生成子服务器。
指令生成子服务器根据生成指令的相关参数,生成资源搜索指令:在视频应用A中搜索视频X。此时显示设备上未安装应用A,因此无法调用应用A的搜索接口,因此无法向视频搜索子服务器发出视频搜索请求。视频搜索子服务器也无法反馈搜索视频X的视频资源。
在一些实施例中,如果声音数据中还包含视频应用名称,且显示设备未安装与所述视频应用名称对应的视频应用,则无法向视频搜索子服务器发送对应的视频搜索请求,同样也无法从视频搜索子服务器获取所述视频资源。控制器生成提示信息,同时将所述提示信息展示在显示器上。提示信息可以是:不存在该应用,请在其他应用中搜索。
在一些实施例中,如果所述声音数据中不包含视频应用名称,且显示设备当前运行有视频应用,语音识别子服务器仅能从声音数据中识别出视频资源名称。语音识别子服务器将识别出的视频资源名称和指令生成的其他相关参数,发送至指令生成子服务器。指令生成子服务器根据视频资源名称和其他相关参数,生成资源搜索指令。
显示设备接收到资源搜索指令后,根据资源搜索指令生成视频搜索请求,即调用当前运行视频应用的搜索接口。将视频搜索请求发送至视频搜索子服务器。在当前运行视频应用中搜索与所述视频资源名称对应的视频资源,以及将搜索得到的视频资源反馈至显示设备。
示例性的,如图19所示的用户界面,为视频应用B的主页界面,包括导航栏和推荐视频。在图19所示的用户界面中,当用户输入声音数据“搜索视频X”,可以在用户界面中显示该声音数据。显示设备将该声音数据发送至语音识别子服务器。
语音识别子服务器从所述声音数据中识别生成指令的相关参数,并将相关参数发送至指令生成子服务器。指令生成子服务器根据生成指令的相关参数,生成资源搜索指令:搜索视频X,以及将所述资源搜索指令发送至显示设备。
显示设备在接收到所述资源搜索指令后,从如图19所示的用户界面跳转到图20所示的用户界面。具体的实现过程为:显示设备根据资源搜索指令调用视频应用B的搜索接口,生成视频搜索请求。将视频搜索请求发送至视频搜索子服务器,在视频应用B中搜索视频X的视频资源,以及将搜索得到的视频X的视频资源反馈至显示设备。同样的,在显示设备上展示搜索得到的与视频X相关的视频资源。
在一些实施例中,如果所述声音数据中不包含视频应用名称,且显示设备当前未运行视频应用,语音识别子服务器仅能从声音数据中识别出视频资源名称。指令生成子服务器根据从声音数据中识别的数据生成资源搜索指令。
显示设备当前没有正在后台运行的视频应用,同时声音数据中不包含视频应用名称,因此显示设备调用已安装的所有视频应用的搜索接口,生成视频搜索请求,将视频搜索请求发送至视频搜索子服务器。从而实现整机搜索,最后将搜索得到的所有与视频资源名称对应的视频资源反馈至显示设备。
示例性,当用户输入声音数据“搜索视频X”时,语音识别子服务器从所述声音数据中识别生成指令的相关参数,并将相关参数发送至指令生成子服务器。指令生成 子服务器根据生成指令的相关参数,生成资源搜索指令:搜索视频X。
此时,搜索指令中不包含指定的视频应用,并且显示设备当前没有正在后台运行的视频应用。则调用已安装的所有视频应用的搜索接口,生成视频搜索请求,将视频搜索请求发送至视频搜索子服务器。从而实现整机搜索视频X的视频资源。最后将与视频X对应的视频资源反馈至显示设备。
在一些实施例中,指令生成子服务器在根据从声音数据中识别的数据,生成的资源搜索指令中包含ApplicationName参数。如果语音识别子服务器从声音数据中识别出某应用的名称,则将该应用的名称赋值给ApplicationName参数。当显示设备接收到资源搜索指令后,如果ApplicationName字段值不为空,并且ApplicationName字段值与显示设备上安装的某个应用的名称一致,则打开该应用。同时将要搜索的视频资源名称发送至该应用的搜索接口,实现在该应用内搜索与视频资源名称对应的视频资源。
当显示设备接收到资源搜索指令后,如果ApplicationName字段值不为空,并且显示设备上安装的应用中没有与ApplicationName字段值一致的应用,则无法打开应用,也无法实现在该应用内搜索视频资源。
当显示设备接收到资源搜索指令后,如果ApplicationName字段值为空,即没有从声音数据中识别出应用的名称,并且当前显示设备后台运行有某应用,则打开当前运行的视频应用。同时将要搜索的视频资源名称发送至当前运行应用的搜索接口,实现在当前运行应用内搜索与视频资源名称对应的视频资源。
当显示设备接收到资源搜索指令后,如果ApplicationName字段值为空,并且当前显示设备后台没有运行的应用,则打开显示设备上安装的所有视频应用。同时将要搜索的视频资源名称发送至所有视频应用的搜索接口,实现在整机搜索。
本申请实施例提供一种视频搜索方法,如图21所示的视频搜索方法的信令图,所述方法包括以下步骤:
步骤一、显示设备采集声音数据,所述声音数据为用户通过用户输入接口输入的语音指令。所述声音数据至少包含视频资源名称。显示设备将所述声音数据发送至服务器。
步骤二、服务器接收到声音数据后,如果声音数据中还包含视频应用名称,且在显示设备上安装有与视频应用名称对应的视频应用。则在所述视频应用名称对应的视频应用中搜索与视频资源名称对应的视频资源。
步骤三、服务器将与视频资源名称对应的视频资源反馈至显示设备。
在一些实施例中,如果声音数据中还包含视频应用名称,且在显示设备上未安装与视频应用名称对应的视频应用。则不向显示设备反馈视频资源。
在一些实施例中,如果声音数据中不包含视频应用名称,且当前在显示设备上有运行的视频应用。则在当前运行的视频应用中搜索与视频资源名称对应的视频资源,以及将视频资源反馈至显示设备。
在一些实施例中,如果声音数据中不包含视频应用名称,且当前在显示设备上未运行视频应用。则整机搜索与视频资源名称对应的视频资源,以及将视频资源反馈至显示设备。
基于上述方法实施例,本申请实施例提供又一种视频搜索方法,如图22所示的视 频搜索方法的信令图,所述方法包括以下步骤:
步骤一、显示设备采集声音数据,所述声音数据为用户通过用户输入接口输入的语音指令。所述声音数据至少包含视频资源名称。显示设备将该声音数据发送至语音识别子服务器。
步骤二、语音识别子服务器从所述声音数据中识别生成指令的各参数,其中至少包含视频资源名称,以及将生成指令的各参数发送至指令生成子服务器。
步骤三、指令生成子服务器根据生成指令的各参数生成资源搜索指令,以及将资源搜索指令反馈至显示设备。
步骤四、显示设备接收所述资源搜索指令,此时声音数据中还包含视频应用名称,如果显示设备中安装有视频应用名称对应的视频应用,则根据资源搜索指令生成视频搜索请求(调用与视频应用名称对应的视频应用的搜索接口),以及将视频搜索请求发送至视频搜索子服务器。
步骤五、视频搜索子服务器在接收到视频搜索请求后,在于视频应用名称对应的视频应用中,搜索有视频资源名称对应的视频资源,以及将与视频资源名称对应的视频资源反馈至显示设备。
在一些实施例中,如果声音数据中还包含视频应用名称,且显示设备中未安装与视频应用名称对应的视频应用,则无法调用该视频应用的搜索接口,无法根据资源搜索指令生成视频搜索请求。
在一些实施例中,如果声音数据中不包含视频应用名称,且显示设备当前运行有视频应用,则调用当前运行的视频应用的搜索接口,以及向视频搜索子服务器发送视频搜索请求。视频搜索子服务器在接收到所述视频搜索请求后,在当前运行的视频应用中搜索与视频资源名称对应的视频资源,以及将所述视频资源反馈至显示设备。
在一些实施例中,如果声音数据中不包含视频应用名称,且显示设备当前没有运行的视频应用,则整机搜索与视频资源名称对应的视频资源。具体的,调用显示设备安装的所有视频应用的搜索接口,向视频搜索子服务器发送视频搜索请求。视频搜索子服务器在所有视频应用中搜索与视频资源名称对应的视频资源,以及将搜索到的所有与视频资源名称对应的视频资源反馈至显示设备。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。
为了方便解释,已经结合具体的实施方式进行了上述说明。但是,上述示例性的讨论不是意图穷尽或者将实施方式限定到上述公开的具体形式。根据上述的教导,可以得到多种修改和变形。上述实施方式的选择和描述是为了更好的解释原理以及实际的应用,从而使得本领域技术人员更好的使用所述实施方式以及适于具体使用考虑的各种不同的变形的实施方式。
Claims (10)
- 一种显示设备,包括:显示器;控制器,被配置为:接收控制装置发送的语音控制指令;所述语音控制指令包括用于切换显示设备语音方案的用户操作所述控制装置的动作数据,以及,用于在显示设备上搜索目标内容的用户输入的语音数据;响应于所述语音控制指令,将显示器的显示页面切换至与所述动作数据相对应的目标语音方案页面,以及在所述目标语音方案页面上显示与所述语音数据对应的目标内容。
- 根据权利要求1所述的显示设备,所述控制器,还被配置为:解析所述语音控制指令,获得语音数据和方向数据;所述方向数据为根据用户操作所述控制装置的方向生成的数据;获取与所述语音数据对应的目标内容;在将显示器的显示页面切换至与所述方向数据对应的目标语音方案页面的同时,在所述目标语音方案页面上显示所述目标内容。
- 根据权利要求1所述的显示设备,所述控制器,还被配置为:解析所述语音控制指令,获得语音数据和轨迹数据;所述轨迹数据为根据用户操作所述控制装置的轨迹生成的数据;获取与所述语音数据对应的目标内容;在将显示器的显示页面切换至与所述轨迹数据对应的目标语音方案页面的同时,在所述目标语音方案页面上显示所述目标内容。
- 根据权利要求1所述的显示设备,所述控制器,还被配置为:解析所述语音控制指令,获得语音数据和手势数据;所述手势数据为根据用户在所述控制装置上输入的手势生成的数据;获取与所述语音数据对应的目标内容;在将显示器的显示页面切换至与所述手势数据对应的目标语音方案页面的同时,在所述目标语音方案页面上显示所述目标内容。
- 一种控制装置,包括:控制器,被配置为:在接收用户输入的语音数据的同时,检测用户操作所述控制装置的动作并生成动作数据;将所述语音数据与所述动作数据打包生成的语音控制指令发送给显示设备。
- 根据权利要求5所述的控制装置,所述控制器,还被配置为:在接收用户输入的语音数据的同时,利用传感器检测用户操作所述控制装置的方向并生成方向数据;其中,不同的方向数据对应显示设备中不同的语音方案。
- 根据权利要求5所述的控制装置,所述控制器,还被配置为:在接收用户输入的语音数据的同时,利用传感器检测用户操作所述控制装置的轨迹并生成轨迹数据;其中,不同的轨迹数据对应显示设备中不同的语音方案。
- 根据权利要求5所述的控制装置,所述控制器,还被配置为:在接收用户输入的语音数据的同时,检测用户在所述控制装置上输入的手势并生成手势数据;其中,不同的手势数据对应显示设备中不同的语音方案。
- 一种显示设备上语音方案的切换方法,包括:接收控制装置发送的语音控制指令;所述语音控制指令包括用于切换显示设备语音方案的用户操作所述控制装置的动作数据,以及,用于在显示设备上搜索目标内容的用户输入的语音数据;响应于所述语音控制指令,将显示器的显示页面切换至与所述动作数据相对应的目标语音方案页面,以及在所述目标语音方案页面上显示与所述语音数据对应的目标内容。
- 一种显示设备上语音方案的切换方法,包括:在接收用户输入的语音数据的同时,检测用户操作所述控制装置的动作并生成动作数据;将所述语音数据与所述动作数据打包生成的语音控制指令发送给显示设备。
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110124749.7A CN113282773A (zh) | 2021-01-29 | 2021-01-29 | 一种视频搜索方法、显示设备及服务器 |
CN202110124749.7 | 2021-01-29 | ||
CN202110156337.1A CN112817556A (zh) | 2021-02-04 | 2021-02-04 | 显示设备上语音方案的切换方法、显示设备及控制装置 |
CN202110156337.1 | 2021-02-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022160911A1 true WO2022160911A1 (zh) | 2022-08-04 |
Family
ID=82652960
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/133767 WO2022160911A1 (zh) | 2021-01-29 | 2021-11-27 | 显示设备上语音方案的切换方法、显示设备及控制装置 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022160911A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160154624A1 (en) * | 2014-12-01 | 2016-06-02 | Lg Electronics Inc. | Mobile terminal and controlling method thereof |
CN107122179A (zh) * | 2017-03-31 | 2017-09-01 | 阿里巴巴集团控股有限公司 | 语音的功能控制方法和装置 |
CN107919123A (zh) * | 2017-12-07 | 2018-04-17 | 北京小米移动软件有限公司 | 多语音助手控制方法、装置及计算机可读存储介质 |
CN109313498A (zh) * | 2016-04-26 | 2019-02-05 | 唯景公司 | 控制光学可切换设备 |
CN112817556A (zh) * | 2021-02-04 | 2021-05-18 | 青岛海信传媒网络技术有限公司 | 显示设备上语音方案的切换方法、显示设备及控制装置 |
-
2021
- 2021-11-27 WO PCT/CN2021/133767 patent/WO2022160911A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160154624A1 (en) * | 2014-12-01 | 2016-06-02 | Lg Electronics Inc. | Mobile terminal and controlling method thereof |
CN109313498A (zh) * | 2016-04-26 | 2019-02-05 | 唯景公司 | 控制光学可切换设备 |
CN107122179A (zh) * | 2017-03-31 | 2017-09-01 | 阿里巴巴集团控股有限公司 | 语音的功能控制方法和装置 |
CN107919123A (zh) * | 2017-12-07 | 2018-04-17 | 北京小米移动软件有限公司 | 多语音助手控制方法、装置及计算机可读存储介质 |
CN112817556A (zh) * | 2021-02-04 | 2021-05-18 | 青岛海信传媒网络技术有限公司 | 显示设备上语音方案的切换方法、显示设备及控制装置 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020244266A1 (zh) | 智能电视的远程控制方法、移动终端和智能电视 | |
CN112911380B (zh) | 一种显示设备及与蓝牙设备的连接方法 | |
CN112653906B (zh) | 显示设备上视频热点播放方法及显示设备 | |
CN113141479A (zh) | 一种显示设备及其按键复用方法 | |
CN112004157A (zh) | 一种多轮语音交互方法及显示设备 | |
CN112272331B (zh) | 一种节目频道列表快速展示的方法及显示设备 | |
CN112905149A (zh) | 显示设备上语音指令的处理方法、显示设备及服务器 | |
CN112885347A (zh) | 一种显示设备的语音控制方法、显示设备及服务器 | |
CN112817556A (zh) | 显示设备上语音方案的切换方法、显示设备及控制装置 | |
CN113301405A (zh) | 一种显示设备及虚拟键盘的显示控制方法 | |
CN114302070A (zh) | 显示设备和音频输出方法 | |
CN112947888A (zh) | 语音功能页面的显示方法及显示设备 | |
CN113014977A (zh) | 显示设备及音量显示方法 | |
CN112637957A (zh) | 显示设备及显示设备与无线音箱的通信方法 | |
WO2022160911A1 (zh) | 显示设备上语音方案的切换方法、显示设备及控制装置 | |
CN111787115B (zh) | 服务器、显示设备和文件传输方法 | |
CN116347148A (zh) | 一种投屏方法及显示设备 | |
CN115701105A (zh) | 显示设备、服务器及语音交互方法 | |
CN115103144A (zh) | 显示设备及音量条显示方法 | |
CN113490013B (zh) | 一种服务器及数据请求方法 | |
CN113490041B (zh) | 语音功能切换方法及显示设备 | |
CN113784203A (zh) | 一种显示设备及频道切换方法 | |
CN114302197A (zh) | 一种语音分离控制方法及显示设备 | |
CN112882780A (zh) | 设置页面显示方法及显示设备 | |
CN113114396A (zh) | 一种显示设备及信道选择方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21922501 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.11.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21922501 Country of ref document: EP Kind code of ref document: A1 |