Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides a method, an apparatus, a device, and a storage medium for controlling a terminal device.
According to a first aspect of embodiments of the present disclosure, there is provided a method for controlling a terminal device, where the method is applied to a terminal device including a DVS acquisition module, the method including:
acquiring event data of human eyes through a DVS acquisition module;
identifying an eye movement from the event data of the human eye;
And under the condition that the recognized eye movement meets the preset operation condition, controlling the terminal equipment to execute corresponding operation according to an operation control instruction obtained by the recognized eye movement.
In an alternative embodiment, the preset operation condition includes that the identified eye motion is a target eye motion for indicating the terminal device to execute the target operation, and the operation control instruction is obtained according to the mapping relation between the pre-constructed target eye motion and the operation control instruction.
In an alternative embodiment, the preset operation condition comprises that the recognized eye motion is a preset eye motion which is used for representing a specified touch mode, the time that the control object gazes at the same position of the terminal equipment exceeds a preset time threshold, the operation control instruction is the same as an instruction triggered by the determined position that the control object gazes at on the specified touch mode touch terminal equipment, and the specified touch mode comprises any one mode of clicking, double clicking and long pressing.
In an alternative embodiment, before acquiring the event data of the human eye, the method further comprises:
determining the position information of the visual focus of the control object according to the event data acquired by the DVS acquisition module;
And judging that the visual focus is positioned in a preset controllable area by utilizing the position information, wherein the controllable area is configured according to the controllable area of the terminal equipment.
In an alternative embodiment, the determining the position information of the visual focus of the control object according to the event data collected by the DVS collection module includes:
Determining the eye position of a control object in event data acquired by a DVS acquisition module, and combining the distance between the eye and the DVS acquisition module and the position information of the DVS acquisition module on terminal equipment to obtain the spatial position information of the eye relative to the terminal equipment;
performing line-of-sight direction identification by using the event data to obtain the line-of-sight direction of the control object;
And obtaining the position information of the visual focus of the control object according to the space position information and the determined sight line direction.
In an alternative embodiment, before acquiring the event data of the human eye, the method further comprises:
And identifying a designated wake-up action from event data acquired by the DVS acquisition module, wherein the designated wake-up action is a pre-designated action for representing the willingness of a control object to control the terminal equipment by utilizing the eye action.
In an alternative embodiment, the method further comprises:
When at least two faces are identified, outputting prompt information for selecting a control object;
And taking the selected object as a control object based on a selection instruction triggered by a user, wherein the eyes are eyes of the control object.
In an optional embodiment, in a case where the identified eye movement meets a preset operation condition, the controlling the terminal device to execute the corresponding operation according to the operation control instruction obtained by the identified eye movement includes:
performing countdown reminding after the recognized eye actions meet preset operation conditions;
if a control object trigger prohibition instruction is not obtained during the countdown, the terminal device is controlled to execute a corresponding operation in accordance with an operation control instruction obtained by the recognized eye movement.
According to a second aspect of embodiments of the present disclosure, there is provided a manipulation apparatus of a terminal device, the apparatus being applied to a terminal device including a DVS acquisition module, the apparatus including:
The data acquisition module is configured to acquire event data of human eyes through the DVS acquisition module;
an action recognition module configured to recognize an eye action from event data of the human eye;
and the operation control module is configured to control the terminal equipment to execute corresponding operation according to an operation control instruction obtained by the identified eye action under the condition that the identified eye action meets the preset operation condition.
In an alternative embodiment, the preset operation condition includes that the identified eye motion is a target eye motion for indicating the terminal device to execute the target operation, and the operation control instruction is obtained according to the mapping relation between the pre-constructed target eye motion and the operation control instruction.
In an alternative embodiment, the preset operation condition comprises that the recognized eye motion is a preset eye motion which is used for representing a specified touch mode, the time that the control object gazes at the same position of the terminal equipment exceeds a preset time threshold, the operation control instruction is the same as an instruction triggered by the determined position that the control object gazes at on the specified touch mode touch terminal equipment, and the specified touch mode comprises any one mode of clicking, double clicking and long pressing.
In an alternative embodiment, the apparatus further comprises a region judgment module configured to:
Before acquiring event data of human eyes, determining position information of a visual focus of a control object according to the event data acquired by the DVS acquisition module;
And judging that the visual focus is positioned in a preset controllable area by utilizing the position information, wherein the controllable area is configured according to the controllable area of the terminal equipment.
In an alternative embodiment, the area determining module is specifically configured to:
Determining the eye position of a control object in event data acquired by a DVS acquisition module, and combining the distance between the eye and the DVS acquisition module and the position information of the DVS acquisition module on terminal equipment to obtain the spatial position information of the eye relative to the terminal equipment;
performing line-of-sight direction identification by using the event data to obtain the line-of-sight direction of the control object;
And obtaining the position information of the visual focus of the control object according to the space position information and the determined sight line direction.
In an alternative embodiment, the action recognition module is further configured to:
Before the data acquisition module acquires the event data of human eyes, a designated wake-up action is identified from the event data acquired by the DVS acquisition module, wherein the designated wake-up action is a pre-designated action for representing the willingness of a control object to control the terminal equipment by utilizing the eye action.
In an alternative embodiment, the apparatus further comprises an object selection module configured to:
When at least two faces are identified, outputting prompt information for selecting a control object;
And taking the selected object as a control object based on a selection instruction triggered by a user, wherein the eyes are eyes of the control object.
In an alternative embodiment, the operation control module is specifically configured to:
performing countdown reminding after the recognized eye actions meet preset operation conditions;
if a control object trigger prohibition instruction is not obtained during the countdown, the terminal device is controlled to execute a corresponding operation in accordance with an operation control instruction obtained by the recognized eye movement.
According to a third aspect of the disclosed embodiments, there is provided a terminal device comprising a DVS acquisition module, a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of any of the methods described above when executing the program.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods described above.
The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:
according to the embodiment of the disclosure, the DVS acquisition module of the terminal equipment is used for acquiring the event data of the human eyes, the eye actions are identified from the event data of the human eyes, and under the condition that the identified eye actions meet the preset operation conditions, the terminal equipment is controlled to execute corresponding operations according to the operation control instructions obtained by the identified eye actions, so that the terminal equipment can be controlled without hand operation, and the event data comprise the data of the pixel units with the detected light intensity change, so that the data size is low and the response speed is high.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
With the wide use of intelligent terminals, more and more people cannot leave terminal equipment such as mobile phones, tablets and the like. The interaction between the terminal equipment and the user becomes the research and development key of various large terminal manufacturers, and various technical schemes for realizing the operation interaction with the user on the terminal equipment appear. Currently, the main stream is to perform touch operation on a screen by a user. However, this manner of touch control is not convenient enough in many cases. In addition, the screen of the terminal equipment is larger and larger, a single hand cannot cover the full screen, a dead zone which cannot be touched exists, the gesture of holding the terminal equipment can make the touch of the screen of the terminal equipment difficult or the hand easily tired when lying on one side or lying on one side, or the upper limb of a person with mobility impairment is inconvenient, and the terminal equipment is difficult or cannot be operated by hand at all.
In view of this, the embodiment of the present disclosure provides a control scheme of a terminal device, where event data of a human eye is obtained through a DVS acquisition module of the terminal device, an eye motion is identified from the event data of the human eye, and in a case where the identified eye motion meets a preset operation condition, the terminal device is controlled to perform a corresponding operation according to an operation control instruction obtained by the identified eye motion, so that the terminal device can be controlled without a hand operation, and since the event data includes data of a pixel unit in which a change in light intensity is detected, the data size is low and the response speed is fast.
The method for controlling the terminal device provided in this embodiment may be implemented by software, or may be implemented by a combination of software and hardware or by hardware, where the related hardware may be configured by two or more physical entities, or may be configured by one physical entity. The method of the embodiment can be applied to the electronic equipment with the DVS acquisition module. The electronic device may be a portable device such as a smart phone, an intelligent learning machine, a tablet computer, a notebook computer, a PDA (Personal digital assistant), a fixed device such as a desktop computer, or a wearable device such as a smart watch and a bracelet.
A smart phone is taken as an example for illustration. The execution subject of the embodiment of the disclosure may be a smart phone, or may be a service program in an operating system of the smart phone. It should be noted that, the smart phone is only one application example provided by the embodiments of the present disclosure, and the technical solution provided by the embodiments of the present disclosure should not be understood as being applicable to a smart phone only.
A dynamic vision sensor (Dynamic Vision Sensor, DVS), which may also be referred to as a dynamic event sensor, is a biomimetic vision sensor that simulates the human retina based on pulse-triggered neurons. The sensor has a pixel unit array formed by a plurality of pixel units inside, wherein each pixel unit only responds to and records the area of rapid change of the light intensity when the light intensity is sensed to change. The specific composition of the dynamic vision sensor is not described here too much. The DVS may employ an event-triggered processing mechanism that outputs an asynchronous event data stream. The event data stream may be event data of successive moments. The event data may include light intensity change information (e.g., a time stamp of the light intensity change and a light intensity value), a coordinate position of the triggered pixel unit, and the like. The response speed of the DVS is not limited by the traditional exposure time and frame rate any more, a high-speed object moving at the speed of ten thousands of frames per second can be detected, the DVS has a larger dynamic range, scene changes can be accurately sensed and output in a low-illumination or high-exposure environment, the DVS power consumption is lower, and the DVS cannot be affected by motion blur because each pixel unit of the DVS can independently respond to the light intensity changes.
Embodiments of the present disclosure are illustrated in the following drawings.
As shown in fig. 1, fig. 1 is a flowchart illustrating a method for controlling a terminal device according to an exemplary embodiment of the present disclosure, where the method may be used in a terminal device, and the terminal device includes a DVS acquisition module, which may be an image capturing module based on a dynamic vision sensor DVS. The method may comprise the steps of:
In step 102, acquiring event data of a human eye through a DVS acquisition module;
in step 104, an eye movement is identified from the event data of the human eye;
In step 104, in the case where the recognized eye movement satisfies the preset operation condition, the terminal device is controlled to perform a corresponding operation according to the operation control instruction obtained from the recognized eye movement.
The method of the embodiment can be used in terminal equipment, and the terminal equipment is provided with a Dynamic Visual Sensor (DVS) based acquisition module, namely a DVS acquisition module. The DVS acquisition module may be disposed on an outer surface of the terminal device, so as to acquire event data of an environment in which the terminal device is located. Such as on the front or back of the terminal device. In some scenarios, since the control object (operator) often performs terminal control according to the content displayed on the screen, in one embodiment, the DVS camera module is disposed on the screen of the terminal device. For example, the DVS acquisition module may be located in the surrounding area of the front-facing camera, although the DVS acquisition module may be located elsewhere, even in place of the front-facing camera in some applications.
The DVS acquisition module acquires event data in a scene, and can output events when the scene changes. For example, when no object in the scene moves relative to the terminal device, the light intensity detected by the pixel unit in the DVS acquisition module will not change, and when an object in the scene is detected to move relative to the terminal device, the light will change, so that the pixel unit is triggered, and an event data stream of the pixel unit detecting the change in light intensity is output, where each event data in the event data stream may include the coordinate position of the pixel unit detecting the change in brightness, the timestamp information of the triggered time, the light intensity value, and so on. Since in the DVS acquisition module, for a single pixel, only when the received light intensity changes, an event (pulse) signal is output. Say that the brightness increases beyond a threshold, then an event of an increase in brightness for that pixel will be added. The event data corresponding to the same time stamp information may be presented in the form of an image, and thus may be referred to as DVS image data, which may be regarded as partial image data, with no event data for a pixel unit where a change in light intensity is not detected.
The embodiment has the advantages of congenital advantages in gesture recognition sensing and motion capturing of moving objects by using DVS, and can be used for recognizing azimuth information, motion information and the like of eyes more accurately and efficiently. DVS also has high frame rate, high dynamics. The high frame rate can enable the DVS to acquire eye information more frequently, analyze eye movements more carefully and capture small changes of eyes more timely. The eye can be well monitored under strong light, dim light or backlight due to high dynamic and good performance under dim light, and the eye can be well adapted to various extreme light environments.
In the related art, a touch control manner may be used to control the terminal device, and the disclosure provides an eye control manner, for which, in one embodiment, an eye control mode may be constructed. The execution conditions of steps 102 to 106 at least include that the eye manipulation mode is on. When the user does not need to utilize the eye control terminal, the eye control mode can be closed, so that resource waste caused by the fact that the DVS acquisition module acquires event data in real time is avoided. In another embodiment, the eye manipulation mode may not be constructed, and it may be determined whether to perform steps 102 to 106 by whether the DVS acquisition module is activated.
In some application scenarios, when the eye manipulation mode or the DVS acquisition module is started, there may be a certain interval time from the actual manipulation terminal, and the continuous action recognition during this period may cause resource waste, so in another embodiment, steps 102 to 106 may be triggered and executed after determining that the user has a desire to manipulate the terminal. In one example, whether the user has a desire to operate the terminal using eye movements may be determined by whether the visual focus is located in a predetermined controllable region. Correspondingly, before acquiring the event data of the human eyes, the method further comprises:
determining the position information of the visual focus of the control object according to the event data acquired by the DVS acquisition module;
And judging that the visual focus is positioned in a preset controllable area by utilizing the position information, wherein the controllable area is configured according to the controllable area of the terminal equipment.
The event data may be event data of the same time collected by the DVS collection module, or may be event data collected at consecutive time(s). The visual focus in the controllable region may be used to characterize the control object's willingness to have a control terminal, which may be a primary willingness if graded. In one example, the position information of the visual focus may be obtained from spatial position information of the eye with respect to the terminal device and a line of sight direction of the eye. For example, the determining the position information of the visual focus of the control object according to the event data collected by the DVS collection module may include:
Determining the eye position of a control object in event data acquired by a DVS acquisition module, and combining the distance between the eye and the DVS acquisition module and the position information of the DVS acquisition module on terminal equipment to obtain the spatial position information of the eye relative to the terminal equipment;
performing line-of-sight direction identification by using the event data to obtain the line-of-sight direction of the control object;
And obtaining the position information of the visual focus of the control object according to the space position information and the determined sight line direction.
The event data may include information such as the coordinate position of the pixel unit where the change in brightness is detected, and thus the eye position may be determined according to the coordinate position of the pixel unit representing the eye.
The distance of the eye from the DVS acquisition module may be determined from depth information indicative of the pixel cells of the eye. In one example, if there is only a monocular camera module, a DFF (Depth From Focus) method may be employed to determine the distance of the eye From the DVS acquisition module. In another example, if there are multiple camera modules at the terminal device, a binocular camera may be used to determine the distance of the eyes from the DVS acquisition module. Taking the case that the DVS acquisition module is arranged on the surface of the screen as an example, the DVS acquisition module and the front camera can form a binocular camera module, so that the binocular camera module is utilized to determine the distance between eyes and the DVS acquisition module. It can be understood that the specific determination method is similar to the measurement by using a common binocular camera, and is not described herein, and in addition, other methods may be used to determine the distance between the eyes and the DVS acquisition module, which is not described herein.
The location information of the DVS acquisition module on the terminal device is fixed and may be stored in advance.
After the position of the eye, the distance between the eye and the DVS acquisition module, and the position information of the DVS acquisition module on the terminal device are obtained, spatial position information of the eye relative to the terminal device can be obtained. For example, a three-dimensional space coordinate system may be established, whereby the space coordinates of the eye and the terminal device may be obtained.
The gaze direction, also known as gaze direction, may be a direction determined based on the pupil. The event data includes data representing the eyes, so that the eye sight direction can be obtained by utilizing the event data to recognize the eye sight direction. In one example, the method of passively determining the direction of the line of sight may be adopted, the DVS acquisition module is used to acquire event data of the eyes, and the acquired event data is used to perform eyeball stereo modeling, so as to obtain the position of the pupil, and the gaze direction is determined according to the position of the pupil. In another example, a manner of actively determining the direction of the line of sight may be employed. For example, a light source active emitting device is provided, a point light source is emitted to eyes through the light source active emitting device, the point light source can reflect in the eyes, event data of the eyes in the scene are collected through a DVS, and a gazing direction is determined according to the positions of the reflection points of the point light source in the eyes.
It should be understood that other manners of determining the eye gaze direction may also be employed, and are not described in detail herein.
In this embodiment, the position information of the visual focus can be determined by the spatial position information of the eye with respect to the terminal device, and the line of sight direction of the eye.
After the position information of the visual focus is obtained, whether the visual focus is located in a predetermined controllable area may be determined, and when it is determined that the visual focus is located in the controllable area, step 102 may be performed, otherwise, the step of determining the position information of the visual focus of the control object according to the event data collected by the DVS collection module is performed continuously is returned.
The predetermined controllable region may be determined according to whether the control object has a desire to control the terminal when the visual focus falls within the controllable region. In one embodiment, when the control object faces the controllable region of the terminal device, the control object may be considered to have the intention of controlling the terminal device, and for this purpose, the controllable region may be configured according to the controllable region of the terminal device. The controllable region of the terminal device may include a touchable controllable region and/or a key controllable region, etc. How to configure the controllable area according to the controllable area of the terminal device can be set according to the requirement. The controllable region may be a region where the control subject desires to manipulate the terminal with the eye. By way of example, the controllable area may be a full screen area, a partial screen area, even a border area of the terminal device, etc. For example, in a video call scenario, in order to avoid mishandling of a hang-up button in a video call interface, the controllable region may be a controllable region excluding the region where the specified button is located. As shown in fig. 2A, fig. 2A is a schematic diagram of a controllable region shown in accordance with an exemplary embodiment of the present disclosure. In this embodiment, the controllable area may include a touch area excluding an area where the hang-up button is located. For another example, in the scene of the homepage or the lock page of the terminal device, the controllable area may be the area where the full screen is located, and even the area where the key is located is also included. As shown in fig. 2B, fig. 2B is a schematic diagram of another controllable region shown in accordance with an exemplary embodiment of the present disclosure. In this schematic diagram, the controllable area may include an area where the front side of the terminal device is located. It will be appreciated that the controllable region may also be configured as other regions as desired, not specifically illustrated herein.
In this embodiment, when it is determined that the visual focus is located in the predetermined controllable area, the execution of the action recognition is triggered, so that the resource waste caused by the action recognition at all times can be avoided.
In one embodiment, in order to avoid mishandling of the camera by the action of the face designated part, whether the subsequently recognized action is valid or not may also be determined by whether the designated wake-up action is detected, and in case the action is valid, the terminal device is controlled to execute the corresponding operation. Correspondingly, before the event data of the human eyes are acquired, the method further comprises the step of identifying a designated wake-up action from the event data acquired by the DVS acquisition module, wherein the designated wake-up action is a pre-designated action for representing the willingness of a control object to control the terminal equipment by utilizing the eye action.
The recognition of the specified wake-up action can be used to characterize the control object's willingness to manipulate the terminal with eye movements. If the willingness characterized by the specified wake-up action is identified as being stronger than the willingness characterized by the visual focus in the controllable region by the classification, the willingness characterized by the specified wake-up action can be classified as a depth willingness. The specified wake-up action may be a pre-specified action, may be a default action, or may be an action pre-configured by the control object. For example, the operation may be performed by blinking twice in succession. After the appointed wake-up action is detected, event data of human eyes can be obtained from the event data newly collected by the DVS collection module so as to identify eye actions and control the terminal equipment.
In this embodiment, after the specific wake-up action is identified from the event data collected by the DVS collection module, the eye action is identified from the event data of the human eye as an effective action, and the effective action can trigger the control of the terminal device. By judging whether the subsequent recognized action is valid by judging whether the designated wake-up action is recognized or not, the error control of the terminal caused by carelessly executing a certain eye action can be avoided.
Further, a validity period of the specified wake-up action may also be set, and the target eye action/specified eye action is an action recognized within a preset period of time after the specified wake-up action is recognized. And identifying the eye movement within a preset time period after identifying the appointed wake-up movement, and judging that the movement is effective. The preset time period may be set according to the requirement, for example, may be set to 3s,10s, or the like. In a scenario where a specific wake-up action must be recognized before each eye action is recognized, the value of the preset time period may be set to be small. After the specific wake-up action is identified, the value of the preset time period may be set to be larger in a scenario where all of the following actions are determined to be valid actions.
The control object may be an object that manipulates the terminal device. For example, the terminal may be a designated object (which may be referred to as an object having a right) or may be a non-designated object (whether the object has a right is not agreed, so long as the terminal is currently controlled). In practical applications, in the case that there may be a plurality of objects within the acquisition range of the DVS acquisition module, in one embodiment, the control object may be an object having rights, and the object having rights may be identified from the acquired face image, and event data of human eyes having the rights object may be acquired. For example, it may be agreed that only the owner has the right to control the camera with an action. In another embodiment, to achieve controllability of the control object, the method further comprises:
When at least two faces are identified, outputting prompt information for selecting a control object;
And taking the selected object as a control object based on a selection instruction triggered by a user, wherein the eyes are eyes of the control object.
Regarding face recognition, face recognition can be performed by using images acquired by other camera modules (such as front cameras) on the terminal device, or by using event data acquired by the DVS acquisition module. The selected object may be a person object selected by the user based on the prompt.
According to the embodiment, the prompt information for selecting the control object is output for the user to select, so that the selectivity of the control object can be improved.
Regarding the eye movements, the eye movements may be one or more of pupil movements, eyeball movements, eyelid movements, and even as a broad understanding, eye movements may include eyebrow movements. For example, the pupil may be enlarged and contracted, the eyeball may be moved, rotated, etc. in different directions, the eyelid may be opened and closed, and the eye expression action characterizing the specified emotion may be completed in combination with the eyelid, eyeball, eyebrow, etc. The target actions may include, for example, one or more of blinking both eyes at least once, opening one eye, resetting the eyeball after moving to a specified orientation, and an eye expression action that characterizes a specified emotion. The eyeball is reset after moving to a specified direction, for example, reset after moving upwards, reset after moving leftwards, etc. The eye expression motion may be an expression such as frowning+glaring.
In this embodiment, the eye motion control terminal can reduce the recognition content, and the eye motion is easy to be completed for the control object, so as to improve the user control experience.
The event data of the human eye may be event data including the human eye, for example, the event data of the human eye is directly obtained from the DVS acquisition module. The event data of the human eyes can also be the event data of the human eyes part intercepted from the event data acquired by the DVS acquisition module.
When the eyes are changed, the event data of the eyes can be obtained, so that the eyes in the acquisition range of the DVS acquisition module can be identified according to the event data. For example, the eye motion in the acquisition range of the DVS acquisition module may be identified according to event data at the same time. For example, the eye region may be identified first according to the event data corresponding to the same time stamp information, and then the eye motion may be identified from the eye region. It will be appreciated that event data under a time stamp or a plurality of time stamps may be required during the identification process and is not limited herein.
In one embodiment, the eye motion in the event data may be identified using a pre-trained motion recognition model. The motion recognition model may be a model obtained by training with labeled training samples. Training samples may include general purpose samples and special purpose samples. The universal sample is event data when the eyes of the subject are not distinguished to perform actions, and the special sample can be event data when the eyes of the subject are controlled locally to perform actions. For example, a dedicated sample acquisition service is provided. For example, taking eye movements including blinking as an example, the user is prompted to blink, then the DVS acquisition module is used to acquire event data when the control object blinks, and record the event data. For example, considering that the eyes of each person may have differences, a general action recognition model may be obtained by training a general sample, and for each device, a special sample of a control object may be used to perform reinforcement training on the general action recognition model to obtain a special action recognition model, thereby improving recognition accuracy.
It can be understood that other means may be used to identify the eye motion in the acquisition range of the DVS acquisition module, which is not described in detail herein.
As for the preset operation condition, a preset condition for triggering the control terminal device may be mentioned. In one embodiment, a mapping relationship between the target eye motion and the operation control instruction may be pre-constructed, where the target eye motion is used to instruct the terminal device to execute the target operation. The target operation may be a basic operation that the operating system can complete, and for example, the target operation may be a power on, a power off, an unlock screen, a lock screen, a slide left screen, a slide right screen, a slide up screen, a slide down screen, an increase volume, a decrease volume, or the like. The eye motion is used to replace the motion of a hand touching the terminal. By way of example, looking left instead of the user sliding the screen left with his hand, looking up instead of the user sliding the screen up with his hand, etc. Correspondingly, the preset operation condition can comprise that the recognized eye motion is used for indicating the terminal equipment to execute the target eye motion of the target operation, and the operation control instruction is obtained according to the mapping relation between the pre-constructed target eye motion and the operation control instruction.
In this embodiment, different control instructions are indicated by different actions, and different operations can be performed by the eye motion control terminal.
In practical applications, since the eye can perform limited actions, and the number of terminal operations expected to be performed by using eye manipulation is large, in one embodiment, the operation control command may be determined by combining the gaze information of the control object. The preset operation condition can comprise that the recognized eye motion is a preset eye motion which is used for representing a specified touch mode, the time that a control object gazes at the same position of the terminal equipment exceeds a preset time threshold, the operation control instruction is the same as an instruction triggered by the determined position that the control object gazes at on the specified touch mode touch terminal equipment, and the specified touch mode comprises any one of single click, double click and long press.
The position at which the controlled object gazes may be obtained from the position information of the visual focus of the controlled object. The gazing position of the controlled object can be a button such as an application icon or a control in a page, or can be a physical key such as a home key, a volume key, a shutdown key and the like.
For example, a single click is represented by a discontinuous blink, a double click is represented by two successive blinks, and a long press is represented by a rotating eyeball. For example, when the control object blinks discontinuously and the time of staring at the camera application icon exceeds the preset time threshold, a control instruction for opening the camera application can be obtained according to the recognized eye motion, and the terminal device is controlled to execute the camera application. For another example, when the control object rotates the eyeball and the time of staring at the instant messaging application icon exceeds the preset time threshold, an operation control instruction for entering the application icon deletion mode can be obtained according to the identified eye action, and the terminal device is controlled to enter the application icon deletion mode. Furthermore, the terminal device can be instructed to execute the deleting operation by utilizing the upward-looking target eye motion, so that the instant messaging application is deleted.
It can be seen that the two embodiments can be used alone or in combination, so as to realize that the terminal can be controlled in various ways, and realize that the diversity of terminal equipment is controlled by utilizing eye actions.
In practical application, misoperation may occur, and in view of this, a reminding mechanism is also configured. Correspondingly, in the case that the recognized eye movement meets the preset operation condition, according to the operation control instruction obtained by the recognized eye movement, controlling the terminal device to execute the corresponding operation includes:
performing countdown reminding after the recognized eye actions meet preset operation conditions;
and if the control object trigger prohibition instruction is not received during the countdown period, controlling the terminal device to execute corresponding operation according to the operation control instruction obtained by the identified eye movement.
In this embodiment, the countdown alert may be an audio alert or a display alert, for example, a countdown number or a countdown progress bar may be displayed on a screen, so as to alert the user that an operation corresponding to the eye motion is performed after the countdown is completed. A button for the user to select prohibition or continuation may also be provided in the interface, and if the user considers that the user is misoperation, the prohibition button may be clicked to trigger the prohibition instruction, so that misoperation is avoided.
The various technical features of the above embodiments may be arbitrarily combined as long as there is no conflict or contradiction between the features, but are not described in detail, and therefore, the arbitrary combination of the various technical features of the above embodiments is also within the scope of the disclosure of the present specification.
Several of these are exemplified below.
As shown in fig. 3, fig. 3 is a flowchart of another method for controlling a terminal device according to an exemplary embodiment of the present disclosure, the method being applied to a terminal device including a DVS acquisition module, the method including:
In step 302, position information of a visual focus of a control object is determined according to event data acquired by the DVS acquisition module.
The visual focus may be determined according to spatial position information (such as eyeball azimuth) of the eye with respect to the terminal device and a line-of-sight direction of the control object.
In one example, to ensure security of the terminal device, the DVS acquisition module may be turned on in an unlocked condition. The DVS starts to detect the human eye and analyze the image to obtain the azimuth and direction information of the eyeball. The DVS is a sensor based on event dynamics, can track an eyeball moving object well, and has the characteristics of high frame rate and high dynamics. The high frame rate can enable the sensor to acquire eye information more frequently, analyze eye movements more carefully and capture small changes of eyes more timely. The high dynamic and good performance under the dark light can well complete the eye monitoring under the strong light, the dark light or the backlight.
In step 304, it is determined whether the visual focus is located in a predetermined controllable area by using the position information, if yes, step 306 is executed, if not, step 302 is executed again, and the position information of the visual focus of the control object is determined by using the newly acquired event data.
The controllable area is configured according to the controllable area of the terminal equipment.
In step 306, the DVS-based acquisition module obtains event data for a human eye and identifies eye movements from the event data for the human eye.
In this step, eye movements, such as movements of the eyeball and expressions of the eye, may be further acquired and analyzed.
In step 308, it is determined whether the recognized eye movement satisfies the preset operation condition, if yes, step 310 is executed, if not, step 302 is executed again, and the position information of the visual focus of the control object is determined by using the newly acquired event data.
In step 310, the terminal device is controlled to perform a corresponding operation according to an operation control instruction obtained from the recognized eye movement.
In fig. 3, the related art is the same as that in fig. 1, and for the sake of brevity, details are not repeated here.
According to the method and the device, the advantages of DVS on gesture recognition perception and motion capture of moving objects are utilized, azimuth information, directions and the like of eyes are accurately and efficiently recognized, corresponding operation is directly performed on the terminal device according to operation control instructions obtained through recognized eye actions, and therefore the purpose of realizing blank operation only by eye spirit without hand touch is achieved, the terminal screen is difficult to touch or fatigue easily due to the fact that hands are stained with liquid, the terminal is held in a sideways or lying state, the screen of the terminal screen is larger and larger, the terminal device can be easily controlled under the conditions that a dead zone and the like which cannot be touched exist, and interaction is achieved. The use scene of the terminal is widened, and the handhold terminal equipment can be smoothly used by people with upper limb dysfunction. The control process is more direct and concise, and the user learning cost can be reduced through the intuitive operation corresponding mode which accords with the intuition of people, so that the operation experience of the terminal is improved.
As shown in fig. 4A, fig. 4A is a flowchart of another method for controlling a terminal device according to an exemplary embodiment of the present disclosure, which may be used in a terminal device including a DVS acquisition module, the method describing a process of how to determine preset operation conditions and how to obtain operation control instructions on the basis of the foregoing embodiment, including the steps of:
In step 402, position information of a visual focus of a control object is determined according to event data acquired by the DVS acquisition module.
In step 404, it is determined whether the visual focus is located in a predetermined controllable region by using the position information, if yes, step 406 is executed, if not, step 402 is executed again, and the position information of the visual focus of the control object is determined by using the newly acquired event data.
The controllable area is configured according to the controllable area of the terminal equipment.
In step 406, event data of a human eye is obtained through a DVS acquisition module, and eye motion is identified from the event data of the human eye.
In step 408, if the identified eye movement is the target eye movement for instructing the terminal device to execute the target operation, an operation control instruction corresponding to the identified target eye movement is obtained according to the mapping relationship between the pre-constructed target eye movement and the operation control instruction, and the terminal device is controlled to execute the corresponding operation by the obtained operation control instruction.
In step 410, if the identified eye movement is a preset specified eye movement representing a specified touch mode and the time of the control object gazing at the same position of the terminal device exceeds a preset time threshold, determining an operation control instruction according to the determined specified touch mode and the determined gazing position of the controlled object on the terminal device, and controlling the terminal device to execute a corresponding operation by the obtained operation control instruction. The operation control instruction is the same as an instruction triggered by the determined appointed touch mode of touching the position of the controlled object gazing on the terminal equipment, and the appointed touch mode comprises any mode of clicking, double clicking and long pressing.
In fig. 4A, which is the same as the related art in fig. 1, for the sake of brevity, details are not repeated here.
The following is an example of a specific application scenario. As shown in fig. 4B, fig. 4B is an application scenario diagram of a manipulation method of a terminal device according to an exemplary embodiment of the present disclosure. In this scene, when the control subject eyes are seen to the left, since eye movements seen to the left are recognized, the terminal device performs the same operation as that triggered by sliding the screen to the left. When the control object is in eye at the camera icon after blinking, triggering the terminal device to open the camera application as the recognized eye movements are eye blinks and the time that the control object is in eye at the camera icon exceeds a preset time threshold. According to the embodiment, the mobile phone is controlled at intervals through eye actions, so that the use scene of the mobile phone is widened.
Corresponding to the foregoing embodiments of the method for controlling a terminal device, the disclosure further provides embodiments of a control apparatus of the terminal device, a device to which the apparatus is applied, and a storage medium.
As shown in fig. 5, fig. 5 is a block diagram of a manipulation apparatus of a terminal device according to an exemplary embodiment of the present disclosure, the apparatus being applied to a terminal device including a DVS acquisition module, the apparatus including:
a data acquisition module 52 configured to acquire event data of a human eye through the DVS acquisition module;
an action recognition module 54 configured to recognize an eye action from event data of the human eye;
the operation control module 56 is configured to control the terminal device to perform a corresponding operation according to an operation control instruction obtained from the recognized eye movement in the case where the recognized eye movement satisfies a preset operation condition.
In an alternative embodiment, the preset operation condition includes that the identified eye motion is a target eye motion for indicating the terminal device to execute the target operation, and the operation control instruction is obtained according to the mapping relation between the pre-constructed target eye motion and the operation control instruction.
In an alternative embodiment, the preset operation condition comprises that the recognized eye motion is a preset eye motion which is used for representing a specified touch mode, the time that the control object gazes at the same position of the terminal equipment exceeds a preset time threshold, the operation control instruction is the same as an instruction triggered by the determined position that the control object gazes at on the touch terminal equipment in the specified touch mode, and the specified touch mode comprises any one of single click, double click and long press.
In an alternative embodiment, the apparatus further comprises a region judgment module (not shown in fig. 5) configured to:
Before acquiring event data of human eyes, determining position information of a visual focus of a control object according to the event data acquired by the DVS acquisition module;
And judging that the visual focus is positioned in a preset controllable area by utilizing the position information, wherein the controllable area is configured according to the controllable area of the terminal equipment.
In an alternative embodiment, the area determining module determines the position information of the visual focus of the control object according to the event data collected by the DVS collecting module, including:
Determining the eye position of a control object in event data acquired by a DVS acquisition module, and combining the distance between the eye and the DVS acquisition module and the position information of the DVS acquisition module on terminal equipment to obtain the spatial position information of the eye relative to the terminal equipment;
performing line-of-sight direction identification by using the event data to obtain the line-of-sight direction of the control object;
And obtaining the position information of the visual focus of the control object according to the space position information and the determined sight line direction.
In an alternative embodiment, the action recognition module is further configured to:
Before the data acquisition module acquires the event data of human eyes, a designated wake-up action is identified from the event data acquired by the DVS acquisition module, wherein the designated wake-up action is a pre-designated action for representing the willingness of a control object to control the terminal equipment by utilizing the eye action.
In an alternative embodiment, the apparatus further comprises an object selection module (not shown in fig. 5) configured to:
When at least two faces are identified, outputting prompt information for selecting a control object;
And taking the selected object as a control object based on a selection instruction triggered by a user, wherein the eyes are eyes of the control object.
In an alternative embodiment, the operation control module is specifically configured to:
performing countdown reminding after the recognized eye actions meet preset operation conditions;
if a control object trigger prohibition instruction is not obtained during the countdown, the terminal device is controlled to execute a corresponding operation in accordance with an operation control instruction obtained by the recognized eye movement.
Correspondingly, the disclosure also provides a terminal device, which comprises a DVS acquisition module, a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of any one of the methods when executing the program.
Accordingly, the present disclosure also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods described above.
The present disclosure may take the form of a computer program product embodied on one or more storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having program code embodied therein. Computer-usable storage media include both permanent and non-permanent, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
Specific details of the implementation process of the functions and roles of each module in the device are shown in the implementation process of the corresponding steps in the method, and are not repeated here.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the objectives of the disclosed solution. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
As shown in fig. 6, fig. 6 is a block diagram of an apparatus for controlling a terminal device according to an exemplary embodiment of the present disclosure. The apparatus 600 may be a mobile phone with a DVS acquisition module, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, or the like.
Referring to FIG. 6, the apparatus 600 may include one or more of a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an input/output (I/O) interface 612, a sensor component 614, and a communication component 616.
The processing component 602 generally controls overall operation of the apparatus 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 may include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.
The memory 604 is configured to store various types of data to support operations at the apparatus 600. Examples of such data include instructions for any application or method operating on the apparatus 600, contact data, phonebook data, messages, pictures, videos, and the like. The memory 604 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 606 provides power to the various components of the device 600. The power supply components 606 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 600.
The multimedia component 608 includes a screen between the device 600 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 608 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the apparatus 600 is in an operational mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 610 is configured to output and/or input audio signals. For example, the audio component 610 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.
The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to, a home button, a volume button, an activate button, and a lock button.
The sensor assembly 614 includes one or more sensors for providing status assessment of various aspects of the apparatus 600. For example, the sensor assembly 614 may detect the on/off state of the device 600, the relative positioning of the components, such as the display and keypad of the device 600, the sensor assembly 614 may also detect the change in position of the device 600 or one of the components in the device 600, the presence or absence of user contact with the device 600, the orientation or acceleration/deceleration of the device 600, and the change in temperature of the device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 616 is configured to facilitate communication between the apparatus 600 and other devices in a wired or wireless manner. The device 600 may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 616 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing any one of the methods described above.
In an exemplary embodiment, a non-transitory computer-readable storage medium is also provided, such as memory 604, including instructions executable by processor 620 of apparatus 600 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
Wherein the instructions in the storage medium, when executed by the processor, enable the apparatus 600 to perform a method for handling a terminal device, comprising:
acquiring event data of human eyes through a DVS acquisition module;
identifying an eye movement from the event data of the human eye;
And under the condition that the recognized eye movement meets the preset operation condition, controlling the terminal equipment to execute corresponding operation according to an operation control instruction obtained by the recognized eye movement.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
The foregoing description of the preferred embodiments of the present disclosure is not intended to limit the disclosure, but rather to cover all modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present disclosure.