[go: up one dir, main page]

CN111147894A - Sign language video generation method, device and system - Google Patents

Sign language video generation method, device and system Download PDF

Info

Publication number
CN111147894A
CN111147894A CN201911251154.7A CN201911251154A CN111147894A CN 111147894 A CN111147894 A CN 111147894A CN 201911251154 A CN201911251154 A CN 201911251154A CN 111147894 A CN111147894 A CN 111147894A
Authority
CN
China
Prior art keywords
sign language
video
stream data
word segmentation
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911251154.7A
Other languages
Chinese (zh)
Inventor
金国卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Intelligent Terminal Co ltd
Original Assignee
Suning Intelligent Terminal Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Intelligent Terminal Co ltd filed Critical Suning Intelligent Terminal Co ltd
Priority to CN201911251154.7A priority Critical patent/CN111147894A/en
Publication of CN111147894A publication Critical patent/CN111147894A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234336Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by media transcoding, e.g. video is transformed into a slideshow of still pictures or audio is converted into text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Library & Information Science (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The application discloses a sign language video generation method, a sign language video generation device and a sign language video generation system, wherein the method comprises the following steps: processing received character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntax analysis result of the character stream data; searching and obtaining sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images; and sequencing and combining the sign language image data according to the dependency syntax analysis result to generate a sign language video and sending the sign language video to the user side for presentation by the user side, so that the conversion of character stream data into the sign language video which can be watched by the hearing-impaired user is realized, the video can be conveniently watched by the hearing-impaired user, and the use experience of the hearing-impaired user is improved.

Description

Sign language video generation method, device and system
Technical Field
The invention relates to the technical field of computers, in particular to a method, a device and a system for generating a sign language video.
Background
When viewing video image data such as television, a user with auditory handicap often cannot view the video normally without subtitles. Even if the current video has subtitles, the meaning of the subtitles cannot be accurately understood by users with low culture level and weak literacy, so that the users cannot normally watch the video even if the users have the subtitles. This brings very big inconvenience for hearing impaired user, has greatly influenced hearing impaired user's use experience.
Disclosure of Invention
In order to solve the defects of the prior art, the invention mainly aims to provide a sign language video generation method, a sign language video generation device, a sign language video generation system and a computer system.
In order to achieve the above object, a first aspect of the present invention provides a method for generating a sign language video, the method comprising:
processing received character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
searching and obtaining sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
and sequencing and combining the sign language image data according to the dependency syntax analysis result to generate a sign language video and sending the sign language video to a user side for presentation by the user side.
In some embodiments, prior to processing the received text stream data using natural language processing techniques, the method further comprises:
voice data is received and converted into text stream data.
In some embodiments, after receiving the voice data and converting to text stream data, the method further comprises:
and sending the text stream data to a user side so that the user side can generate subtitles to display.
In some embodiments, the sorting and combining the sign language images according to the dependency parsing result to generate the sign language video specifically includes:
sequentially arranging the sign language image data according to the dependency syntax analysis result to obtain sequentially arranged sign language image data;
and endowing the sequentially arranged sign language image data to a pre-constructed virtual role to generate a sign language video.
In a second aspect, the present invention provides a method for generating a sign language video, where the method includes:
the server side uses a natural language processing technology to process the received character stream data to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
the server side searches and obtains sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
the server side sorts and combines the sign language image data according to the dependency syntax analysis result to generate a sign language video and sends the sign language video to the user side;
and the user side receives and displays the sign language video.
In a third aspect, the present invention provides an apparatus for generating sign language video, the apparatus comprising:
the communication module is used for receiving character stream data and sending the generated sign language video to a user side;
the processing module is used for processing the character stream data;
the data storage module is used for storing the sign language image data and the mapping relation between the participles and the sign language images;
and the video generation module is used for sequencing and combining the sign language images to generate the sign language video.
In a fourth aspect, the present invention provides a sign language video generating system, including:
the server is used for processing the character stream data, matching the corresponding sign language image according to the word segmentation result, generating sign language video data and sending the sign language video data to the client;
and the user side is used for receiving and presenting the sign language video data returned by the server side.
In a fifth aspect, the present invention provides a computer system, the system comprising:
one or more processors;
and memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
processing received character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
searching and obtaining sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
and sequencing and combining the sign language image data according to the dependency syntax analysis result to generate a sign language video and sending the sign language video to a user side for presentation by the user side.
According to the specific embodiments provided herein, the present application discloses the following technical effects:
processing character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntactic analysis result; the corresponding sign language image data are obtained according to the mapping relation between the word segmentation and the sign language image, and are sequenced and combined according to the dependency syntax analysis result, so that a sign language video is generated, the character stream data are converted into the sign language video which can be watched by the hearing-impaired user, the hearing-impaired user can watch the video conveniently, and the use experience of the hearing-impaired user is improved;
voice data is converted into character stream data in real time, and the conversion from the final voice data to sign language video is realized;
the text stream data is converted into the subtitles to be displayed, even if sound is not conveniently played, information which is wanted to be transmitted by the voice of the video can be still known, and the efficiency and convenience of watching the video by all users are improved.
Of course, it is not necessary for any product to achieve all of the above-described advantages at the same time for the practice of the present application.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a scene diagram of the present application;
FIG. 2 is a scenario flow diagram of the present application;
FIG. 3 is a flow chart of a method of the present application;
FIG. 4 is a flow chart of a method of the present application;
FIG. 5 is a diagram of the apparatus structure of the present application;
fig. 6 is a computer system configuration diagram of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Taking the smart television as an example, when the hearing-impaired user watches the video played in the smart television, the hearing-impaired user cannot watch the video normally if the watched video has no subtitles because the audio played in the television cannot be heard.
In order to improve the watching experience of the hearing-impaired user, the invention aims to provide a sign language video generation method, voice data are converted into character stream data, then the character stream data are processed according to the converted character stream data to obtain a word segmentation result and a dependency syntax analysis result, a matched sign language image in a sign language library is searched according to the word segmentation result, the matched sign language image is sequenced according to the dependency syntax analysis result, the sequenced sign language image is given to a pre-constructed virtual role to generate a sign language video watched by the hearing-impaired user, and the hearing-impaired user can watch the video without subtitles greatly conveniently.
The dependency parsing described in the present invention can be used to determine the syntactic structure of a sentence or the dependency between words in a sentence. The method mainly comprises two aspects of contents, namely, determining a grammar system of the language, namely, giving formal definition to a grammar structure of a legal sentence in the language; another aspect is syntactic analysis techniques, i.e. the automatic derivation of the syntactic structure of a sentence, according to a given syntactic hierarchy, the analysis of the syntactic units contained in the sentence and the relations between these syntactic units. And constructing a syntactic analysis tree according to the dependency syntactic analysis result, and determining the arrangement sequence of the target sentences according to the syntactic analysis tree.
Fig. 1 shows a system structure diagram of the present invention, which includes a server and a client. The server can be a service provider such as a cloud server with the functions of communication, word segmentation, video generation, data storage and the like. The device related to the user side can be any device with communication and display functions, such as an intelligent television, a computer, a mobile phone, a tablet and the like, voice data can be uploaded to the server side through the Internet, a sign language video is generated by the server side according to the voice data and is transmitted back to the user side, and the user side displays the received sign language video in a proper size and displays the sign language video on a screen for the user to watch.
The sign language image is a pre-drawn sign language action graph, the meaning of the graph is conveyed through sign language, and a pre-established virtual character can be endowed to generate a sign language video.
The invention can also be used for the communication between the user without hearing failure and the user with hearing failure. The user without hearing impairment can upload the voice data to the server side through the first user side, the server side can convert the received voice data into character stream data, the sign language video is generated by using the sign language video generation method provided by the invention according to the converted character stream data, and the character stream data and the sign language video are sent to the second user side of the user without hearing impairment, so that the user without hearing impairment can freely communicate with other users.
Specifically, as shown in fig. 2, taking the smart television as an example of a user side, the above scheme can be specifically implemented by the following steps:
210. the intelligent television and the server establish communication connection through three-way handshake.
After the three-way handshake establishes communication connection, the server side calls a local ASR (voice recognition) function module, requests an address of the smart television, and judges whether the smart television currently establishing communication connection has the authority to call the requested interface, wherein the interface can be used for realizing functions of natural language analysis, voice data conversion, text word segmentation and the like so as to prepare for receiving voice data uploaded by the smart television.
220. And the smart television uploads the real-time voice data to the cloud server.
230. The server receives and converts the real-time voice data into character stream data.
And the service end calls the ASR voice recognition function module to convert the voice data uploaded by the intelligent television into character stream data, so that the follow-up operation is laid.
240. And the server side transmits the converted text stream data back to the intelligent television.
250. The intelligent television displays the received character stream data in real time.
When the smart television judges that the currently played video has no subtitles or a user sends a subtitle display instruction, the received character stream data is displayed on a screen for the user to watch.
260. The server side carries out word segmentation and dependency syntactic analysis on the character stream data by using a natural language processing technology to obtain word segmentation results and dependency syntactic analysis results.
270. And the server side searches and obtains sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images.
And the server removes words which cannot be expressed by the hand language, such as articles and the like contained in the word segmentation result according to the word segmentation result, searches the sign language image corresponding to the residual words contained in the word segmentation result and obtains sign language image data corresponding to the word segmentation result.
280. And the server-side sorts the obtained sign language image data according to the dependency syntax analysis result, and endows the sorted sign language image data to a pre-constructed virtual character to generate a sign language animation video.
And the server sorts the hand language image data according to the dependency syntax analysis result and factors influencing the sorting, such as logical relations and the like, of the hand language image data, and then gives the sorted hand language image data to a pre-constructed virtual character to generate the hand language animation video.
The pre-constructed virtual character may be pre-created using Unity technology, which may enable rendering of image content such as 3D animation.
290. And the server side sends the generated sign language animation video to the smart television.
After receiving the sign language animation video sent by the server, the smart television displays the sign language animation video on the lower right corner of the current video in a small window mode, and therefore a user can watch the sign language animation video conveniently.
Example one
Correspondingly to the above steps, an embodiment of the present invention provides a method for generating a sign language video, which is applied to a server, and as shown in fig. 3, the method includes:
310. processing received character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
320. searching and obtaining sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
330. and sequencing and combining the sign language image data according to the dependency syntax analysis result to generate a sign language video and sending the sign language video to a user side for presentation by the user side.
When the user side receives the sign language video, the sign language video can be displayed on the screen for the user to watch, and the user experience of watching video programs and the like by the user with hearing disabilities is improved.
Preferably, when the user sends the voice data to the server, before processing the received text stream data by using the natural language processing technology, the method further includes:
301. receiving voice data and converting the voice data into character stream data;
preferably, in order to improve the user experience of watching the video, the text stream data generated by conversion can be sent to the user terminal to generate subtitles for display; after receiving the voice data and converting the voice data into text stream data, the method further comprises:
302. and sending the text stream data to a user side so that the user side can generate subtitles to display.
After receiving the text stream data, the user terminal can display the current video as a subtitle when judging that the current video has no subtitle or when the user sends a command for displaying the subtitle.
Preferably, the sorting and combining the sign language image data according to the dependency syntax analysis result may specifically include:
331. sequentially arranging the sign language image data according to the dependency syntax analysis result to obtain sequentially arranged sign language image data;
332. and endowing the sequentially arranged sign language image data to a pre-constructed virtual role to generate a sign language video.
The virtual character may be a pre-constructed virtual character, and may be a 3D character image pre-created through Unity technology.
Example two
Corresponding to the above embodiment, the present application further provides a method for generating a sign language video, so as to implement interaction between a user side and a server side. As shown in fig. 4, the method includes:
410. the server side uses a natural language processing technology to process the received character stream data to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
420. the server side searches and obtains sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
430. the server side sorts and combines the sign language image data according to the dependency syntax analysis result to generate a sign language video and sends the sign language video to the user side;
440. and the user side receives and displays the sign language video.
Preferably, when the user sends the voice data to the server, before the server uses the natural language processing technology to process the received text stream data, the method further includes:
401. the server receives the voice data and converts the voice data into text stream data.
Preferably, in order to improve the user experience of watching the video, the server may send the converted text stream data to the client to generate subtitles for display; after receiving the voice data and converting the voice data into text stream data, the method further comprises:
402. the server side sends the character stream data to the user side;
403. and generating the subtitles by the user side for presentation.
After receiving the text stream data, the user terminal can display the current video as a subtitle when judging that the current video has no subtitle or when the user sends a command for displaying the subtitle.
Preferably, the step of ordering and combining the sign language image data by the server according to the dependency syntax analysis result may specifically include:
431. the server sequentially arranges the sign language image data according to the dependency syntax analysis result to obtain sequentially arranged sign language image data;
432. and endowing the sequentially arranged sign language image data to a pre-constructed virtual role to generate a sign language video.
EXAMPLE III
In response to the first embodiment, the present application provides a sign language video generating device, which acts on a server, and as shown in fig. 5, the device includes:
a communication module 510, configured to receive text stream data and send the generated sign language video to a user side;
preferably, when the user terminal sends the voice data, the communication module may also be configured to receive the voice data sent by the user terminal.
The processing module 520 is configured to process the text stream data to obtain a word segmentation result and a dependency parsing result of the text stream data;
the data storage module 530 is used for storing the sign language image data and the mapping relation between the participles and the sign language images;
the data storage module comprises a sign language library, and sign language image data and a mapping relation between participles and sign language images are stored and used for providing the sign language image data;
and a video generating module 540, configured to sort and combine the sign language images to generate a sign language video.
Preferably, for the purpose of generating the sign language video by the voice data transmitted from the user side, the sign language video generating device may further include:
and a voice conversion module 550, configured to convert the voice data into text stream data.
Example four
Corresponding to the second embodiment, the present application further provides a sign language video generating system, as shown in fig. 1, including a user side and a server side:
the server is used for processing the character stream data, matching the corresponding sign language image according to the word segmentation result, generating sign language video data and sending the sign language video data to the client;
and the user side is used for receiving and presenting the sign language video data returned by the server side.
EXAMPLE five
In accordance with the above embodiments, the present application also provides a computer system comprising one or more processors; and memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
processing received character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
searching and obtaining sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
and sequencing and combining the sign language image data according to the dependency syntax analysis result to generate a sign language video and sending the sign language video to a user side for presentation by the user side.
Fig. 6 illustrates an architecture of a computer system, which may include, in particular, a processor 1510, a video display adapter 1511, a disk drive 1512, an input/output interface 1513, a network interface 1514, and a memory 1520. The processor 1510, video display adapter 1511, disk drive 1512, input/output interface 1513, network interface 1514, and memory 1520 may be communicatively coupled via a communication bus 1530.
The processor 1510 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided by the present Application.
The Memory 1520 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random access Memory), a static storage device, a dynamic storage device, or the like. The memory 1520 may store an operating system 1521 for controlling the operation of the computer system 1500, a Basic Input Output System (BIOS) for controlling low-level operations of the computer system 1500. In addition, a web browser 1523, a data storage management system 1524, an icon font processing system 1525, and the like can also be stored. The icon font processing system 1525 may be an application program that implements the operations of the foregoing steps in this embodiment of the application. In summary, when the technical solution provided by the present application is implemented by software or firmware, the relevant program codes are stored in the memory 1520 and called for execution by the processor 1510.
The input/output interface 1513 is used for connecting an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The network interface 1514 is used to connect a communication module (not shown) to enable the device to communicatively interact with other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
The bus 1530 includes a path to transfer information between the various components of the device, such as the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, and the memory 1520.
In addition, the computer system 1500 may also obtain information of specific extraction conditions from the virtual resource object extraction condition information database 1541 for performing condition judgment, and the like.
It should be noted that although the above devices only show the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, the memory 1520, the bus 1530, etc., in a specific implementation, the devices may also include other components necessary for proper operation. Furthermore, it will be understood by those skilled in the art that the apparatus described above may also include only the components necessary to implement the solution of the present application, and not necessarily all of the components shown in the figures.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a cloud server, or a network device) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A method for generating sign language video, the method comprising:
processing received character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
searching and obtaining sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
and sequencing and combining the sign language image data according to the dependency syntax analysis result to generate a sign language video and sending the sign language video to a user side for presentation by the user side.
2. The method of generating as claimed in claim 1, wherein prior to processing the received text stream data using natural language processing techniques, the method further comprises:
voice data is received and converted into text stream data.
3. The method of generating as claimed in claim 2, wherein after receiving voice data and converting to text stream data, the method further comprises:
and sending the text stream data to a user side so that the user side can generate subtitles to display.
4. The method according to any of claims 1-3, wherein the sorting and combining the sign language images according to the dependency parsing result to generate a sign language video specifically comprises:
sequentially arranging the sign language image data according to the dependency syntax analysis result to obtain sequentially arranged sign language image data;
and endowing the sequentially arranged sign language image data to a pre-constructed virtual role to generate a sign language video.
5. A method for generating sign language video, the method comprising:
the server side uses a natural language processing technology to process the received character stream data to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
the server side searches and obtains sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
the server side sorts and combines the sign language image data according to the dependency syntax analysis result to generate a sign language video and sends the sign language video to the user side;
and the user side receives and displays the sign language video.
6. An apparatus for generating sign language video, the apparatus comprising:
the communication module is used for receiving character stream data and sending the generated sign language video to a user side;
the processing module is used for processing the character stream data to obtain a word segmentation result and a dependency syntactic analysis result of the character stream data;
the data storage module is used for storing the sign language image data and the mapping relation between the participles and the sign language images;
and the video generation module is used for sequencing and combining the hand language image data to generate a sign language video.
7. The generation apparatus of claim 6, wherein the apparatus further comprises:
and the voice conversion module is used for converting the voice data into character stream data.
8. The generating device of claim 6 or 7, wherein the communication module is further configured to transmit the text stream data to a user terminal.
9. A system for generating sign language video, the system comprising:
the server is used for processing the character stream data, matching the corresponding sign language image according to the word segmentation result, generating sign language video data and sending the sign language video data to the client;
and the user side is used for receiving and presenting the sign language video data returned by the server side.
10. A computer system, the system comprising:
one or more processors;
and memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
processing received character stream data by using a natural language processing technology to obtain a word segmentation result and a dependency syntax analysis result of the character stream data;
searching and obtaining sign language image data corresponding to the word segmentation result according to a mapping relation between pre-stored word segmentation and sign language images;
and sequencing and combining the sign language image data according to the dependency syntax analysis result to generate a sign language video and sending the sign language video to a user side for presentation by the user side.
CN201911251154.7A 2019-12-09 2019-12-09 Sign language video generation method, device and system Pending CN111147894A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911251154.7A CN111147894A (en) 2019-12-09 2019-12-09 Sign language video generation method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911251154.7A CN111147894A (en) 2019-12-09 2019-12-09 Sign language video generation method, device and system

Publications (1)

Publication Number Publication Date
CN111147894A true CN111147894A (en) 2020-05-12

Family

ID=70517894

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911251154.7A Pending CN111147894A (en) 2019-12-09 2019-12-09 Sign language video generation method, device and system

Country Status (1)

Country Link
CN (1) CN111147894A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113903224A (en) * 2021-11-01 2022-01-07 浙江方泰显示技术有限公司 Interactive display system based on bidirectional signals
WO2023142590A1 (en) * 2022-01-30 2023-08-03 腾讯科技(深圳)有限公司 Sign language video generation method and apparatus, computer device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120040476A (en) * 2010-10-19 2012-04-27 이광우 Sign language translating device and sign language translating method
CN108074569A (en) * 2017-12-06 2018-05-25 安徽省科普产品工程研究中心有限责任公司 A kind of intelligence voice identifies in real time and methods of exhibiting
CN110347867A (en) * 2019-07-16 2019-10-18 北京百度网讯科技有限公司 Method and apparatus for generating lip motion video

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120040476A (en) * 2010-10-19 2012-04-27 이광우 Sign language translating device and sign language translating method
CN108074569A (en) * 2017-12-06 2018-05-25 安徽省科普产品工程研究中心有限责任公司 A kind of intelligence voice identifies in real time and methods of exhibiting
CN110347867A (en) * 2019-07-16 2019-10-18 北京百度网讯科技有限公司 Method and apparatus for generating lip motion video

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113903224A (en) * 2021-11-01 2022-01-07 浙江方泰显示技术有限公司 Interactive display system based on bidirectional signals
WO2023142590A1 (en) * 2022-01-30 2023-08-03 腾讯科技(深圳)有限公司 Sign language video generation method and apparatus, computer device, and storage medium
CN116561294A (en) * 2022-01-30 2023-08-08 腾讯科技(深圳)有限公司 Sign language video generation method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US11151765B2 (en) Method and apparatus for generating information
CN109947512B (en) Text adaptive display method, device, server and storage medium
CN109618181B (en) Live broadcast interaction method and device, electronic equipment and storage medium
CN111381909B (en) Page display method and device, terminal equipment and storage medium
US11758088B2 (en) Method and apparatus for aligning paragraph and video
CN110880324A (en) Voice data processing method and device, storage medium and electronic equipment
CN114064943A (en) Conference management method, conference management device, storage medium and electronic equipment
US20240303287A1 (en) Object recommendation method and apparatus, and electronic device
CN112364144A (en) Interaction method, device, equipment and computer readable medium
CN113850898A (en) Scene rendering method and device, storage medium and electronic equipment
CN111147894A (en) Sign language video generation method, device and system
CN114401337B (en) Data sharing method, device, equipment and storage medium based on cloud phone
CN110286776A (en) Input method, device, electronic equipment and the storage medium of character combination information
WO2025092132A1 (en) Data processing method and apparatus, and storage medium
CN112632241A (en) Method, device, equipment and computer readable medium for intelligent conversation
CN110689285A (en) Test method, test device, electronic equipment and computer readable storage medium
CN112114770A (en) Interface guiding method, device and equipment based on voice interaction
CN116991672A (en) Data monitoring method, device, equipment and medium
CN116431657A (en) Data query statement generation method, device, equipment, storage medium and product
JP2025504561A (en) Content presentation methods, devices, electronic devices and storage media
CN113850899A (en) Digital human rendering method, system, storage medium and electronic device
CN113852835A (en) Live audio processing method, device, electronic device and storage medium
CN111246248A (en) Statement meaning dynamic presentation method and device, electronic equipment and storage medium
CN119150834B (en) Content template generation method and device, storage medium and electronic equipment
CN114501112B (en) Method, apparatus, device, medium, and article for generating video notes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200512

RJ01 Rejection of invention patent application after publication