CN109309790A

CN109309790A - Method and system for intelligent recording of conference slides

Info

Publication number: CN109309790A
Application number: CN201811302591.2A
Authority: CN
Inventors: 张叶; 许佳佳; 常旭岭
Original assignee: Changchun Mayor Guangxinyi Technology Co Ltd
Current assignee: Changchun Mayor Guangxinyi Technology Co Ltd
Priority date: 2018-11-02
Filing date: 2018-11-02
Publication date: 2019-02-05

Abstract

A kind of meeting lantern slide intelligent recording method and system, comprising the following steps: step 1: collection site video image data and audio data；Step 2: the slide location in automatic identification scene；Step 3: automatic identification slide content, it is determined whether be a new lantern slide；Step 4: a new lantern slide in this way, then carry out view transformation for lantern slide, switch to positive visual angle；Step 5: the video and audio data to acquisition store.The intelligent recognition of lantern slide can be completed through the invention and is automatically recorded, and adjustable is optimal viewing angle to store, for subsequent lantern slide arrangement, character recognition, synopsis refine, keyword index provides valuable help, pass through the meeting lantern slide intelligent recorder, user without lifting mobile phone photograph in a meeting, the content of report itself can be absorbed in, and do not miss important slideshow information, subsequent arrangement is more convenient.

Description

A kind of meeting lantern slide intelligent recording method and system

Technical field

The invention belongs to artificial intelligence field, it is related to a kind of intelligent imaging identification and record hardware device and software processing system System.

Background technique

With the continuous development of science and technology, the continuous promotion of computer vision and artificial intelligence technology, people handle picture number Accordingly and the ability of acoustic information has had reached intelligentized degree, but minutes, such as lantern slide photograph to record, still Using the strategy for picking up mobile phone photograph, last phase tidying up is laborious and is easy to miss splendid contents.

Summary of the invention

The present invention is intended to provide a kind of meeting lantern slide intelligent recording method and system, pass through a kind of computer vision intelligence Analysis method and filming apparatus automatically identify slide location and information in image, and carry out angle and be adjusted to positive visual angle, It is recorded.

A kind of meeting lantern slide intelligent recording method, it is characterised in that: the following steps are included:

Step 1: collection site video image data and audio data；

Step 2: the slide location in automatic identification scene；

Step 3: automatic identification slide content, it is determined whether be a new lantern slide；

Step 4: a new lantern slide in this way, then carry out view transformation for lantern slide, switch to positive visual angle；

Step 5: the video and audio data to acquisition store.

Further, the acquisition of above-mentioned audio data and video image and closing selection are manual.

Further, according to content, selection records audio conversion at text information.

Further, the identification model and resolution system model to video interested and audio are established, by the audio of acquisition Data and video image are parsed and are judged whether in user's list interested, are saved if in list interested, such as Not in list interested, then delete.

A kind of meeting lantern slide intelligently record system, including video acquisition module, audio collection module, video control mould Block, audio frequency control module, data transmission module, data processing module, storage control module, memory module, power module；It is described Power module is above-mentioned module for power supply；The data transmission module is bi-directionally connected with data processing module, storage control module；

The video image data of acquisition and audio data are passed through data by the video acquisition module and audio collection module Transmission module sends data processing module to, the PPT slide location in data processing module energy automatic identification scene and to unreal Lamp piece realtime graphic is shot；Determine whether for a new lantern slide；And view transformation is carried out to lantern slide, switch to face Angle, data processing module pass through again data transmission module control signal is transmitted to storage control module to image and audio data into Row storage.

The video acquisition module is camera and corresponding bracket, and camera is rack-mount, in higher position Complete the shooting to lantern slide.

The video control module, data processing module, audio collection module, audio frequency control module, storage control module, Memory module is completed by cell phone application or other mobile device terminals.

The audio collection module and data acquisition module further include hand push button.

For the system according to content, further including will be by audio conversion at text information logging modle.

The memory module is to be locally stored or cloud storage.

The data transmission module is wire transmission or wireless transmission, and the wireless transmission is bluetooth, wifi, 4G One of signal or 5G signal transmission form are a variety of.

The data processing module is local data processing module, cloud processing module or mixed processing module.

The local data processing module is cell phone application, CPU processor, GPU processor or special chip.

The utility model has the advantages that

Meeting lantern slide intelligent recorder through the invention can complete the intelligent recognition of lantern slide and automatically record, and Adjustable is optimal viewing angle to store, and is subsequent lantern slide arrangement, character recognition, synopsis refinement, keyword index Valuable help is provided, passes through the meeting lantern slide intelligent recorder, user without lifting mobile phone photograph, Ke Yizhuan in a meeting It infuses in the content of report itself, and does not miss important slideshow information, subsequent arrangement is more convenient.

Detailed description of the invention

Fig. 1 is schematic diagram of this system based on cell phone application；

Fig. 2 is a kind of slide location extraction algorithm flow diagram of the present invention.

Specific embodiment

Below according to specific embodiment, the invention is further elaborated, the embodiment is based on cell phone application Meeting lantern slide intelligent recording method and system.

Referring to Fig. 1, showing schematic diagram of this system based on cell phone application.

A kind of meeting lantern slide intelligent recording method, at least includes the following steps:

Step 1: connecting with mobile communication by wifi held on bracket, APP is opened.

Step 2: the video image data and audio data of collection site.

Specifically, mobile microphone collection site sound can be passed through by external camera collection site video image data Frequently；It is controlled by video and audio collection control module, which can obtain data processing by data transmission module The acquisition instructions of module, can also be by obtaining acquisition instructions manually.

Step 3: the image data of acquisition is identified and divided, slide location is extracted.

When actually detected, input picture degree of comparing is enhanced first, increases the contrast between lantern slide and background with prominent Edge out；Then canny operator extraction image border is used, and lantern slide is extracted from edge image using contours extract algorithm Profile；Quadrangle fitting finally is carried out to lantern slide profile, obtains four angular coordinates of lantern slide, and then extract lantern slide position It sets.

Step 4: the text information for changing into audio record lecture original text may be selected.

Speech recognition converts voice signals into text.Wherein signal processing module will be special according to the Auditory Perception of human ear Point extracts most important feature in voice, converts voice signals into feature vector sequence.Acoustic feature can use linear prediction It encodes (Linear Predictive Coding, LPC), mel-frequency cepstrum coefficient (Mel-frequency Cepstrum Coefficients, MFCC), Meier scale filter group (Mel-scale Filter Bank, FBank) etc..

Decoder (Decoder) is according to acoustic model and language model, by the speech characteristic vector of input is Sequence Transformed Character string.

Acoustic model carries out knowledge to acoustics, phonetics, the variable of environment and speaker's gender, the difference of accent etc. It indicates.Acoustic model can often use RNN and LSTM based on deep neural network, and language model then knows one group of word Sequence composition Knowing indicates.Language model can use N-Gram.

Step 5: view transformation, the lantern slide of effect is faced in storage.

After four angle points for extracting lantern slide, perspective transformation matrix is calculated first, further according to perspective transformation matrix to unreal Lamp panel region is converted pixel-by-pixel, obtains facing image.

The basic formula of perspective transform is as follows:

Wherein, [u, v, w] is the image homogeneous coordinates before transformation, and for two dimensional image, w perseverance is 1；[x ', y ', w '] is to become Image coordinate after changing, can be exchanged into two-dimensional coordinate.A in transformation matrix₃₃Perseverance is 1, it is therefore desirable to calculate 8 parameters, lead to 8 equations can be constructed to solve this 8 unknown transformation parameters by crossing 4 angular coordinates.

Pixel can be calculated one by one in slide region by following formula in changing image after obtaining transformation matrix Coordinate completes transformation.

Step 6: whether automatic identification slide content changes, and after identification, automatic starting storage.

Step 7: retaining the video image data or the audio data.

Step 8: returning to second step.

Referring to Fig. 1, being the schematic illustration of meeting lantern slide intelligent recorder.The present invention also provides a kind of meeting magic lanterns Piece intelligently record system, including video acquisition module, video control module, audio collection module, audio frequency control module, manually control Molding block, data transmission module, data processing module, storage control module, memory module, power module, power module are upper State module for power supply.The data transmission module is adopted with the video acquisition module, the video control module, the audio respectively Collect module, the audio frequency control module, the data processing module, the storage control module, memory module connection；Institute Video acquisition module and the audio collection module is stated respectively to lead to the video image data at collected scene with audio data It crosses the data transmission module and passes to the data processing module；The data processing module passes through the data transmission module Control signal is delivered separately to the memory module, the video control module, the audio frequency control module；The video control Molding block is manually controlled with the audio frequency control module by the manual control module.

The data processing module is for carrying out intelligent video analysis, the video image data and the sound to acquisition According to being identified, being parsed, whether position and content for judging to parse lantern slide change frequency, and lecture may be selected Voice directly record or be converted to writing record.Processing module can be mobile phone, be also possible to individual processing module.

This system can be wire transmission, or wireless transmission；Storage can be to be locally stored, or Yun Cun Storage, the video acquisition module are camera, and the audio collection module is microphone, by camera collection site video, Pass through microphone collection site audio；The opening and closing for carrying out video, audio can be operated with manual switches, can also closed In the case where, video, audio frequency control module time opening and closing are passed through by setting time interval.Microphone can be by mobile phone It provides, can also be provided by individual module.

For example, the data of acquisition are passed to data processing mould through data transmission module in the case where video, audio are opened Block, identifies the image data of acquisition and is divided and be named entity and change into text；Voice knowledge is carried out to audio data Text is not changed into and is input to semantic resolution system, if the scene and corresponding crucial dialogue that parse do not feel emerging in user Interesting list, then system will be to data collected without storage, and close acquisition equipment, and at certain time intervals to system It reopens, continues to acquire；Such as in user's list interested, then stored.The camera has high definition, low clear conversion function Energy.

The data processing module can be local data processing module, cloud processing module.At the local data Reason module can be CPU processor, GPU processor or special chip.The wireless transmission can be bluetooth, wifi, 4G letter Number or 5G signal transmission form.The video or audio output apparatus is display, can be provided by mobile phone.

Those skilled in the art should further appreciate that, describe in conjunction with the embodiments described herein Each exemplary unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clear Illustrate to Chu the interchangeability of hardware and software, generally describes each exemplary group according to function in the above description At and step.These functions are implemented in hardware or software actually, the specific application and design depending on technical solution Constraint condition.Professional technician can use different methods to achieve the described function each specific application, but It is that such implementation should not be considered as beyond the scope of the present invention.

The step of method described in conjunction with the examples disclosed in this document or algorithm, can be executed with hardware, processor The combination of software module or the two is implemented, or is integrated in mobile phone and completes.Software module can be placed in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, removable magnetic In any other form of storage medium well known in disk, CD-ROM or technical field.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means particular features, structures, materials, or characteristics described in conjunction with this embodiment or example It is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms need not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any It can be combined in any suitable manner in a or multiple embodiment or examples.In addition, without conflicting with each other, the technology of this field The feature of different embodiments or examples described in this specification and different embodiments or examples can be combined by personnel And combination.

Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

The above described specific embodiments of the present invention are not intended to limit the scope of the present invention..Any basis Various other corresponding changes and modification, should be included in the guarantor of the claims in the present invention made by technical concept of the invention It protects in range.

Claims

1. a kind of meeting lantern slide intelligent recording method, it is characterised in that: the following steps are included:

Step 1: collection site video image data and audio data；

Step 2: the slide location in automatic identification scene；

Step 5: the video and audio data to acquisition store.

2. a kind of meeting lantern slide intelligent recording method according to claim 1, it is characterised in that: it is further, it is above-mentioned The acquisition and closing selection of audio data and video image are manual.

3. a kind of meeting lantern slide intelligent recording method according to claim 1, it is characterised in that: it is further, according to Content, selection record audio conversion at text information.

4. a kind of meeting lantern slide intelligent recording method according to claim 1, it is characterised in that: it is further, it establishes Identification model and resolution system model to video interested and audio parse the audio data of acquisition with video image And judge whether to be saved if in list interested in user's list interested, if in list interested, then do not deleted.

5. a kind of meeting lantern slide intelligently records system, it is characterised in that: including video acquisition module, audio collection module, view Frequency control module, audio frequency control module, data transmission module, data processing module, storage control module, memory module, power supply Module；The power module is above-mentioned module for power supply；The data transmission module and data processing module, storage control module are double To connection；

The video acquisition module and audio collection module transmit the video image data of acquisition and audio data by data Module sends data processing module to, the PPT slide location in data processing module energy automatic identification scene and to lantern slide Realtime graphic is shot；Determine whether for a new lantern slide；And view transformation is carried out to lantern slide, switch to positive visual angle, number Pass through data transmission module again according to processing module and control signal is transmitted to storage control module and image and audio data are deposited Storage.

6. a kind of meeting lantern slide according to claim 5 intelligently records system, it is characterised in that: the video acquisition mould Block is camera and corresponding bracket, and camera is rack-mount, completes the shooting to lantern slide in higher position.

7. a kind of meeting lantern slide according to claim 6 intelligently records system, it is characterised in that: the video controls mould Block, data processing module, audio collection module, audio frequency control module, storage control module, memory module by cell phone application or Other mobile device terminals are completed.

8. a kind of meeting lantern slide according to claim 7 intelligently records system, it is characterised in that: the audio collection mould Block and data acquisition module further include hand push button.

9. a kind of meeting lantern slide according to claim 8 intelligently records system, it is characterised in that: the system is in Hold, further including will be by audio conversion at text information logging modle.

10. a kind of meeting lantern slide intelligence record system, feature according to claim 5-9 any claim exist It is to be locally stored or cloud storage in: the memory module.

11. a kind of meeting lantern slide intelligence record system, feature according to claim 5-9 any claim exist In: the data transmission module is wire transmission or wireless transmission, and the wireless transmission is bluetooth, wifi, 4G signal Or one of 5G signal transmission form or a variety of.

12. a kind of meeting lantern slide intelligence record system, feature according to claim 5-9 any claim exist In: the data processing module is local data processing module, cloud processing module or mixed processing module.

13. a kind of meeting lantern slide according to claim 12 intelligently records system, it is characterised in that: the local data Processing module is cell phone application, CPU processor, GPU processor or special chip.