[go: up one dir, main page]

CN109309790A - Method and system for intelligent recording of conference slides - Google Patents

Method and system for intelligent recording of conference slides Download PDF

Info

Publication number
CN109309790A
CN109309790A CN201811302591.2A CN201811302591A CN109309790A CN 109309790 A CN109309790 A CN 109309790A CN 201811302591 A CN201811302591 A CN 201811302591A CN 109309790 A CN109309790 A CN 109309790A
Authority
CN
China
Prior art keywords
module
lantern slide
meeting
audio
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811302591.2A
Other languages
Chinese (zh)
Inventor
张叶
许佳佳
常旭岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun Mayor Guangxinyi Technology Co Ltd
Original Assignee
Changchun Mayor Guangxinyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun Mayor Guangxinyi Technology Co Ltd filed Critical Changchun Mayor Guangxinyi Technology Co Ltd
Priority to CN201811302591.2A priority Critical patent/CN109309790A/en
Publication of CN109309790A publication Critical patent/CN109309790A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/66Remote control of cameras or camera parts, e.g. by remote control devices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

A kind of meeting lantern slide intelligent recording method and system, comprising the following steps: step 1: collection site video image data and audio data;Step 2: the slide location in automatic identification scene;Step 3: automatic identification slide content, it is determined whether be a new lantern slide;Step 4: a new lantern slide in this way, then carry out view transformation for lantern slide, switch to positive visual angle;Step 5: the video and audio data to acquisition store.The intelligent recognition of lantern slide can be completed through the invention and is automatically recorded, and adjustable is optimal viewing angle to store, for subsequent lantern slide arrangement, character recognition, synopsis refine, keyword index provides valuable help, pass through the meeting lantern slide intelligent recorder, user without lifting mobile phone photograph in a meeting, the content of report itself can be absorbed in, and do not miss important slideshow information, subsequent arrangement is more convenient.

Description

A kind of meeting lantern slide intelligent recording method and system
Technical field
The invention belongs to artificial intelligence field, it is related to a kind of intelligent imaging identification and record hardware device and software processing system System.
Background technique
With the continuous development of science and technology, the continuous promotion of computer vision and artificial intelligence technology, people handle picture number Accordingly and the ability of acoustic information has had reached intelligentized degree, but minutes, such as lantern slide photograph to record, still Using the strategy for picking up mobile phone photograph, last phase tidying up is laborious and is easy to miss splendid contents.
Summary of the invention
The present invention is intended to provide a kind of meeting lantern slide intelligent recording method and system, pass through a kind of computer vision intelligence Analysis method and filming apparatus automatically identify slide location and information in image, and carry out angle and be adjusted to positive visual angle, It is recorded.
A kind of meeting lantern slide intelligent recording method, it is characterised in that: the following steps are included:
Step 1: collection site video image data and audio data;
Step 2: the slide location in automatic identification scene;
Step 3: automatic identification slide content, it is determined whether be a new lantern slide;
Step 4: a new lantern slide in this way, then carry out view transformation for lantern slide, switch to positive visual angle;
Step 5: the video and audio data to acquisition store.
Further, the acquisition of above-mentioned audio data and video image and closing selection are manual.
Further, according to content, selection records audio conversion at text information.
Further, the identification model and resolution system model to video interested and audio are established, by the audio of acquisition Data and video image are parsed and are judged whether in user's list interested, are saved if in list interested, such as Not in list interested, then delete.
A kind of meeting lantern slide intelligently record system, including video acquisition module, audio collection module, video control mould Block, audio frequency control module, data transmission module, data processing module, storage control module, memory module, power module;It is described Power module is above-mentioned module for power supply;The data transmission module is bi-directionally connected with data processing module, storage control module;
The video image data of acquisition and audio data are passed through data by the video acquisition module and audio collection module Transmission module sends data processing module to, the PPT slide location in data processing module energy automatic identification scene and to unreal Lamp piece realtime graphic is shot;Determine whether for a new lantern slide;And view transformation is carried out to lantern slide, switch to face Angle, data processing module pass through again data transmission module control signal is transmitted to storage control module to image and audio data into Row storage.
The video acquisition module is camera and corresponding bracket, and camera is rack-mount, in higher position Complete the shooting to lantern slide.
The video control module, data processing module, audio collection module, audio frequency control module, storage control module, Memory module is completed by cell phone application or other mobile device terminals.
The audio collection module and data acquisition module further include hand push button.
For the system according to content, further including will be by audio conversion at text information logging modle.
The memory module is to be locally stored or cloud storage.
The data transmission module is wire transmission or wireless transmission, and the wireless transmission is bluetooth, wifi, 4G One of signal or 5G signal transmission form are a variety of.
The data processing module is local data processing module, cloud processing module or mixed processing module.
The local data processing module is cell phone application, CPU processor, GPU processor or special chip.
The utility model has the advantages that
Meeting lantern slide intelligent recorder through the invention can complete the intelligent recognition of lantern slide and automatically record, and Adjustable is optimal viewing angle to store, and is subsequent lantern slide arrangement, character recognition, synopsis refinement, keyword index Valuable help is provided, passes through the meeting lantern slide intelligent recorder, user without lifting mobile phone photograph, Ke Yizhuan in a meeting It infuses in the content of report itself, and does not miss important slideshow information, subsequent arrangement is more convenient.
Detailed description of the invention
Fig. 1 is schematic diagram of this system based on cell phone application;
Fig. 2 is a kind of slide location extraction algorithm flow diagram of the present invention.
Specific embodiment
Below according to specific embodiment, the invention is further elaborated, the embodiment is based on cell phone application Meeting lantern slide intelligent recording method and system.
Referring to Fig. 1, showing schematic diagram of this system based on cell phone application.
A kind of meeting lantern slide intelligent recording method, at least includes the following steps:
Step 1: connecting with mobile communication by wifi held on bracket, APP is opened.
Step 2: the video image data and audio data of collection site.
Specifically, mobile microphone collection site sound can be passed through by external camera collection site video image data Frequently;It is controlled by video and audio collection control module, which can obtain data processing by data transmission module The acquisition instructions of module, can also be by obtaining acquisition instructions manually.
Step 3: the image data of acquisition is identified and divided, slide location is extracted.
When actually detected, input picture degree of comparing is enhanced first, increases the contrast between lantern slide and background with prominent Edge out;Then canny operator extraction image border is used, and lantern slide is extracted from edge image using contours extract algorithm Profile;Quadrangle fitting finally is carried out to lantern slide profile, obtains four angular coordinates of lantern slide, and then extract lantern slide position It sets.
Step 4: the text information for changing into audio record lecture original text may be selected.
Speech recognition converts voice signals into text.Wherein signal processing module will be special according to the Auditory Perception of human ear Point extracts most important feature in voice, converts voice signals into feature vector sequence.Acoustic feature can use linear prediction It encodes (Linear Predictive Coding, LPC), mel-frequency cepstrum coefficient (Mel-frequency Cepstrum Coefficients, MFCC), Meier scale filter group (Mel-scale Filter Bank, FBank) etc..
Decoder (Decoder) is according to acoustic model and language model, by the speech characteristic vector of input is Sequence Transformed Character string.
Acoustic model carries out knowledge to acoustics, phonetics, the variable of environment and speaker's gender, the difference of accent etc. It indicates.Acoustic model can often use RNN and LSTM based on deep neural network, and language model then knows one group of word Sequence composition Knowing indicates.Language model can use N-Gram.
Step 5: view transformation, the lantern slide of effect is faced in storage.
After four angle points for extracting lantern slide, perspective transformation matrix is calculated first, further according to perspective transformation matrix to unreal Lamp panel region is converted pixel-by-pixel, obtains facing image.
The basic formula of perspective transform is as follows:
Wherein, [u, v, w] is the image homogeneous coordinates before transformation, and for two dimensional image, w perseverance is 1;[x ', y ', w '] is to become Image coordinate after changing, can be exchanged into two-dimensional coordinate.A in transformation matrix33Perseverance is 1, it is therefore desirable to calculate 8 parameters, lead to 8 equations can be constructed to solve this 8 unknown transformation parameters by crossing 4 angular coordinates.
Pixel can be calculated one by one in slide region by following formula in changing image after obtaining transformation matrix Coordinate completes transformation.
Step 6: whether automatic identification slide content changes, and after identification, automatic starting storage.
Step 7: retaining the video image data or the audio data.
Step 8: returning to second step.
Referring to Fig. 1, being the schematic illustration of meeting lantern slide intelligent recorder.The present invention also provides a kind of meeting magic lanterns Piece intelligently record system, including video acquisition module, video control module, audio collection module, audio frequency control module, manually control Molding block, data transmission module, data processing module, storage control module, memory module, power module, power module are upper State module for power supply.The data transmission module is adopted with the video acquisition module, the video control module, the audio respectively Collect module, the audio frequency control module, the data processing module, the storage control module, memory module connection;Institute Video acquisition module and the audio collection module is stated respectively to lead to the video image data at collected scene with audio data It crosses the data transmission module and passes to the data processing module;The data processing module passes through the data transmission module Control signal is delivered separately to the memory module, the video control module, the audio frequency control module;The video control Molding block is manually controlled with the audio frequency control module by the manual control module.
The data processing module is for carrying out intelligent video analysis, the video image data and the sound to acquisition According to being identified, being parsed, whether position and content for judging to parse lantern slide change frequency, and lecture may be selected Voice directly record or be converted to writing record.Processing module can be mobile phone, be also possible to individual processing module.
This system can be wire transmission, or wireless transmission;Storage can be to be locally stored, or Yun Cun Storage, the video acquisition module are camera, and the audio collection module is microphone, by camera collection site video, Pass through microphone collection site audio;The opening and closing for carrying out video, audio can be operated with manual switches, can also closed In the case where, video, audio frequency control module time opening and closing are passed through by setting time interval.Microphone can be by mobile phone It provides, can also be provided by individual module.
For example, the data of acquisition are passed to data processing mould through data transmission module in the case where video, audio are opened Block, identifies the image data of acquisition and is divided and be named entity and change into text;Voice knowledge is carried out to audio data Text is not changed into and is input to semantic resolution system, if the scene and corresponding crucial dialogue that parse do not feel emerging in user Interesting list, then system will be to data collected without storage, and close acquisition equipment, and at certain time intervals to system It reopens, continues to acquire;Such as in user's list interested, then stored.The camera has high definition, low clear conversion function Energy.
The data processing module can be local data processing module, cloud processing module.At the local data Reason module can be CPU processor, GPU processor or special chip.The wireless transmission can be bluetooth, wifi, 4G letter Number or 5G signal transmission form.The video or audio output apparatus is display, can be provided by mobile phone.
Those skilled in the art should further appreciate that, describe in conjunction with the embodiments described herein Each exemplary unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clear Illustrate to Chu the interchangeability of hardware and software, generally describes each exemplary group according to function in the above description At and step.These functions are implemented in hardware or software actually, the specific application and design depending on technical solution Constraint condition.Professional technician can use different methods to achieve the described function each specific application, but It is that such implementation should not be considered as beyond the scope of the present invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can be executed with hardware, processor The combination of software module or the two is implemented, or is integrated in mobile phone and completes.Software module can be placed in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, removable magnetic In any other form of storage medium well known in disk, CD-ROM or technical field.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means particular features, structures, materials, or characteristics described in conjunction with this embodiment or example It is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms need not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any It can be combined in any suitable manner in a or multiple embodiment or examples.In addition, without conflicting with each other, the technology of this field The feature of different embodiments or examples described in this specification and different embodiments or examples can be combined by personnel And combination.
Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned Embodiment is changed, modifies, replacement and variant.
The above described specific embodiments of the present invention are not intended to limit the scope of the present invention..Any basis Various other corresponding changes and modification, should be included in the guarantor of the claims in the present invention made by technical concept of the invention It protects in range.

Claims (13)

1. a kind of meeting lantern slide intelligent recording method, it is characterised in that: the following steps are included:
Step 1: collection site video image data and audio data;
Step 2: the slide location in automatic identification scene;
Step 3: automatic identification slide content, it is determined whether be a new lantern slide;
Step 4: a new lantern slide in this way, then carry out view transformation for lantern slide, switch to positive visual angle;
Step 5: the video and audio data to acquisition store.
2. a kind of meeting lantern slide intelligent recording method according to claim 1, it is characterised in that: it is further, it is above-mentioned The acquisition and closing selection of audio data and video image are manual.
3. a kind of meeting lantern slide intelligent recording method according to claim 1, it is characterised in that: it is further, according to Content, selection record audio conversion at text information.
4. a kind of meeting lantern slide intelligent recording method according to claim 1, it is characterised in that: it is further, it establishes Identification model and resolution system model to video interested and audio parse the audio data of acquisition with video image And judge whether to be saved if in list interested in user's list interested, if in list interested, then do not deleted.
5. a kind of meeting lantern slide intelligently records system, it is characterised in that: including video acquisition module, audio collection module, view Frequency control module, audio frequency control module, data transmission module, data processing module, storage control module, memory module, power supply Module;The power module is above-mentioned module for power supply;The data transmission module and data processing module, storage control module are double To connection;
The video acquisition module and audio collection module transmit the video image data of acquisition and audio data by data Module sends data processing module to, the PPT slide location in data processing module energy automatic identification scene and to lantern slide Realtime graphic is shot;Determine whether for a new lantern slide;And view transformation is carried out to lantern slide, switch to positive visual angle, number Pass through data transmission module again according to processing module and control signal is transmitted to storage control module and image and audio data are deposited Storage.
6. a kind of meeting lantern slide according to claim 5 intelligently records system, it is characterised in that: the video acquisition mould Block is camera and corresponding bracket, and camera is rack-mount, completes the shooting to lantern slide in higher position.
7. a kind of meeting lantern slide according to claim 6 intelligently records system, it is characterised in that: the video controls mould Block, data processing module, audio collection module, audio frequency control module, storage control module, memory module by cell phone application or Other mobile device terminals are completed.
8. a kind of meeting lantern slide according to claim 7 intelligently records system, it is characterised in that: the audio collection mould Block and data acquisition module further include hand push button.
9. a kind of meeting lantern slide according to claim 8 intelligently records system, it is characterised in that: the system is in Hold, further including will be by audio conversion at text information logging modle.
10. a kind of meeting lantern slide intelligence record system, feature according to claim 5-9 any claim exist It is to be locally stored or cloud storage in: the memory module.
11. a kind of meeting lantern slide intelligence record system, feature according to claim 5-9 any claim exist In: the data transmission module is wire transmission or wireless transmission, and the wireless transmission is bluetooth, wifi, 4G signal Or one of 5G signal transmission form or a variety of.
12. a kind of meeting lantern slide intelligence record system, feature according to claim 5-9 any claim exist In: the data processing module is local data processing module, cloud processing module or mixed processing module.
13. a kind of meeting lantern slide according to claim 12 intelligently records system, it is characterised in that: the local data Processing module is cell phone application, CPU processor, GPU processor or special chip.
CN201811302591.2A 2018-11-02 2018-11-02 Method and system for intelligent recording of conference slides Pending CN109309790A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811302591.2A CN109309790A (en) 2018-11-02 2018-11-02 Method and system for intelligent recording of conference slides

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811302591.2A CN109309790A (en) 2018-11-02 2018-11-02 Method and system for intelligent recording of conference slides

Publications (1)

Publication Number Publication Date
CN109309790A true CN109309790A (en) 2019-02-05

Family

ID=65222948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811302591.2A Pending CN109309790A (en) 2018-11-02 2018-11-02 Method and system for intelligent recording of conference slides

Country Status (1)

Country Link
CN (1) CN109309790A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110493640A (en) * 2019-08-01 2019-11-22 东莞理工学院 A kind of system and method that the Video Quality Metric based on video processing is PPT
CN112468761A (en) * 2020-10-31 2021-03-09 浙江云优家智能科技有限公司 Intelligent conference recording system
CN112689085A (en) * 2020-12-09 2021-04-20 展讯通信(上海)有限公司 Method, device and system for identifying PPT screen projection area and electronic equipment
CN113296660A (en) * 2020-10-21 2021-08-24 阿里巴巴集团控股有限公司 Image processing method and device and electronic equipment
CN113947572A (en) * 2021-09-30 2022-01-18 成都新潮传媒集团有限公司 Method and device for detecting quality of publications on advertising equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1655609A (en) * 2004-02-13 2005-08-17 精工爱普生株式会社 Method and system for recording video conference data
CN102523519A (en) * 2010-10-29 2012-06-27 微软公司 Automatic multimedia slideshows for social media-enabled mobile devices
CN105450944A (en) * 2015-11-13 2016-03-30 北京自由坊科技有限责任公司 Method and device for synchronously recording and reproducing slides and live presentation speech
CN106126580A (en) * 2016-06-20 2016-11-16 惠州Tcl移动通信有限公司 A kind of lantern slide filming control method and mobile terminal
CN108256513A (en) * 2018-03-23 2018-07-06 中国科学院长春光学精密机械与物理研究所 A kind of intelligent video analysis method and intelligent video record system
CN209692906U (en) * 2018-11-02 2019-11-26 长春市长光芯忆科技有限公司 A kind of meeting lantern slide intelligence record system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1655609A (en) * 2004-02-13 2005-08-17 精工爱普生株式会社 Method and system for recording video conference data
CN102523519A (en) * 2010-10-29 2012-06-27 微软公司 Automatic multimedia slideshows for social media-enabled mobile devices
CN105450944A (en) * 2015-11-13 2016-03-30 北京自由坊科技有限责任公司 Method and device for synchronously recording and reproducing slides and live presentation speech
CN106126580A (en) * 2016-06-20 2016-11-16 惠州Tcl移动通信有限公司 A kind of lantern slide filming control method and mobile terminal
CN108256513A (en) * 2018-03-23 2018-07-06 中国科学院长春光学精密机械与物理研究所 A kind of intelligent video analysis method and intelligent video record system
CN209692906U (en) * 2018-11-02 2019-11-26 长春市长光芯忆科技有限公司 A kind of meeting lantern slide intelligence record system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110493640A (en) * 2019-08-01 2019-11-22 东莞理工学院 A kind of system and method that the Video Quality Metric based on video processing is PPT
CN113296660A (en) * 2020-10-21 2021-08-24 阿里巴巴集团控股有限公司 Image processing method and device and electronic equipment
CN112468761A (en) * 2020-10-31 2021-03-09 浙江云优家智能科技有限公司 Intelligent conference recording system
CN112689085A (en) * 2020-12-09 2021-04-20 展讯通信(上海)有限公司 Method, device and system for identifying PPT screen projection area and electronic equipment
CN113947572A (en) * 2021-09-30 2022-01-18 成都新潮传媒集团有限公司 Method and device for detecting quality of publications on advertising equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109309790A (en) Method and system for intelligent recording of conference slides
CN107799126B (en) Voice endpoint detection method and device based on supervised machine learning
CN109377539B (en) Method and apparatus for generating animation
EP4336490B1 (en) Voice processing method and related device
US11527242B2 (en) Lip-language identification method and apparatus, and augmented reality (AR) device and storage medium which identifies an object based on an azimuth angle associated with the AR field of view
TW202022683A (en) Method, device, storage medium, and computer equipment of processing image
CN109509470A (en) Voice interactive method, device, computer readable storage medium and terminal device
WO2023207541A1 (en) Speech processing method and related device
CN104715753B (en) A kind of method and electronic equipment of data processing
WO2015171646A1 (en) Method and system for speech input
JP2014519082A (en) Video generation based on text
CN103024530A (en) Intelligent television voice response system and method
CN107992485A (en) A kind of simultaneous interpretation method and device
WO2024140430A9 (en) Text classification method based on multimodal deep learning, device, and storage medium
KR20210033850A (en) Output method for artificial intelligence speakers based on emotional values calculated from voice and face
CN113205569B (en) Image drawing method and device, computer readable medium and electronic equipment
CN105989836A (en) Voice acquisition method, device and terminal equipment
CN108256513A (en) A kind of intelligent video analysis method and intelligent video record system
WO2022147692A1 (en) Voice command recognition method, electronic device and non-transitory computer-readable storage medium
CN109670073B (en) Information conversion method and device and interactive auxiliary system
EP4485948A1 (en) Video processing method and apparatus, device and medium
CN113593539A (en) Streaming end-to-end voice recognition method and device and electronic equipment
CN120491832A (en) Digital person real-time generation method and device based on interaction site
CN111524518B (en) Augmented reality processing method and device, storage medium and electronic equipment
CN112562687B (en) Audio and video processing method and device, recording pen and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190205