CN119296558A - Audio and video quality enhancement method, device, equipment and medium based on machine learning - Google Patents
Audio and video quality enhancement method, device, equipment and medium based on machine learning Download PDFInfo
- Publication number
- CN119296558A CN119296558A CN202411815572.5A CN202411815572A CN119296558A CN 119296558 A CN119296558 A CN 119296558A CN 202411815572 A CN202411815572 A CN 202411815572A CN 119296558 A CN119296558 A CN 119296558A
- Authority
- CN
- China
- Prior art keywords
- audio
- video
- data
- preset
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010801 machine learning Methods 0.000 title claims abstract description 127
- 238000000034 method Methods 0.000 title claims abstract description 75
- 238000012545 processing Methods 0.000 claims abstract description 287
- 238000005457 optimization Methods 0.000 claims abstract description 62
- 230000005540 biological transmission Effects 0.000 claims description 76
- 230000009467 reduction Effects 0.000 claims description 51
- 238000004590 computer program Methods 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 11
- 230000002829 reductive effect Effects 0.000 claims description 6
- 230000002708 enhancing effect Effects 0.000 claims 11
- 230000000694 effects Effects 0.000 abstract description 33
- 230000008569 process Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000005286 illumination Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 238000004140 cleaning Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009429 electrical wiring Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The application discloses an audio and video quality enhancement method, device, equipment and medium based on machine learning, relating to the technical field of audio and video quality enhancement, comprising the steps of determining audio and video data to be processed according to initial audio and video information to be processed and preset machine learning parameters if the initial audio and video information to be processed is received; determining audio and video processing data according to the audio and video data to be processed and a preset machine learning database, wherein the audio and video processing data comprises audio processing data and video processing data, determining audio quality enhancement data according to the audio processing data and a preset first optimization rule when the audio and video processing data is the audio processing data, and determining video quality enhancement data according to the video processing data and a preset second optimization rule when the audio and video processing data is the video processing data. The playing effect of the audio and video is improved.
Description
Technical Field
The application relates to the technical field of audio and video quality enhancement, in particular to an audio and video quality enhancement method, device, equipment and medium based on machine learning.
Background
At present, along with the development of related technologies of audio and video, higher requirements are also put forward on an audio and video processing mode.
The traditional audio and video processing mode directly converts initial audio and video into an electric signal capable of being transmitted, and converts the electric signal into audio and video capable of being played when the audio and video are required to be played, so that the audio and video processing mode has great defects, and the phenomenon that the final playing effect is affected due to poor playing effect of the initial audio and video is caused by the fact that the initial audio and video is directly converted into the electric signal capable of being transmitted, namely the audio and video processing mode can affect the final playing effect due to poor playing effect of the initial audio and video, and further the audio and video playing effect is poor.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present application and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The application mainly aims to provide an audio and video quality enhancement method, device, equipment and medium based on machine learning, aiming at solving the technical problem of poor audio and video playing effect.
In order to achieve the above object, the present application provides a machine learning-based audio/video quality enhancement method, which includes:
If the initial audio and video information to be processed is received, determining the audio and video data to be processed according to the initial audio and video information and preset machine learning parameters;
Determining audio and video processing data according to the audio and video data to be processed and a preset machine learning database, wherein the audio and video processing data comprise audio processing data and video processing data;
Determining audio quality enhancement data according to the audio processing data and a preset first optimization rule when the audio and video processing data are the audio processing data;
and determining video quality enhancement data according to the video processing data and a preset second optimization rule when the audio and video processing data are the video processing data.
In an embodiment, the preset machine learning parameters include a standard video parameter threshold and a standard audio parameter threshold, and the step of determining the audio-video data to be processed according to the initial audio-video information and the preset machine learning parameters includes:
When the initial audio-video information is audio information, detecting whether an audio parameter value in the audio information meets the standard audio parameter threshold value, and when the audio parameter value in the audio information does not meet the standard audio parameter threshold value, taking the audio data in the audio information as audio-video data to be processed;
When the initial audio and video information is video information, detecting whether a video parameter value in the video information meets the standard video parameter threshold, and when the video parameter value in the video information does not meet the standard video parameter threshold, taking video data in the video information as audio and video data to be processed.
In an embodiment, the preset machine learning database includes an audio coding transmission library and a video coding transmission library, and the step of determining audio/video processing data according to the audio/video data to be processed and the preset machine learning database includes:
Determining audio data in the audio-video data to be processed, determining a first coding transmission mode corresponding to the data characteristics of the audio data in the audio coding transmission library, and binding the first coding transmission mode and the audio data as audio processing data;
Determining video data in the audio and video data to be processed, determining a second coding transmission mode corresponding to the data characteristics of the video data in the video coding transmission library, and binding the second coding transmission mode and the video data as video processing data.
In an embodiment, the preset first optimization rule includes an audio noise reduction rule, an audio equalization rule, and an echo cancellation rule, and the step of determining audio quality enhancement data according to the audio processing data and the preset first optimization rule includes:
When the noise value of the audio processing data is larger than a preset audio noise reduction threshold value, noise reduction is carried out on the audio processing data based on the audio noise reduction rule to obtain noise reduction audio data, and when the balance value of the audio processing data is larger than a preset audio balance threshold value, audio balance is carried out on the noise reduction audio data based on the audio balance rule to obtain balance audio data;
when the echo value of the audio processing data is larger than a preset echo cancellation threshold value, echo cancellation is carried out on the balanced audio data based on the echo cancellation rule to obtain audio quality enhancement data;
And when the echo value of the audio processing data is smaller than or equal to a preset echo cancellation threshold value, taking the balanced audio data as audio quality enhancement data.
In an embodiment, the preset second optimization rule includes a video denoising rule, a contrast enhancement rule, a sharpening rule, and a resolution processing rule, and the step of determining video quality enhancement data according to the video processing data and the preset second optimization rule includes:
When the noise value of the video processing data is larger than a preset video denoising threshold value, denoising the video processing data based on the video denoising rule to obtain denoising video data;
When the contrast of the video processing data is smaller than or equal to a preset contrast enhancement threshold value, the noise reduction video data is used as video quality enhancement data;
and when the contrast of the video processing data is larger than a preset contrast enhancement threshold value, carrying out contrast adjustment on the noise reduction video data based on the contrast enhancement rule to obtain contrast adjustment data, and when the sharpening value of the contrast adjustment data is larger than a preset sharpening processing threshold value and the resolution of the contrast adjustment data is larger than a preset resolution processing threshold value, carrying out processing on the contrast adjustment data based on the sharpening processing rule and the resolution processing rule to obtain video quality enhancement data.
In an embodiment, the machine learning-based audio/video quality enhancement method further includes:
Acquiring an initial training audio and video, and determining a playing score corresponding to the initial training audio and video, wherein the playing score comprises a first score value which is directly played;
determining an audio noise reduction threshold value, an audio equalization threshold value and an echo cancellation threshold value corresponding to the audio data under the first grading value, and taking the audio noise reduction threshold value, the audio equalization threshold value and the echo cancellation threshold value as the basis for starting a preset first optimization rule;
Determining a video denoising threshold value, a contrast enhancement threshold value, a sharpening threshold value and a resolution processing threshold value corresponding to the video data under the first grading value, and taking the video denoising threshold value, the contrast enhancement threshold value, the sharpening threshold value and the resolution processing threshold value as the basis for starting a preset second optimization rule.
In an embodiment, the playing score further includes a second score value played after the transmission, and the step of determining the playing score corresponding to the initial training audio and video includes:
determining a first corresponding relation between the audio coding mode and the data characteristics of the audio data under the second scoring value, and taking the first corresponding relation as an audio coding transmission library in a preset machine learning database;
and determining a second corresponding relation between the video coding mode and the data characteristics of the video data under the second grading value, and taking the second corresponding relation as a video coding transmission library in a preset machine learning database.
In addition, in order to achieve the above object, the present application also provides an audio and video quality enhancement device based on machine learning, where the audio and video quality enhancement device based on machine learning includes:
The information acquisition module is used for determining the audio and video data to be processed according to the initial audio and video information and preset machine learning parameters if the initial audio and video information to be processed is received;
the type judging module is used for determining audio and video processing data according to the audio and video data to be processed and a preset machine learning database, wherein the audio and video processing data comprise audio processing data and video processing data;
The first enhancement module is used for determining audio quality enhancement data according to the audio processing data and a preset first optimization rule when the audio and video processing data are the audio processing data;
and the second enhancement module is used for determining video quality enhancement data according to the video processing data and a preset second optimization rule when the audio and video processing data are the video processing data.
In addition, in order to achieve the aim, the application also provides an audio and video quality enhancement device based on machine learning, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program is configured to realize the steps of the audio and video quality enhancement method based on machine learning.
In addition, in order to achieve the above object, the present application also proposes a medium, which is a computer-readable storage medium, on which a computer program is stored, the computer program implementing the steps of the machine learning-based audio/video quality enhancement method as described above when being executed by a processor.
The embodiment of the application provides an audio and video quality enhancement method based on machine learning, which comprises the steps of determining audio and video data to be processed according to initial audio and video information to be processed and preset machine learning parameters if the initial audio and video information to be processed is received, determining audio and video processing data according to the audio and video data to be processed and a preset machine learning database, determining audio and video processing data according to the audio processing data and a preset first optimization rule when the audio and video processing data is the audio processing data, determining audio quality enhancement data according to the audio processing data and a preset first optimization rule when the audio and video processing data is the video processing data, determining video quality enhancement data according to the video processing data and a preset second optimization rule when the audio and video processing data is the video processing data, determining audio and video data to be processed according to the initial audio and video information and the preset machine learning parameters, further processing the audio and video processing data and the preset machine learning database, finally determining audio and video processing data under different processing data types, determining audio quality enhancement data and a first optimization rule according to the audio and video processing data and a preset first optimization rule or a second optimization rule when the audio and video processing data and the preset audio and the video quality enhancement data are the initial audio and video data are not equal to each other, and the audio quality enhancement data can be directly transmitted according to the initial enhancement rule or the audio and video enhancement rule is not determined when the audio quality enhancement data and the first optimization rule is different from the initial enhancement rule, thereby improving the playing effect of the audio and video.
Drawings
Fig. 1 is a flowchart of a first embodiment of a machine learning-based audio/video quality enhancement method according to the present application;
FIG. 2 is a flow chart of an implementation of the machine learning-based audio/video quality enhancement method of the present application;
FIG. 3 is a flowchart of a second embodiment of the machine learning-based audio/video quality enhancement method of the present application;
FIG. 4 is a schematic block diagram of the machine learning-based audio/video quality enhancement device of the present application;
FIG. 5 is a schematic diagram of a hardware operating environment related to a device in the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
For a better understanding of the technical solution of the present application, the following detailed description will be given with reference to the drawings and the specific embodiments.
The existing audio and video processing mode is to directly convert the initial audio and video into an electrical signal capable of being transmitted, and convert the electrical signal into the audio and video capable of being played when the audio and video are required to be played, namely, directly process the audio and video in different environments or different time points into the electrical signal, so as to transmit the electrical signal to an area required to be played, further process the electrical signal into the audio and video signal capable of being played, the whole process does not consider the influence on the audio and video when the actual different environments or different time points and the internal signal are acquired, and the problem that the final playing effect is affected due to poor playing effect of the initial audio and video is low in contrast of the environment, namely, unclear information is shot under strong light, and noise of the sound signal is larger because of noisy environment sound.
Therefore, based on the defects of the above audio and video quality processing scheme, the audio and video quality enhancement method based on machine learning is provided. The method comprises the steps of determining audio and video data to be processed through initial audio and video information and preset machine learning parameters, further processing the audio and video data and determining audio and video processing data through a preset machine learning database, finally determining audio quality enhancement data through the audio processing data and a preset first optimization rule and/or determining video quality enhancement data according to the video processing data and a preset second optimization rule under different processing data types, avoiding the phenomenon that the initial audio and video is directly converted into an electric signal capable of being transmitted and the final playing effect is affected due to poor playing effect of the initial audio and video, and determining audio quality enhancement data through the audio processing data and the preset first optimization rule and/or determining video quality enhancement data according to the video processing data and the preset second optimization rule under different processing data types, so that the playing effect of the audio and video is improved.
It should be noted that, the execution body of the present embodiment may be a computing service device having functions of data processing, network communication, and program running, such as a tablet computer, a personal computer, a mobile phone, or a device, a controller, or the like capable of implementing the above functions. The present embodiment and the following embodiments will be described below with reference to a controller as an example.
Based on this, an embodiment of the present application provides an audio and video quality enhancement method based on machine learning, and referring to fig. 1, fig. 1 is a flowchart of a first embodiment of the audio and video quality enhancement method based on machine learning according to the present application.
Referring to fig. 1, the present application provides a machine learning-based audio/video quality enhancement method, and in a first embodiment of the machine learning-based audio/video quality enhancement method, the machine learning-based audio/video quality enhancement method includes:
step S10, if initial audio and video information to be processed is received, determining audio and video data to be processed according to the initial audio and video information and preset machine learning parameters;
Step S20, determining audio and video processing data according to the audio and video data to be processed and a preset machine learning database, wherein the audio and video processing data comprise audio processing data and video processing data;
In this embodiment, when the initial audio/video information to be processed is received, the initial audio/video information is determined to determine whether the initial audio/video information needs to be processed, and the main determination mode is to determine whether the initial audio/video information needs to be quality-enhanced based on a preset machine learning parameter, so as to determine that the initial audio/video information is used as the audio/video data to be processed when the quality enhancement is required. The audio/video data to be processed refers to audio/video data to be subjected to quality enhancement, the initial audio/video information refers to audio/video data which is initially acquired or acquired, the preset machine learning parameters refer to parameters which are determined based on machine learning and need to be subjected to quality enhancement, if contrast is smaller than a certain value for the video data, the video data is determined to be subjected to quality enhancement, if noise is larger than a certain value for the audio data, the audio data is determined to be subjected to quality enhancement. After the audio and video data to be processed requiring quality enhancement is determined, audio and video processing data are determined according to the audio and video data to be processed and a preset machine learning database, wherein the audio and video processing data comprise audio processing data and video processing data, the preset machine learning database is a database determined based on machine learning, the database at least comprises an audio coding transmission database and a video coding transmission database, the audio coding transmission database is a database for defining audio coding and transmission, the video coding transmission database is a database for defining video coding and transmission, the audio processing data are audio data after coding transmission processing, the video processing data are video data after coding transmission processing, and further influence of the audio and video data on coding and transmission can be avoided so as to ensure playing effects of the audio and video.
Step S30, when the audio and video processing data are the audio processing data, determining audio quality enhancement data according to the audio processing data and a preset first optimization rule;
step S40, determining video quality enhancement data according to the video processing data and a preset second optimization rule when the audio/video processing data is the video processing data.
In this embodiment, after the audio and video processing data is determined, the audio processing data and the video processing data in the audio and video processing data are respectively processed, that is, the audio processing data and the video processing data are respectively processed according to a certain processing rule, so as to obtain respective corresponding quality enhancement data, for example, the audio quality enhancement data is determined based on the audio processing data and a preset first optimization rule, and/or the video quality enhancement data is determined based on the video processing data and a preset second optimization rule. The audio quality enhancement data refers to audio data after processing the data, the video quality enhancement data refers to video data after processing the data, the preset first optimization rule refers to rules for processing the audio processing data, such as an audio noise reduction rule, an audio equalization rule and an echo cancellation rule, the preset second optimization rule refers to rules for processing the video processing data, such as a video denoising rule, a contrast enhancement rule, a sharpening processing rule and a resolution processing rule, and finally the audio and video data after quality enhancement can be obtained so as to play based on the audio and video data, thereby ensuring the playing effect of the audio and video data. It should be noted that, the processes of processing based on the respective rules and processing based on the machine learning database may be changed in order, that is, the quality enhancement data of the audio and video may be obtained directly based on the audio and video data to be processed, and the quality enhancement data of the audio and video and the machine learning database may be encoded and transmitted to be used as the final quality enhancement data of the audio and video, so as to perform the subsequent playing process of the audio and video data, and further, from the aspects of data itself and transmission programming, the playing effect of the audio and video data is improved.
In an embodiment, referring to fig. 2, fig. 2 is a schematic flow chart of an implementation of the machine learning-based audio/video quality enhancement method of the present application, after an audio/video to be processed is obtained, it is determined whether the audio/video is required to be processed (quality enhancement processing), that is, a step of determining audio/video data to be processed according to the initial audio/video information and a preset machine learning parameter is performed, so as to preferentially determine improvement and optimization of transmission when it is determined that the processing is required, that is, determine audio/video processing data based on a preset machine learning database, so as to achieve improvement and optimization of transmission through improvement of coding transmission, and further optimize the data itself after the improvement and optimization of transmission are determined, that is, at this time, the audio/video data is processed based on respective rules, so as to improve defects of the audio/video data, and thus achieve the purpose of improving audio/video playing effects.
In this embodiment, an audio and video quality enhancement method based on machine learning is provided, where if initial audio and video information to be processed is received, the audio and video data to be processed is determined according to the initial audio and video information and preset machine learning parameters; the method comprises determining audio and video processing data according to the audio and video processing data and a preset machine learning database, determining audio and video processing data according to the audio and video processing data and a preset second optimization rule, determining audio quality enhancement data according to the audio processing data and the preset first optimization rule, determining video quality enhancement data according to the video processing data and a preset second optimization rule, determining audio and video processing data according to initial audio and video information and preset machine learning parameters, determining audio and video processing data according to the audio and video processing data and the preset machine learning database, determining audio quality enhancement data according to the preset first optimization rule and/or determining video quality enhancement data according to the video processing data and the preset second optimization rule under different processing data types, avoiding the phenomenon that the playing effect of initial audio and video is not good and the playing effect of the initial audio and video is not good, determining the playing effect of the initial audio and video can be performed under different processing data types, thereby improving the playing effect of the audio and video.
Further, based on the first embodiment of the present application, a second embodiment of the machine learning based audio/video quality enhancement method of the present application is provided, in this embodiment, the preset machine learning parameters include a standard video parameter threshold and a standard audio parameter threshold, and the step of determining the audio/video data to be processed according to the initial audio/video information and the preset machine learning parameters includes:
Step S11, when the initial audio-video information is audio information, detecting whether an audio parameter value in the audio information meets the standard audio parameter threshold value, and when the audio parameter value in the audio information does not meet the standard audio parameter threshold value, taking the audio data in the audio information as audio-video data to be processed;
Step S12, when the initial audio/video information is video information, detecting whether a video parameter value in the video information meets the standard video parameter threshold, and when the video parameter value in the video information does not meet the standard video parameter threshold, taking the video data in the video information as the audio/video data to be processed.
In this embodiment, when determining whether quality enhancement processing is required, the determination is made based on a standard video parameter threshold and a standard audio parameter threshold in preset machine learning parameters, where the standard video parameter threshold is a parameter of video data when user-defined quality enhancement processing is not required, and may be contrast, illumination intensity, etc. of the video data, for example, the contrast is outside a first area, and the illumination intensity is outside a first value, where it is determined that quality enhancement processing is required for the video data, the standard audio parameter threshold is a parameter of audio data when user-defined quality enhancement processing is not required, and may be noise, echo, etc. of the audio data, for example, noise is outside a certain second value, or echo is outside a third value, where it is determined that quality enhancement processing is required for the audio data. At this time, the initial audio-video information is divided into audio information containing audio data and video information containing video data for processing, when the audio information is audio information, the opportunity is to detect whether the audio parameter value in the audio information meets a standard audio parameter threshold, so that when the audio parameter value in the audio information does not meet the standard audio parameter threshold, the audio data in the audio information is used as audio-video data to be processed, wherein the audio parameter value refers to a relevant parameter of the audio data, can be a noise value, an echo value and the like, can be directly determined based on the audio parameter, does not meet the standard audio parameter threshold and refers to noise outside a certain second value or echo outside a third value, and when the audio parameter value is video information, the opportunity is to detect whether the video parameter value in the video information meets the standard video parameter threshold, so that when the video parameter value in the video information does not meet the standard video parameter threshold, the video data in the video information is used as audio-video data to be processed, wherein the video parameter value refers to a relevant parameter of the video data, can be contrast, illumination intensity and the like, can be directly determined based on the video parameter, and the condition that the video parameter value does not meet the standard video parameter refers to contrast is outside a first value and outside a first illumination intensity. To determine whether quality enhancement processing is required, thereby ensuring the effect of subsequent audio and video playing.
Further, the preset machine learning database includes an audio coding transmission library and a video coding transmission library, and the step of determining audio/video processing data according to the audio/video data to be processed and the preset machine learning database includes:
step S21, determining audio data in the audio-video data to be processed, determining a first coding transmission mode corresponding to the data characteristics of the audio data in the audio coding transmission library, and binding the first coding transmission mode and the audio data as audio processing data;
step S22, determining video data in the audio and video data to be processed, determining a second coding transmission mode corresponding to the data characteristics of the video data in the video coding transmission library, and binding the second coding transmission mode and the video data as video processing data.
In this embodiment, when the audio-video data to be processed is encoded and transmitted, the audio data (data related to audio playing) in the audio-video data to be processed is determined respectively, and then, on one hand, a first encoding transmission mode corresponding to the data characteristics of the audio data is determined in an audio encoding transmission library, and the first encoding transmission mode and the audio data are bound to be used as audio processing data, wherein the audio encoding transmission library refers to a database defining different encoding and transmission modes corresponding to the audio characteristics, the first encoding transmission mode refers to an encoding and transmission mode matched with the data characteristics of the audio data, the audio characteristics refer to characteristics of the audio data, such as wavelength, frequency and the like of the audio data, and on the other hand, a second encoding transmission mode corresponding to the data characteristics of the video data is determined in a video encoding transmission library, and the second encoding transmission mode and the video data are bound to be used as video processing data, the video encoding transmission mode refers to a database defining different encoding and transmission modes corresponding to the video characteristics, the second encoding transmission mode refers to a database matched with the data characteristics of the video data, and the audio characteristics refer to the characteristics of the audio data, such as wavelength, frequency and the audio characteristics of the audio data can be improved, and the audio-video data can be encoded and transmitted according to the audio characteristics and the audio characteristics.
In an embodiment, on one hand, proper coding settings and algorithms are selected for audio and video features of different audio and video data, so as to improve compression efficiency and transmission quality of the audio and video data. Redundancy and distortion of audio and video data can be reduced through coding optimization, and data transmission efficiency and viewing experience are improved. On the other hand, advanced transmission technology and protocols such as real-time transport protocol (RTP) and real-time streaming protocol (RTSP) are utilized to improve the transmission speed and stability of audio and video data. The delay and the packet loss rate of the audio and video data can be reduced through transmission optimization, and the real-time performance and the reliability of the data are improved.
Further, based on the first embodiment and/or the second embodiment of the present application, a third embodiment of the machine learning based audio/video quality enhancement method of the present application is provided, in this embodiment, the step S30 includes an audio noise reduction rule, an audio equalization rule, and an echo cancellation rule, and the step of determining audio quality enhancement data according to the audio processing data and the preset first optimization rule includes:
Step S31, when the noise value of the audio processing data is larger than a preset audio noise reduction threshold, noise reduction is carried out on the audio processing data based on the audio noise reduction rule to obtain noise reduction audio data, and when the balance value of the audio processing data is larger than a preset audio balance threshold, audio balance is carried out on the noise reduction audio data based on the audio balance rule to obtain balance audio data;
Step S32, when the echo value of the audio processing data is larger than a preset echo cancellation threshold value, echo cancellation is carried out on the balanced audio data based on the echo cancellation rule to obtain audio quality enhancement data;
and step S33, when the echo value of the audio processing data is smaller than or equal to a preset echo cancellation threshold value, taking the balanced audio data as audio quality enhancement data.
In this embodiment, when processing audio processing data, the audio processing data is processed according to an audio noise reduction rule, an audio equalization rule and an echo cancellation rule, where the audio noise reduction rule refers to a defined rule for reducing noise of the audio data, and may be a common audio noise reduction manner, the audio equalization rule refers to a defined rule for performing audio equalization of the audio data, and may be a common audio equalization manner, and the echo cancellation rule refers to a defined rule for performing echo cancellation of the audio data, and may be a common echo cancellation manner. When the noise value of the audio processing data is greater than a preset audio noise reduction threshold, noise reduction is performed on the audio processing data based on an audio noise reduction rule to obtain noise reduction audio data, and when the balance value of the audio processing data is greater than the preset audio balance threshold, audio balance is performed on the noise reduction audio data based on the audio balance rule to obtain balance audio data, wherein the preset audio noise reduction threshold is the optimal noise value of the audio processing data defined by a user based on machine learning, if the noise value exceeds the preset audio noise reduction threshold A, the noise reduction audio data is the audio data after noise reduction, the preset audio balance threshold is the optimal balance value of the audio processing data defined by the user based on machine learning, if the noise value exceeds the preset audio balance threshold B, the audio balance is determined to be required, the balance audio data is the noise reduction and the audio data after the audio balance, it is worth noting that the noise value of the audio processing data and the echo value of the audio processing data can be obtained based on a conventional mode, and the echo value of the audio processing data can be obtained without limitation. And finally, when the echo value of the audio processing data is smaller than or equal to a preset echo cancellation threshold value, namely no echo influence exists at the moment, the balanced audio data can be directly used as audio quality enhancement data, otherwise, when the echo value of the audio processing data is larger than the preset echo cancellation threshold value, echo cancellation is carried out on the balanced audio data based on the echo cancellation rule to obtain the audio quality enhancement data, the preset echo cancellation threshold value refers to the optimal echo cancellation echo value defined by a user based on machine learning, if the echo cancellation threshold value C exceeds the preset echo cancellation threshold value C, the echo cancellation is determined to be needed, and the audio quality enhancement data refers to the audio data after echo cancellation, noise reduction and audio equalization. It should be noted that the three processing flows may be sequentially exchanged, which is not limited herein.
In an embodiment, the audio noise reduction is to remove noise in the audio by using a filter or other methods, so that the noise of the audio data may be from environmental noise, device noise, etc. within a range required by the preset audio noise reduction threshold. Through noise reduction processing, the audio can be clearer and purer. The audio equalization refers to adjusting the frequency response characteristic of the audio to make the audio reach equalization on different frequencies, and the audio equalization can improve the tone quality and the hearing feel of the audio, so that the audio is more pleasant and comfortable. In audio communication, echo is a common problem, and echo cancellation technology can remove echo components in audio, and by analyzing echo characteristics in audio signals and canceling or weakening the echo characteristics, the definition and conversation quality of the audio can be improved. So as to ensure the audio playing effect in the above processing mode.
In an embodiment, the preset second optimization rule includes a video denoising rule, a contrast enhancement rule, a sharpening rule, and a resolution processing rule, and the step of determining video quality enhancement data according to the video processing data and the preset second optimization rule includes:
Step S41, when the noise value of the video processing data is larger than a preset video denoising threshold value, denoising the video processing data based on the video denoising rule to obtain denoising video data;
Step S42, when the contrast of the video processing data is smaller than or equal to a preset contrast enhancement threshold, the noise reduction video data is used as video quality enhancement data;
step S43, when the contrast of the video processing data is greater than a preset contrast enhancement threshold, performing contrast adjustment on the noise reduction video data based on the contrast enhancement rule to obtain contrast adjustment data, and when the sharpening value of the contrast adjustment data is greater than a preset sharpening threshold and the resolution of the contrast adjustment data is greater than a preset resolution processing threshold, performing processing on the contrast adjustment data based on the sharpening rule and the resolution processing rule to obtain video quality enhancement data.
In this embodiment, when processing video processing data, the video processing data is processed according to a video denoising rule, a contrast enhancement rule, a sharpening processing rule and a resolution processing rule, where the video denoising rule refers to a defined rule for denoising video data, and may be a common video denoising method, the contrast enhancement rule refers to a defined rule for performing contrast adjustment on video data, and may be a common contrast adjustment method, the sharpening processing rule refers to a defined rule for performing sharpening processing on video data, and may be a common sharpening processing method, and the resolution processing rule refers to a defined rule for performing resolution processing on video data, and may be a common resolution processing method. When the noise value of the video processing data is larger than a preset video denoising threshold value, denoising the video processing data based on a video denoising rule to obtain denoising video data, and performing subsequent processing based on the contrast of the audio processing data, wherein the preset video denoising threshold value is the optimal video denoising noise value defined by a user based on machine learning, if the noise value exceeds a preset video denoising threshold value D, denoising is determined to be needed, and denoising video data is the video data after denoising. At the moment, the contrast processing is required to be performed after the noise reduction is defined, so that the accuracy of the whole data can be ensured, and the accuracy of the subsequent video data processing is further improved. At this time, if the contrast of the video processing data is less than or equal to a preset contrast enhancement threshold, the noise reduction video data is used as video quality enhancement data, and the contrast enhancement threshold is an optimal contrast value defined by a user based on machine learning, if the contrast exceeds a preset contrast enhancement threshold E, it is determined that contrast adjustment, typically contrast enhancement, is required.
On the other hand, when the contrast of the video processing data is greater than a preset contrast enhancement threshold, performing contrast adjustment on the noise-reduced video data based on a contrast enhancement rule to obtain contrast adjustment data, and when the sharpening value of the contrast adjustment data is greater than a preset sharpening threshold and the resolution of the contrast adjustment data is greater than a preset resolution processing threshold, performing processing on the contrast adjustment data based on the sharpening rule and the resolution processing rule to obtain video quality enhancement data, wherein the preset sharpening threshold refers to an optimal sharpening value defined by a user based on machine learning, if the preset sharpening threshold is exceeded, determining that sharpening is required, and if the preset resolution processing threshold is exceeded, determining that resolution processing is required, and when the sharpening value of the contrast adjustment data is greater than the preset sharpening threshold, performing processing on the video quality enhancement data, namely, performing noise value of the video processing data and contrast of the video processing data, and performing processing on the video quality enhancement data, wherein the sharpening value and the resolution of the video processing data can be obtained in a non-restricted order based on a conventional manner.
In an embodiment, the video denoising rule can utilize a filter or other methods to reduce noise interference in an image, noise is usually caused by electromagnetic interference, signal attenuation and other factors in the signal transmission process, and by using an advanced denoising algorithm and the filter, the noise can be effectively restrained and removed, so that the image is clearer and finer. The contrast enhancement rule refers to adjusting the brightness and contrast of the video to make the image more vivid and lively, and the contrast refers to the brightness change degree in the image, and by optimizing the contrast, the dynamic range of the image can be increased, and the detail and texture presentation of the image can be improved. The sharpening rule is that the image is sharper by increasing the contrast of the edge and the detail, and the sharpening process is helpful to highlight the detail part in the image, so that the picture is clearer. The resolution processing rule is to enhance the low-resolution video to a higher resolution by advanced technologies such as deep learning, and the super-resolution processing can recover more image details and improve the definition of the video. It should be noted that, before or after the video denoising rule (before judging the contrast), the video data may be defogged, that is, the haze video may be processed, so as to reduce the interference of haze, improve the definition of the image, and the defogging algorithm usually analyzes the fog component in the image and removes or weakens the fog component, thereby recovering the clear image. The motion compensation processing can be carried out on video data, the motion track is analyzed through an algorithm aiming at the video with rapid motion, the image blurring is corrected, and the motion compensation technology can compensate the image blurring and distortion generated by motion, so that the picture is smoother and clearer. So as to ensure the video playing effect in the above processing mode.
Further, based on the first embodiment, the second embodiment, and/or the third embodiment of the present application, a fourth embodiment of the machine learning-based audio/video quality enhancement method of the present application is provided, in this embodiment, referring to fig. 3, fig. 3 is a schematic flow diagram of the second embodiment of the machine learning-based audio/video quality enhancement method of the present application, where the machine learning-based audio/video quality enhancement method further includes:
Step S50, acquiring initial training audios and videos, and determining play scores corresponding to the initial training audios and videos, wherein the play scores comprise first score values for direct play;
Step S60, determining an audio noise reduction threshold value, an audio equalization threshold value and an echo cancellation threshold value corresponding to the audio data under the first grading value, and taking the audio noise reduction threshold value, the audio equalization threshold value and the echo cancellation threshold value as the basis for starting a preset first optimization rule;
Step S70, determining a video denoising threshold value, a contrast enhancement threshold value, a sharpening threshold value and a resolution processing threshold value corresponding to the video data under the first grading value, and taking the video denoising threshold value, the contrast enhancement threshold value, the sharpening threshold value and the resolution processing threshold value as the basis for starting a preset second optimization rule.
In this embodiment, in the machine learning stage, the basis of the subsequent audio and video processing needs to be performed and the basis of how to process the audio and video are determined based on a plurality of initial training audio and video, where the initial training audio and video refers to a plurality of different audio and video data, or audio and video data after a plurality of processing of one audio and video data, for example, processing the audio and video data to different degrees of noise reduction, contrast, compiling and the like to obtain a plurality of audio and video data, so as to collect the score corresponding to each audio and video data by a user and the score of the audio and video data by the controller of the user, and further, two score values are used as play scores, and two score values are collected based on different proportions to obtain a final play score, and at this time, the first score value of directly playing the audio and video, that is the score value directly played without transmission, at this time, the influence of the own data on the play effect can be determined. And then, performing targeted processing on the audio data and the video data, when the audio data is the audio data, determining an audio noise reduction threshold value, an audio equalization threshold value and an echo cancellation threshold value corresponding to the audio data under a first grading value, taking the audio noise reduction threshold value, the audio equalization threshold value and the echo cancellation threshold value as the basis for starting a preset first optimization rule, at this time, determining the audio noise reduction threshold value, the audio equalization threshold value and the echo cancellation threshold value corresponding to the audio data under the highest first grading value, and further, determining whether the preset first optimization rule needs to be started for optimizing the audio data or not in the subsequent determination. When the video data is the video data, determining a video denoising threshold value, a contrast enhancement threshold value, a sharpening threshold value and a resolution processing threshold value corresponding to the video data under a first grading value, further taking the above threshold values as the basis for starting a preset second optimization rule, determining the video denoising threshold value, the contrast enhancement threshold value, the sharpening threshold value and the resolution processing threshold value corresponding to the video data under the highest first grading value, further determining whether the preset second optimization rule needs to be started or not to optimize the video data, at the moment, determining four threshold values respectively, and further performing quality enhancement processing on the audio and video data based on the threshold values, so as to ensure the audio and video playing effect.
Further, the playing score further includes a second score value played after transmission, and after the step of determining the playing score corresponding to the initial training audio and video, the method includes:
step S51, determining a first corresponding relation between the audio coding mode and the data characteristics of the audio data under the second scoring value, and taking the first corresponding relation as an audio coding transmission library in a preset machine learning database;
step S52, determining a second corresponding relation between the video coding mode and the data feature of the video data under the second score value, and using the second corresponding relation as a video coding transmission library in a preset machine learning database.
In this embodiment, in addition to the effect of the own data on the playing effect, the effect of transmission and encoding on the audio/video playing is also considered. The playing score further comprises a second score value played after transmission, and further the audio coding mode and the transmission mode which are uniquely corresponding to the data features of different audio data can be determined by determining a first corresponding relation between the audio coding mode and the data features of the audio data under the second score value and using the first corresponding relation as an audio coding transmission library in a preset machine learning database, namely determining the first corresponding relation between the audio coding mode and the data features of the audio data under the highest second score value, and further the coding and transmission modes are selected in subsequent selection so as to reduce the influence of coding and transmission on audio playing. When the video data is the video data, determining a second corresponding relation between the video coding mode and the data characteristics of the video data under a second grading value, and taking the second corresponding relation as a video coding transmission library in a preset machine learning database, namely determining the second corresponding relation between the video coding mode and the data characteristics of the video data under the highest second grading value, so that the video coding modes and the transmission modes which are uniquely corresponding to the data characteristics of different video data can be determined, and further, the coding and the transmission modes are selected in subsequent selection, so that the influence of coding and transmission on video playing is reduced.
It should be noted that the foregoing examples are only for understanding the present application, and do not constitute a limitation of the machine learning-based audio/video quality enhancement method of the present application, and that many forms of simple transformation based on this technical concept are within the scope of the present application.
The application also provides an audio and video quality enhancement device based on machine learning, referring to fig. 4, the audio and video quality enhancement device based on machine learning comprises:
The information acquisition module 10 is configured to determine audio and video data to be processed according to the initial audio and video information and preset machine learning parameters if the initial audio and video information to be processed is received;
The type judging module 20 is configured to determine audio and video processing data according to the audio and video data to be processed and a preset machine learning database, where the audio and video processing data includes audio processing data and video processing data;
The first obstacle avoidance module 30 is configured to determine, when the audio and video processing data is the audio processing data, audio quality enhancement data according to the audio processing data and a preset first optimization rule;
and the second obstacle avoidance module 40 is configured to determine video quality enhancement data according to the video processing data and a preset second optimization rule when the audio and video processing data is the video processing data.
The audio and video quality enhancement device based on machine learning provided by the application can solve the technical problem of poor audio and video playing effect by adopting the audio and video quality enhancement method based on machine learning in the embodiment. Compared with the prior art, the audio and video quality enhancement device based on machine learning has the same beneficial effects as the audio and video quality enhancement method based on machine learning provided by the embodiment, and other technical features in the audio and video quality enhancement device based on machine learning are the same as the features disclosed by the method of the embodiment, and are not repeated herein.
The application provides an audio and video quality enhancement device (which can be a part of a cleaning robot) based on machine learning, comprising at least one processor and a memory in communication with the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor so that the at least one processor can execute the audio and video quality enhancement method based on machine learning in the first embodiment. It should be noted that other devices shared with the cleaning robot, such as a power supply, may also be present in the machine learning-based audio/video quality enhancement device, and will not be described here.
Referring now to fig. 5, a schematic diagram of a controller suitable for use in implementing embodiments of the present application is shown. The controller in the embodiment of the present application may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (Personal DIGITAL ASSISTANT: personal digital assistant), a PAD (Portable Application Description: tablet), a PMP (Portable MEDIA PLAYER: portable multimedia player), a car-mounted terminal (e.g., car navigation terminal), etc., a fixed terminal such as a digital TV, a desktop computer, etc. The controller shown in fig. 5 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present application.
As shown in fig. 5, the controller may include a processing device 1001 (e.g., a central processing unit, a graphics processor, etc.) that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage device 1003 into a random access Memory (RAM: random Access Memory) 1004. In the RAM1004, various programs and data required for the operation of the controller are also stored. The processing device 1001, the ROM1002, and the RAM1004 are connected to each other by a bus 1005. An input/output (I/O) interface 1006 is also connected to the bus. In general, devices may be connected to the I/O interface 1006 including input devices 1007 such as a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc., output devices 1008 including a Liquid crystal display (LCD: liquid CRYSTAL DISPLAY), speaker, vibrator, etc., storage devices 1003 including a magnetic tape, hard disk, etc., and communication devices 1009. The communication means 1009 may allow the controller to communicate with other devices wirelessly or by wire to exchange data. While a controller having various means is shown in the figures, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through a communication device, or installed from the storage device 1003, or installed from the ROM 1002. The above-described functions defined in the method of the disclosed embodiment of the application are performed when the computer program is executed by the processing device 1001.
The controller provided by the application adopts the machine learning-based audio and video quality enhancement method in the embodiment, and can solve the technical problem of poor audio and video playing effect. Compared with the prior art, the beneficial effects of the controller provided by the application are the same as those of the audio and video quality enhancement method based on machine learning provided by the embodiment, and other technical features in the controller are the same as those disclosed by the method of the previous embodiment, and are not repeated herein.
It is to be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the description of the above embodiments, particular features, structures, materials, or characteristics may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
The present application provides a computer-readable storage medium having computer-readable program instructions (i.e., a computer program) stored thereon for performing the machine learning-based audio-video quality enhancement method in the above-described embodiments.
The computer readable storage medium provided by the present application may be, for example, a USB flash disk, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor device, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access Memory (RAM: random Access Memory), a Read-Only Memory (ROM), an erasable programmable Read-Only Memory (EPROM: erasable Programmable Read Only Memory or flash Memory), an optical fiber, a portable compact disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this embodiment, the computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to electrical wiring, fiber optic cable, RF (Radio Frequency) and the like, or any suitable combination of the foregoing.
The computer readable storage medium may be included in the controller or may exist alone without being assembled into the controller.
The computer-readable storage medium carries one or more programs that, when executed by the controller, cause the controller to:
If the initial audio and video information to be processed is received, determining the audio and video data to be processed according to the initial audio and video information and preset machine learning parameters;
Determining audio and video processing data according to the audio and video data to be processed and a preset machine learning database, wherein the audio and video processing data comprise audio processing data and video processing data;
Determining audio quality enhancement data according to the audio processing data and a preset first optimization rule when the audio and video processing data are the audio processing data;
and determining video quality enhancement data according to the video processing data and a preset second optimization rule when the audio and video processing data are the video processing data.
Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of remote computers, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN: local Area Network) or a wide area network (WAN: wide Area Network), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based devices which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present application may be implemented in software or in hardware. Wherein the name of the module does not constitute a limitation of the unit itself in some cases.
The readable storage medium provided by the application is a computer readable storage medium, and the computer readable storage medium stores computer readable program instructions (namely computer program) for executing the machine learning-based audio and video quality enhancement method, so that the technical problem of poor audio and video playing effect can be solved. Compared with the prior art, the beneficial effects of the computer readable storage medium provided by the application are the same as those of the machine learning-based audio/video quality enhancement method provided by the embodiment, and are not repeated here.
The application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the machine learning based audio/video quality enhancement method as described above.
The computer program product provided by the application can solve the technical problem of poor playing effect of audio and video. Compared with the prior art, the beneficial effects of the computer program product provided by the application are the same as those of the machine learning-based audio/video quality enhancement method provided by the embodiment, and are not repeated here.
The foregoing description is only a partial embodiment of the present application, and is not intended to limit the scope of the present application, and all the equivalent structural changes made by the description and the accompanying drawings under the technical concept of the present application, or the direct/indirect application in other related technical fields are included in the scope of the present application.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411815572.5A CN119296558A (en) | 2024-12-11 | 2024-12-11 | Audio and video quality enhancement method, device, equipment and medium based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411815572.5A CN119296558A (en) | 2024-12-11 | 2024-12-11 | Audio and video quality enhancement method, device, equipment and medium based on machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN119296558A true CN119296558A (en) | 2025-01-10 |
Family
ID=94165833
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411815572.5A Pending CN119296558A (en) | 2024-12-11 | 2024-12-11 | Audio and video quality enhancement method, device, equipment and medium based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN119296558A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104094312A (en) * | 2011-12-09 | 2014-10-08 | 英特尔公司 | Control of video processing algorithms based on measured perceptual quality characteristics |
US20170330579A1 (en) * | 2015-05-12 | 2017-11-16 | Tencent Technology (Shenzhen) Company Limited | Method and device for improving audio processing performance |
CN112672157A (en) * | 2020-12-22 | 2021-04-16 | 广州博冠信息科技有限公司 | Video encoding method, device, equipment and storage medium |
CN112906463A (en) * | 2021-01-15 | 2021-06-04 | 上海东普信息科技有限公司 | Image-based fire detection method, device, equipment and storage medium |
CN113747257A (en) * | 2021-08-31 | 2021-12-03 | 安徽创变信息科技有限公司 | Audio and video data acquisition method and system |
CN117459716A (en) * | 2023-10-23 | 2024-01-26 | 合肥联宝信息技术有限公司 | Digital signal testing methods, devices, equipment and storage media |
CN118573959A (en) * | 2024-05-28 | 2024-08-30 | 重庆平可杰信息技术有限公司 | Audio and video data acquisition method and system based on 5G terminal equipment |
-
2024
- 2024-12-11 CN CN202411815572.5A patent/CN119296558A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104094312A (en) * | 2011-12-09 | 2014-10-08 | 英特尔公司 | Control of video processing algorithms based on measured perceptual quality characteristics |
US20170330579A1 (en) * | 2015-05-12 | 2017-11-16 | Tencent Technology (Shenzhen) Company Limited | Method and device for improving audio processing performance |
CN112672157A (en) * | 2020-12-22 | 2021-04-16 | 广州博冠信息科技有限公司 | Video encoding method, device, equipment and storage medium |
CN112906463A (en) * | 2021-01-15 | 2021-06-04 | 上海东普信息科技有限公司 | Image-based fire detection method, device, equipment and storage medium |
CN113747257A (en) * | 2021-08-31 | 2021-12-03 | 安徽创变信息科技有限公司 | Audio and video data acquisition method and system |
CN117459716A (en) * | 2023-10-23 | 2024-01-26 | 合肥联宝信息技术有限公司 | Digital signal testing methods, devices, equipment and storage media |
CN118573959A (en) * | 2024-05-28 | 2024-08-30 | 重庆平可杰信息技术有限公司 | Audio and video data acquisition method and system based on 5G terminal equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109361949B (en) | Video processing method, video processing device, electronic equipment and storage medium | |
JP4818450B1 (en) | Graphics processing unit and information processing apparatus | |
CN109587560A (en) | Video processing method, video processing device, electronic equipment and storage medium | |
US20230421716A1 (en) | Video processing method and apparatus, electronic device and storage medium | |
CN105283917A (en) | Method for cancelling noise and electronic device thereof | |
CN118800268B (en) | Voice signal processing method, voice signal processing device and storage medium | |
US11822854B2 (en) | Automatic volume adjustment method and apparatus, medium, and device | |
WO2022143522A1 (en) | Audio signal processing method and apparatus, and electronic device | |
CN115767181A (en) | Live video stream rendering method, device, equipment, storage medium and product | |
CN111754424A (en) | Method, device and electronic device for facial skin beautifying treatment in pictures | |
WO2023274005A1 (en) | Image processing method and apparatus, electronic device, and storage medium | |
CN108495235B (en) | Method and device for separating heavy and low sounds, computer equipment and storage medium | |
WO2024222373A1 (en) | Audio noise reduction method and apparatus, device, storage medium and product | |
CN114845212A (en) | Volume optimization method and device, electronic equipment and readable storage medium | |
CN119296558A (en) | Audio and video quality enhancement method, device, equipment and medium based on machine learning | |
CN118609608A (en) | Noise reduction using voice activity detection in audio processing systems and applications | |
JP6766203B2 (en) | Video optimization processing system and method | |
WO2023197967A1 (en) | Multi-channel sound mixing method, and device and medium | |
CN112950516B (en) | Method and device for enhancing local contrast of image, storage medium and electronic equipment | |
TW202333144A (en) | Audio signal reconstruction | |
CN117133296A (en) | Display device and method for processing mixed sound of multipath voice signals | |
CN119360873B (en) | AI-based intelligent noise reduction method, device, equipment and medium for conference audio stream | |
CN114449341B (en) | Audio processing methods, devices, readable media and electronic equipment | |
JP5238849B2 (en) | Electronic device, electronic device control method, and electronic device control program | |
US12340784B2 (en) | Audio processing method, audio processing apparatus and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |