[go: up one dir, main page]

CN112770116B - Method for extracting video key frame by using video compression coding information - Google Patents

Method for extracting video key frame by using video compression coding information Download PDF

Info

Publication number
CN112770116B
CN112770116B CN202011642920.5A CN202011642920A CN112770116B CN 112770116 B CN112770116 B CN 112770116B CN 202011642920 A CN202011642920 A CN 202011642920A CN 112770116 B CN112770116 B CN 112770116B
Authority
CN
China
Prior art keywords
video
frame
shot
extracting
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011642920.5A
Other languages
Chinese (zh)
Other versions
CN112770116A (en
Inventor
艾达
梁嘉倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Posts and Telecommunications
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN202011642920.5A priority Critical patent/CN112770116B/en
Publication of CN112770116A publication Critical patent/CN112770116A/en
Application granted granted Critical
Publication of CN112770116B publication Critical patent/CN112770116B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method for extracting video key frame by video compression coding information is composed of extracting depth and frame bit number characteristics, shot switching detection and extracting key frame. The invention adopts the coding unit depth information and the frame bit number compression domain characteristics in the video code stream to carry out shot switching detection, obtain shot fragments and carry out key frame extraction. The invention fully utilizes the compressed domain video to process without decompression, reduces the calculation process, shortens the processing time and improves the processing speed. Compared with the existing method, the experimental result shows that the accuracy of the method is improved by 12.1%, the recall rate is improved by 5.3%, the F value is improved by 8.4%, and the extracted key frame can well express the main content of the original video. The method has the advantages of small calculated amount, high efficiency, high accuracy, high processing speed and the like, and can be used for processing the video image.

Description

Method for extracting video key frame by using video compression coding information
Technical collar city
The invention belongs to the technical field of digital video retrieval, and particularly relates to a method for extracting video key frames by using video compression coding information.
Background
With the rapid development of multimedia technology and network technology, video data rapidly grows, unprecedented data appears, and how to effectively manage videos and rapidly acquire important information in the videos becomes a research hotspot. Under the background, key frame extraction becomes an effective way for solving the problem, and by extracting the key frame, the data volume of the video can be greatly reduced, the important information of the original video can be well expressed, the retrieval time is saved, and the video retrieval efficiency is improved.
At present, as for the extraction method of key frames, scholars at home and abroad carry out a great deal of research work, and the methods can be divided into key frame extraction in a pixel domain and key frame extraction in a compression domain according to processed video data objects. The method for extracting the key frame of the pixel domain is carried out after the video is completely decompressed, the calculated amount is large, the efficiency is low, and the real-time requirement is difficult to meet. The compressed domain video processing technology is directly oriented to compressed video data with small data volume, and the video is processed under the condition of no decompression or partial decompression, so that the processing speed of the video can be greatly improved, and therefore, the research on the key frame extraction method on the compressed domain draws wide attention.
Ali Reza et al propose a method for extracting key frames in the h.265/HEVC compressed domain, which uses a normalized histogram of intra-frame prediction modes extracted from the h.265/HEVC coded video to detect similar frames, classifies the similar frames using fuzzy c-means clustering, and extracts key frames. Zhu Zhiming et al proposed a video abstract key frame extraction method of video coding compression domain, which is to count the number of brightness prediction modes of a video coding intra-frame coding PU block at a decoding end, construct a mode feature vector, cluster the mode feature vector by using an adaptive clustering algorithm fused with an iterative self-organizing data analysis algorithm (ISODATA) to obtain candidate key frames, and filter the candidate key frames again through similarity to remove redundant frames to obtain final key frames.
The common point of the methods is that the intra-frame prediction mode value is used as the characteristic, and the experiment only aims at the full intra-frame mode, so that the processing speed of the video frame is low, the processing time is long, and the practicability is not realized.
Disclosure of Invention
The technical problem to be solved by the present invention is to overcome the disadvantages of the above video frame processing method, and provide a method for extracting video key frames by using video compression coding information, which does not need decoding, has small calculation amount, high processing speed and high extraction efficiency.
The technical scheme adopted for solving the technical problems comprises the following steps:
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000021
wherein Dx,yAnd Rx,yRespectively, indicate the (x,y) number of distortion and coding bits for the pixels, x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W x H is video resolution, lambda is Lagrangian coefficient, W and H are finite positive integers, and W is greater than H.
Determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000022
Wherein N represents the nth coded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, round () is an upward rounding function, fαFor coding depth values of a unit, fαThe value of (a) is any one of 0, 1,2 and 3.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000023
(2) Lens switching detection
Counting the frame bit number R of the encoded framenAnd drawing a line drawing for analysis, marking the positions which are gradually increased and then gradually reduced as shot switching, wherein 1 shot segment is arranged between two adjacent shot switching, the length of the shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, and the value of K is a limited positive integer.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000024
Figure BDA0002880332880000031
Figure BDA0002880332880000032
Figure BDA0002880332880000033
Figure BDA0002880332880000034
wherein FiAnd FjThe depth feature vectors for the ith and j-th coded frames, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }, respectively, are represented.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially forming N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues.
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, and M is less than N.
Will be a distance dmThe smallest frame is denoted as the key frame.
In the step (1) of extracting the depth and frame bit number characteristics, the value of W is 176-7680, the value of H is 144-4320, and the value of N is 1000-7000.
In the step (2) of detecting lens switching, the value of K is 5-20.
The invention adopts CU depth value and frame bit number compression domain characteristics in video code stream to carry out shot switching detection, obtains shot fragments, and carries out key frame extraction. The invention fully utilizes the compressed domain video to process without decompression, reduces the calculation process, shortens the processing time and improves the processing speed. Compared with the existing method, the experimental result shows that the accuracy of the method is improved by 12.1%, the recall rate is improved by 5.3%, the F value is improved by 8.4%, and the extracted key frame can well express the main content of the original video. The method has the advantages of small calculated amount, high efficiency, high accuracy, high processing speed and the like, and can be used for processing the video image.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following drawings and examples, but the present invention is not limited to these examples.
Example 1
Taking the video sequence a New Horizon, segment 02 in the international VSUMM dataset as an example, the method for extracting video key frames by using video compression coding information in the embodiment includes the following steps (see fig. 1):
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000041
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in the coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W x H is video resolution, lambda is greater than or equal to 0 and is Lagrange coefficient, W and H are limited positive integers, W is greater than H, the value of W in the embodiment is 352, and the value of H is 240.
Determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000042
Where N represents the nth encoded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, N is 1797 in this embodiment, round () is an upward rounding function, f is a positive integer, and N is a positive integerαFor coding depth values of a unit, fαIs any one of 0, 1,2 and 3, fαThe specific value of (c) should be determined according to the value of n.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000043
(2) Lens switching detection
Counting the frame bit number R of the encoded framenAnd drawing a line graph for analysis, marking the positions which are gradually increased and then gradually decreased as shot switching, wherein 1 shot segment is arranged between every two adjacent shot switching, the length of each shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, the value of K is a limited positive integer, the value of K in the embodiment is 13, and the specific value of M is 376, 232, 128, 108, 80, 76, 72, 80, 116, 120, 68, 72 and 108.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000051
Figure BDA0002880332880000052
Figure BDA0002880332880000053
Figure BDA0002880332880000054
Figure BDA0002880332880000055
wherein FiAnd FjThe depth feature vectors for the ith and j-th coded frames, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }, respectively, are represented.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially obtaining N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues, wherein the value of K in the step is the same as that of K in the step (2), and the value of N is the same as that of N in the step (1).
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, M is less than N, and the specific value of M is the same as that in step (2).
Will be a distance dmThe smallest frame is denoted as the key frame.
Example 2
Taking an ocean floor Legacy as an example, the method for extracting video key frames by using video compression coding information in the embodiment includes the following steps:
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000061
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in the coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W x H is video resolution, lambda is greater than or equal to 0 and is Lagrange coefficient, W and H are limited positive integers, W is greater than H, the value of W in the embodiment is 176, and the value of H is 144.
Determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000062
Where N represents the nth encoded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, N is 1000 in this embodiment, round () is an upward rounding function, f is a positive integer, and N is a positive integerαFor coding depth values of a unit, fαIs any one of 0, 1,2 and 3, fαThe specific value of (c) should be determined according to the value of n.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000063
(2) Lens switching detection
Counting the frame bit number R of the encoded framenAnd drawing a broken line graph for analysis, marking the positions which are gradually increased and then gradually decreased as shot switching, wherein 1 shot segment is arranged between every two adjacent shot switching, the length of each shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, the value of K is a limited positive integer, the value of K in the embodiment is 5, and the specific values of M are 336, 216, 112, 96 and 296.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000064
Figure BDA0002880332880000065
Figure BDA0002880332880000066
Figure BDA0002880332880000071
Figure BDA0002880332880000072
wherein FiAnd FjThe depth feature vectors for the ith and j-th coded frames, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }, respectively, are represented.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially obtaining N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues, wherein the value of K in the step is the same as that of K in the step (2), and the value of N is the same as that of N in the step (1).
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, M is less than N, and the specific value of M is the same as that in step (2).
Will be a distance dmThe smallest frame is denoted as the key frame.
Example 3
Taking an exceptional Terrane of a video sequence as an example, the method for extracting a video key frame by using video compression coding information of the embodiment includes the following steps:
(1) extracting depth and frame bit number features
Determining a rate-distortion cost J of the coding unit according to equation (1):
Figure BDA0002880332880000073
wherein Dx,yAnd Rx,yRespectively representing the distortion and the coding bit number of the (x, y) th pixel in the coding unit, wherein x belongs to {1,2, …, H }, y belongs to {1,2, …, W }, W x H is video resolution, lambda is greater than or equal to 0 and is Lagrange coefficient, W and H are limited positive integers, W is greater than H, the value of W in the embodiment is 7680, and the value of H is 4320.
Determining depth feature vector F of coded frame according to equation (2)n
Fn={f1,f2,…,fα} (2)
Figure BDA0002880332880000074
Where N represents the nth encoded frame of the video, N belongs to {1,2, …, N }, N is the total frame number of the video, N is a finite positive integer, N is 7000 in this embodiment, round () is an upward rounding function, f is a positive integerαFor coding depth values of a unit, fαIs any one of 0, 1,2 and 3, fαThe specific value of (c) should be determined according to the value of n.
Determining the number of frame bits R according to equation (3)n
Figure BDA0002880332880000081
(2) Lens switching detection
Counting the frame bit number R of the encoded framenAnd drawing a broken line graph for analysis, marking the positions which are gradually increased and then gradually decreased as shot switching, wherein 1 shot segment is arranged between every two adjacent shot switching, the length of each shot segment is M, the value of M is a limited positive integer, M is less than N, K shot segments are obtained, the value of K is a limited positive integer, the value of K in the embodiment is 20, and the specific value of M is 156, 196, 596, 1068, 316, 452, 196, 96, 468, 240, 496, 176, 152, 376, 192, 112, 412, 336, 240 and 396.
(3) Extracting key frames
The laplacian matrix L is determined as in equation (4):
Figure BDA0002880332880000082
Figure BDA0002880332880000083
Figure BDA0002880332880000084
Figure BDA0002880332880000085
Figure BDA0002880332880000086
wherein FiAnd FjThe depth feature vectors for the ith and j-th coded frames, i ∈ {1,2, …, N }, j ∈ {1,2, …, N }, respectively, are represented.
Determining eigenvectors Y corresponding to the first K eigenvalues of L according to the formula (5), and constructing an NxK order matrix Y according to the formula (6):
L×y=β×D×y (5)
Y=[y1,y2,…,yK] (6)
wherein y is1,y2,...,yKSequentially obtaining N multiplied by 1 order eigenvectors corresponding to the first K eigenvalues, wherein the value of K in the step is the same as that of K in the step (2), and the value of N is the same as that of N in the step (1).
K-means clustering is carried out on the matrix Y, and the distance d between the clustering center mu and all other frames in the shot is determined according to the formula (7)m
dm=||ym-μ||2 (7)
Wherein M belongs to {1,2, …, M }, M is the length of each shot, M is a finite positive integer, M is less than N, and the specific value of M is the same as that in step (2).
Will be a distance dmThe smallest frame is denoted as the key frame.
In order to verify the beneficial effects of the present invention, the inventor performed a comparison experiment by using the method of extracting video key frames from video compression coding information in embodiment 1 of the present invention and an HEVC intra frame based compressed domain video summary (hereinafter referred to as "comparison file 1") method, and determined the accuracy, recall rate, and F value of the two methods as comprehensive indicators for evaluating the quality of the video summary, where the experiment and calculation results are shown in table 1.
The accuracy is determined as follows:
Figure BDA0002880332880000091
wherein N ismNumber of key frames, N, for the experimental method to match the user summaryASThe number of key frames extracted for the experimental method.
The recall rate is determined as follows:
Figure BDA0002880332880000092
wherein N isUSKey frame number extracted for user abstract.
The value of F is determined as follows:
Figure BDA0002880332880000093
TABLE 1 results of the experiment
Figure BDA0002880332880000094
As can be seen from Table 1, compared with the method of the comparison document 1, the method of the present invention has the advantages of significantly improved effect, wherein the accuracy rate is improved by 12.1%, the recall rate is improved by 5.3%, and the F value is improved by 8.4%.

Claims (3)

1.一种用视频压缩编码信息提取视频关键帧的方法,其特征在于由下述步骤组成:1. a method for extracting video key frame with video compression coding information, is characterized in that being made up of following steps: (1)提取深度和帧比特数特征(1) Extract depth and frame bit number features 按式(1)确定编码单元的率失真代价J:Determine the rate-distortion cost J of the coding unit according to formula (1):
Figure FDA0003330975010000011
Figure FDA0003330975010000011
其中Dx,y和Rx,y分别表示编码单元中坐标为(x,y)的像素的失真和编码比特数,x∈{1,2,…,H},y∈{1,2,…,W},W×H为视频分辨率,λ为拉格朗日系数且λ≥0,W和H为有限的正整数、且W>H;where D x, y and R x, y represent the distortion and the number of encoded bits of the pixel with coordinates (x, y) in the coding unit, respectively, x∈{1,2,…,H}, y∈{1,2, ...,W}, W×H is the video resolution, λ is the Lagrangian coefficient and λ≥0, W and H are finite positive integers, and W>H; 按式(2)确定编码帧的深度特征向量FnDetermine the depth feature vector F n of the encoded frame according to formula (2): Fn={f1,f2,…,fα} (2)F n ={f 1 ,f 2 ,...,f α } (2)
Figure FDA0003330975010000012
Figure FDA0003330975010000012
其中n表示视频的第n个编码帧,n∈{1,2,…,N},N为视频总帧数,N取值为有限的正整数,round()为向上取整函数,f1,f2,…fα为编码单元的深度值,f1,f2,…fα的取值为0、1、2、3中的任意一个数;where n represents the nth coded frame of the video, n∈{1,2,…,N}, N is the total number of video frames, N is a finite positive integer, round() is a round-up function, f 1 , f 2 ,...f α is the depth value of the coding unit, and the value of f 1 , f 2 ,...f α is any one of 0, 1, 2, and 3; 按式(3)确定帧比特数RnDetermine the number of frame bits R n according to formula (3):
Figure FDA0003330975010000013
Figure FDA0003330975010000013
(2)镜头切换检测(2) Lens switching detection 统计编码帧的帧比特数Rn并绘制折线图进行分析,将呈现先逐渐增加后逐渐减少的转折点标记为镜头切换,相邻两个镜头切换之间为1个镜头片段,镜头片段的长度为M,M取值为有限的正整数,且M<N,得到K个镜头片段,K取值为有限的正整数;Count the number of frame bits R n of the coded frame and draw a line graph for analysis, mark the turning point that gradually increases first and then gradually decreases as shot switching, and between two adjacent shot switching is one shot segment, and the length of the shot segment is M, M is a finite positive integer, and M<N, K shot segments are obtained, and K is a finite positive integer; (3)提取关键帧(3) Extract key frames 按式(4)确定拉普拉斯图矩阵L:Determine the Laplacian graph matrix L according to formula (4):
Figure FDA0003330975010000014
Figure FDA0003330975010000014
Figure FDA0003330975010000021
Figure FDA0003330975010000021
Figure FDA0003330975010000022
Figure FDA0003330975010000022
Figure FDA0003330975010000023
Figure FDA0003330975010000023
Figure FDA0003330975010000024
Figure FDA0003330975010000024
其中Fi和Fj分别表示第i和j个编码帧的深度特征向量,i∈{1,2,…,N},j∈{1,2,…,N};where F i and F j represent the depth feature vectors of the i and jth encoded frames, respectively, i∈{1,2,…,N}, j∈{1,2,…,N}; 按式(5)确定L的前K个特征值对应的特征向量y,并按式(6)构造N×K阶矩阵Y:Determine the eigenvectors y corresponding to the first K eigenvalues of L according to equation (5), and construct an N×K-order matrix Y according to equation (6): L×y=β×D×y (5)L×y=β×D×y (5) Y=[y1,y2,…,yK] (6)Y=[y 1 ,y 2 ,...,y K ] (6) 其中y1,y2,...,yK依次为前K个特征值对应的N×1阶特征向量;where y 1 , y 2 ,...,y K are the N×1-order eigenvectors corresponding to the first K eigenvalues in turn; 对矩阵Y进行k-means聚类,并按式(7)确定聚类中心μ与该镜头中其他所有帧的距离dmPerform k-means clustering on the matrix Y, and determine the distance d m between the cluster center μ and all other frames in the shot according to formula (7): dm=||ym-μ||2 (7)d m =||y m -μ|| 2 (7) 其中m∈{1,2,…,M},M为每个镜头片段的长度,M取值为有限的正整数,且M<N;where m∈{1,2,…,M}, M is the length of each shot segment, M is a finite positive integer, and M<N; 将距离dm最小的帧记为关键帧。The frame with the smallest distance d m is recorded as the key frame.
2.根据权利要求1所述的用视频压缩编码信息提取视频关键帧的方法,其特征在于:在提取深度和帧比特数特征步骤(1)中,所述的W的取值为176~7680,H的取值为144~4320,N的取值为1000~7000。2. the method for extracting video key frame with video compression coding information according to claim 1, is characterized in that: in extracting depth and frame bit number feature step (1), the value of described W is 176~7680 , H ranges from 144 to 4320, and N ranges from 1000 to 7000. 3.根据权利要求1所述的用视频压缩编码信息提取视频关键帧的方法,其特征在于:在镜头切换检测步骤(2)中,所述的K的取值为5~20。3 . The method for extracting video key frames with video compression and coding information according to claim 1 , wherein in the shot switch detection step (2), the value of K is 5-20. 4 .
CN202011642920.5A 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information Active CN112770116B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011642920.5A CN112770116B (en) 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011642920.5A CN112770116B (en) 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information

Publications (2)

Publication Number Publication Date
CN112770116A CN112770116A (en) 2021-05-07
CN112770116B true CN112770116B (en) 2021-12-07

Family

ID=75698646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011642920.5A Active CN112770116B (en) 2020-12-31 2020-12-31 Method for extracting video key frame by using video compression coding information

Country Status (1)

Country Link
CN (1) CN112770116B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114697761B (en) * 2022-04-07 2024-02-13 脸萌有限公司 Processing method, processing device, terminal equipment and medium
CN116723335B (en) * 2023-06-29 2024-06-18 西安邮电大学 Method of extracting video key frames using video compression coding information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453649A (en) * 2008-12-30 2009-06-10 浙江大学 Key frame extracting method for compression domain video stream
CN108632625A (en) * 2017-03-21 2018-10-09 华为技术有限公司 A kind of method for video coding, video encoding/decoding method and relevant device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE9902328A0 (en) * 1999-06-18 2000-12-19 Ericsson Telefon Ab L M Procedure and system for generating summary video
EP2257057B1 (en) * 2008-03-19 2019-05-08 National University Corporation Hokkaido University Dynamic image search device and dynamic image search program
GB201515415D0 (en) * 2015-08-29 2015-10-14 Univ Warwick Image compression
CN105979267A (en) * 2015-12-03 2016-09-28 乐视致新电子科技(天津)有限公司 Video compression and play method and device
CN111984942B (en) * 2020-07-23 2023-10-27 西安理工大学 Robust video zero watermarking method based on polar complex exponential transformation and residual neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453649A (en) * 2008-12-30 2009-06-10 浙江大学 Key frame extracting method for compression domain video stream
CN108632625A (en) * 2017-03-21 2018-10-09 华为技术有限公司 A kind of method for video coding, video encoding/decoding method and relevant device

Also Published As

Publication number Publication date
CN112770116A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN107657228B (en) Video scene similarity analysis method and system, video encoding and decoding method and system
CN104954791B (en) Key frame real-time selection method in the wireless distributed Video coding of mine
CN101374234B (en) A content-based video copy monitoring method and device
Liu et al. Key frame extraction from MPEG video stream
CN101394522A (en) Method and system for detecting video copy
CN103065153A (en) Video key frame extraction method based on color quantization and clusters
CN102917225B (en) HEVC intraframe coding unit fast selecting method
CN109104609B (en) A Shot Boundary Detection Method Fusing HEVC Compression Domain and Pixel Domain
Duan et al. Compact descriptors for visual search
CN106231214A (en) High-speed cmos sensor image based on adjustable macro block approximation lossless compression method
CN112770116B (en) Method for extracting video key frame by using video compression coding information
KR102261669B1 (en) Artificial Neural Network Based Object Region Detection Method, Device and Computer Program Thereof
CN109982071B (en) HEVC (high efficiency video coding) dual-compression video detection method based on space-time complexity measurement and local prediction residual distribution
CN103020138A (en) Method and device for video retrieval
KR20220045920A (en) Method and apparatus for processing images/videos for machine vision
CN105163122B (en) A kind of compression of images and decompression method based on image block similarity
CN108833928B (en) Traffic monitoring video coding method
CN110188625B (en) A Video Refinement Structure Method Based on Multi-feature Fusion
CN104299256B (en) Almost-lossless compression domain volume rendering method for three-dimensional volume data
CN108510425A (en) Reversible water mark method based on IPPVO and optimization MHM
CN107682699B (en) A Near Lossless Image Compression Method
CN113784147A (en) A high-efficiency video coding method and system based on convolutional neural network
Ouyang et al. The comparison and analysis of extracting video key frame
CN102592130A (en) Target identification system aimed at underwater microscopic video and video coding method thereof
CN103905818B (en) Method for rapidly determining inter-frame prediction mode in HEVC standard based on Hough conversion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant