Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2211.08428 (eess)

[Submitted on 15 Nov 2022 (v1), last revised 8 Mar 2023 (this version, v2)]

Title:CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming

Authors:Qihua Zhou, Ruibin Li, Song Guo, Peiran Dong, Yi Liu, Jingcai Guo, Zhenda Xu

View PDF

Abstract:Recent years have witnessed the dramatic growth of Internet video traffic, where the video bitstreams are often compressed and delivered in low quality to fit the streamer's uplink bandwidth. To alleviate the quality degradation, it comes the rise of Neural-enhanced Video Streaming (NVS), which shows great prospects for recovering low-quality videos by mostly deploying neural super-resolution (SR) on the media server. Despite its benefit, we reveal that current mainstream works with SR enhancement have not achieved the desired rate-distortion trade-off between bitrate saving and quality restoration, due to: (1) overemphasizing the enhancement on the decoder side while omitting the co-design of encoder, (2) limited generative capacity to recover high-fidelity perceptual details, and (3) optimizing the compression-and-restoration pipeline from the resolution perspective solely, without considering color bit-depth. Aiming at overcoming these limitations, we are the first to conduct an encoder-decoder (i.e., codec) synergy by leveraging the inherent visual-generative property of diffusion models. Specifically, we present the Codec-aware Diffusion Modeling (CaDM), a novel NVS paradigm to significantly reduce streaming delivery bitrates while holding pretty higher restoration capacity over existing methods. First, CaDM improves the encoder's compression efficiency by simultaneously reducing resolution and color bit-depth of video frames. Second, CaDM empowers the decoder with high-quality enhancement by making the denoising diffusion restoration aware of encoder's resolution-color conditions. Evaluation on public cloud services with OpenMMLab benchmarks shows that CaDM effectively saves up to 5.12 - 21.44 times bitrates based on common video standards and achieves much better recovery quality (e.g., FID of 0.61) over state-of-the-art neural-enhancing methods.

Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
Cite as:	arXiv:2211.08428 [eess.IV]
	(or arXiv:2211.08428v2 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2211.08428

Submission history

From: Qihua Zhou [view email]
[v1] Tue, 15 Nov 2022 05:14:48 UTC (1,520 KB)
[v2] Wed, 8 Mar 2023 14:25:10 UTC (1,418 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators