NEVC1.0 / README.md

Add yaml data

2270b50 verified 6 months ago

6.4 kB

	---
	license: bsd-3-clause-clear
	language:
	- en
	---
	<!-- Copyright 2025 ByteDance Ltd. and/or its affiliates.
	All rights reserved.
	Licensed under the BSD 3-Clause Clear License (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at
	https://choosealicense.com/licenses/bsd-3-clause-clear/
	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.

	Redistribution and use in source and binary forms, with or without
	modification, are permitted (subject to the limitations in the disclaimer
	below) provided that the following conditions are met:

	* Redistributions of source code must retain the above copyright notice,
	this list of conditions and the following disclaimer.
	* Redistributions in binary form must reproduce the above copyright notice,
	this list of conditions and the following disclaimer in the documentation
	and/or other materials provided with the distribution.
	* Neither the name of ByteDance Ltd. and/or its affiliates nor the names of its
	contributors may be used to endorse or promote products derived from this
	software without specific prior written permission.

	NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY
	THIS LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
	CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
	NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
	PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
	CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
	EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
	PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
	OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
	WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
	OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
	ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -->

	<div align="center">

	# NEVC-1.0 <br>(EHVC: Efficient Hierarchical Reference and Quality Structure for Neural Video Coding)

	<div align="center">
	<img src="./assets/performance.png" alt="Performance comparison" width="60%" style="max-width: 100%;" height="auto">
	</div>

	</div>

	<div align="left">

	## 📝 Introduction
	This repository provides the pretrained model weights for NEVC-1.0, which integrates contributions from EHVC (Efficient Hierarchical Reference and Quality Structure for Neural Video Coding) — one of the core components of the framework.
	EHVC introduces a hierarchical reference and quality structure that significantly improves both compression efficiency and rate–distortion performance.
	The corresponding code repository can be found here: [NEVC-1.0-EHVC](https://github.com/bytedance/NEVC).

	Key designs of EHVC include:
	- Hierarchical multi-reference: Resolves reference–quality mismatches using a hierarchical reference structure and a multi-reference scheme, optimized for low-delay configurations.
	- Lookahead mechanism: Enhances encoder-side context by leveraging forward features, thereby improving prediction accuracy and compression.
	- Layer-wise quantization scale with random quality training: Provides a flexible and efficient quality structure that adapts during training, resulting in improved encoding performance.

	---

	## 🔧 Models
	EHVC uses two models: the intra model and the inter model.
	- The intra model handles intra-frame coding.
	- The inter model is responsible for inter-frame (predictive) coding.

	### Intra Model
	The main contributions of NEVC-1.0 focus on inter coding.
	For intra coding, we directly adopt the pretrained model `cvpr2023_image_psnr.pth.tar` from [DCVC-DC](https://github.com/microsoft/DCVC/blob/main/DCVC-family/DCVC-DC/checkpoints/download.py), without further training.

	### Inter Model
	The inter model of NEVC-1.0 is provided at `/models/nevc1.0_inter.pth.tar`.
	The architecture of the inter model is illustrated below:

	<div align="center">
	<img src="./assets/architecture.png" alt="Inter model architecture" width="50%" style="max-width: 100%;" height="auto">
	</div>

	---

	## 📊 Experimental Results
	### Objective Comparison
	<div align="center">

	BD-Rate (%) comparison for PSNR
	Anchor: VTM-23.4 LDB.
	All codecs tested with 96 frames and intra-period = 32.

	<img src="./assets/96F32G.png" alt="BD-Rate 96F32G" width="50%" style="max-width: 100%;" height="auto">

	Rate–Distortion curves on HEVC B, HEVC C, UVG, and MCL-JCV datasets.
	Tested with 96 frames and intra-period = 32.

	<img src="./assets/96F32G_curve.png" alt="RD curves 96F32G" width="80%" style="max-width: 100%;" height="auto">

	BD-Rate (%) comparison for PSNR
	Anchor: VTM-23.4 LDB.
	All codecs tested with full sequences and intra-period = -1.

	<img src="./assets/allF-1G.png" alt="BD-Rate allF-1G" width="50%" style="max-width: 100%;" height="auto">

	Rate–Distortion curves on HEVC B, HEVC C, UVG, and MCL-JCV datasets.
	Tested with full sequences and intra-period = -1.

	<img src="./assets/allF-1G_curve.png" alt="RD curves allF-1G" width="80%" style="max-width: 100%;" height="auto">

	</div>

	---

	## 📜 Citation
	If you find NEVC-1.0 useful in your research or projects, please cite the following paper:

	- EHVC: Efficient Hierarchical Reference and Quality Structure for Neural Video Coding
	Junqi Liao, Yaojun Wu, Chaoyi Lin, Zhipin Deng, Li Li, Dong Liu, Xiaoyan Sun.
	Proceedings of the 33rd ACM International Conference on Multimedia (ACM MM 2025).

	```bibtex
	@inproceedings{liao2025ehvc,
	title={EHVC: Efficient Hierarchical Reference and Quality Structure for Neural Video Coding},
	author={Liao, Junqi and Wu, Yaojun and Lin, Chaoyi and Deng, Zhipin and Li, Li and Liu, Dong and Sun, Xiaoyan},
	booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
	year={2025}
	}
	```

	---


	## 🙌 Acknowledgement
	The intra model of this project is based on [DCVC-DC](https://github.com/microsoft/DCVC/blob/main/DCVC-family/DCVC-DC/checkpoints/download.py).