CompSplat: Compression-aware 3D Gaussian Splatting for Real-world Video

1KyungPook National University, 2Adobe Research

CompSplat achieves high-quality novel view synthesis from real-world compressed videos.
Given (a) compressed video input, our approach leverages (b) compression information showing per-frame quality variations from different quantization parameters. Due to degraded inputs from compression, previous methods (c) NoPe-NeRF, (d) LocalRF, and (e) LongSplat generate blurry or distorted results. In contrast, through compression-aware optimization, (f) our proposed method produces clear reconstructions with fine details.

Video Comparison

Video Comparison

Abstract


High-quality novel view synthesis (NVS) from real-world videos is crucial for applications such as cultural heritage preservation, digital twins, and immersive media. However, real-world videos typically contain long sequences with irregular camera trajectories and unknown poses, leading to pose drift, feature misalignment, and geometric distortion during reconstruction. Moreover, lossy compression amplifies these issues by introducing inconsistencies that gradually degrade geometry and rendering quality. While recent studies have addressed either long-sequence NVS or unposed reconstruction, compression-aware approaches still focus on specific artifacts or limited scenarios, leaving diverse compression patterns in long videos insufficiently explored. In this paper, we propose CompSplat, a compression-aware training framework that explicitly models frame-wise compression characteristics to mitigate inter-frame inconsistency and accumulated geometric errors. CompSplat incorporates compression-aware frame weighting and an adaptive pruning strategy to enhance robustness and geometric consistency, particularly under heavy compression. Extensive experiments on challenging benchmarks, including Tanks and Temples, Free, and Hike, demonstrate that CompSplat achieves state-of-the-art rendering quality and pose accuracy, significantly surpassing most recent state-of-the-art NVS approaches under severe compression conditions.

Method

Overview of the CompSplat pipeline: (a) Our approach builds upon an unposed-GS framework, reconstructing a 3D Gaussian scene from compressed videos through incremental pose estimation and optimization. (b) Frame-wise compression information (QP and bitrates) is converted into a confidence score. (c) We introduce Quality-guided Density Control, which regulates Gaussian optimization based on frame reliability: (d) Quality Gap-aware Masking mitigates frame-to-frame quality differences by applying a gap ratio–based pixel mask.

CompSplat pipeline overview
  • Gaussian Scale-based Pruning. Removes over-diffused Gaussians caused by low-quality compressed frames using scale-aware, confidence-guided pruning.
  • Adaptive Densification and Pruning. Dynamically adjusts densification and pruning thresholds based on frame quality to stabilize Gaussian density across varying compression levels.
  • Quality Gap-aware Masking. Mitigates supervision instability from inter-frame quality gaps by down-weighting unreliable pixels using keypoint-matching confidence.

Analysis: Effect of Video Compression

Effects of quantization parameter (QP) on video frames. As QP increases from 27 to 47, overall frame quality progressively degrades compared to the uncompressed reference.
Frame-wise Y-PSNR and bits per frame of video sequences encoded at various QP values. Lower QP values yield higher PSNR and larger bit usage, while higher QP values result in reduced quality and substantially fewer coded bits. Periodic peaks correspond to I-frames within the GOP structure.
Frame-wise Compression Analysis. During video compression, each frame is encoded with QP values, leading to substantial inter-frame variations in PSNR and bitrate. This non-uniformity becomes more pronounced in long real-world videos, highlighting the need to explicitly consider frame-wise compression artifacts when applying 3DGS to compressed video.

Camera Pose Estimation

Visualization of camera trajectories on the Free dataset