MCSLab (지도교수 : 류은석)의 정종범, 이순빈 학생들이 연구한 논문 " DATRA-MIV: Decoder-Adaptive Tiling and Rate Allocation for MPEG Immersive Video " 을 제출한 ACM TOMM이 JCR 2023 기준 상위 8%
논문 개제 성과를 이뤘습니다. 본 논문에서 류은석 교수 연구팀은 MIV 인코더를 활용하여 VR 같은 입체 장치에 사용할 수 있는 인코딩을 제안합니다. 논문의 자세한 내용은 다음과 같습니다.
[논문]
DATRA-MIV: Decoder-Adaptive Tiling and Rate Allocation for MPEG Immersive Video
[Abstract]
The emerging immersive video coding standard moving picture experts group (MPEG) immersive video (MIV), which is ongoing standardization by MPEG-Immersive (MPEG-I) group, enables six degrees of freedom in a virtual reality environment that represents both natural and computer-generated scenes using multi-view video compression. The MIV eliminates the redundancy between multi-view videos and merges the residuals into multiple pictures, called an atlas. Thus, bitstreams with encoded atlases are generated and corresponding number of decoders are needed, which is challenging for the lightweight device with a single decoder. This article proposes a decoder-adaptive tiling and rate allocation method for MIV to overcome the challenge. First, the proposed method divides atlases into subpictures considering two aspects: (i) subpicture bitstream extracting and merging into one bitstream to use a single decoder and (ii) separation of each source view from the atlases for rate allocation. Second, the atlases are encoded by versatile video coding (VVC), using an extractable subpicture to divide the atlases into subpictures. Third, each subpicture bitstream is extracted, and asymmetric quality allocation for each subpictures is conducted by considering the residuals in the subpicture. Fourth, mixed-quality subpictures were merged by using the proposed bitstream merger. Fifth, the merged bitstream is decoded by using a single decoder. Finally, the viewing area of the user is synthesized by using the reconstructed atlases. Experimental results with the VVC test model (VTM) show that the proposed method achieves a 21.37% Bjøntegaard delta rate saving for immersive video peak signal-to-noise ratio and a 26.76% decoding runtime saving compared to the VTM anchor configuration. Moreover, it supports bitstreams for multiple decoders and single decoder without re-encoding, transcoding, or a substantial increase of the server-side storage.