15.0.0.1.1.1.1.1.209JCTVC-N0036 RCE3: Summary Report of HEVC Range Extensions Core Experiment 3 on Intra Coding Methods for Screen Content [L. Guo]
(Reviewed Mon. 29th, Track A (David Flynn).)
A summary of RCE3 on Intra coding methods for Screen Content for HEVC Range Extensions is reported. Three methods have been evaluated based on the CE description in JCTVC-M1123.
The CE performed tests, covering two subject areas:
-
Palette coding (two methods, tests one and two), where for each block, a dictionary (palette) of pixel values is transmitted, and the block consists of indices into the dictionary.
-
Intra motion compensation (an intra picture block copying operation), with two subtests three and four that respectively disable and enable limiting the vertical vector to within the current CTU.
No combination tests have been performed.
All methods tested only offer benefits for screen content material, with either no gain or minor losses for non-screen content.
Complexity assessments are reported for one method (Test 1).
For the palette coding methods, the main difference between the two methods is the size of the dictionary, the use of a pixel prediction method and the entropy coding method. Test 2 seems to report higher gains than test 1, but has a higher runtime cost.
RCE3 summary and general discussion
RCE3 primary contributions
15.0.0.1.1.1.1.1.210JCTVC-N0205 RCE3: Results of test 3.3 on Intra motion compensation [D.-K. Kwon, M. Budagavi (TI)]
(Reviewed in Track A Tue. 30th (DF).)
Applications such as wireless displays, automotive infotainment, remote desktop, remote gaming, cloud computing etc. are becoming popular. Video in these applications often has mixed content consisting of natural video, text, graphics etc. In text and graphics regions, patterns (e.g. text characters, icons, lines etc.) can repeat within a picture. This contribution proposes a CU-level intra motion compensation tool to remove this redundancy to reportedly achieve coding gain. When intra motion compensation is enabled for a CU, either horizontal motion or vertical motion is signalled. The proposed method is tested with two different search ranges in the encoder. In the first test, horizontal and vertical motions are limited to a range of 0–63. In the second test, the vertical motion is further limited so that displaced block does not go beyond LCU boundary. The following bit rate savings under RCE3 common conditions for lossless coding are reported:
-
First test - Class F: (2.5%/1.4%/0.7%), SC RGB 444: (25.2%/19.9%/16.6%), SC YUV 444: (20.7%/18.9%/17.0%), Class B and RangeExt: (0.0%/0.0%/0.0%) for AI/RA/LDB.
-
Second test - Class F: (2.3%/1.4%/0.6%), SC RGB 444: (21.5%/17.0%/14.8%), SC YUV 444: (18.5%/16.6%/16.1%), Class B and RangeExt: (0.0%/0.0%/0.0%) for AI/RA/LDB.
The following average luma BD-Rate savings under RCE3 common conditions for lossy coding are reported:
First test - AI-MT: 19.0%, AI-HT: 18.4%, AI-SHT: 17.9%, RA-MT: 15.7%, RA-HT: 15.2%, LDB-MT: 12.0% and LDB-HT: 11.7%.
Second test - AI-MT: 16.8%, AI-HT: 16.3%, AI-SHT: 15.9%, RA-MT: 13.8%, RA-HT: 13.4%, LDB-MT: 10.5% and LDB-HT: 10.0%.
The method tested is an intra picture block copying operation that, on a CU basis, permits copying a block of reference samples using a 1D integer vector in either the horizontal or vertical direction. Vector ranges are 0 to 63, offset by the current CU width, and are coded with a 6-bit magnitude and direction. For chroma, the luma vector corresponds to half-pel sampling, with interpolation performed using the current chroma inter interpolation filter.
A variant is tested that limits the vertical range of the vector such that all reference samples reside within the current CTU.
The mode is signalled prior to the current CU pred_mode flag, completely bypassing the current intra syntax. There is no NxN split for the smallest CU size. The transform tree is not affected.
An example of redundancy exploited was provided in the presentation, although figurative it would require a 2D vector. An expert wondered if 2D vectors would these be more appropriate.
It was also remarked that it would be helpful to provide a vector map overlay showing the mode utilisation and vector lengths.
Class F result averaging may be pessimistic, since some class F sequences contain very little screen content, rather they just contain some small captions.
15.0.0.1.1.1.1.1.211JCTVC-N0247 RCE3: Results of Test 3.1 on Palette Mode for Screen Content Coding [L. Guo, M. Karczewicz, J. Sole (Qualcomm)]
(Reviewed in Track A Tue. 30th (DF).)
This contribution reports the results of RCE3 test 3.1 on palette coding for screen content.
The method tested is a palette coding mode that bypasses the HEVC intra prediction, transform and quantisation, where a palette is signalled at the start of each CU, and then individual pixel values are signalled by indexing the palette. To reduce the palette overhead, each palette may be inherited from a previous CU. The indexed pixel values are signalled using three modes: 1) where a run of pixels are copied from the row above, 2) where a run of pixels have the same palette index, and 3) when no entry exists in the palette, a PCM sample value may be sent.
Lossless has no benefit in class B, a small benefit in class F (1.3% intra), screen content RGB (25%), YUV (11.6%) for a 113% encoder runtime cost.
Lossy: As QP decreases, the gains are increasing. For Class F (0.4% – 1.6%), SC RGB (14% – 25%) C YUV(3.7% – 9.3%). Similar gains are reported for Random Access. For Low delay, gains are diminished by about 3/4.
Concern was expressed about the level of parallelism at the encoder for deriving the palette. First the palette must be determined and then the values mapped to the palette indices.
A cross-checker reported that encoder complexity needs to be limited in the application areas that this proposal addresses. In the current implementation, the palette is formed by calculating a histogram of the current pixel values, and a sorting operation may be costly. There needs to be consideration of the palette size, signalling and encoder complexity.
15.0.0.1.1.1.1.1.212JCTVC-N0287 RCE3 Test 2: Multi-stage Base Color and Index Map [W. Zhu (BJUT), J. Xu (Microsoft), W. Ding (BJUT)]
(Reviewed in Track A Tue. 30th (DF).)
This contribution presents the description and results of RCE3 test 2. In the test, a new mode for screen content coding is used. In the new mode, each intra block is represented by several base colors and an index map for each pixel to indicate which base color is used as the reconstructed value for that specific pixel. The coding of an index uses neighbouring indices as predictions. Compared with the HM10.1_RExt3.1 anchor, on average 15.8%, 41.1% and 26.0% bit-saving are achieved for class F, SC RGB 444 and SC YUV 444 sequences in the All Intra HE Main-tier case; 11.3%, 35.1% and 21% bit-saving are achieved for class F, SC RGB 444 and SC YUV 444 sequences in the Random Access HE Main-tier case; 7.7%, 29.6% and 15.7% bit-saving are achieved for class F, SC RGB 444 and SC YUV 444 sequences in the Low Delay HE Main-tier case.
Compared to N0247:
-
A prediction of the pixel index values is formed on a pixel-by-pixel basis, where the prediction direction for the current pixel is estimated using previously reconstructed neighbouring samples.
-
Predictors and index values are CABAC coded
Predicted pictures do not use the palette mode.
The palette is formed on a CU basis and never copied from a previous block. The palette size may vary, and to find the optimum palette size, a search is performed starting from a small number of entries and iteratively increases. An early termination method is used to reduce the search time.
Chroma palette size is related to sampling format.
This technique offers higher gains than N0247 but with a greater run time penalty.
Concern was expressed regarding the encoder and how to make it more parallel. It was commented that an RDO search is required, since using a simpler metric would be difficult. More study may be required to see how such a mode search fits into implementation budgets.
There is a decoder parallelism concern regarding the throughput of the pixel-by-pixel prediction. Previous methods with serial decoding loops have been discouraged in the past.
RCE3 cross checks
15.0.0.1.1.1.1.1.213JCTVC-N0104 RCE3: Cross-verification of Test 3.1 [S. Lee, C. Kim (Samsung)] [late]
15.0.0.1.1.1.1.1.214JCTVC-N0125 RCE3: Cross-check of Test 3.2 [X. Wei, J. Zan (Huawei)] [late]
15.0.0.1.1.1.1.1.215JCTVC-N0126 RCE3: Cross-check of Test 3.3 [X. Wei, J. Zan (Huawei)] [late]
15.0.0.1.1.1.1.1.216JCTVC-N0326 Cross-check report for RCE 3.1 [X. Wang, Z. Ma, M. Xu (Huawei)] [late]
15.0.0.1.1.1.1.1.217JCTVC-N0327 Cross-check report for RCE 3.3 [X. Wang, Z. Ma, M. Xu (Huawei)] [late]
Share with your friends: |