“A performance comparison of fractional-pel interpolation filters in HEVC and H.264/AVC”
Under the guidance of
Dr.K.R.RAO
EE 5359 - Multimedia Processing Project Interim Report
Submitted By: LOHITH SUBRAMANYA
UTA ID: 1000928742
E-mail: lohith.subramanya@mavs.uta.edu
Date of Submission: 21st March 2014
List of Acronyms:
AIF: Adaptive Interpolation Filter
ALF: Adaptive Loop Filter
APEC: Adaptive Prediction Error Coding
AVC: Advanced Video Coding
AQMS: Adaptive Quantization Matrix Selection
CABAC: Context Adaptive Binary Arithmetic Coding
CAVAC: Context Adaptive Variable Arithmetic Coding
CSVT: Circuits and Systems for Video Technology
DCT: Discrete Cosine Transform
DCTIF: Discrete Cosine Transform Interpolation Filter
DMVD: Decoder-side Motion Vector Deviation
DSP: Digital Signal Processing
EMS: Extended Macro-block Size
FIR: Finite Impulse Response
HEVC: High Efficiency Video Coding
HP: High Profile
IBDI: Internal Bit Depth Increasing
ITU-T: International Telecommunication Union – Telecommunication Standardization Sector
JCT-VC: Joint Collaborative Team on Video Coding
JPEG: Joint Photographic Experts Group
KLT: Karhunen - Loeve Transform
LTS: Larger Transform Size
MCP: Motion Compensated Prediction
MP: Main Profile
MPEG: Moving Picture Experts Group
MV: Motion Vectors
RDO: Rate Distortion Optimization
SOC: System On Chip
SVN: Sub-Version
UVLC: Universal Variable Length Coding
VCEG: Video Coding Experts Group
VCIP: Visual Communications and Image Processing
Objective:
The objective of this project is to compare and analyze the fractional-pel interpolation filters in HEVC [18] and H.264/AVC [17] based on their frequency responses, complexity, coding performance and performance gain. BD-PSNR [33] and BD-Bit Rate [33] are the metrics used to evaluate the comparison of the two standards.
Introduction:
The fractional-pel interpolation filters (6-tap FIR [24] and average) adopted in H.264/AVC [17] improve motion compensation greatly. Similarly, DCT-based fractional-pel interpolation filters (7-tap and 8-tap) are adopted in the HEVC [1] [10] standard. This project involves the differences between these fractional-pel interpolation filters. During this project, the fractional-pel interpolation filters in HEVC [1] and H.264/AVC [17] will be compared based on properties of their frequency responses.
What is H.264 [17]?
It is an industry standard for video compression, the process of converting digital video into a format that takes up less capacity when it is stored or transmitted. Video compression (or video coding) is an essential technology for applications such as digital television, DVD-Video, mobile TV, videoconferencing and internet video streaming. Standardizing video compression makes it possible for products from different manufacturers (e.g. encoders, decoders and storage media) to inter-operate. The encoder converts video into a compressed format and the decoder converts the compressed video back into an uncompressed format.
H.264 [17] defines a format (syntax) for compressed video and a method for decoding this syntax to produce a displayable video sequence. Figure 1 [17] shows the encoding and decoding processes and highlights the parts that are covered by the H.264 standard.
Figure 1: Block Diagram of H.264 [17]
How does H.264 [17] codec work?
An H.264 [17] video encoder carries out prediction, transform and encoding processes (Figure 1) [17] to produce a compressed H.264 [17] bitstream. An H.264 [17] video decoder carries out the complementary processes of decoding, inverse transform and reconstruction to produce a decoded video sequence.
What is HEVC [1]?
High Efficiency Video Coding (HEVC) [1] is the current joint video coding standardization project of the ITU-T Video Coding Experts Group (VCEG) (ITU-T Q.6/SG 16) and ISO/IEC Moving Picture Experts Group (MPEG) (ISO/IEC JTC 1/SC 29/WG 11).
The Joint Collaborative Team on Video Coding (JCT-VC) [25] has been established to work on this project.
The Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V) [25] has been established to work on 3D video coding extensions of HEVC and other video coding standards.
The block diagram of HEVC is shown in figure 2 [6].
Figure2: Block diagram of HEVC [6]
Why use HEVC?
All video technologies need encoding and decoding to ensure efficient transmission and storage. HEVC’s better use of bandwidth is designed to enable higher resolutions without crippling networks and overstuffing storage systems. Since video and cinema industries are edging towards 8Kx4K [22] video resolutions with images of more than 8 megapixels, the new HEVC standard could soon be a widely used technology. However, due to the filing of numerous patents the use will not be for free. The patent filing by country is shown in Figure 3 [2] .
Figure 3: Patent filings of HEVC related patents by the five biggest patent holders [2]
Features of HEVC [3]:
The JCT-VC evaluates modifications [3] to current coding tools, such as:
ALF: Adaptive Loop Filter - ALF is located at the last processing stage of each picture and can be regarded as a tool trying to catch and fix artifacts from previous stages.
EMS: Extended Macro-block Size – They are used in Joint Model key technology areas software to improve the inter-block coding efficiency.
IBDI: Internal Bit Depth Increasing
AQMS: Adaptive Quantization Matrix Selection - this involves coding a frame multiple times to obtain optimal candidate matrix parameters
As well as new coding tools such as:
Modified intra prediction,
Modified de-block filter and
DMVD: Decoder-side motion vector deviation.
The new features [3] proposed to meet the requirements are:
2-D non-separable AIF
Separable AIF
Directional AIF
"Super-macro-block" structure up to 64x64 with additional transforms.
Adaptive prediction error coding (APEC) in spatial and frequency domains
Competition-based scheme for motion vector selection and coding
Mode-dependent KLT [21] for intra coding
It is speculated that these techniques are most beneficial with multi-pass encoding.
FIR Filters [27]:
FIR filters [27] are one of two primary types of digital filters used in Digital Signal Processing (DSP) applications.
"FIR" means "Finite Impulse Response". If an impulse response, that is, a single "1" sample is followed by many "0" samples, zeroes will come out after the "1" sample has made its way through the delay line of the filter.
An N-Tap FIR filter is shown in figure 4 [28].
Figure 4: N-Tap FIR filter [28]
Here h (k) is the filter coefficient array, x (n-k) is the input data array to the filter and y(n) is the obtained output from all the input data arrays.
The number N represents the number of taps of the filter and relates to the filter performance.
An N-tap FIR filter requires N multiply-accumulate cycles.
Why use interpolation in video coding?
Motion-compensated prediction (MCP) [8] is the key to the success of the modern video coding standards, as it removes the temporal redundancy in video signals and reduces the size of bitstreams significantly. With MCP, the pixels to be coded are predicted from the temporally neighboring ones, and only the prediction errors and the motion vectors (MV) [8] are transmitted. However, due to the finite sampling rate, the actual position of the prediction in the neighboring frames may be out of the sampling grid, where the intensity is unknown. So, the intensities of the positions in between the integer pixels, called sub-positions, must be interpolated and the resolution of MV [8] is increased accordingly.
One aspect of HEVC [1] video compression involves interpolation among various pixels to determine brightness. Figure 5 [9] from the draft standard shows how this process takes place.
Figure 5: Integer pixel (Shaded with upper case letters) and fractional pixel positions (Non-shaded blocks with lower case letters) for quarter-pel LUMA interpolation [9]
(Credit: ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC))
The interpolation filters used in H.264 [17] are 6 tap FIR filter for half-pel interpolation and the average filter for quarter-pel interpolation. Similarly, in HEVC [3], an 8-tap DCTIF is used for half-pel interpolation and a 7-tap DCTIF is used for quarter-pel interpolation.
DCT [10] is one of the popular transforms used in video signal processing applications, since DCT exhibits similar properties to the optimal KLT [21]. The 2nd order DCT [10] used in image compression standard JPEG [31] is defined by:
x(n) X(K)
Here, X(k) is the 2nd order forward DCT co-efficient where k=0,1,2…N-1 and x(n) is the 2nd order inverse DCT co-efficient where n=0,1,2,...N-1. Also,
By substituting the forward 2nd order DCT [10] equation in the inverse equation, the interpolation formula is obtained which is as follows:
For even tap filters, the equation changes to:
Similarly, for an odd tap filter, the equation is represented by:
The filter co-efficients for half-pel and quarter-pel filters [10] are:
The filter weights of the corresponding positions in HEVC [1] [10] are:
Here, the constant B = 8 and >> denotes arithmetic right shift. The magnitude response graphs of half-pel interpolation filters are shown in figure 6.
Figure 6: Magnitude response graphs of half-pel interpolation filters [10]
Here, the solid graph represents the DCTIF 8-Tap filter response.
Dashed graph represents H.264/AVC filter response.
Dotted graph represents DCTIF 6-Tap filter response.
Figure 7: Representation of integer, half and quarter pel pixels [20]
Figure 7 shows the representation of the integer and fractional pel arrangement.
The comparison of the modified filter coefficients based on coding performance, frequency response, performance gain and complexity that are obtained can be further assessed for the required parametric results mentioned in “A comparison of Fractional-Pel Interpolation Filters in HEVC and H.264/AVC” [10]
Implementation and Results:
The half-pel and quarter-pel interpolation filter co-efficients are inter-changed between both the H.264/AVC [17] and HEVC [1] [11] [18] codecs using reference software’s JM 18.6 [32] and HM 13 [16] respectively and the results obtained are:
Waterfall_cif.yuv [14]: The results obtained are tabulated in table 1 and the graph is shown in figure 8.
Frame Height: 352
Frame Width: 288
Frame Rate: 25fps
Number of frames encoded: 25
|
HEVC
|
H.264
|
Encoding Time(seconds)
|
30.731
|
26.268
|
Bit Rate(kbits/sec)
|
232.5279
|
233.7748
|
Y-PSNR(dB)
|
35.5360
|
35.5890
|
U-PSNR(dB)
|
36.8714
|
35.170
|
V-PSNR(dB)
|
37.9408
|
37.442
|
Average PSNR(dB)
|
36.2535
|
35.2182
|
Table 1: Sequence waterfall_cif.yuv comparison
Figure 8: RD plot for waterfall_cif.yuv
Bus_qcif.yuv [14]: The results obtained are tabulated in table 2 and the graph is shown in figure 9.
Frame Height: 176
Frame Width: 144
Frame Rate: 25fps
Number of frames encoded: 25
|
HEVC
|
H.264
|
Encoding Time(seconds)
|
8.740
|
8.200
|
Bit Rate(kbits/sec)
|
320.3269
|
344.89
|
Y-PSNR(dB)
|
33.8993
|
31.8140
|
U-PSNR(dB)
|
38.2684
|
37.5162
|
V-PSNR(dB)
|
38.8129
|
40.0570
|
Average PSNR(dB)
|
36.2596
|
34.8071
|
Table 2: Sequence bus_qcif.yuv comparison
Figure 9: RD plot for bus_qcif.yuv
Coastguard.yuv [14]: The results obtained are tabulated in table 3 and the graph is shown in figure 10.
Frame Height: 352
Frame Width: 288
Frame Rate: 15fps
Number of frames encoded: 15
|
HEVC
|
H.264
|
Encoding Time(seconds)
|
29.002
|
27.419
|
Bit Rate(kbits/sec)
|
543.0624
|
577.6629
|
Y-PSNR(dB)
|
33.9717
|
33.5862
|
U-PSNR(dB)
|
43.8025
|
44.6524
|
V-PSNR(dB)
|
44.6284
|
43.2639
|
Average PSNR(dB)
|
36.2576
|
35.4288
|
Table 3: Sequence coastguard.yuv comparison
Figure 10: RD plot for coastguard.yuv
Stefan_cif.yuv [14]: The results obtained are tabulated in table 4 and the graph is shown in figure 11.
Frame Height: 352
Frame Width: 288
Frame Rate: 30fps
Number of frames encoded: 30
|
HEVC
|
H.264
|
Encoding Time(seconds)
|
30.746
|
29.143
|
Bit Rate(kbits/sec)
|
748.1192
|
769.14
|
Y-PSNR(dB)
|
35.4565
|
34.7083
|
U-PSNR(dB)
|
38.8193
|
37.9758
|
V-PSNR(dB)
|
38.7977
|
38.0275
|
Average PSNR(dB)
|
36.2945
|
35.5312
|
Table 4: Sequence stefan_cif.yuv comparison
Figure 11: RD plot for Stefan_cif.yuv
References:
-
Fraunhofer Heinrich Hertz Institute - http://hevc.hhi.fraunhofer.de/
-
Open Patents and Standards Platform - http://www.iplytics.com/en/tag/hevc/
-
HEVC Review Site http://telcogroup.ru/files/materials-pdf/High_Efficiency_Video_Coding_H265.pdf
-
Overview of HEVC - http://iphome.hhi.de/wiegand/assets/pdfs/2012_12_IEEE-HEVC-Overview.pdf
-
HEVC Blog: http://www.extremetech.com/computing/162027-h-265-benchmarked-does-the-next-generation-video-codec-live-up-to-expectations
-
Altera Technologies: http://www.altera.com/technology/system-design/articles/2013/tv-studio-system.html
-
I. Richardson, “Real time implementation of H.264 Video Coding”, IEEE International SOC Conference, PP: 390, Sept. 2008
-
H.265 Blog http://www.h265.net/2010/07/adaptive-interpolation-filter-for-video-coding.html
-
CNET Blog http://news.cnet.com/8301-11386_3-57566116-76/hevc-video-standard-finished-high-end-improvements-coming/
-
H.Lv, et al, “ A comparison of fractional-pel interpolation in HEVC and H.264/AVC”, IEEE Conference on Visual Communications and Image Processing (VCIP), PP: 1-6, Nov. 2012
-
G.J.Sullivan, et al, “ Overview of the HEVC Standard”, IEEE Transactions on Circuits and Systems for Video Technology (CSVT), Vol: 22, No: 12, PP: 1649-1668, Dec. 2012
-
B.Lee, et al, “Performance Comparison of various interpolation methods for color filter arrays”, IEEE Symposium on Industrial Electronics, Vol: 1, PP: 232-236, June 2001
-
V.Yu and J.Ostermann, “Locally Adaptive Non-Separable Interpolation Filter for H.264/AVC”, IEEE International Conference on Image Processing, PP: 33-36, Oct. 2006
-
Video Test Sequences: http://trace.eas.asu.edu/yuv/
-
Tortoise SVN Downloadable Software Link: http://tortoisesvn.net/downloads.html
-
HM 13 Software Link: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-13.0+RExt-6.0rc1/
-
H.264 Advanced Video Coding Blog https://www.vcodex.com/h264.html
-
G.J.Sullivan, et al, “ Standardized Extensions of HEVC”, IEEE Journal of Selected Topics in Signal Processing, Vol: 7, No: 6, PP: 1001-1016, Dec. 2013
-
K.R.Rao, D.N.Kim and J.J.Hwang, “Video coding standards”, Springer Publications, Jan. 2014: http://www.springer.com/physics/book/978-94-007-6741-6
-
SPIE Digital Library Article on HEVC: http://electronicimaging.spiedigitallibrary.org/article.aspx?articleid=1730243
-
Karhunen-Loeve Transform: http://en.wikipedia.org/wiki/Karhunen%E2%80%93Lo%C3%A8ve_theorem
-
Sharp 8Kx4K TV: http://www.sound-news.net/index.php/the-novosti/hifi-av-novosti/item/552-sharp-8kx4k-tv
-
Institute of Computer and Communication Engineering – Article on HEVC: http://research.ncku.edu.tw/re/articles/e/20071102/2.html
-
FIR Filter: http://en.wikipedia.org/wiki/Finite_impulse_response
-
JCT-VC Document Management System: http://phenix.int-evry.fr/jct/
-
T.Wiegand, et al, “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol: 13, No: 7, PP: 560-576, July 2003
-
Iowegian International DSP Site: http://www.dspguru.com/dsp/faqs/fir/basics
-
N-Tap FIR Filter: http://www.analog.com/static/imported-files/seminars_webcasts/MixedSignal_Sect6.pdf
-
I.Richardson, “ The H.264 Advanced Video Compression Standard”, Wiley Publications, Aug. 2010: http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470516925.html
-
HM 13 Software Reference Manual: http://mpeg.chiariglione.org/standards/mpeg-h/high-efficiency-video-coding/high-efficiency-video-coding-hevc-encoder-description
-
JPEG: http://www.jpeg.org/
-
JM 18.6 Software Repository: http://iphome.hhi.de/suehring/tml/download/
-
BD-Metrics: http://www.mathworks.com/matlabcentral/fileexchange/27798-bjontegaard-metric/content/bjontegaard.m
-
Special issue on emerging research and standards in next generation video coding, IEEE Transactions on Circuits and Systems for Video Technology (CSVT), Vol: 22, PP: 1646-1909, Dec. 2012
-
Special issue on emerging research and standards in next generation video coding, IEEE Transactions on Circuits and Systems for Video Technology (CSVT), Vol: 23, PP: 2009-2142, Dec. 2013
Share with your friends: |