Project proposal topic: Advanced Video Coding



Download 109.68 Kb.
Date28.01.2017
Size109.68 Kb.
#10099
PROJECT PROPOSAL
Topic: Advanced Video Coding

By


Jagriti Dhingra
Under the guidance of Dr. K. R. Rao

TABLE OF ACRONYMS

ATSC: Advanced Television Systems Committee AVC: Advanced Video Coding. BD-BR: Bjontegaard Delta Bitrate. BD-PSNR: Bjontegaard Delta Peak Signal to Noise Ratio. CABAC: Context Adaptive Binary Arithmetic Coding. CTB: Coding Tree Block. CTU: Coding Tree Unit. CU: Coding Unit. DBF: De-blocking Filter. DCT: Discrete Cosine Transform. DVB: Digital Video Broadcast HEVC: High Efficiency Video Coding. HM: HEVC Test Model. ICME: International Conference on Multimedia and Expo IEC: International Electro-technical Commission. ISDB: Integrated Services Digital Broadcasting ISO: International Organization for Standardization.

ITU-T: International Telecommunication Union- Telecommunication Standardization Sector.

JCT: Joint Collaborative Team.

JCT-VC: Joint Collaborative Team on Video Coding.

JM: H.264 Test Model.

JPEG: Joint Photographic Experts Group.

MC: Motion Compensation.

ME: Motion Estimation.

MPEG: Moving Picture Experts Group.

MSE: Mean Square Error.

PB: Prediction Block.

PSNR: Peak Signal to Noise Ratio.

QP: Quantization Parameter

SAO: Sample Adaptive Offset.

SSIM: Structural Similarity Index.

TB: Transform Block.

TU: Transform Unit.

VCEG: Visual Coding Experts Group.


ADVANCED VIDEO CODING AND ITS COMPARISON WITH OTHER VIDEO CODING STANDARDS
Objective:

It is proposed to study about advanced video coding (AVC) that is H.264. Coding simulations will be performed on various sets of test images. This paper contains every basic detail of H.264 such as its features, applications, its versions,

The paper also discusses the comparison of H.264 with other video coding standards like HEVC and MPEG-2 Standards.

The main objectives of the H.264/AVC standard are focused on coding efficiency, architecture, and functionalities. More specifically, an important objective was the achievement of a substantial increase (roughly a doubling) of coding efficiency over MPEG-2 Video for high delay applications and over H.263 version 2 for low-delay applications, while keeping implementation costs within an acceptable range. Doubling coding efficiency corresponds to halving the bit rate necessary to represent video content with a given level of perceptual picture quality. It also corresponds to doubling the number of channels of video content of a given quality within a given limited bit-rate delivery system such as a broadcast network. The architecture-related objective was to give the design a “network-friendly” structure, including enhanced error/loss robustness capabilities, in particular, which could address applications requiring transmission over various networks under various delay and loss conditions.[3]


Introduction:

H.264 or MPEG-4 Part 10, Advanced Video Coding (MPEG-4 AVC) is a video compression format that is currently one of the most commonly used formats for the recording, compression, and distribution of video content.

H.264 technology aims to provide good video quality at considerably low bit rates, at reasonable level of complexity while providing flexibility to wide range of applications. [1]

The MPEG-2 video coding standard (also known as ITU-T H.262) [2], which was developed about ten years ago primarily as an extension of prior MPEG-1 video capability with support of interlaced video coding, was an enabling technology for digital television systems worldwide. It is widely used for the transmission of standard definition (SD) and high definition (HD) TV signals over satellite, cable, and terrestrial emission and the storage of high-quality SD video signals onto DVDs.

H.264/AVC has achieved a significant improvement in compression performance compared to prior standards, and it provides a network-friendly representation of the video that addresses both conversational (video telephony) and non-conversational (storage, broadcast, or streaming) applications. [3]



H.264 is typically used for lossy compression in the strict mathematical sense, although the amount of loss may sometimes be imperceptible. It is also possible to create truly lossless encodings using it — e.g., to have localized lossless-coded regions within lossy-coded pictures or to support rare use cases for which the entire encoding is lossless.

H.264/MPEG-4 AVC is a block-oriented motion-compensation-based video compression standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC JTC1 Moving Picture Experts Group (MPEG). The project partnership effort is known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 AVC standard (formally, ISO/IEC 14496-10 – MPEG-4 Part 10, Advanced Video Coding) are jointly maintained so that they have identical technical content.

H.264 is perhaps best known as being one of the video encoding standards for Blu-ray Discs; all Blu-ray Disc players must be able to decode H.264. It is also widely used by streaming internet sources, such as videos from Vimeo, YouTube, and the iTunes Store, web software such as the Adobe Flash Player and Microsoft Silverlight, and also various HDTV broadcasts over terrestrial (ATSC, ISDB-T, DVB-T or DVB-T2), cable (DVB-C), and satellite (DVB-S and DVB-S2).[4]

Overview of the H.264/AVC Video Coding Standard

The intent of the H.264/AVC project was to create a standard capable of providing good video quality at substantially lower bit rates than previous standards (i.e., half or less the bit rate of MPEG-2, H.263, or MPEG-4 Part 2), without increasing the complexity of design so much that it would be impractical or excessively expensive to implement. An additional goal was to provide enough flexibility to allow the standard to be applied to a wide variety of applications on a wide variety of networks and systems, including low and high bit rates, low and high resolution video, broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems.[4]

Video coding for telecommunication applications has evolved through the development of the ITU-T H.261, H.262 (MPEG-2), and H.263 video coding standards (and later enhancements of H.263 known as H.263+ and H.263++ and has diversified from ISDN and T1/E1 service to embrace PSTN, mobile wireless networks, and LAN/Internet network delivery. Throughout this evolution, continued efforts have been made to maximize coding efficiency while dealing with the diversification of network types and their characteristic formatting and loss/error robustness requirements.[5]

In December of 2001, VCEG and the Moving Picture Experts Group (MPEG) ISO/IEC JTC 1/SC 29/WG 11 formed a Joint Video Team (JVT), with the charter to finalize the draft new video coding standard for formal approval submission as H.264/AVC [1] in March 2003.

The scope of the standardization is illustrated in Fig. 1, which shows the typical video coding/decoding chain (excluding the transport or storage of the video signal).




Fig. 1. Scope of video coding standardization.[5]
The standardization of the first version of H.264/AVC was completed in May 2003. In the first project to extend the original standard, the JVT then developed what was called the Fidelity Range Extensions (FRExt). These extensions enabled higher quality video coding by supporting increased sample bit depth precision and higher-resolution color information, including sampling structures known as Y'CbCr 4:2:2 (=YUV 4:2:2) and Y'CbCr 4:4:4. Several other features were also included in the Fidelity Range Extensions project, such as adaptive switching between 4×4 and 8×8 integer transforms, encoder-specified perceptual-based quantization weighting matrices, efficient inter-picture lossless coding, and support of additional color spaces. The design work on the Fidelity Range Extensions was completed in July 2004, and the drafting work on them was completed in September 2004.[4]

Further recent extensions of the standard then included adding five other new profiles intended primarily for professional applications, adding extended-gamut color space support, defining additional aspect ratio indicators, defining two additional types of "supplemental enhancement information" (post-filter hint and tone mapping), and deprecating one of the prior FRExt profiles that industry feedback indicated should have been designed differently.

The next major feature added to the standard was Scalable Video Coding (SVC). Specified in Annex G of H.264/AVC, SVC allows the construction of bitstreams that contain sub-bitstreams that also conform to the standard, including one such bitstream known as the "base layer" that can be decoded by a H.264/AVC codec that does not support SVC..

The next major feature added to the standard was Multiview Video Coding (MVC). Specified in Annex H of H.264/AVC, MVC enables the construction of bitstreams that represent more than one view of a video scene. An important example of this functionality is stereoscopic 3D video coding.[4]



Applications:

  • The H.264 video format has a very broad application range that covers all forms of digital compressed video from low bit-rate Internet streaming applications to HDTV broadcast and Digital Cinema applications with nearly lossless coding. With the use of H.264, bit rate savings of 50% or more are reported. For example, H.264 has been reported to give the same Digital Satellite TV quality as current MPEG-2 implementations with less than half the bitrate, with current MPEG-2 implementations working at around 3.5 Mbit/s and H.264 at only 1.5 Mbit/s.[6]

  • To ensure compatibility and problem-free adoption of H.264/AVC, many standards bodies have amended or added to their video-related standards so that users of these standards can employ H.264/AVC.

  • The Digital Video Broadcast project (DVB) approved the use of H.264/AVC for broadcast television in late 2004.

  • The Advanced Television Systems Committee (ATSC) standards body in the United States approved the use of H.264/AVC for broadcast television in July 2008, although the standard is not yet used for fixed ATSC broadcasts within the United States.[7][8] It has also been approved for use with the more recent ATSC-M/H (Mobile/Handheld) standard, using the AVC and SVC portions of H.264.[9]

  • Conversational services over ISDN, Ethernet, LAN, DSL, wireless and mobile networks, modems, etc. or mixtures of these.

  • Multimedia messaging services (MMS) over ISDN, DSL, ethernet, LAN, wireless and mobile networks, etc.

  • Video-on-demand or multimedia streaming services over ISDN, cable modem, DSL, LAN, wireless networks, etc.[5]

  • The CCTV (Closed Circuit TV) and Video Surveillance markets have included the technology in many products.

  • Canon and Nikon DSLRs use H.264 video wrapped in QuickTime MOV containers as the native recording format.

  • AVCHD is a high-definition recording format designed by Sony and Panasonic that uses H.264 (conforming to H.264 while adding additional application-specific features and constraints).

  • AVC-Intra is an intraframe-only compression format, developed by Panasonic.

Design Feature Highlights:

To address the need for flexibility and customizability, the H.264/AVC design covers a VCL, which is designed to efficiently represent the video content, and a NAL, which formats the VCL representation of the video and provides header information in a manner appropriate for conveyance by a variety of transport layers or storage media.




Fig. 2. Structure of H.264/AVC video encoder[5]
Some highlighted features of the design that enable enhanced coding efficiency include the following enhancements of the ability to predict the values of the content of a picture to be encoded.
Variable block-size motion compensation with small block sizes: This standard supports more flexibility in the selection of motion compensation block sizes and shapes than any previous standard, with a minimum luma motion compensation block size as small as 4× 4.
Quarter-sample-accurate motion compensation: Most prior standards enable half-sample motion vector accuracy at most. The new design improves up on this by adding quarter-sample motion vector accuracy, as first found in an advanced profile of the MPEG-4 Visual (part 2) standard, but further reduces the complexity of the interpolation processing compared to the prior design.
Motion vectors over picture boundaries: While motion vectors in MPEG-2 and its predecessors were required to point only to areas within the previously-decoded reference picture, the picture boundary extrapolation technique first found as an optional feature in H.263 is included in H.264/AVC.
In-the-loop deblocking filtering: Block-based video coding produces artifacts known as blocking artifacts. These can originate from both the prediction and residual difference coding stages of the decoding process. Application of an adaptive deblocking filter is a well-known method of improving the resulting video quality, and when designed well, this can improve both objective and subjective video quality. Building further on a concept from an optional feature of H.263 , the deblocking filter in the H.264/AVC design is brought within the motion-compensated prediction loop, so that this improvement in quality can be used in inter-picture prediction to improve the ability to predict other pictures as well.
NAL

The NAL is designed in order to provide “network friendliness” to enable simple and effective customization of the use of the VCL for a broad variety of systems. The NAL facilitates the ability to map H.264/AVC VCL data to transport layers such as:

• RTP/IP for any kind of real-time wire-line and wireless Internet services (conversational and streaming);

• File formats, e.g., ISO MP4 for storage and MMS;

• H.32X for wireline and wireless conversational services;

• MPEG-2 systems for broadcasting services, etc.



The full degree of customization of the video content to fit the needs of each particular application is outside the scope of the H.264/AVC standardization effort, but the design of the NAL anticipates a variety of such mappings. Some key concepts of the NAL are NAL units, byte stream, and packet format uses of NAL units, parameter sets, and access units.[5]


  1. NAL Units

The coded video data is organized into NAL units, each of which is effectively a packet that contains an integer number of bytes. The first byte of each NAL unit is a header byte that contains an indication of the type of data in the NAL unit, and the remaining bytes contain payload data of the type indicated by the header. The payload data in the NAL unit is interleaved as necessary with emulation prevention bytes, which are bytes inserted with a specific value to prevent a particular pattern of data called a start code prefix from being accidentally generated inside the payload. The NAL unit structure definition specifies a generic format for use in both packet-oriented and bitstream-oriented transport systems, and a series of NAL units generated by an encoder is referred to as a NAL unit stream.

  1. NAL Units in Byte-Stream Format Use

Some systems (e.g., H.320 and MPEG-2/H.222.0 systems) require delivery of the entire or partial NAL unit stream as an ordered stream of bytes or bits within which the locations of NAL unit boundaries need to be identifiable from patterns within the coded data itself. For use in such systems, the H.264/AVC specification defines a byte stream format. In the byte stream format, each NAL unit is prefixed by a specific pattern of three bytes called a start code prefix. The boundaries of the NAL unit can then be identified by searching the coded data for the unique start code prefix pattern. The use of emulation prevention bytes guarantees that start code prefixes are unique identifiers of the start of a new NAL unit.


  1. NAL Units in Packet-Transport System Use

In other systems (e.g., internet protocol/RTP systems), the coded data is carried in packets that are framed by the system transport protocol, and identification of the boundaries of NAL units within the packets can be established without use of start code prefix patterns. In such systems, the inclusion of start code prefixes in the data would be a waste of data carrying capacity, so instead the NAL units can be carried in data packets without start code prefixes.


  1. VCL and Non-VCL NAL Units

NAL units are classified into VCL and non-VCL NAL units. The VCL NAL units contain the data that represents the values of the samples in the video pictures, and the non-VCL NAL units contain any associated additional information such as parameter sets (important header data that can apply to a large number of VCL NAL units) and supplemental enhancement information (timing information and other supplemental data that may enhance usability of the decoded video signal but are not necessary for decoding the values of the samples in the video pictures).


  1. Parameter Sets

A parameter set is supposed to contain information that is expected to rarely change and offers the decoding of a large number of VCL NAL units. There are two types of parameter sets:

• sequence parameter sets, which apply to a series of consecutive coded video pictures called a coded video sequence;

• picture parameter sets, which apply to the decoding of one or more individual pictures within a coded video sequence.

. In other applications (Fig. 3), it can be advantageous to convey the parameter sets “out-of-band” using a more reliable transport mechanism than the video channel itself.



Fig. 3. Parameter set use with reliable “out-of-band” parameter set exchange.[5]


  1. Access Units

A set of NAL units in a specified form is referred to as an access unit. The decoding of each access unit results in one decoded picture. The format of an access unit is shown in Fig. 4.

Each access unit contains a set of VCL NAL units that together compose a primary coded picture. It may also be prefixed with an access unit delimiter to aid in locating the start of the access unit. Some supplemental enhancement information containing data such as picture timing information may also precede the primary coded picture.




Fig. 4. Structure of an access unit[5]


  1. Coded Video Sequences

A coded video sequence consists of a series of access units that are sequential in the NAL unit stream and use only one sequence parameter set. Each coded video sequence can be decoded independently of any other coded video sequence, given the necessary parameter set information, which may be conveyed “in-band” or “out-of-band”.
Profiles:

The standard defines a sets of capabilities, which are referred to as profiles, targeting specific classes of applications. These are declared as a profile code (profile_idc) and a set of constraints applied in the encoder. This allows a decoder to recognize the requirements to decode that specific stream.

Profiles for non-scalable 2D video applications include the following:

Constrained Baseline Profile (CBP, 66 with constraint set 1)

Primarily for low-cost applications, this profile is most typically used in videoconferencing and mobile applications. It corresponds to the subset of features that are in common between the Baseline, Main, and High Profiles.



Baseline Profile (BP, 66)

Primarily for low-cost applications that require additional data loss robustness, this profile is used in some videoconferencing and mobile applications. This profile includes all features that are supported in the Constrained Baseline Profile, plus three additional features that can be used for loss robustness (or for other purposes such as low-delay multi-point video stream compositing). The importance of this profile has faded somewhat since the definition of the Constrained Baseline Profile in 2009. All Constrained Baseline Profile bitstreams are also considered to be Baseline Profile bitstreams, as these two profiles share the same profile identifier code value.



Extended Profile (XP, 88)

Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.



Main Profile (MP, 77)

This profile is used for standard-definition digital TV broadcasts that use the MPEG-4 format as defined in the DVB standard.[14] It is not, however, used for high-definition television broadcasts, as the importance of this profile faded when the High Profile was developed in 2004 for that application.



High Profile (HiP, 100)

The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (for example, this is the profile adopted by the Blu-ray Disc storage format and the DVB HDTV broadcast service).



Progressive High Profile (PHiP, 100 with constraint set 4)

Similar to the High profile, but without support of field coding features.



Constrained High Profile (100 with constraint set 4 and 5)

Similar to the Progressive High profile, but without support of B (bi-predictive) slices.



High 10 Profile (Hi10P, 110)

Going beyond typical mainstream consumer product capabilities, this profile builds on top of the High Profile, adding support for up to 10 bits per sample of decoded picture precision.



High 4:2:2 Profile (Hi422P, 122)

Primarily targeting professional applications that use interlaced video, this profile builds on top of the High 10 Profile, adding support for the 4:2:2 chroma subsampling format while using up to 10 bits per sample of decoded picture precision.



High 4:4:4 Predictive Profile (Hi444PP, 244)

This profile builds on top of the High 4:2:2 Profile, supporting up to 4:4:4 chroma sampling, up to 14 bits per sample, and additionally supporting efficient lossless region coding and the coding of each picture as three separate color planes.

For camcorders, editing, and professional applications, the standard contains four additional Intra-frame-only profiles, which are defined as simple subsets of other corresponding profiles. These are mostly for professional (e.g., camera and editing system) applications:

High 10 Intra Profile (110 with constraint set 3)

The High 10 Profile constrained to all-Intra use.



High 4:2:2 Intra Profile (122 with constraint set 3)

The High 4:2:2 Profile constrained to all-Intra use.



High 4:4:4 Intra Profile (244 with constraint set 3)

The High 4:4:4 Profile constrained to all-Intra use.



CAVLC 4:4:4 Intra Profile (44)

The High 4:4:4 Profile constrained to all-Intra use and to CAVLC entropy coding (i.e., not supporting CABAC).

As a result of the Scalable Video Coding (SVC) extension, the standard contains five additional scalable profiles, which are defined as a combination of a H.264/AVC profile for the base layer (identified by the second word in the scalable profile name) and tools that achieve the scalable extension:

Scalable Baseline Profile (83)

Primarily targeting video conferencing, mobile, and surveillance applications, this profile builds on top of the Constrained Baseline profile to which the base layer (a subset of the bitstream) must conform. For the scalability tools, a subset of the available tools is enabled.



Scalable Constrained Baseline Profile (83 with constraint set 5)

A subset of the Scalable Baseline Profile intended primarily for real-time communication applications.



Scalable High Profile (86)

Primarily targeting broadcast and streaming applications, this profile builds on top of the H.264/AVC High Profile to which the base layer must conform.



Scalable Constrained High Profile (86 with constraint set 5)

A subset of the Scalable High Profile intended primarily for real-time communication applications.



Scalable High Intra Profile (86 with constraint set 3)

Primarily targeting production applications, this profile is the Scalable High Profile constrained to all-Intra use.

As a result of the Multiview Video Coding (MVC) extension, the standard contains two multiview profiles:

Stereo High Profile (128)

This profile targets two-view stereoscopic 3D video and combines the tools of the High profile with the inter-view prediction capabilities of the MVC extension.



Multiview High Profile (118)

This profile supports two or more views using both inter-picture (temporal) and MVC inter-view prediction, but does not support field pictures and macroblock-adaptive frame-field coding.



Multiview Depth High Profile (138)

Fig. 5: The specific coding parts of the profiles in H.264 [11]

Each profile specifies a subset of entire bitstream of syntax and limits that shall be supported by all decoders conforming to that profile. There are three profiles in the first version: baseline, main, and extended. Main profile is designed for digital storage media and television broadcasting. H.264 main profile which is the subset of high profile was designed with compression coding efficiency as its main target. Fidelity range extensions [10] provide a major breakthrough with regard to compression efficiency. The profiles are shown in Fig. 5.
There are four High profiles defined in the fidelity range extensions: High, High 10, High 4:2:2, and High 4:4:4. High profile is to support the 8-bit video with 4:2:0 sampling for applications using high resolution. High 10 profile is to support the 4:2:0 sampling with up to 10 bits of representation accuracy per sample. High 4:2:2 profile supports up to 10 bits per sample. High 4:4:4 profile supports up to 4:4:4 chroma sampling up to 12 bits per sample thereby supporting efficient lossless region coding [11].
H.264/AVC Main Profile Intra-Frame Coding:

Main difference between H.264/AVC main profile intra-frame coding and JPEG 2000 is in the transformation stage. The characteristics of this stage also decide the quantization and entropy coding stages. H.264 uses block based coding, shown in


Fig. 6 which is like block translational model employed in inter-frame coding framework [12]. 4x4 transform block size is used instead of 8x8. H.264 exploits spatial redundancies using intra-frame prediction of the macro-block using the neighboring pixels of the same frame, thus taking the advantage of inter-block spatial prediction.

The result of applying spatial prediction and wavelet like 2-level transform iteration is effective in smooth image regions. This feature enables H.264 to be competitive with JPEG2000 in high resolution, high quality applications. JPEG cannot sustain in the competition even though it uses DCT based block coding. DCT coding framework is competitive with wavelet transform coding if the correlation between neighboring pixels is properly considered using context adaptive entropy coding.

In H.264, after transformation, the coefficients are scalar quantized, zig-zag scanned and entropy coded by CABAC. Another entropy coding CAVLC operates by switching between different VLC tables which are designed using exponential Golomb code based on locally available contexts collected from neighboring blocks- used sacrificing some coding efficiency [11].

H.264/AVC FRExt High Profile Intra-Frame Coding:

Main feature in FRExt that improves coding efficiency is the 8x8 integer transform- and all the coding methods as well as prediction modes associated with adaptive selection between 4x4 and 8x8 integer transforms. Other features include



  • higher resolution for color representation such as YUV 4:2:2 and YUV 4:4:4,

  • addition of 8x8 block size is a key factor in very high resolution, high bit rates

  • achieve very high fidelity – even for selective lossless representation of video

Fig.6: Basic coding structure for a macroblock in H.264/AVC [11].


Context-based Adaptive Binary Arithmetic Coding (CABAC):

CABAC utilizes the arithmetic coding, also in order to achieve good compression. The CABAC encoding process, shown in Fig. 7, consists of three elementary steps [13].



Fig.7: Block diagram for CABAC[15]



step 1 : binarization – Mapping non binary symbols into binary sequence before given to arithmetic coder.

step 2 : context modeling – It is a probability model for defining one or more elements based on previously encoded syntax elements.

step 3 : binary arithmetic coding – Encodes elements based on selected probability model.

HEVC

High Efficiency Video Coding (HEVC) [12] is an international standard for video compression developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group). The main goal of HEVC standard is to significantly improve compression performance compared to existing standards (such as H.264/Advanced Video Coding [13]) in the range of 50% bit rate reduction at similar visual quality [6].

HEVC is designed to address existing applications of H.264/MPEG-4 AVC and to focus on two key issues: increased video resolution and increased use of parallel processing architectures [6]. It primarily targets consumer applications as pixel formats are limited to 4:2:0 8-bit and 4:2:0 10-bit. The new revised standard, enables new use-cases with the support of additional pixel formats such as 4:2:2 and 4:4:4 and bit depth higher than 10-bit [20], embedded bit-stream scalability and 3D video [33].

Figure 8: Block Diagram of HEVC CODEC [17]




Encoder and Decoder in HEVC
Source video, consisting of a sequence of video frames, is encoded or compressed by a video encoder to create a compressed video bit stream. The compressed bit stream is stored or transmitted. A video decoder decompresses the bit stream to create a sequence of decoded frames [17].

The video encoder performs the following steps:

 Partitioning each picture into multiple units

 Predicting each unit using inter or intra prediction, and subtracting the prediction from the unit

 Transforming and quantizing the residual (the difference between the original picture unit and the prediction)

 Entropy encoding transform output, prediction information, mode information and headers

The video decoder performs the following steps:

• Entropy decoding and extracting the elements of the coded sequence

• Rescaling and inverting the transform stage

• Predicting each unit and adding the prediction to the output of the inverse transform

• Reconstructing a decoded video image

Figure 9: Block Diagram of HEVC Encoder [9]



Figure 10: Block Diagram of HEVC Decoder [31]

averaging the bit savings for each simulation point on the rate-quality curves. The coding gains of H.264 over MPEG-2 are summarized in table I and 2 for each of the sequences in set-I (CCIR-601 format) and set-2 (CIF format), respectively. The H.264 encoder achieves roughly an average of 50% coding gain over CBR mode MPEG-2. Compared to approximately 3:1 rate reduction.[16]

MPEG-2

MPEG-2 is widely used as the format of digital television signals that are broadcast by terrestrial (over-the-air), cable, and direct broadcast satellite TV systems. It also specifies the format of movies and other programs that are distributed on DVD and similar disks. As such, TV stations, TV receivers, DVD players, and other equipment are often designed to this standard. MPEG-2 was the second of several standards developed by the Moving Pictures Expert Group (MPEG) and is an international standard (ISO/IEC 13818). Error: Reference source not found]


The video section, part 2 of MPEG-2, is similar to the previous MPEG-1 standard, but also provides support for interlaced video; the format used by analog broadcast TV systems. MPEG-2 video is not optimized for low bit-rates, especially less than 1 Mbit/s at standard definition resolutions. However, it outperforms MPEG-1 at 3 Mbit/s and above. MPEG-2 is directed at broadcast formats at higher data rates of 4 Mbps (DVD) and 19 Mbps (HDTV). All standards-compliant MPEG-2 video decoders are fully capable of playing back MPEG-1 video streams. MPEG-2/video is formally known as ISO/IEC 13818-2 and as ITU-T Rec. H.262 Error: Reference source not found].

MPEG-2 Profiles and Levels

MPEG-2 video supports wide range of applications from mobile to high quality HD editing. For many applications, it is unrealistic and too expensive to support the entire standard. To allow such applications to support only subsets of it, the standard defines profile and level.[35]



MPEG-2 Profiles [35]

Simple Profile


This profile has the fewest tools. The Simple profile offers the basic toolkit for MPEG-2 encoding. This is intra and predicted frame encoding and decoding with a color sub sampling of YUV 4:2:0.

Main Profile


This profile has all the tools of the Simple Profile plus one more (termed bi-directional prediction). It gives better (maximum) quality for the same bit-rate than the Simple Profile. A Main Profile decoder decodes both Main and Simple Profile-encoded pictures. This backward compatibility pattern applies to the succession of profiles. A refinement of the Main Profile, sometimes unofficially known as Main Profile Professional Level or MPEG 422, allows line-sequential color difference signals (4:2:2) to be used, but not the scaleable tools of the higher Profiles.

SNR Scalable Profile and Spatially Scalable Profile


The two Profiles after the Main Profile are, successively, the SNR Scaleable Profile and the Spatially Scaleable Profile. These add tools which allow the coded video data to be partitioned into a base layer and one or more ‘top-up’ signals. The top-up signals can either improve the noise (SNR Scalability) or the resolution (Spatial Scalability). These Scaleable systems may have interesting uses. The lowest layer can be coded in a more robust way, and thus provide a means to broadcast to a wider area, or provide a service for more difficult reception conditions. Nevertheless there will be a premium to be paid for their use in receiver complexity. Owing to the added complexity, none of the Scaleable Profiles is supported by digital video broadcasting (DVB). The inputs to the system are YUV component radio. However, the first four profiles code the color-difference signals line-sequentially.

High Profile


It includes all the previous tools plus the ability to code line-simultaneous colour-difference signals. In effect, the High Profile is a ‘super system’, designed for the most sophisticated applications, where there is no constraint on bit rate.

Error: Reference source not found is a tabulated form of the properties of the various MPEG-2 profiles.




MPEG-2 Levels




Description of Levels

A level is the definition for the MPEG standard for physical parameters such as bit rates, picture sizes and resolutions. There are four levels specified by MPEG2: High level, High 1440, Main level, and Low level. MPEG-2 Video Main Profile and Main level has sampling limits at ITU-R 601 parameters (PAL and NTSC). Profiles limit syntax (i.e. algorithms) whereas Levels limit encoding parameters (sample rates, frame dimensions, coded bitrates, buffer size etc.). Together, Video Main Profile and Main Level (abbreviated as MP@ML) keep complexity within current technical limits, yet still meet the needs of the majority of applications. MP@ML is the most widely accepted combination for most cable and satellite systems; however different combinations are possible to suit other applications. Error: Reference source not found]


Table shows a comparison between the four MPEG-2 levels on the basis of the frame size (PAL/NTSC) and the maximum bit rate for each.

Table . MPEG-2 Levels [36]





Future Work

Further I will be comparing the different video coding standards using Comparison metrics and using HEVC Test Model i.e. HM Software.



REFERENCES:
[1] “Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264/ISO/IEC 14 496-10 AVC,” in Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVTG050, 2003.
[2] “Generic Coding of Moving Pictures and Associated Audio Information - Part 2: Video,” ITU-T and ISO/IEC JTC 1, ITU-T Recommendation H.262 and ISO/IEC 13 818-2 (MPEG-2), 1994.
[3] IEEE SIGNAL PROCESSING MAGAZINE [148] MARCH 2007. https://www.ic.tu-berlin.de/fileadmin/fg121/publications/2007_42_h264avc_nutshell.pdf
[4] http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC
[5] IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003.
[6] Wenger et al. "RFC 3984 : RTP Payload Format for H.264 Video". p. 2.
[7] "ATSC Standard A/72 Part 1: Video System Characteristics of AVC in the ATSC Digital Television System" (PDF). Retrieved2011-07-30.
[8] “ATSC Standard A/72 Part 2: AVC Video Transport Subsystem Characteristics" (PDF). Retrieved 2011-07-30.
[9] "ATSC Standard A/153 Part 7: AVC and SVC Video System Characteristics" (PDF). Retrieved 2011-07-30.
[10] G. Sullivan, P. Topiwala and A. Luthra, “The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions,” SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74, Aug. 2004
[11] T. Wiegand, G. Sullivan, G. Bjontegaard and A. Luthra, “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, pp.560-576, July 2003.
[12] T. Tran, L.Liu and P. Topiwala, “Performance comparison of leading image codecs: H.264/AVC intra, JPEG 2000, and Microsoft HD photo,” Proc. SPIE Int’l Symposium, Digital Image Processing, San Diego, Sept. 2007.
[13] W.B. Pennebaker and J.L. Mitchell, JPEG: Still image data compression standard, Kluwer academic publishers, 2003.

[14]  "TS 101 154 – V1.9.1 – Digital Video Broadcasting (DVB); Specification for the use of Video and Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream" (PDF). Retrieved 2010-05-17.


[15] “The MPEG-2 intemational standard,” ISO/IEC, Reference number ISOlIEC 13818-2, 1996.
[16] R. Schafer, T. Wiegand, and H. Schwarz, “The Emerging H.264lAVC Standard”, EBU Technical Review, Jan. 2003.
[17] “Video coding for low bit rate communications,” ITUT, ITU-T Recommendation H.263, ver. 1, 1995.
[18] Telenor H.263 codec, “ITU-T/SG-15, video codec test model, TMN5”, Telenor Research, 1995.
[19] HEVC white paper-Ittiam Systems: http://www.ittiam.com/Downloads/en/documentation.aspx
[20] HEVC white paper-Ateme: http://www.ateme.com/an-introduction-to-uhdtv-and-hevc
[21] HEVC white paper-Elemental Technologies: http://www.elementaltechnologies.com/lp/hevc-h265-demystified-white-paper
[22] White paper on PSNR-NI: http://www.ni.com/white-paper/13306/en/

[23] Test Sequences: ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/

[24] Access to HM 13.0 Reference Software: http://hevc.hhi.fraunhofer.de/
[25] Access to JM 18.6 Reference Software: http://iphome.hhi.de/suehring/tml/
[26] Website on PSNR: http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio
[27] Website on SSIM: http://en.wikipedia.org/wiki/Structural_similarity
[28] Access the website http://www-ee.uta.edu/Dip/Courses/EE5359/ and refer to the project by S. Kulkarni on “Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)”, University of Texas At Arlington, Spring 2013.
[29] Access the website http://www-ee.uta.edu/Dip/Courses/EE5359/ and refer to the project by H. B. Jain on “Comparative and performance analysis of HEVC and H.264 intra frame coding and JPEG 2000”, University of Texas At Arlington, Spring 2013.
[30] Access the website http://www-ee.uta.edu/Dip/Courses/EE5359/ and refer to the thesis by S. Vasudevan on “Fast intra prediction and fast residual quad-tree encoding implementation in HEVC”, University of Texas, Arlington, 2013.

[31] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012.


[32] Z. Wang et al, “Image Quality Assessment: From Error Visibility to Structural Similarity”, IEEE Transactions on Image Processing, Vol. 13, No. 4, pp. 600-612, Apr. 2004.
[33] G.J. Sullivan et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.
[34] “MPEG-2”, Wikipedia, Feb. 14, 2008.

Available at


[35] “H.262 : Information technology - Generic coding of moving pictures and associated audio information: Video”, International Telecommunication Union, 2000-02.
[36] “MPEG-2 White paper”, Pinnacle Technical Documentation, Version 0.5, Pinnacle Systems, Feb. 29, 2000.
Directory: faculty -> krrao -> dip -> Courses -> EE5359
faculty -> Samples of Elements Exam Question III contains All Prior Exam Qs III except
faculty -> 【Education&Working Experience】
faculty -> References Abe, M., A. Kitoh and T. Yasunari, 2003: An evolution of the Asian summer monsoon associated with mountain uplift —Simulation with the mri atmosphere-ocean coupled gcm. J. Meteor. Soc. Japan, 81
faculty -> Ralph R. Ferraro Chief, Satellite Climate Studies Branch, noaa/nesdis
faculty -> Unit IV text: Types of Oil, Types of Prices Grammar: that/those of, with revision
EE5359 -> The university of texas at arlington
EE5359 -> Scalable video coding extension of hevc (s-hevc)
EE5359 -> -
EE5359 -> “A performance comparison of fractional-pel interpolation filters in hevc and H. 264/avc”
EE5359 -> Topics in Signal Processing

Download 109.68 Kb.

Share with your friends:




The database is protected by copyright ©ininet.org 2025
send message

    Main page