Several breakout meetings were held to discuss VCTR matters during the MPEG week. Four input contributions were reviewed as listed below. Progress was made in particular in the integration of AVC technology (baseline profile, intra coding) into the VCTR framework. More time was spent within the group for a more fundamental understanding of VCTR implications and application on media coding development, as well as setting up a work plan for the next phase.
A paradigm shift in media coding as observed today can be interpreted as many tools likewise being used in many different codecs rather than the traditional "one tool-one functionality" design. Furthermore, the traditional view of "one standard for each dedicated application" is shifted by the fact that many codecs are housed in a single platform, including competing MPEG and non-MPEG codecs. The traditional codec-level conformance specification is not efficient for such multiple codec support, where many codecs share common/similar tools. Furthermore, MPEG should shorten the cycle of standardization of a new technology (with interoperability) and enable more flexible usage of MPEG codecs in a highly dynamic environment. Therefore, VCTR should not be an alternate specification of existing MPEG standards, but rather enable and simplify MPEG-1/2/4 implementations. This includes the possibility to design new codecs based on VCTR, simplify transcoding, allow for efficient hardware implementation and define conformance testing at the level of Functional Units (FU) instead of full codecs.
As a next major step, a demo of VCTR (including software implementation) is planned for the October meeting. This will demonstrate the feasibility of defining a new Codec using VCTR building blocks (including syntax Information and new syntax based bitstream). An example could be integration of AVC intra coding tools into an MPEG-4 Visual (part 2) codec. Furthermore, transform-level transcoding is a possible candidate for demonstration. Technical description (TD) of VCTR as well as the software will be further developed to allow such a demo. A possible timeline for VCTR standardization (as discussed in the breakout group) could be CD for April 2006, FCD for April 2007.
Documents reviewed:
11920
|
Kazuo Sugimoto
Yoshihisa Yamada
Kohtaro Asai
Tokumichi Murakami
|
Textual Description for AVC@BP
|
11921
|
Kazuo Sugimoto
Yoshihisa Yamada
Kohtaro Asai
Tokumichi Murakami
|
Video Resolution Conversion on Intra Only AVC
|
11942
|
Sunyoung Lee
Hyungyu Kim
Euee S. Jang
|
Status report on VCTR software implementation
|
12074
|
Chun-Jen Tsai
|
Suggestions on the direction of VCTR
|
Output Documents:
No.
|
Title
|
TBP
|
Available
|
|
Video Coding Tools Repository
|
|
|
7095
|
Study of Video Coding Tools Repository V4.0
|
No
|
05/04/22
|
7096
|
VCTR Textual Description V3.0
|
No
|
05/04/22
|
7097
|
VCTR Software V2.0
|
No
|
05/05/13
|
6.3Wavelet Video Coding
At the Hong Kong meeting, it was decided by the Ad hoc group to use only one software framework primarily for further explorations on wavelet video coding developments. This package was provided by Microsoft Research Asia, using the new copyright disclaimer format (N6851). All technical input documents brought to the Busan meeting were already related to implementations within that software framework. Progress was made in particular investigating alternative (generalized) strategies of spatio-temporal wavelet decomposition and reducing artefacts by appropriate post processing. It is planned to integrate newly proposed tools into the software framework, to be used in the next round of exploration experiments (see description in N7098). In particular EE1 is targeted in finding further progress by full integration of different proposed technology. The experimental settings were derived from the SVC Core Experiments as defined by the Palma meeting, however extending the ranges of SNR scalability which might be of particular importance to show an advantage of embedded wavelet concepts as compared to the emerging SVC amendment of AVC. EE2 and EE3 are related to improvements of entropy coding and intra prediction. Details of the technical inputs can be summarized as follows:
Documents reviewed:
11952
|
ChinPhek Ong
ShengMei Shen
MenHuang Lee
Yoshimasa Honda
|
Wavelet Video Coding - Generalized Spatial Temporal Scalability (GSTS)
Possibilities: t+2D, 2D+t, t+2D+t. GSTS map represents all of these. Any combination of "intertwined" temporal and spatial decompositions is possible. Also supports "Partial scalability" avoids unnecessary points, such as QQCIF 30 Hz. Scheme GSTS-A (more similar t+2D) is better than original MSRA SW at high resolution (CIF), while GSTS-B (more similar 2D+t) is better at low resolution (QCIF). Different compromises for better performance at low, mid or high resolution can be made. One method GSTC-C gives >1 dB gain at QCIF with only small loss at 4CIF. Plan to re-run with AVC base layer to get more improvement.
|
11975
|
Ruiqin Xiong
Jizheng Xu
Feng Wu
|
Coding performance comparison between MSRA wavelet video coding and JSVM1
Advantages: Broader range of scalabilities; Non-redundancy; elegant spatial & SNR scalability. Disadvantage. No encoder-side reconstruction, such that performance at low rates becomes inferior. Open-loop disallows intra prediction. Disadvantage of JSVM: Bitstreams are not well embedded. Approach: t+2D MCTF with AVC in lowest subband. MSRA codec used as single spatial layer, run once for each spatial resolution. Usually slightly better than JSVM for CIF, in some cases worse at QCIF. 3-layer scalability: 4CIF usually better, CIF/QCIF only sometimes. Comparison is made of the separately encoded spatial streams, therefore it is not really comparable with JSVM. Technically, codec is not significantly improved over Palma. Codec does not provide multiple adaptation capability over different layers as was to be shown in Palma. Next round of EE should support this, track performance increases in particular at lower layers, comparison should be done most likely visually.
|
12008
|
Markus Beermann
Mathias Wien
|
De-ringing filter proposal for the VIDWAV Evaluation software
Non-linear filter to reduce artifacts from Wavelet coders. Weighting functions both related to spatial distance and amplitude differences and steered by quantizer noise estimation. Weighting is made by exponential (Gaussian) functions.
|
12056
|
Tillier
Pau
Pesquet-Popescu
|
Coding performance comparison of entropy coders in wavelet video coding
2 entropy coders (MC-EZBC and 3D-ESCOT) compared for a t+2D configuration (RPI and MSRA). 3D-ESCOT is slightly better for both cases. In particular for high-motion sequences, difference seems to be more important. Could be extended by comparisons for more entropy codecs.
|
12058
|
Pau
Pesquet-Popescu
|
Comparison of Spatial M-band Filter Banks for t+2D Video Coding
Replace spatial transform in t+2D MCTF by different M-band filter banks. Background: Sharp edges and H frames could eventually better be captured by better frequency selection capability of M-band filter banks. PSNR difference are relatively low, in some cases wavelets are slightly better. Different from what was found previously in MC-EZBC, where gains were available by longer filters.
|
Output Document:
No.
|
Title
|
TBP
|
Available
|
|
Wavelet Video Exploration
|
|
|
7098
|
Description of Exploration Experiments in Wavelet Video Coding
|
No
|
05/04/22
|
Share with your friends: |