International organisation for standardisation organisation internationale de normalisation



Download 8.47 Mb.
Page37/116
Date19.10.2016
Size8.47 Mb.
#4078
1   ...   33   34   35   36   37   38   39   40   ...   116

CDVS


  • As a general note, it was assessed that it would be premature to proceed to CD, regarding the fact that investigations about keypoint extraction methods are still ongoing.
    1. Requirements related


15.0.0.1.1.1.1.1.26m29671 A First Set of Requirements for Visual Search in Video for Media Archives and Broadcasting Applications [Alberto MESSINA, Sabino METTA, Werner BAILER]

A set of 6 requirements was presented from the broadcasting point of view. An output document from Requirements was drafted including content of m29974 and m29671.


15.0.0.1.1.1.1.1.27m29974 Applications and Requirements for CDVS Extension to the Video Domain [Danilo PAU, Arcangelo BRUNA, Emanuele Plebani, Marco Marcon]

Presented at a joint meeting with Requirements.

A number of use scenarios for video extension of CVDS presented. An output document from requirements was drafted including content of m29974 and m29671.

15.0.0.1.1.1.1.1.28m29975 Requirements of HOG compression for automotive usecase [Hiroaki KUMON, Karel KREUTER]

Industry support was expressed for the scenarios presented in m29974.

15.0.0.1.1.1.1.1.29m30565 Use cases for Video Search [Jean-Ronan Vigouroux, Frederic Lefebvre]

A number of use scenarios involving image to video and video to video were presented. Some are already addressed by e.g. Video Signature but other use scenarios will be used in the forthcoming requirements.

    1. General


15.0.0.1.1.1.1.1.30m29976 Analyze then compress: a new visual encoder [Luca Baroffio, Matteo Cesana, Alessandro Redondi, Marco Tagliasacchi, Stefano Tubaro, Danilo Pau]

(Presented at a joint meeting with Requirements)

Comparison of the Analyze then Compress (ATC) vs Compress then Analyze (CTA) schemes were presented in the context of automotive applications. Inter-frame descriptor coding was used. Significant gains in terms of rate distortion and rate repetability and matching and accuracy were reported. Industry interest reported in m29975).

An output document on requirements was issued.

15.0.0.1.1.1.1.1.31m29977 Testing CVDS Test Model 6 on Supermarket’s objects [Davide Mazzini, Danilo Pau, Emanuele Plebani, Raimondo Schettini, Simone Bianco]

A database of supermarket products was acquired (430 objects, 2127 images), as it is an important use scenario. Some supermarket consumer goods have similar packaging, this causes increase in FPR (for example red wine could be linked with white wine of the same producer). Localization accuracy is slightly lower due to the shape of the objects. A very useful study case. Use of MPEG-7 descriptors in conjuction with CDVS could solve the problem. MAF format could/should be defined to enable this functionality.


    1. CE and other technical inputs

      1. Interest point detectors


15.0.0.1.1.1.1.1.32m30235 CDVS Quantitative Review of Local Descriptors for Visual Search [Davide Mazzini, Danilo Pau, Raimondo Schettini, Simone Bianco]

Study of performance and speed for various combinations of detectors and descriptors. Detector: SIFT, DoG, Multiscale Harris, Harris-Laplace, Hessian, Mulitiscale Hessian, Hessian-Laplace. Descriptor: SIFT, KAZE, FREAK. Logarithmic distance ratio was used. All images in VGA. Two experiments: selected 1k points (1) and all keaypoints (2). Best performance overall SIFT-SIFT. Limitation of the keypoint affect affine invariant detectors most. OpenCV is the fastest implementation of SIFT-SIFT. Very useful study.

15.0.0.1.1.1.1.1.33m30237 CDVS STM position on interest point detection for CDVS [Danilo Pau, Doug Sorensen]

Legal position presented and noted.

15.0.0.1.1.1.1.1.34m30286 CDVS: concerns about the key point detector [Skjalg Lepsoy, Gianluca Francini, Massimo Balestri]

Technical position expressing some concerns regarding SIFT IP issues presented. No consensus reached.

15.0.0.1.1.1.1.1.35m30309 Proposed Improvements to TM6.0 [Stavros Paschalakis, Karol Wnukowicz, Miroslaw Bober, Alessandra Mosca, Massimo Mattelliano]

Proposed scalable representation of the bitstream, by using single priority list: the redundancies of the descriptors is eliminated. The transcoding to different bit rates can be done through simple truncation.

No impact on performances, the descriptors remain unchanged, only the order within the bitstream is changed.

Proposal cross-checked by TI (input doc number: m30476).

This proposal is adopted into the TM7.

15.0.0.1.1.1.1.1.36m30311 Improvements to TM6.0 with a Robust Visual Descriptor – Proposal from University of Surrey and Visual Atoms [Miroslaw Bober, Syed Husain, Stavros Paschalakis, Karol Wnukowicz]

An new global descriptor is presented, with the following features:


  • Significant improvements for the non-planar objects;

  • Memory is reduced of 25%;

  • Control mechanism is simple to operate, with just one parameter acting on the FPR;

  • Technical details: Every element is assigned to multiple clusters, residual errors in each cluster is l1 normalized, the same matching is used for pair-wise and retrieval, with hamming distance (fixed level penalty assigned in case cluster is empty).

  • Results have been cross-verified by TI.

Overall significant improvement on accuracy, both for retrieval and pairwise, simplified thresholding method, and 25% less of memory. Complexity is slightly slower than TM6.

Clear improvement across all bit rates and all experiments. No drawback identified for possible adoption of the contribution. CE on this issue will be defined. The CDVS breakout requests advise from the Video Subgroup Chair about possible adoption or not at this meeting of the propose technology within TM7.

Some discussion needs to be performed whether to be put in TM, no consensus to include at this time.

Decision to investigate in CE; usual CE rules will be applied, better technology will be adopted.

CE process in CDVS is without open software, software is made only after adoption, and proponents are not willing to change for this case.

15.0.0.1.1.1.1.1.37m30316 CDVS: Comparision among different keypoint detectors in CDVS Test Model [Massimo Mattelliano, Alessandra Mosca]

Comparison presented among different keypoint detector with SIFT descriptor. TM5 showed the best perfomances in all experiments. Open CL SIFT and ORB also showed comparable perfomances. ORB detector shows less complexity.

15.0.0.1.1.1.1.1.38m30437 Cross-check of m30311 "Improvements to TM6.0 with a Robust Visual Descriptor – Proposal from University of Surrey and Visual Atoms" [Massimo Balestri]

Cross-check noted.

15.0.0.1.1.1.1.1.39m30476 Cross-check of m30309 "Proposed Improvements to TM6.0" [Massimo Balestri]

Cross-check noted.

15.0.0.1.1.1.1.1.40m30499 CDVS: A statement from Telecom Italia on interest point detection [Francesco Battipede, Skjalg Lepsoy, Gianluca Francini, Massimo Balestri]

Legal position presented and noted.

15.0.0.1.1.1.1.1.41m30514 More Comments on interest point detection for CDVS [Danilo Pau, Douglas SORENSEN]

Legal position presented and noted.

      1. CE on Key Point Detection


CE1 had been renamed CE on key point detection from the last meeting.

The group developed a block diagram of the generic feature detector which incorporates all contributions, noting strong points of each contribution. Two processing pipelines were identified, one in spatial domain and one in frequency domain – see figure below:

Each can use either a classical SIFT extrema detector and refinement or a newly proposed ALP detector (Polynomial extrema detector). The proponents already started working on integration of their proposals along the pipelines identified.

The group also analyzed the speed of the proposed detectors, which is presented below as a graph.



It was noted that proposals M30233, M30170 and M30446 achieved fastest processing.

The group decided to keep the TM6.0 detector without changes for TM7.0 in order to avoid the risk related to combining various elements of the proposals. Proponents willing to share their integrated detector code will be able to upload it to a specially created experimental branch of the SVN.

m30170 CDVS Huawei’s Response to CE 1: An Improved Block-Based Spatial LoG Interest Point Detector [Zheng Liu, Qiang Zhou, Guojun Xu, Giovanni Cordara]

Proposal with an improved block-based spatial Laplacian of Gaussian (LoG). Partial filtering of the top layer and the bottom layer in an octave are used for fast computation. Spatial filtering and extrema search in a hybrid manner is used to allow early termination of extrema search for further reducing the running time.

Features:


  • 4 times faster including gradient and orient. computation

  • 15 times faster vs TM6 without gradient and orientation computation

  • Less static memory – 93 Byte

  • Dynamic memory 0.87MB

  • No constraint on block decomposition

  • Fully parallelable – Block, Layer and Octave

  • Hybrid Spatial Filtering

(Cross-verified by M30242.)

Alternative LoG skipping is seen as useful feature.

15.0.0.1.1.1.1.1.42m30171 CDVS: Cross-Check [Zheng Liu]

Cross-check noted.

15.0.0.1.1.1.1.1.43m30224 [CDVS] Etri's response to CE1 [Sang-il Na, Seungjae Lee, Keundong Lee, Weon-Geun Oh, Insu Won, Dong-Seok Jeong]

Scale normalized LoG feature detector proposed with:



  • TM parameters are fully re-trained using available DBs and the performance is comparable.

  • It does not require additional library such as FFT and reduces the memory usage by minimizing temporal memory without block-based approach.

Scale normalized LoG, Higher performance, 3 times faster compared to TM6.

Intel V-Tune amplifier used for timing. How to measure memory complexity needs to be defined.

(Cross-verified by m30290.)

TM6 performance drops when TM6 is re-tuned (TPR= -0.33%, MAP=-0.66%).

Windows task manager also could be used.

Actions:


  • Review the database size for the feature training and augment the database during the meeting to secure relaible results. Action on TI and ETRI and complete by 10th of August 2013.

  • FP detector – follow the group effort to define DCVS super-detector.

  • Memory will be coarsly assessed by use Massif (Linux) and VS Profiling Tool (Windows), the results will be normalized. Proponents shall bring the results for the current TM7.

15.0.0.1.1.1.1.1.44m30225 Suggestion of time complexity measurment for CE1 [Seungjae Lee, Keundong Lee, Sang-il Na, Weon-Geun Oh, Da-Un Jung]

Proponents suggested two methods for measuring time complexity in the detection process. For fair comparison method, if source code is available, time stamps should be introduced in the same part of the code. Profiling tools can be also use, but source code would be required for cross-check. Clear definition is required.

15.0.0.1.1.1.1.1.45m30226 [CDVS] Suggestion of memory complexity measurment for CE1 [Keundong Lee, Seungjae Lee, Sang-il Na, Weon-Geun Oh, Sung-Kwan Je]

Proponents addressed a limitation of the theoretical analysis of the memory use.

15.0.0.1.1.1.1.1.46m30233 CDVS STM response to Interest Point Detector CE1 [Danilo Pau, Emanuele Plebani, Arcangelo Bruna*, Marco Marcon, Riccardo Ancona]

A highly integrated and optimized solution, with the following features:



  • Gaussian Scale Space (GSS) computed using overlap and save Fourier filtering featuring block (128) based processing combined with spatial Laplacian of the GSS responses.

  • Use the fast Cooley–Tukey radix-2 algorithm for FFT on real input

  • Compression scheme for pre-computed filters

  • Stripes based computation for GSS and LoG and their management.

  • Direct computation of gradients

  • Selector moved after descriptor stage.

  • Optimized descriptor extraction

Evolution of m28076: 5 times faster, 1/3 memory, tested on existing sensors, Android, Linux. GSS (Gaussian Scale Space) is computed using save and overlap.

LoG in spatial domain after GSS. Frequency filtering block size is 128.

Mixed frequency-space. Combined detection and feature extraction.

Selector is the largest memory contributor.

(Cross-verified by m30238.)

15.0.0.1.1.1.1.1.47m30241 CDVS: Peking University's Response to CE1 [Jie Chen, Ling-Yu Duan, Rongrong Ji, Tiejun Huang, Wen Gao]

The contribution proposes an improved BFLoG with the following features:


  • The block based processing was extended to extrema detection and orientations assignment with significant reduction of the memory footprint.

  • The block size was reduced from 172*172 to 128*128, which further reduces the computation time and memory access complexity.

  • Supports the parallelized processing especially for hardware implementation.

BFLoG (Block-based Frequency domain Laplace of Gaussian).

Extrema computing (3x3) and orientation assignment (Rmax=18) implemented as block-based (block size 90x90 + 19 overlap). Reduced memory footprint, Faster detection time + slight improvement in performance. Work-flow optimized: VGA image buffer 300KB.

Memory consumption reduced from 2.5M to 740kB.

(Crossverified by m30171.)

15.0.0.1.1.1.1.1.48m30238 CDVS: Crosscheck of STMicroelectronics response to CE1 [Alessandra Mosca]

Cross-check noted.

15.0.0.1.1.1.1.1.49m30242 CDVS: Crosscheck of Huawe's response to CE1 [Jie Chen, Ling-Yu Duan]

Cross-check noted.

15.0.0.1.1.1.1.1.50m30256 CDVS: Telecom Italia’s response to CE1 – interest point detection [Gianluca Francini, Skjalg Lepsoy, Massimo Balestri]

Features:


  • The identification of extrema in scale space is different from the prior art

  • It is based on a polynomial approximation of the scale space function, providing a representation that is continuous in its parameters.

  • Part of the processing is performed on each pixel independently from the others, thus it is well suited for parallelization.

Scale normalized LoG with sigma: 1.8, 2.85, 3.6 and 4.22 used to reconstruct scale-space.

Drop of performance due to global descriptor, which could not be trained properly. Proponent working jointly with PKU tried to train global but failed (3 weeks). Sometimes the drop of 20% occurred.

PKU training algorithm provides different result for each training run and significant difference on windows and Unix reported.

The training procedure requires curve_fitting_toolbox / Matlab 2012 or newer.

(ETRI needed to re-adjust the step size and boundary limits in function “find” in move.cpp.)

PKU was request to check and rectify training poblems.

Not cross-verified.

Issues - low contrast key point detection.

15.0.0.1.1.1.1.1.51m30290 CDVS: Crosscheck of ETRI response to CE1 [Alessandra Mosca]

Cross-check noted.

15.0.0.1.1.1.1.1.52m30446 CDVS: Samsung's Response to CE-1 [Victor Fragoso-Rojas, Gaurav Srivastava, Abhishek Nagar, Zhu Li]

Approximation of Gaussian with difference of box filters.



  • Difference of box filtering: developed a 2-step LS-LASSO scheme to compute a sparse linear combinations of box filters from a large dictionary

  • It offers good speed up of the extraction process (reduce by 35%) with some drop in matching accuracy (2% loss overall, <1% for 4k, 8k, 16k). 4–5% drop for retrieval without retraining of the global retrieval.

  • It is quite different from SIFT detection and it may be considered as an effective alternative to address the patent issue.

The global descriptor has not been retrained and this is thought to contribute to the drop in perfrommance.

Some concerns were expressed that this could require use of a SURF patent.

(Late cross-verification by M30602.)

      1. Retrieval and Matching


15.0.0.1.1.1.1.1.53m30228 [CDVS] Software maintenance for CDVS: Improving localization accuracy [Sang-il Na, Weon-Geun Oh, Insu Won, Dong-Seok Jeong]

In TM6 pixel corner is regarded a location. However this introduces larger quantization error compared to placing location in the pixel centre, which marginally improves localization by 0.6%. Since this is a very minor change without effect on any other results it should be implement in the TM.

(Crosscheck was sucessful – m30267.)

15.0.0.1.1.1.1.1.54m30267 Cross-check of m30228 "Software maintenance for CDVS: Improving localization accuracy" [Gianluca Francini]

Cross-check noted.

15.0.0.1.1.1.1.1.55m30193 CDVS image matching performance study with alternative interesting points detection and description schemes [Victor Fragoso-Rojas, Abhishek Nagar, Gaurav Srivastava, Zhu Li]

Tested a combination of detectors and descriptors. Analyzed extraction time and perfromance. FAST was the fastest 2ms vs 140 ms for SIFT. Rate curve was produced comparing BRISK, FAST-SIFT and SIFT-SIFT. CDVS datasets were used.

15.0.0.1.1.1.1.1.56m30240 CDVS Visual searching and object locking to retrieve meaningful contents [Emanuele Plebani, Danilo Pau, Riccardo Ancona, Daniele Miatto, Arcangelo Bruna, Alberto Messina]

Contribution regarding the use of the test model for demonstrator. RT input or blue-tooth camera or file based. Real time performance achieved: 5–9 fps depending on the bit rate. Several interesting use scenarios presented. Code refactoring suggested: new APIs.

15.0.0.1.1.1.1.1.57m30513 CDVS Software Model Improvement: Selective 2-Way Matching [Abhishek Nagar, Gaurav Srivastava, Victor Fragoso-Rojas, Zhu Li]

Informative contribution (no cross check performed). Presentation of results obtained introducing the two-way matching.



    1. Download 8.47 Mb.

      Share with your friends:
1   ...   33   34   35   36   37   38   39   40   ...   116




The database is protected by copyright ©ininet.org 2024
send message

    Main page