Course program Week 1 Section Introduction to recognition and indexing of visual data

Download 304.14 Kb.

Page	1/2
Date	01.06.2018
Size	304.14 Kb.
	#52664

1 2

BASI di DATI MULTIMEDIALI 2013-14 Alberto del Bimbo

Multimedia Recognition and Indexing Professor
Course program
Week 1

Section 1. Introduction to recognition and indexing of visual data

(Professor: Alberto del Bimbo)

Week 2

Section 2. Global image features (Recall of image analysis)

(Professor: Alberto del Bimbo)

Global image features : Color; Texture; Edges and Lines
Dimensionality reduction: PCA, LDA, Eigenfaces

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4

[B] Alberto del Bimbo, Visual Information Retrieval, Morgan Khaufman, 1999, Chapter 2-4
Week 3

Section 3. The MPEG7 standard

(Professor: Alberto del Bimbo)

MPEG7 holistic descriptors [1]

References

[1] ISO/IEC TR 15938-8:2002, Information technology -- Multimedia content description interface-Part 8: Extraction and use of MPEG-7 descriptions, http://www.iso.org/iso/

Week 4

Laboratory 1: MPEG7

(Assistant: Marco Bertini)

Week 5 - 6

Section 4. Local image features

(Professor: Alberto del Bimbo)

Rotation invariant Harris corner detector
Scale invariant keypoint detectors: Harris-Laplacian [1], SIFT Scale Invariant Feature Transform [2], SURF Speed Up Robust Features [3]
Affine invariant region detectors: Harris affine, Intensity Extrema Regions, MSER Maximally Stable Extremal Regions [4]
Local descriptors: SIFT [2], Color SIFT, SURF [3], GLOH Gradient Location and Orientation Histogram
References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4

[1] Krystian Mikolajczyk and Cordelia Schmid, A Performance Evaluation of Local Descriptors, IEEE TPAMI 2005

[2] David Lowe, Distinctive Image Features from Scale-Invariant Keypoints , International Journal of Computer Vision, 2004.

[3] Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc Van Gool, Speeded-Up Robust Features (SURF), Elsevier, 2008

[4] J. Matas, O. Chum, M. Urban, T. Pajdla, Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, British Machine Vision Int. Conference, 2002

Week 7

Section 5 Visual words and bag of Words representation

(Professor: Alberto del Bimbo)

Visual Words and Bag of Words model: vocabulary formation by K-means, Radius-based clustering [1]

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 14

[1] Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray , Visual Categorization with Bags of Keypoints
Week 8

Section 6. Object instance recognition

(Professor: Alberto del Bimbo)

Distance measures

Nearest Neighbour Matching
Geometric alignment and outliers rejection: Random Sample Consensus
Video Google [1]

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4, 5, 6

[1] Josef Sivic, Andrew Zisserman, Video Google: A Text Retrieval Approach to Object Matching in Videos, ICCV 2003
Week 9 - 10

Section 7. Object categorization

(Professors: Alberto del Bimbo, Andy Bagdanov°, Lorenzo Seidenari*)

Bayes classification (Recall of statistical principles) °
Support Vector Machines discriminative classifier °
Partial matching of sets of features: Pyramid Matching Kernel [1] Spatial Pyramid Matching
HOG Histogram of Oriented Gradients people detector [2]
Boosting classifier, Adaboost
Viola and Jones face detector [3]
Probabilistic Latent Semantic Analysis generative classifier [4] *
Expectation maximization (Recall of statistical principles) *

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4, 5, 6

[B] Christopher Bishop, Pattern Recognition and Machine Learning, Springer 2006, Chapter 2

[1] Kristen Grauman and Trevor Darrell, Pyramid Match Kernels: Discriminative Classiﬁcation with Sets of Image Features, IEEE ICCV 2005

[2] Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, IEEE CVPR Int. Conference 2005

[3] Paul Viola and Michael Jones, Robust Real-time Object Detection , Int. Wkshop on Statistical and Computational Theories of Vision, 2001

[4] Florent Monay, Daniel Gatica-Perez, PLSA-based Image Auto-Annotation: Constraining the Latent Space, ACM Multimedia 2004
Week 11

Laboratory 2 : Bag of Visual Words

(Assistants: Marco Bertini, Lamberto Ballan, Lorenzo Seidenari)

Week 12

Section 8. With image sequences

(Professors: Lorenzo Seidenari °, Andy Bagdanov*)

Spatio-temporal features: holistic features; local features: STIP Spatio-Temporal Interest Point

detector [1], Dollar’s spatio-temporal detector; local descriptors °

Action and Event recognition [2] °

Detection in video sequences *

References

[1] Ivan Laptev, On Space-Time Interest Points, International Journal of Computer Vision, 2005

[2] L. Ballan, M. Bertini, A. Del Bimbo, L. Seidenari, and G. Serra, "Effective Codebooks for Human Action Categorization," IEEE ICCV Int. Workshop on Video-oriented Object and Event Classification (VOEC), 2009.
Week 13

Laboratory 3 : Detection and tracking

(Assistants: Andy Bagdanov, Giuseppe Lisanti)

Week 14

Section 9. Matching at large scale

(Professor: Alberto del Bimbo)

Vocabulary Tree [1]
Multidimensional hashing: Local Sensitive Hashing, Pyramid Match Hashing, Semantic

Hashing [2][3][4]

Performance measures

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 14

[4] David Nister and Henrik Stewenius, Scalable Recognition with a Vocabulary Tree, IEEE CVPR Int. Conference, 2006

[2] Aristides Gionis, Piotr Indyky, Rajeev Motwaniz, Similarity Search in High Dimensions via Hashing, IEEE VLDB, Int. Conference 1999

[3] Brian Kulis Kristen Grauman, Kernelized Locality-Sensitive Hashing for Scalable Image Search, IEEE ICCV int. Conference, 2009

[4] Mohamed Aly, Peter Welinder, Mario Munich, Pietro Perona, Scaling Object Recognition: Benchmark of Current State of the Art Techniques, IEEE ICCV Int. Conference, 2009

Week 15

Section 10. Exploiting human knowledge

(Professors: Giuseppe Serra °, Marco Bertini *)

Wordnet and ontologies [1] °
RDF, OWL, SWRL °
Data from Social Networks *

References

[1] John Davies, Dieter Fensel, Frank van Harmelen, Towards the Semantic Web: Ontology-driven Knowledge Management, 2002

Course slides
Free pdf copy downloadable at: http://www.micc.unifi.it/delbimbo/teaching/multimedia-databases

(password protected)

Reference textbooks
[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010

Free copy downloadable at: http://szeliski.org/Book/

for details in algorithms and solutions
[B] Alberto del Bimbo, Visual Information Retrieval, Morgan Khaufman, 1999

for details in algorithms and solutions

Download 304.14 Kb.

Share with your friends:

1 2