Course program Week 1 Section Introduction to recognition and indexing of visual data



Download 304.14 Kb.
Page1/2
Date01.06.2018
Size304.14 Kb.
#52664
  1   2
BASI di DATI MULTIMEDIALI 2013-14 Alberto del Bimbo

Multimedia Recognition and Indexing Professor
Course program
Week 1

Section 1. Introduction to recognition and indexing of visual data

(Professor: Alberto del Bimbo)


Week 2

Section 2. Global image features (Recall of image analysis)

(Professor: Alberto del Bimbo)



  • Global image features : Color; Texture; Edges and Lines

  • Dimensionality reduction: PCA, LDA, Eigenfaces

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4

[B] Alberto del Bimbo, Visual Information Retrieval, Morgan Khaufman, 1999, Chapter 2-4
Week 3

Section 3. The MPEG7 standard

(Professor: Alberto del Bimbo)



  • MPEG7 holistic descriptors [1]

References

[1] ISO/IEC TR 15938-8:2002, Information technology -- Multimedia content description interface-Part 8: Extraction and use of MPEG-7 descriptions, http://www.iso.org/iso/


Week 4

Laboratory 1: MPEG7

(Assistant: Marco Bertini)


Week 5 - 6

Section 4. Local image features

(Professor: Alberto del Bimbo)



  • Rotation invariant Harris corner detector

  • Scale invariant keypoint detectors: Harris-Laplacian [1], SIFT Scale Invariant Feature Transform [2], SURF Speed Up Robust Features [3]

  • Affine invariant region detectors: Harris affine, Intensity Extrema Regions, MSER Maximally Stable Extremal Regions [4]

  • Local descriptors: SIFT [2], Color SIFT, SURF [3], GLOH Gradient Location and Orientation Histogram

  • References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4

[1] Krystian Mikolajczyk and Cordelia Schmid, A Performance Evaluation of Local Descriptors, IEEE TPAMI 2005

[2] David Lowe, Distinctive Image Features from Scale-Invariant Keypoints , International Journal of Computer Vision, 2004.

[3] Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc Van Gool, Speeded-Up Robust Features (SURF), Elsevier, 2008

[4] J. Matas, O. Chum, M. Urban, T. Pajdla, Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, British Machine Vision Int. Conference, 2002

Week 7

Section 5 Visual words and bag of Words representation

(Professor: Alberto del Bimbo)



  • Visual Words and Bag of Words model: vocabulary formation by K-means, Radius-based clustering [1]

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 14

[1] Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray , Visual Categorization with Bags of Keypoints
Week 8

Section 6. Object instance recognition

(Professor: Alberto del Bimbo)



  • Distance measures

  • Nearest Neighbour Matching

  • Geometric alignment and outliers rejection: Random Sample Consensus

  • Video Google [1]

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4, 5, 6

[1] Josef Sivic, Andrew Zisserman, Video Google: A Text Retrieval Approach to Object Matching in Videos, ICCV 2003
Week 9 - 10

Section 7. Object categorization

(Professors: Alberto del Bimbo, Andy Bagdanov°, Lorenzo Seidenari*)



  • Bayes classification (Recall of statistical principles) °

  • Support Vector Machines discriminative classifier °

  • Partial matching of sets of features: Pyramid Matching Kernel [1] Spatial Pyramid Matching

  • HOG Histogram of Oriented Gradients people detector [2]

  • Boosting classifier, Adaboost

  • Viola and Jones face detector [3]

  • Probabilistic Latent Semantic Analysis generative classifier [4] *

  • Expectation maximization (Recall of statistical principles) *

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 4, 5, 6

[B] Christopher Bishop, Pattern Recognition and Machine Learning, Springer 2006, Chapter 2

[1] Kristen Grauman and Trevor Darrell, Pyramid Match Kernels: Discriminative Classification with Sets of Image Features, IEEE ICCV 2005

[2] Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, IEEE CVPR Int. Conference 2005

[3] Paul Viola and Michael Jones, Robust Real-time Object Detection , Int. Wkshop on Statistical and Computational Theories of Vision, 2001

[4] Florent Monay, Daniel Gatica-Perez, PLSA-based Image Auto-Annotation: Constraining the Latent Space, ACM Multimedia 2004
Week 11

Laboratory 2 : Bag of Visual Words

(Assistants: Marco Bertini, Lamberto Ballan, Lorenzo Seidenari)


Week 12

Section 8. With image sequences

(Professors: Lorenzo Seidenari °, Andy Bagdanov*)



  • Spatio-temporal features: holistic features; local features: STIP Spatio-Temporal Interest Point

detector [1], Dollar’s spatio-temporal detector; local descriptors °

  • Action and Event recognition [2] °

  • Detection in video sequences *

References

[1] Ivan Laptev, On Space-Time Interest Points, International Journal of Computer Vision, 2005

[2] L. Ballan, M. Bertini, A. Del Bimbo, L. Seidenari, and G. Serra, "Effective Codebooks for Human Action Categorization," IEEE ICCV Int. Workshop on Video-oriented Object and Event Classification (VOEC), 2009.
Week 13

Laboratory 3 : Detection and tracking

(Assistants: Andy Bagdanov, Giuseppe Lisanti)


Week 14

Section 9. Matching at large scale

(Professor: Alberto del Bimbo)



Hashing [2][3][4]

  • Performance measures

References

[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010, Chapter 14

[4] David Nister and Henrik Stewenius, Scalable Recognition with a Vocabulary Tree, IEEE CVPR Int. Conference, 2006

[2] Aristides Gionis, Piotr Indyky, Rajeev Motwaniz, Similarity Search in High Dimensions via Hashing, IEEE VLDB, Int. Conference 1999

[3] Brian Kulis Kristen Grauman, Kernelized Locality-Sensitive Hashing for Scalable Image Search, IEEE ICCV int. Conference, 2009

[4] Mohamed Aly, Peter Welinder, Mario Munich, Pietro Perona, Scaling Object Recognition: Benchmark of Current State of the Art Techniques, IEEE ICCV Int. Conference, 2009


Week 15

Section 10. Exploiting human knowledge

(Professors: Giuseppe Serra °, Marco Bertini *)

  • Wordnet and ontologies [1] °

  • RDF, OWL, SWRL °

  • Data from Social Networks *

References

[1] John Davies, Dieter Fensel, Frank van Harmelen, Towards the Semantic Web: Ontology-driven Knowledge Management, 2002



Course slides
Free pdf copy downloadable at: http://www.micc.unifi.it/delbimbo/teaching/multimedia-databases

(password protected)



Reference textbooks
[A] Richard Szeliski, Computer Vision Algorithms and Applications, Springer 2010

Free copy downloadable at: http://szeliski.org/Book/



for details in algorithms and solutions
[B] Alberto del Bimbo, Visual Information Retrieval, Morgan Khaufman, 1999
for details in algorithms and solutions

Download 304.14 Kb.

Share with your friends:
  1   2




The database is protected by copyright ©ininet.org 2025
send message

    Main page