tait - Image Retrieval John Tait University of Sunderland,...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Image Retrieval John Tait University of Sunderland, UK Outline of Afternoon – Introduction • Why image retrieval is hard • How images are represented • Current approaches – Indexing and Retrieving Images • Navigational approaches • Relevance Feedback • Automatic Keywording – Advanced Topics, Futures and Conclusion • Video and music retrieval • Towards practical systems • Conclusions and Feedback 2 Scope General Digital Still Photographic Image Retrieval – Generally colour Some different issues arise – Narrower domains • E.g.Medical images especially where part of body and/or specific disorder is suspected – Video – Image Understanding - object recognition 3 Thanks to Chih-Fong Tsai Sharon McDonald Ken McGarry Simon Farrand And members of the University of Sunderland Information Retrieval Group 4 Introduction Introduction Why is Image Retrieval Hard ? What is the topic of What this image this ? What are right What keywords to index this image this ? What words would What you use to retrieve this image ? this The Semantic Gap ? 6 Problems with Image Retrieval A picture is worth a thousand words The meaning of an image is highly The individual and subjective individual 7 How similar are these two How images images 8 How Images are represented 10 11 Compression • In practice images are stored as compressed raster – Jpeg – Mpeg • Cf Vector … • Not Relevant to retrieval 12 Image Processing for Retrieval • Representing the Images – Segmentation – Low Level Features • Colour • Texture • Shape 13 Image Features • Information about colour or texture or shape which are extracted from an image are known as image features – Also a low-level features • Red, sandy – As opposed to high level features or concepts • Beaches, mountains, happy, serene, George Bush 14 Image Segmentation • Do we consider the whole image or just part ? – Whole image - global features – Parts of image - local features 15 Global features • Averages across whole image Tends to loose distinction between foreground and background Poorly reflects human understanding of images Computationally simple A number of successful systems have been built using global image features including Sunderland’s CHROMA 16 Local Features • Segment images into parts • Two sorts: – Tile Based – Region based 17 Regioning and Tiling Schemes Tiles (a) 5 tiles (b) 9 tiles (c) 5 regions (d) 9 regions Regions 18 Tiling Break image down into simple geometric shapes Similar Problems to Global Plus dangers of breaking up significant objects Computational Simple Some Schemes seem to work well in practice 19 Regioning • Break Image down into visually coherent areas Can identify meaningful areas and objects Computationally intensive Unreliable 20 Colour • Produce a colour signature for region/whole image • Typically done using colour correllograms or colour histograms 21 Colour Histograms I dentify a number of buck ets in w hich to sor t t he available colour s (e.g. r ed gr een and blue, or up to ten or so colour s) A llocate each pixel in an image to a buck et a nd count the number of pixels in each b uck et. U se the figur e pr oduced (buck et id plus count, nor malised for image size and r esolution) as the index k ey (signatur e) for each image. 22 Global Colour Histogram 90 80 70 60 50 40 30 20 10 0 Red Orange 23 Other Colour Issues • Many Colour Models – RGB (red green blue) – HSV (Hue Saturation Value) – Lab, etc. etc. • Problem is getting something like human vision – Individual differences 24 Texture • Produce a mathematical characterisation of a repeating pattern in the image – – – – Smooth Sandy Grainy Stripey 25 26 27 Texture • Reduces an area/region to a (small - 15 ?) set of numbers which can be used a signature for that region. • Proven to work weel in practice • Hard for people to understand 28 Shape • Straying into the realms of object recognition • Difficult and Less Commonly used 29 Ducks again • All objects have closed boundaries • Shape interacts in a rather vicious way with segmentation • Find the duck shapes 30 31 Summary of Image Representation • Pixels and Raster • Image Segmentation – Tiles – Regions • Low-level Image Features – Colour – Texture – Shape 32 Indexing and Retrieving Images Images Overview of Section 2 Quick Reprise on IR Navigational Approaches Relevance Feedback Automatic Keyword Annotation 34 34 Reprise on Key Interactive IR Reprise ideas ideas Index Time vs Query Time Processing Query Time Must be fast enough to be interactive Index (Crawl) Time Can be slow(ish) Can There to support retrieval 35 35 An Index A data structure which stores data in a suitably data abstracted and compressed form in order to faciliate rapid processing by an application faciliate 36 36 Indexing Process 37 37 Navigational Approaches to Image Retrieval to Essential Idea Layout images in a virtual space in an Layout arrangement which will make some sense to the user the Project this onto the screen in a Project comprehensible form comprehensible Allow them to navigate around this projected Allow space (scrolling, zooming in and out) space 39 39 Notes Typically colour is used Texture has proved difficult for people to Texture understand understand Shape possibly the same, and also user interface Shape most people can’t draw ! most Alternatives include time (Canon’s Time Alternatives Tunnel) and recently location (GPS Cameras) Tunnel) Need some means of knowing where you are 40 40 Observation It appears people can take in and will inspect It many more images than texts when searcing many 41 41 CHROMA Development in Sunderland: Development mainly by Ting Sheng Lai now of National Palace mainly Museum, Taipei, Taiwan Museum, Structure Navigation System Thumbnail Viewer Similarity Searching Sketch Tool 42 42 The CHROMA System General Photographic Images Global Colour is the Primary Indexing Key Images organised in a hierarchical Images classification using 10 colour descriptors and classification colour histograms colour 43 43 Access System 44 44 The Navigation Tool 45 45 Technical Issues Fairly Easy to arrange image signatures so Fairly they support rapid browsing in this space they 46 46 Relevance Feedback Relevance More Like this Relevance Feedback Well established technique in text retrieval Experimental results have always shown it to work Experimental well in practice well Unfortunately experience with search engines Unfortunately has show it is difficult to get real searchers to adopt it - too much interaction adopt 48 48 Essential Idea User performs an initial query Selects some relevant results System then extracts terms from these to System augment the initial query Requeries 49 49 Many Variants Pseudo Just assume high ranked documents are relevant Ask users about terms to use Include negative evidence Etc. etc. 50 50 Query-by-Image-Example 51 51 Why useful in Image Retrieval? 1. 2. Provides a bridge between the users Provides understanding of images and the low level features (colour, texture etc.) with which the systems is actually operating systems Is relatively easy to interface to Is 52 52 Image Retrieval Process Green Ducks Water Texture Leaf Texture 53 53 Observations Most image searchers prefer to use key words Most to formulate initial queries to Eakins et al, Enser et al First generation systems all operated using low First level features only level Colour, texture, shape etc. Smeulders et al 54 54 Ideal Image Retrieval Process Thumbnail Browsing Need Keyword Query More Like this 55 55 Image Retrieval as Text Image Retrieval Retrieval What we really want to do is make the image What retrieval problem text retrieval retrieval 56 56 Three Ways to go Manually Assign Keywords to each image Use text associated with the images (captions, Use web pages) web Analyse the image content to automatically Analyse assign keywords assign 57 57 Manual Keywording Expensive Unreliable Can only really be justified for high value Can collections – advertising collections Do the indexers and searchers see the images in Do the same way the Feasible 58 58 Associated Text Cheap Powerful Tends to be “one dimensional” Famous names/incidents Does not reflect the content rich nature of images Currently Operational - Google 59 59 Possible Sources of Associated text Filenames Anchor Text Web Page Text around the anchor/where the Web image is embedded image 60 60 Automatic Keyword Assignment A form of Content Based Image Retrieval Cheap (ish) Predictable (if not always “right”) No operational System Demonstrated Although considerable progress has been made Although recently recently 61 61 Basic Approach Learn a mapping from the low level image Learn features to the words or concepts 62 62 Two Routes Translate the image into piece of text 1. n n Forsyth and other s Manmatha and others Find that category of images to which a Find keyword applies keyword 2. n n Tsai and Tait (SIGIR 2005) 63 63 Second Session Summary Separating Index Time and Retrieval Time Separating Operations Operations “First generation CBIR” Navigation (by colour etc.) Relevance Feedback Keyword based Retrieval Manual Indexing Associated Text Automatic Keywording Automatic 64 64 Advanced Topics, Futures and Conclusions Outline Video and Music Retrieval Towards Practical Systems Conclusions and Feedback 66 Video and Music Retrieval Video Retrieval • All current Systems are based on one or more of: – Narrow domain - news, sport – Use automatic speech recognition to do speech to text on the soundtrack – Do key frame extraction and then treat the problem as still image retrieval 68 Missing Opportunities in Video Retrieval • Using delta’s - frame to frame differences - to segment the image into foreground/background, players, pitch, crowd etc. • Trying to relate image data to language/ text data 69 Music Retrieval • Distinctive and Hard Problem – What makes one piece of music similar to another • Features – Melody – Artist – Genre ? 70 Towards Practical Systems Ideal Image Retrieval Process Thumbnail Browsing Need Keyword Query More Like this 72 Requirements > 5000 Key word vocabulary > 5% accuracy of keyword assignment for all keywords > 5% precision in response to single key word queries The Semantic Gap Bridged! 73 CLAIRE Example State of the Art Semantic CBIR System Colour and Texture Features Simple Tiling Scheme Two Stage Learning Machine SVM/SVM and SVM/k-NN Colour to 10 basic colours Texture to one texture term per category 74 Tiling Scheme 75 Texture Classifier Key word Annotation Colour Data Extractor Texture Segmentation Image Architecture of Claire Known Key Word/class 76 Training/Test Collection Randomly Selected from Corel Training Set 30 Test 20 images per category Collection images per category 77 SVM/SVM Keywording with 100+50 Categories 70% 60% 50% concrete classes abstract classes baseline 40% 30% 20% 10% 0% 10 30 50 70 100 78 Examples Keywords Concrete Beaches Dogs Mountain Orchids Owls Rodeo Tulips Women Abstract Architecture City Christmas Industry Sacred Sunsets Tropical Yuletide 79 SVM vs kNN 70% 60% 50% SVM concrete SVM abstract baseline kNN abstract kNN concrete 40% 30% 20% 10% 0% 10 30 50 70 100 150 80 Reduction in Unreachable Classes Missing Category Numbers 60 50 40 SVM concrete SVM abstract kNN concrete kNN abstract 30 20 10 0 10 30 50 70 100 150 81 Labelling Areas of Feature Space Mountain Tree Sea 82 Overlap in Feature Space 83 Keywording 200+200 Categories SVM/1-NN 60% concrete keywords 50% abstract keywords 40% 30% baseline 20% Expon. (abstract keywords) 10% 0% 10 30 50 70 100 150 200 84 Discussion Results still promising 5.6% of images have at least one relevant keyword assigned Still useful - but only for a vocabulary of 400 words ! See demo at http://osiris.sunderland.ac.uk/~da2wli/system/silk1/ High proportion of categories which are never assigned 85 Segmentation Are the results dependent on the specific tiling/regioning scheme used ? 86 Regioning (a) 5 tiles (b) 9 tiles (c) 5 regions (d) 9 regions 87 Effectiveness Comparison 70.00% 61.5% (0) 60.00% 50.00% 40.67% (1) 40.00% tiles 33% (0) 36.67% (0) 30.00% regions 27.79% (1) 27.9% (2) 20.00% 18.4% (9) 21.43% (5) 16.7% (15) 10.00% 14.3% (30) 9.13% (69) 13.7% (25) 8% (81) 60.00% 0.00% 10 30 50 70 100 150 200 52.5% (0) 50.00% No. of concrete classes 48% (0) 40.00% Five Tiles vs Five Regions 1-NN Data Extractor Accuracy Accuracy 52.5% (0) 31% (0) tiles 30.00% 26.33% (2) 20.00% regions 22.55% (2) 21.06% (0) 16.29% (7) 11.25% (21) 9% (51) 9.13% (69) 9.25% (27) 8.86% (44) 14.14% (7) 10.00% 8% (81) 0.00% 10 30 50 70 100 150 200 No. of abstract classes 88 Next Steps More categories Integration into complete systems Systematic Comparison with Generative approach pioneered by Forsyth and others 89 Other Promising Examples Jeon, Manmatha and others - High number of categories - results difficult to interpret Carneiro Also and Vasconcelos problems with missing concepts Srikanth et al Possibly leading results in terms of precision and vocabulary scale 90 Conclusions Image Indexing and Retrieval is Hard Effective Image Retrieval needs a cheap and predictable way of relating words and images Adaptive and Machine Learning approaches offer one way forward with much promise 91 Feedback Comments and Questions Selected Bibliography Selected Early Systems The following leads into all the major trends in systems based on colour, The texture and shape texture CHROMA A. Smeaulder, M. Worring, S. Santini, A. Gupta and R. Jain “Content-based Image Retrieval: the end A. of the early years” IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1349of 1380, 2000. Sharon McDonald and John Tait “Search Strategies in Content-Based Image Retrieval” Proceedings of the 26th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR of 2003), Toronto, July, 2003. pp 80-87. ISBN 1-58113-646-3 2003), Sharon McDonald, Ting-Sheng Lai and John Tait, “Evaluating a Content Based Image Retrieval Sharon System” Proceedings of the 24th ACM SIGIR Conference on Research and Development in System” Information Retrieval (SIGIR 2001), New Orleans, September 2001. W.B. Croft, D.J. Harper, D.H. Kraft, and J. Zobel (Eds). ISBN 1-58113-331-6 pp 232-240. Kraft, Translation Based Approaches P. Duygulu, K. Barnard, N. de Freitas and D. Forsyth “Learning a Lexicon for a Fixed Image P. Vocabulary” European Conference on Computer Vision, 2002. Vocabulary” K. Barnard, P. Duygulu, N. de Freitas and D. Forsyth “Matching Words and Pictures” Journal of K. machine Learning Research 3: 1107-1135, 2003. machine Very recent new paper on this is: Very P. Virga, P. Duygulu “Systematic Evaluation of Machine Translation Methods for Image and Video P. Annotation” Images and Video Retrieval, Proceedings of CIVR 2005, Singapore, Springer, 2005. Annotation” 94 Cross-media Relevance Models etc J. Jeon, V. Lavrenko, R. Manmatha “Automatic Image Annotation and Retrieval using Cross-Media J. Relevance Models” Proceedings of the 26th ACM SIGIR Conference on Research and Development Proceedings in Information Retrieval (SIGIR 2003), Toronto, July, 2003. Pp 119-126 in See also recent unpublished papers on http://ciir.cs.umass.edu/~manmatha/mmpapers.html More recent stuff G Carneiro and N. Vasconcelos “A Database Centric View of Sentic Image Annotation and Retrieval” Carneiro Proceedings of the 28th ACM SIGIR Conference on Research and Development in Information Proceedings Retrieval (SIGIR 2005), Salvador, Brazil, August, 2005 Retrieval M. Srikanth, J. Varner, M. Bowden, D. Moldovan “Exploiting Ontologies for Automatic Image M. Annotation” Proceedings of the 28th ACM SIGIR Conference on Research and Development in Annotation” Information Retrieval (SIGIR 2005), Salvador, Brazil, August, 2005 Information See also the SIGIR workshop proceedings http://mmir.doc.ic.ac.uk/mmir2005 95 ...
View Full Document

This note was uploaded on 05/22/2011 for the course COMP 207 taught by Professor Zhangli during the Spring '11 term at University of Liverpool.

Ask a homework question - tutors are online