What can pictures tell us about web pages? Improving document search using images

dc.contributor.authorRodriguez-Vaamonde, Sergio
dc.contributor.authorTorresani, Lorenzo
dc.contributor.authorFitzgibbon, Andrew
dc.contributor.institutionTecnalia Research & Innovation
dc.date.accessioned2024-07-24T11:52:26Z
dc.date.available2024-07-24T11:52:26Z
dc.date.issued2013
dc.description.abstractTraditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion of each page. In this paper we study whether the content of the pictures appearing in a Web page can be used to enrich the semantic description of an HTML document and consequently boost the performance of a keyword-based search engine. We present a Web-scalable system that exploits a pure text-based search engine to find an initial set of candidate documents for a given query. Then, the candidate set is reranked using semantic information extracted from the images contained in the pages. The resulting system retains the computational efficiency of traditional text-based search engines with only a small additional storage cost needed to encode the visual information. We test our approach on the TREC 2009 Million Query Track, where we show that our use of visual content yields improvement in accuracies for two distinct text-based search engines, including the system with the best reported performance on this benchmark.en
dc.description.statusPeer reviewed
dc.format.extent4
dc.identifier.citationRodriguez-Vaamonde , S , Torresani , L & Fitzgibbon , A 2013 , What can pictures tell us about web pages? Improving document search using images . in SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval . SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval , pp. 849-852 , 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013 , Dublin , Ireland , 28/07/13 . https://doi.org/10.1145/2484028.2484144
dc.identifier.citationconference
dc.identifier.doi10.1145/2484028.2484144
dc.identifier.isbn9781450320344
dc.identifier.urihttps://hdl.handle.net/11556/2189
dc.identifier.urlhttp://www.scopus.com/inward/record.url?scp=84883113098&partnerID=8YFLogxK
dc.language.isoeng
dc.relation.ispartofSIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
dc.relation.ispartofseriesSIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
dc.rightsinfo:eu-repo/semantics/restrictedAccess
dc.subject.keywordsImage Content
dc.subject.keywordsRanking
dc.subject.keywordsWeb Search
dc.subject.keywordsComputer Graphics and Computer-Aided Design
dc.subject.keywordsInformation Systems
dc.titleWhat can pictures tell us about web pages? Improving document search using imagesen
dc.typeconference output
Files