Today, IBM announced that in collaboration with a European Union Consortium researchers have developed an analytics engine to allow people to find even untagged video, pictures, and music that match multimedia they've submitted as a query.
The consortium has engineered a Web technology called SAPIR that can analyze and identify the pixels in large-scale collections of audio-visual content. For example, it can analyze a digitized photograph or the bitstreams in an electronic sound file - even if they haven't been tagged or indexed with descriptive information. The multimedia identified is automatically indexed and ranked for easy retrieval.
"SAPIR is a potential 'game-changer' when it comes to scalability in search and analyses," said Yosi Mass, a scientist at IBM Research-Haifa, and project leader for SAPIR. "It approaches the problem from a fundamentally different perspective, and opens up a universe of new possibilities for using multimedia to analyze the vast visual and auditory world in which we now live."
SAPIR (www.sapir.eu) can index and enable the ability to sift through collections of millions of multimedia items by extracting "low-level descriptors" from the photographs or videos. These descriptors include features such as color, layout, shapes, or sounds.
For example, if a tourist uses her mobile phone to photograph a statue, SAPIR identifies the image's low-level descriptors, compares them to existing photographs, and helps identify the statue. With further research, more specific features could be analyzed, so that, say, someone can photograph a fashionable wallet seen on the street, and find out which stores carry the item. In the future, scientists might be able to extend the power of SAPIR's scalability to aid in patient healthcare by analyzing medical images and rich media patient records to suggest a likely medical diagnosis, by comparing the combined results with historical data from distributed medical repositories.
Multimedia comprises the biggest proportion of information stored on the Internet. In fact, according to a May 2009 IDC study, 95% of electronic information on the Internet, such as digital photos, is unstructured -- and isn't neatly categorized or tagged. Images, captured by more than 1 billion devices in the world, are the biggest part of the digital universe. The number of cell phone pictures reached nearly 100 billion in 2006, and is expected to reach 500 billion images by 2010.
SAPIR taps into the vast -- and rapidly growing -- electronic repository of multimedia and has exceptional reliability and nearly unlimited capacity. It uses the same type of self-organizing peer-to-peer technology currently used for swapping audio and video over the Internet. With this approach, there is no central point of potential failure, and server hardware can be added for additional capacity when the collection grows. The "freshness" of the categorized indexed is ensured by an approach where providers of content automatically push their material into a searchable repository.
A demo for testing by the general public is now available at http://sapir.isti.cnr.it/....
The SAPIR project consortium (http://www.sapir.eu/) includes: IBM Research- Haifa, Israel; Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche (CNR), Italy; Max-Planck-Gesellschaft Zur Foerderung der Wissenschaften E.V. (MPG), Germany; Eurix S.R.L. (EURIX), Italy; Xerox - SAS (XRCE), France; Masarykova Univerzita (MU Brno), Czech Republic; Telefonica Investigacion y Desarrollo sa Unipersonal (TID), Spain; Telenor ASA (TELENOR), Norway; Universita' degli Studi di Padova (UPD), Italy.