Searching for content from large video archives like those at the BBC is notoriously challenging. Searching is typically done via keyword queries using pre-defined metadata such as titles, tags, audio transcripts and viewer's notes. However, it is difficult to use such keywords to search for specific moments in a video leaving searchers with the painstaking task of trawling through hours and hours of footage.
Multimodal Video Search by Examples (MVSE) has the potential to solve this problem. MVSE allows users to search for specific moments in a video where a particular speaker talks about a specific topic at a particular location. Instead of using keywords, users will use examples to search for a specific face, voice, location and/or topic.
The £2.2m project is funded by the Engineering and Physical Sciences Research Council (EPSRC).
The project team will use the BBC’s extensive video archive to study efficient, effective, scalable and robust MVSE where video archives are large, historical and dynamic; and the modalities are person (face or voice), context, and topic.
The project will address key challenges in the development of an MVSE solution, including video segmentation, content representation, hashing, ranking and fusion. The aim is to develop a framework for MVSE and to validate it through the development of a prototype search tool. Such a search tool will be useful for organisations such as the BBC and the British Library, who maintain large collections of video archives and want to provide a search tool for their own staff as well as for the public.
Professor Hui Wang, Principal Investigator and Professor of Computer Science at Ulster University commented:
“We are delighted to lead this innovative project which we hope will transform multimedia search capability. MVSE is a relatively new approach to video search with huge potential. Together with our partners we will use our expertise to develop a framework for MVSE and create a scalable next-generation ‘search by example’ functionality for national video archives. The team at Ulster consists of four academics and will focus on AI-based search technologies.”
Rob Cooper, Lead Development Producer, BBC Research and Development said:
“We think that this type of research could be really useful to journalists, programme makers and eventually members of the public who want to search the BBC’s archives. The ability to quickly scan hundreds of thousands of hours of TV and radio would mean that users could more rapidly unearth the hidden gems in our archives.”
Mark Gales, Professor of Information Engineering, University of Cambridge said:
“MVSE is an exciting, timely, project. Recent developments in artificial intelligence have dramatically improved our ability to classify audio, speech and video data. This project offers unique challenges in all of these areas, to enable highly flexible, highly specific, retrieval of content from large video archives. The team at Cambridge University will focus on developing and integrating novel speech and language processing research into this system.”
Josef Kittler, Distinguished Professor at the University of Surrey commented:
“Indexing and retrieving multimedia information from huge archives by content is a major challenge. At Surrey we are corralling a team of four academics with complementary expertise in biometrics, audio and visual scene analysis to tackle the problem. The project will require innovative Artificial Intelligence and Machine Learning techniques to achieve its objectives.”
Visit the project website for more information.