John Lee, VP of APAC Sales, TiVo
TiVo is providing a natural language, voice-based search solution to leading platforms in the US and the UK, making it quicker and easier for viewers to find the content they are looking for. John Lee, VP of APAC Sales, TiVo, shares his perspective in this opinion piece.
For many years, effective voice-based search technologies have eluded businesses that tried to bring next-generation input methods to viewers. Command-based speech systems have been perceived as ineffective and hard for viewers to use. However, the widespread adoption of smartphones and tablets, and their minimised keyboards, has led to a renewed interest in this genre of technology.
Apple’s Siri and Amazon’s Alexa have progressed beyond basic menu navigation functions. Now any device with a microphone has potential for speech-based commands and can become an intelligent discovery system that uses a sophisticated entertainment brain to understand customer desires.
This technology is important and under-explored by the video industry, which often appears to have been left behind in terms of intuitive discovery functionality. For content providers, voice-based search and recommendation should be at the core of their products to provide customers with accessibility to their favourite shows and genres. In India, where there is a huge number of dialects, voice search is a great tool to facilitate a quick and easy search for content that users want to watch.
Speaking the viewer’s language
With the chaos of content available today, viewers have preferred selections and considerations across cast, plot and genre. Conversational interfaces simulate natural communication qualities and remove the need to conform to hierarchical menu structures. Most importantly, the technology must understand when a user is drilling into a particular genre in detail, or when they have lost interest and have completely switched topics.
To be successful, natural language search needs to encompass a variety of different points, each crucial to success:
- Disambiguation: Natural language technology must understand and interpret the user’s intent. For example, the phonetic sound “Kroos” can be interpreted to apply to Tom Cruise or Penelope Cruz, and the system should be able to understand what the user is looking for in relation to the original query.
- Statefulness: During a dialogue with a user, the system should be able to maintain context, and understand that people change their minds quickly. For example, the user could say that they are “in a mood for thrillers,” then jump to “Bond” and then to “old ones”. Ideally, the system should understand these requests, and serve up a series of older James Bond films for the viewer to select from.
- Personalisation: Conversational systems need to understand their users on an individual basis. For example, the system should learn that a user based in India who asks “When is the game tonight?” wants to know about their local team, and if they ask, “When is the Indians game?” they are referring to the game involving the cricket team, Mumbai Indians.
Taking understanding to the next level
Behind successful natural language technology lies excellent search capabilities. New technologies such as graph, have introduced high-quality and relevant search results to consumers everywhere, setting a benchmark across industries. Unlike a traditional database, a graph is much more scalable and flexible because it allows the connection of all sorts of information to records, without the reliance on “tables.”
In the context of TV, most consumers have viewing patterns that can be mapped to provide highly personalised results to searches that vary from country to country and from region to region. The ability to make personalisation precise and extremely relevant – what the industry is now terming hyper-personalisation – is correlated to the knowledge graph’s semantic capabilities.
At its core, a quality conversational search engine should include the following aspects:
- Knowledge graph: Developed at its facilities in Bangalore, using next-generation technology, the Knowledge Graph is a key component of TiVo’s Advanced Search and Recommendation solution. This graph maps search results to intent, and prioritises those results based on the weight of their connection and should be able to:
- Look at named entities in media, entertainment and geography and extract, de-duplicate and disambiguate the entities across sources
- Recognise similarities and build relationships between entities
- Identify a multi-dimensional view of popularity and how audience interest in the entities shift over time
- Generate a large vocabulary such as keywords and sub-genres to help search systems identify relevant content
- Personal graph: Crucial to true conversational systems, the personal graph tunes the conversational system to individuals to enable natural conversations around the user’s preferences and context. The personal graph is:
- Based on statistical machine learning
- Able to learn individual behavioural patterns and interests
- Learns how time and device affect recommendations
At the front end of the system, the conversational query engine is required to bind all aspects together. This brings together key algorithms to map and learn linguistic features and provide content discovery features to customers.
Join Charles Dawes, TiVo’s International Marketing Senior Director, at the Broadcast Media Track on 28 June! He will be discussing more on the growing importance of voice search in content consumption. View the Summit Programme here. Natural language technology backed by knowledge graphs can revolutionise content discovery, and search and recommendation. Based on excellent metadata that covers actors and actresses, content synopsis and even famous quotations from films, content providers can create a second to none entertainment brain that offers customers speedy and accurate access to their favourite shows while suggesting similar content that they might enjoy. Voice-based discovery backed by knowledge graphs is no gimmick – it is set to change the way that people interact with their TV screens and mobile devices – as long as service providers make it personalised, intuitive and natural.