I test applications using a natural language interface or a natural language technology and I write an article about my experience with it.
This week, I tested Trooclick, a free Opinion-Driven Search Engine.
A New Approach of Opinion Mining with Trooclick’s Opinion-Driven Search Engine
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object.
The problem with these solutions is that they apply a very wide crawl reporting data without much interest. The completeness of these platforms can be brought into question. Although the approach certainly is of interest in “quantifying” data, we cannot get something out of it in the “qualified” perspective. First, the result is often a hodgepodge of heterogeneous texts, often highly redundant, with little real opinions expressed (hence the category “neutral” as many opinions analysis systems put next to “positive” and “negative”) category has little value in most cases. Finally, eReputation platforms offering a function of “sentiment analysis” do this for the entire document, which quickly becomes unusable when several points of view are expressed in the same document .
Trooclick’s Opinion-Driven Search Engine (beta) uses natural language processing (NLP) technology to gather quotes (Screenshot 1) from online sources of news and opinions – including content from publishers, blogs, and Twitter (soon expanding to radio and TV). After quotes are extracted, the speakers are categorized (categories include executives, analysts, politicians, clients – 16 total). They then use this data to rank news articles for quality. Unsurprisingly, for a company that emphasizes the importance of different points of view to understand the news, in the Trooclick universe more points of view from more people means a higher ranking. For now they have two ranking criteria active (number of speakers and quote score) with about 30 more in the pipes (Screenshot 2).
The software, based on advanced text mining and semantic analysis technologies, performs several tasks:
- Quality ranking of articles per event
- extracting quotes and identifying the speaker, the media, the date of the publication
- classification of the speaker into categories (manager, analyst, customer, employee, etc …)
Still in beta, the Trooclick’s solution is currently only available in the English-speaking media.
Automatic Detection of quotes
Trooclick tracks quotes from news sites and social media. The system identifies quotes in different ways. As a result not only does the site pull up direct quotes like this:
“McDonalds will stop serving antibiotic-raised poultry,” said McDonalds President, Mike Andres.
…But also indirect ones like this:
Mike Andres, McDonalds President, said McDonalds will stop serving antibiotic-raised poultry.
And some quotes come from opinion columns (“Hopefully chicken is just the start – I hope the Big Mac and McRib will be next”) and analysis (“McDonald’s decision to sell milk produced without rBST was a good step because the growth hormone can cause health problems in dairycows.”)
Automation of Fact-Checking for Journalism
Fact checking is the task of assessing the truthfulness of claims made by public figures such as politicians, pundits, etc. It is commonly performed by journalists employed by news organisations in the process of news article creation. Fact-checking is a time-consuming process » (Vlachos & Riedel 2014).
Journalism is about finding facts, interpreting their importance, and then sharing that information with the audience. That’s all journalists do: find, verify, enrich and then disseminate information. It sounds easy, doesn’t it, observing what is going on, asking questions, uncovering facts and then telling the public what we have discovered. But we are dealing with volatile raw material. Handled carelessly, the facts we uncover, research and present have the power to cause misunderstandings, damage and could, potentially, change the course of history. That’s why it’s essential that we apply robust fact-checking to all our journalism.
But, fact-checking is a time-consuming process. Automating the process of fact checking has recently been discussed in the context of computational journalism (Cohen et al., 2011; Flew et al., 2012). Inspired by the recent progress in natural language processing, databases and information retrieval, the vision is to provide journalists with tools that would allow them to perform this task automatically,
One way to solve that problem is by presenting different points of view as Trooclick’s Opinion-Driven Search Engine. There are other solutions in the area of fact-checking such as Truth Teller: this Washington Post initiative transcribes political videos and checks them against a database that draws on PolitiFact, FactCheck.org, and the paper’s own Fact Checker blog. The program then tells viewers which statements are true and which are false. In 2015, the Post plans to annotate videos in real time.
Trooclick’s Opinion-Driven Search Engine is available online on http://trooclick.com
Here a video presentation of Trooclick:
To contact Trooclick