Why Rosoka is the ‘no-brainer’ plug-in for IBM i2 Analyst’s Notebook
It’s no secret that i2® is a popular choice for visualising and analysing data. That’s why in 2011 IBM paid a rumoured $500 million to acquire the small British software company. Its flagship product: Analyst’s Notebook has now become the standard for charting data and is used in nearly all of the UK police forces. If you have a sporadic flow of data in formats that vary greatly, IBM i2 is still one of the most compelling tools on the market.
IBM i2® products have always worked well with structured data. This has given its users the ability to ‘bring to life’ data hidden in columns and rows. There are many other places that data can hide though and one of those places is in unstructured text.
Unstructured Data
Unstructured text can pose a challenge to analyse and understand, and we keep generating more of it every day. This text can be anywhere from documents, emails or social media posts. In March 2018 we wrote a blog detailing how data hidden in unstructured text can be utilised. It focussed on the use of Natural Language Processing and particularly Entity Extraction. You can read it here. That blog post concentrated on what could be done using freely available tools.
At the time we stated that there are many commercial offerings that offer that functionality but in a much slicker and easy to access interface. One of those tools is Rosoka Text Analytics. Rosoka is an industry leader of text analytics solutions. Their enterprise solution Rosoka Server is great for large scale analysis of unstructured data. This blog is about their standalone solution which is a plug-in for i2.
So, why is it a ‘no-brainer’?
Well, to start off with its fully integrated with IBM i2 Analyst’s Notebook. It’s a plug-in that has been created in close collaboration with IBM and you can tell. We tested the plug-in against several datasets, including the case study we used in the previous blog post. This data can be found here.
Here are the reasons why we think it’s a ‘no-brainer’ plug-in:
- It extracts entities and links from unstructured files such as PDF’s, Word documents and txt files. It then allows users to generate Analyst’s Notebook charts based on the content
- Processing time is quick compared to reading the documents manually and quicker than the free extraction engines used in our previous blog
- During testing we found the entity extraction to be very accurate. What was more impressive was the relationship extraction. In our previous blog we inferred links based on how many times two entities are mentioned in the same document. What Rosoka do is far more sophisticated and is based on the language around the two entities
- Entity extraction is good but it can create a lot of noise. Rosoka’ s use of Salience (relevance) means we could always find the most relevant extracted entities and ignore the others
- Rosoka applies entity resolution to its extraction. This meant that if Theresa May is mentioned as ‘Theresa’, ‘Mrs May’ or even ‘She’ in a document, Rosoka will be able to resolve these different texts as one entity.
- It’s easy to reclassify an extracted entity from say a person to a company (for example Robert Dyas the UK hardware store). Rosoka can also learn from your decisions so it doesn’t make the same mistake again
- Rosoka is truly multilingual. We tested this by running several news articles about Islamic State through Rosoka in different languages. The organisation Islamic State was mentioned in both the Russian and the Arabic news articles. On the chart they were resolved into one entity, despite them originating from two different languages. That meant that when we expanded Islamic state we got linked items from both the Russian and the Arabic articles.
- The most surprising reason Rosoka is a no-brainer though is the price. For a fraction of the cost of IBM i2 Analyst’s Notebook the plug-in delivers accurate, multilingual, unstructured data analysis. We’ve looked at costs for this type of technology for clients before and often the requirements were dropped, due to budget restrictions; well not with Rosoka!
In conclusion, we believe Rosoka is a ‘no-brainer’ plug-in for i2 Analyst’s Notebook. The functionality, ease of use and price will just make any analysts life easier!
For more information on Rosoka visit their website or contact S-branch. If you’d like to find out more about unstructured data analysis, then feel free to call.
S-branch offers independent advice, meaning if Rosoka is not for you at this time, we can help you find the solution that is.