Organising the data with Entity Extraction for easy accessibility

About the Client:

The Client is a US-based organization, working in the Knowledge management area.
The majority of their work is on cloud platforms.

Challenges:

Being into the Knowledge management industry the client deals with huge data each day.
There are a number of documents to be maintained and managed.
The existing data needed to be well organized so that it can be accessed easily.
New data has to go into the system into the managed format avoiding repetitive tasks.

  • Organizing unstructured data.
  • Higher manual efforts in retrieving the useful data.
  • Searching for the data.
Our Solution:

The client appointed HUE Digital to study the pain areas in the existing system and suggest a solution to
current problems.

We studied the client’s data. Based on our analysis of the data and the frequency of the data
generation, HUE Digital solutions suggested building an application that accepts the plain text request
and process that request using various APIs and services. This application provides a response in
JSON format.

Once the application receives a text request, it detects the language of the text and forwards that
request to Google NLP for further processing. It uses Google NLP service to generate named entities
and common nouns from the content.

After getting named entities from Google NLP, it sent those terms for lemmatization. Lemmatization is the process of grouping inflected forms together as a single base form.

This system uses these lemmatized entities for further processing. We have created a domain-specific
ontology file and created a triple store using Apache Jena. Apache Jena is an open-source Java
framework for Semantic Web and Linked Data applications. It offers RDF and SPARQL support an
Ontology API and Reasoning support as well as triple stores. It uses DBpedia API to generate
DBpedia entities for named entities.

The application uses the Google translation API to translate the entities in the provided languages.

Architecture :

 

Business Benefits:

Our solution enriched the client’s content management system. Our system helped to reduce the
manual efforts to search, store and organize the existing and new data.
HUE Digital helped in achieving the following business outcome for the client.

  • Helped in Increasing throughput/productivity by 45%
  • Helped in saving man-hours, eventually leading to significant cost reduction.
  • Users can now access any information just with a couple of clicks
  • Tasks are completed at a high degree of accuracy as the process is now system dependant
  • reducing human error.
  • Reduces operation time and work handling time significantly.
  • Frees up employee’s time to focus on other roles.

Like this article?

Share on Facebook
Share on Twitter
Share on Linkdin
Share on Pinterest

Leave a comment