Named Entity Recognition

Problem Statement:

  1. When signing up with Insurance provider company for an insurance policy, customers have the option to check a box for special counsel instruction (Meaning Insurance provider company would not select standard external counsel for them). When they underwrite policies for the clients, if special counsel is requested, there is a text field where they can specify the instructions. Attached spreadsheet is the sample data for things that currently go into this field.
  2. Insurance provider would like a text analysis/NLP tool that would take this text data and convert it into structured data and put it into a field in a DB. Looking at the data, it looks like most fields have the name of an external counsel followed by a positive or negative sentiment so it shouldn’t be that difficult.

Purpose of sentiment analysis And NER for Insurance provider:

  1. 1) Sentiment analysis can be used to improve customer insurance service, promote client engagement, and other specific purposes.
  2. 2) sentiments also have the power to impact insurance customer.
  3. 3) Two approach to solve Sentiment analysis in machine learning: unsupervised or supervised learning.
  4. 4) Given dataset falls into unsupervised learning.
  5. 5) NER(name entity recognition) extracts meaningful information from Datasets .This extracted information helps Insurance provider in many meaningful ways e.g only read mobile number,location,person,organisation info.


  • The special counsel instruction is captured in a natural language (English, in this case). However, this human language is astoundingly complex and diverse. Thoughts and information is expressed in infinite ways, both verbally and in writing. Also, within the language is a unique set of grammar and syntax rules, terms and slang. Moreover, there can also be misspellings or abbreviations, punctuation mistakes etc.
  • This information however is a veritable goldmine.
  • The challenge is to resolve ambiguity in language and add useful numeric structure to the data so that it can used for decision making


  1. iVentura Machine Learning Platform was used for building the solution. iVentura provides the complete ecosystem for data scientists to build models without worrying about the underlying Infra & Security. Either for a team or an individual data scientist, iVentura is ideally suited as a platform of choice because it comes equipped with:iVentura ingests the sample data, converts it into structured data and captures it into a data store. The following is the high level approach adopted for the Solution.
  2. 1) Analyze the given data
  3. 2) Perform Data Pre-processing: Clean the data of unwanted records, if any Check the data for its consistency Remove duplicate records
  4. 3) Feature Extractions using TFIDF on Text
  5. 4) Perform k-Means Clustering to identify patterns of data
  6. 5) Perform the Sentiment Analysis using TextBlob & NLTK – Vader
  7. 6) Perform the Name Entity Recognition using Spacy and pyap+re expression
  8. 7) Build the Word Cloud
  9. 8) Save the data into Excel/DB
  10. 9) Deployment & Visualization