Categorizing and Segmenting User SMS Data Using NLP Techniques

Case Study: Categorizing and Segmenting User SMS Data Using NLP Techniques

About the Client

Our client is an Infrastructure Neo Bank that is designed to redefine your banking experience and promote open banking. The client is building an employee neobank facilitating superior financial instruments and curated to perfectly fit the product to an employee’s needs.

The Business Challenge

To segregate information across multiple platforms. Information such as Income and Expenses made. But, the lack of understanding, the way they have spent during the month, w.r.t the amount spent on Shopping, Travel, Entertainment, EMI, Insurance, and many more.

The extraction of key features, Merchant name identification and category mapping.

Visibility of spends done by an individual:

What Aptus Data Labs Did

This is a summary of what we delivered :

The Impact Aptus Data Labs Made

This is the impact we created with measurable outcomes for Backend and ML engine

The Business and Technology Approach



This Expense Category Classification App is implemented as per the architecture below. There are 3 components on a large scale: · Web App · Android App · SDK

Firstly, the webapp user goes through the middleware/authentication layer and creates a client profile. Then, a unique secret is created for each organization which is given to the client. Additionally, this secret key is used to create a token for the client’s respective users in the SDK. Then, the Android App users sign up via OTP or google auth whereas the SDK is authenticated via the secret key and the token. After passing the Authentication layer/middleware, they get access to the API Service. Therefore, their messages are fetched via the SMS Receiver Service (Kafka) and classified via the ML Engine to get the structured SMS. Moreover, this data gets populated in the Database and is visible to the android users as well as the Web App Portal. Therefore, the user Interface of the WebApp enables the admin to view the classified data and their analytics.

ML Engine

The raw messages are cleaned and classified into transaction, non-transaction and payment reminder messages. Therefore, to predict the Merchant name from the SMS Named entity recognition (NER) is used. Furthermore, NER is a natural language processing (NLP) technique that automatically identifies named entities in a text. Moreover, the Merchant name and the due dates of payment remainder SMS are predicted using NER method. Hence, the Merchant name of transaction SMS are predicted and mapped with predefined categories. Additionally, the predefined category list consists of ATM, Bill, Crypto, E Commerce, Entertainment, Education, Food & Drinks, Food Delivery, Fuel, Groceries, Health, Home Service, Income, Insurance, Investment, Loan, Recharge, Rent, ITR, Retail, Salary, Travel, Wallet and Unknown.

Related Case Studies

Case Study: Achieving Low-Latency API-Based Queries with MongoDB

Achieving low-latency API-based queries with Mongo DB

Performance analysis - MapR DB vs. Mongo DB - Tool Selection Process
Case Study: Revolutionizing Pharma Analytics with AWS Data Lake

Revolutionizing Pharma Analytics with AWS Data Lake

Enterprise Data Lake and Analytics implementation for a large Pharmaceutical Company in India on AWS platform
Case Study: Boosting Performance with Apache Spark Migration

Boosting Performance with Apache Spark Migration

Data Migration & Performance Improvement of large data processing

Unlock the Potential of Data Science with Aptus Data Labs

Don't wait to harness the power of data science - contact Aptus Data Labs today and start seeing results.

Get In touch with our  Experts

Are you planning to take your business to the next level with data science? We invite you to connect with us today to schedule a consultation. Our team will work with you, to assess your current data landscape and develop a customized solution that will help you gain valuable insights and drive growth.