Enterprise Data Lake and Analytics implementation for a large Pharmaceutical Company in India on AWS platform

Implemented a distributed data lake architecture and advanced analytics on the AWS cloud platform to reduce IT costs and improve productivity

Data-engineering-pharma-aptus

About the Client

The client is a multinational pharmaceutical company based in India. The client deals in manufacturing and selling active pharmaceutical ingredients and pharmaceutical formulations in India and the US. Considered as one of the most reputed brands in India, the client has expanded with joint ventures and acquisitions in the last two decades.

The Business Challenge

The client wanted to improve and accelerate analytics-driven decisions and reduce the time for data analysis, data analytics, and data reporting on both structured and unstructured data. Furthermore, the client wanted to improve the deviation tracking of mitigation tasks and reduce the system stack cost by enabling an open-source, industrial-grade platform. The client also wanted to prepare ground and infra for AI/ML and advanced analytics.

What Aptus Data Labs Did

We built enterprise data lake architecture and implemented analytics on the AWS platform. This solution included AWS data lake architecture for scalable warehouse and AWS data lake architecture for IoT and unstructured data.

The Impact Aptus Data Labs Made

The enterprise data lake architecture on the AWS platform enabled the client to process, analyze, and report both structured and unstructured data quickly with better analytics-driven decisions. This solution helped the client to reduce IT costs and improve business performance.

The Business and Technology Approach

Aptus Data Labs used the following process to build enterprise data lake architecture for scalable warehouse and for IoT and unstructured data to resolve the business challenge. The solution was in three stages. Aptus Data Labs:

  • Release 1
  • Release 2
  • Release 3
  1. Gathered and analyzed requirements and due diligence document.
  2. Set up an enterprise data lake platform and defined data lake framework, tools and technology, configuration, testing, and platform readiness
  3. Setup database for DCS and acing GWD (data fetched as applicable for dashboards in scope) with an elastic data-warehouse
  4. Migrated database to PostgreSQL (T and M scope)
  1. Demonstrated connectivity to various databases from Presto
  2. Backed up email and uploaded data to the cloud
  3. Uploaded IoT data to the cloud
  1. Established connectivity from R/Python to Cloud Database/S3 using Libraries
  2. Enabled Presto/AWS Athena for data search or ad-hoc queries
  3. Migrated Tableau dashboard to Superset or AWS Quicksights or D3.J3

The Reference Architecture We Built

Data-lake-reference

Tools used

The Outcome

The new data architecture based on AWS Cloud benefited the client in multiple ways and helped to resolve the business challenge. The benefits in all the three phases were:

  • Phase 1
  • Phase 2
  • Phase 3
  1. Advanced analytical capabilities-driven on both structured and unstructured data with Enterprise search enabled for any data
  2. Machine Learning used to drive improvements and productivity
  1. Demonstrated connectivity to various databases from Presto
  2. Backed up email and uploaded data to the cloud
  3. Uploaded IoT data to the cloud
  1. Established connectivity from R/Python to Cloud Database/S3 using Libraries
  2. Enabled Presto/AWS Athena for data search or ad-hoc queries
  3. Migrated Tableau dashboard to Superset or AWS Quicksights or D3.J3

Related Case Studies

Download Case study

Download Case study