Implemented a distributed data lake architecture and advanced analytics on the AWS cloud platform to reduce IT costs and improve productivity

Boosting Performance with Apache Spark Migration

Case Study: Boosting Performance with Apache Spark Migration

About the Client

The client is a leading data processing company in Australia. Hence, they are connected to more than 4,500 pharmacies across Australia and wanted to improve access to information across the pharmacy supply chain. The objective was to help pharmacies identify opportunities in both their dispensary and retail through reporting and analytics, as well as support rebate payments from suppliers and patient adherence programs.

The Business Challenge

The client wanted to improve and accelerate analytics-driven decisions and reduce the time for data analysis, data analytics, and data reporting on both structured and unstructured data. Furthermore, the client wanted to improve the deviation tracking of mitigation tasks and reduce the system stack cost by enabling an open-source, industrial-grade platform. Additionally, the client also wanted to prepare ground and infra for AI/ML and advanced analytics.

What Aptus Data Labs Did

We migrated the client’s existing 5-node Vertica Cluster platform to Apache Spark in Hortonworks on AWS Cluster to improve the processing time and quickly adapt to new features in the future along with cost reduction.

The Impact Aptus Data Labs Made

The new analytics platform boosted the performance by 62% and reduced the data processing time. In addition, it also reduced IT costs by 400% and helped the client to handle large volumes of data smoothly.

The Business and Technology Approach

Aptus Data Labs used the following methodology for environment migration and to resolve the existing challenge. Aptus Data Labs

Tools Used

The Outcome

The migrated analytics platform reduced the processing time from 2.2 hours for a billion records to 1 hour for 1.2 billion records that boosted the performance by 62%. Moreover, the analytics platform reduced IT costs significantly using open-source technologies. In addition, the platform used the yarn cluster to ensure high availability and high efficiency of the system. Furthermore, it also enabled the client to handle massive volumes of data smoothly without any break in the performance.

Related Case Studies

Case Study: Achieving Low-Latency API-Based Queries with MongoDB

MapR and Mongo DB benchmarking for tool selection for a BFSI company

Performance analysis - MapR DB vs. Mongo DB - Tool Selection Process
Case Study: Short Period Demand Forecasting For Hispanic Health Using Advanced AI Techniques

Enterprise Data Lake and Analytics implementation

Enterprise Data Lake and Analytics implementation for a large Pharmaceutical Company in India on AWS platform
Case Study: Enhancing Business Growth Through Data Migration and Visualization

Data Migration & Visualisation for Digital Marketing Performance

Data Migration & Visualization for Digital Marketing Performance

Unlock the Potential of Data Science with Aptus Data Labs

Don't wait to harness the power of data science - contact Aptus Data Labs today and start seeing results.

Get In touch with our  Experts

If you’re looking to take your business to the next level with data science, we invite you to contact us today to schedule a consultation. Our team will work with you to assess your current data landscape and develop a customized solution that will help you gain valuable insights and drive growth.