Helped a leading pharmaceutical company to digitize its “PlaceHolder Document...Read More
The client is a public sector organization based in Australia. The client’s primary focus is e-governance and administration services where the client is responsible to deliver government services to customers, businesses, and other government organizations. In dealing with e-governance services, the client regularly needs to process large volumes of data based on information contained in applications form of Information and Communication Technology (ICT).
The client wanted to process large batches of data quickly where the number of daily transactions is up to 15 million. The constant growth of e-governance projects was generating a large amount of data that was becoming historical every day and additional 8 TB data was getting added every month. The client wanted to manage these large volumes of data and do analytical processing without performance issues.
We migrated the client’s database to the Hadoop platform with multi-node cluster environment to process massive data. We used basic hardware to reduce costs and distributed processing using Hive and MapReduce. We also enabled the platform to hold only hot data (3 months) in Teradata and cold data (3+ months) in a Hadoop environment.
The new analytics platform reduced IT costs significantly. It also helped the client to handle large volumes of data without any performance breaks.
Aptus Data Labs used the below process for data migration and to resolve the existing challenge. Aptus Data Labs:
The below figure shows at high level components of migration:
NDFS and Programming
The migrated Hadoop platform reduced the IT costs significantly as being an open-source system, there was no license cost. The platform reduced initial investment as well as recurring costs related to handling large volumes of data by using commodity hardware. The Hadoop platform, set up in a clustered framework, allowed new nodes to be added to the cluster to handle the ever-growing volume of data. It also enabled the client to process massive volumes of data smoothly without any performance degradations.