Helped a leading pharmaceutical company to digitize its “PlaceHolder Document...Read More
The client is a corporate body dealing with general insurance and reinsurance underwriting based in the UK. With roots in marine insurance, the client underwrites multiple types of policies, such as marine, property, aviation, energy, casualty, and motor risks. Known as a specialist insurance market, the client has 54 agencies maintaining 80 syndicates with an established presence worldwide, especially in Europe and North America.
The client wanted to mine text from standard PDF files and scanned image documents into an excel sheet for usage as data entry inputs for an Open-Twin application. The prerequisites for this case were:
First, our team used PyPDF2 to distinguish between editable and scanned PDF files and then built an ML-based CRF/HMM model for NER tagging to mine text as per the client’s expectation to resolve this business challenge.
Our ML-based text mining solution helped the client to:
Aptus Data Labs performed the following steps to combat this business challenge.
The client was able to mine required text from several editable and scanned PDF files quickly. With the processed data contained in an excel sheet as per the prerequisites, the client was able to use the output as input for data entry. The client was able to improve the business performance and save time and money with the newly built ML-based text mining application.