Business Objective / Goal
To automate and digitize the process of updating medicine preparation templates by replacing placeholder text with meaningful chemical and process information—eliminating manual effort, reducing human error, and speeding up processing across large volumes of documents.
Solutions & Implementation
- Developed a custom Named Entity Recognition (NER) solution using both POS tagging and LSTM-based deep learning models to identify contextual entities around placeholders.
- Implemented logic to scan for meaningful text around placeholder tags (marked by ##) and narrowed down the text window for more accurate entity prediction.
- Replaced blank sections in .odt templates with relevant chemical names or phrases without losing sentence semantics.
- Designed the workflow to handle both individual documents and folders containing multiple templates.
- Deployed the solution in Rapidminer for seamless execution and batch processing.
Major Technologies Used
- Python – Core scripting and data processing
- Keras – LSTM-based deep learning model development
- Gensim – Semantic text processing
- Powershell Scripts – Document parsing and batch execution
- Rapidminer – Workflow orchestration and user interface for business users