2022 - Present
• Implemented a BERT Transformer multi-class classification model to extract store names from merchant descriptions, enhancing coverage by 11% from regex-based pattern matching.
• Scaled solution on 900 million merchant descriptions employing accelerated computing with GPU on AWS Sagemaker with PyTorch. Deployed model using Docker on AWS Sagemaker Inference and scheduled nightly runs using Airflow.
2021 - 2021
• Adapted to DaVinci HL7 interoperability standard Airflow DAGs were built in Google Cloud Platform (GCP). Custom Python scripts, along with DataFlow jobs were written for each resource.
• Streamlined healthcare efficiency by teaming with 2 more resources to move 200 million entries into FHIR Store, ensuring CVS Health complies with US Healthcare standards, and designing seamless data communication.
2020 - 2021
• Spearheaded and mentored a team of four data scientists in partnership with operations to automate dedupe process, reducing turnaround time by sevenfold.
• Engineered fraud detection pipeline using ensemble models involving Neural Network and XGBoost.
2019 - 2020
• Researched and developed an algorithm to identify probabilistic location of POS terminals to notify customers of nearby offers. Discretized customer’s leveraging k-means clustering to provide right offers, increasing revenue on average by 32%.
• Utilized Natural Language Processing (NLP) – Universal Sentence Encoder (sentence embeddings) using Tensorflow Keras to vectorize data, coupled with LightGBM leading to a $1.9 million collections deal with Citizen’s Bank.
2017 - 2019
Xceedance Consulting Ltd
• Instituted customer behavioral scoring by utilizing Named Entity Recognizer (NER) on ongoing Insurance claims by taking past six months of data, boosting customer retention by 17% for one calendar year.
• Extracted payment information from 100K invoices by processing payment papers with OpenCV and OCR-Tesseract to automate authentication process, reducing manual labor by 60%