Education
-
2021 - Present
State University of New York at Buffalo
Master of Science in Computer and Information Sciences
Coursework: Computer Vision, Pattern Recognition, Distributed Systems, Algorithms and Data Structures, Computational Linguistics, Data Intensive Computing, Algorithms for Modern Computing
-
2021 - 2021
DeepLearning.ai
Deep Learning Specialization
Coursework: CNN, Transformers, LSTM, GAN, VAE, Hyper parameter Tuning, Structuring ML Projects
Experience
-
2022 - Present
Center for Unified Biometrics and Sensors, University at Buffalo
Research Assistant
â—¦ IARPA-BRIAR: Developing face recognition systems for the IARPA-BRIAR program. Developing algorithms to
recognize faces from low resolution images, by improving state-of-the-art face recognition pipeline such as AdaFace
using deep metric learning loss formulation. Tech Stack: PyTorch, CNN, ProxyNCA
â—¦ Generating Clinical Biomarker Profiles: Working with the Department of Pharmaceutical Sciences, University at
Buffalo, to model clinical biomarker profiles of under-represented groups. Developed conditional GAN to learn a
panel of 16 diabetes-relevant biomarkers from the National Health and Nutrition Examination Survey (paper). Tech
Stack: Python, GAN, VAE -
2019 - 2021
FullContact Technologies
Data Science Lead
â—¦ Look Alike: Lead generation API enabling clients to target prospective customers, developed the audience segmen-
tation model with data pre-processing, feature engineering and training using sagemaker inference pipeline. TechStack: Python, XGBoost, K-means, T-Shap, Spark, Apache Airflow, AWS Glue, AWS Sage-
Makerâ—¦ Record Linkage: Improve the address matching capability of FullContacts identity graph using fuzzy features. Used
fuzzy scores like Jaro Winkler, Soundex and DoubleMetaphone to gain 3% increment in address matching. Tech
Stack: Python, Spark, Apache Airflow, AWS SageMaker, Athena -
2010 - 2019
Nielsen
Data Scientist
â—¦ Cross Channel Attribution: Continuous development of the top-down marketing attribution platform. Identified and
implemented enhancements in feature engineering and goodness of fit from hands-on model development experience.
Interpret model results using cross-channel attribution reports(HALO). Mentor junior data scientists for end-to-end
delivery of data refreshes. Tech Stack: R, Regression, ARIMA, PLSR
â—¦ ETL: Develop tools to pull data from sources like sftp, ad servers, email etc. Optimize data pre-processing scripts using
advanced data processing and manipulation packages like data.table, dplyr. Tech Stack: R, SQL, Unix,
Python
â—¦ EDA: Extensive data visualization using tools like Tableau, RStudio, and MS Excel. Finding correlation patterns,
detecting tribal knowledge. Create data sign off documents with in depth analysis of data discovery process and client
feed backs. Tech Stack: R-Shiny, Tableau, MS-Excelâ—¦ Deployment: ROI maximization by finding optimized spend allocation using gradient search, enhanced the opti-
mization module using data.table for upto 10x improvement in response time. Optimized pre-processing and post-
processing routines for low latency of batch inference. Tech Stack: R, VoltDB, Jenkins