About Jaskirat Singh
-
Viewed 410
About me
I am a technology enthusiast with a passion for Big Data Analytics. I have worked in fields, namely Data Modeling, Analysis, Data Engineering, Machine Learning, Deep Learning, Customer Relations, and Business Development.
I believe in today’s rapidly growing technological landscape, more and more people and organizations are turning inward to their data to inform their decisions, dictate public policy, and make things more accessible. Some say data is the new oil. Both represent a valuable commodity that is more useful in a refined state. Today, our ability to process and understand large volumes of data is not only a decision-making tool or source of revenue but a driver of change, creating jobs and markets we never before knew could exist.
Professional Experience:
I am an AWS Cloud Data Engineer with 4.5 years of experience in Data Analytics and Engineering across 3 companies, and 4 clients. I have contributed to diverse projects spanning Geospatial Intelligence, Healthcare, and Cybersecurity. From creating applications to detect and avoid network threats to designing, implementing, and managing scalable Data Pipelines, Database schemas, and Data Warehouses for optimizing Out-of-Home marketing campaigns to Feature Engineering for detecting Lung Cancer Stages, my expertise lies in crafting robust Data Architecture and Machine Learning models.
Technical Expertise:
Languages and Databases: Python, MySQL, PostgreSQL, MongoDB, Scala, R
Big Data: Apache Spark, Hadoop, Airflow, SQL, Superset, Tableau, Snowflake, Kafka, ETL, DBT
Cloud Platform:
AWS: Glue, Step Functions, Lambdas, S3, Athena, Sagemaker, EC2, EMR, SNS, SQS, Redshift, Cloudwatch, IAM
Libraries: PySpark, Boto3, Sklearn, Pandas, NumPy, Deequ, Great Expectations, Matplotlib, Keras, NLTK, Gensim, PyTorch
Other tools: CI/CD, Docker, Terraform, IaC, Agile, Git, GitHub, GitLab
Competent in: Data Analysis, Data Engineering, Databases, Data Modelling, Data Visualization, and Machine Learning
Get in touch! I’m always excited to interact with other data specialists, share my views, and consider working together.
Education
- 2023 - 2024
-
2014 - 2018
Guru Gobind Singh Indraprastha University
Bachelor of Technology, Computer Science and Engineering
CGPA - 8.4/10.0
Experience
-
2024 - 2024
University at Buffalo
Graduate Teaching Assistant
Course: Data Intensive Computing
Responsibilities:
• Mentor students with programming assignments, projects, and implement solutions using Big Data tools such as Hadoop, Spark, ML algorithms, and databases, etc.
• Develop and administer weekly quizzes to assess student understanding and progress.
• Conduct demonstrations and workshops to illustrate practical applications of Big Data tools and techniques.
• Collaborate with the professor in the grading process and provide valuable insights for course improvement. -
2021 - 2023
Tiger Analytics
Machine Learning Engineer
Remote – Bangalore, India
Platform, Data, and ML Engineering
• Collaborated with 4 domain experts to create ML models for Lung Cancer detection and stage prediction
• Engineered over 250 features from large-scale clinical datasets using PySpark, S3, and Glue jobs
• Productionalize end-to-end Machine Learning workflows using Sagemaker pipelines
• Transitioned data pipelines from Step Functions to Airflow, expanded integrations beyond AWS
• Addressed compliance issues for 4 AWS services, improved security using Boto3, and Lambda -
2019 - 2021
Sahaj Software Solutions
Data Engineer
Bangalore, India
Data Science and Engineering
• Implemented data-centric solution for optimizing out-of-home ads, using geospatial data for real-time analytics
• Deployed ETL processes to ingest data from 4 vendors into S3 which enhanced accessibility by 30%
• Orchestrated 10+ data pipelines and developed algorithms to generate audience insights and impressions
• Scaled platform to handle 500+ GBs of data, managed 10+ billion observations every week
• Discovered and resolved 3 crucial data quality issues using Deequ, fostered credibility and predictive accuracy
• Built 5+ impactful Superset dashboards using SQL and Athena, elevated campaign planning for US & UK region
• Tailored RASA chatbot for top Indian transport firm, refined core NLP features – NEL and intent ranking -
2018 - 2019
QOS Technology
Trainee Software Developer - R&D
Bangalore, India
Data Analysis, Machine Learning, and Deep Learning
• Designed a system to block IP addresses/domains on Checkpoint Firewall based on score threshold
• Analyzed unstructured real-time firewall logs, and created 2 attributes to determine severity levels
• Developed a Neural Network to calculate risk score of incident logs using Python and TensorFlow
• Automated repetitive SOC analysts’ tasks using APIs in Python, leading to 5x faster incident response time