About Jaskirat Singh

  • Are you open to working remotely? Yes
  • Please Enter All Locations You Are Open to Work (each line will be each city, country location)
    United States
  • Allow Profile Promotion To Recruiters/Companies Yes
  • Allow Profile Promotion on Social Media Yes
  • Allow Profile Promotion to Alumni Yes
  • Viewed 410

About me

I am a technology enthusiast with a passion for Big Data Analytics. I have worked in fields, namely Data Modeling, Analysis, Data Engineering, Machine Learning, Deep Learning, Customer Relations, and Business Development.

I believe in today’s rapidly growing technological landscape, more and more people and organizations are turning inward to their data to inform their decisions, dictate public policy, and make things more accessible. Some say data is the new oil. Both represent a valuable commodity that is more useful in a refined state. Today, our ability to process and understand large volumes of data is not only a decision-making tool or source of revenue but a driver of change, creating jobs and markets we never before knew could exist.

Professional Experience:

I am an AWS Cloud Data Engineer with 4.5 years of experience in Data Analytics and Engineering across 3 companies, and 4 clients. I have contributed to diverse projects spanning Geospatial Intelligence, Healthcare, and Cybersecurity. From creating applications to detect and avoid network threats to designing, implementing, and managing scalable Data Pipelines, Database schemas, and Data Warehouses for optimizing Out-of-Home marketing campaigns to Feature Engineering for detecting Lung Cancer Stages, my expertise lies in crafting robust Data Architecture and Machine Learning models.

Technical Expertise:

Languages and Databases: Python, MySQL, PostgreSQL, MongoDB, Scala, R

Big Data: Apache Spark, Hadoop, Airflow, SQL, Superset, Tableau, Snowflake, Kafka, ETL, DBT

Cloud Platform:

AWS: Glue, Step Functions, Lambdas, S3, Athena, Sagemaker, EC2, EMR, SNS, SQS, Redshift, Cloudwatch, IAM

Libraries: PySpark, Boto3, Sklearn, Pandas, NumPy, Deequ, Great Expectations, Matplotlib, Keras, NLTK, Gensim, PyTorch

Other tools: CI/CD, Docker, Terraform, IaC, Agile, Git, GitHub, GitLab

Competent in: Data Analysis, Data Engineering, Databases, Data Modelling, Data Visualization, and Machine Learning

Get in touch! I’m always excited to interact with other data specialists, share my views, and consider working together.

Education

Experience

  • 2024 - 2024
    University at Buffalo

    Graduate Teaching Assistant

    Course: Data Intensive Computing

    Responsibilities:
    • Mentor students with programming assignments, projects, and implement solutions using Big Data tools such as Hadoop, Spark, ML algorithms, and databases, etc.
    • Develop and administer weekly quizzes to assess student understanding and progress.
    • Conduct demonstrations and workshops to illustrate practical applications of Big Data tools and techniques.
    • Collaborate with the professor in the grading process and provide valuable insights for course improvement.

  • 2021 - 2023
    Tiger Analytics

    Machine Learning Engineer

    Remote – Bangalore, India
    Platform, Data, and ML Engineering
    • Collaborated with 4 domain experts to create ML models for Lung Cancer detection and stage prediction
    • Engineered over 250 features from large-scale clinical datasets using PySpark, S3, and Glue jobs
    • Productionalize end-to-end Machine Learning workflows using Sagemaker pipelines
    • Transitioned data pipelines from Step Functions to Airflow, expanded integrations beyond AWS
    • Addressed compliance issues for 4 AWS services, improved security using Boto3, and Lambda

  • 2019 - 2021
    Sahaj Software Solutions

    Data Engineer

    Bangalore, India
    Data Science and Engineering
    • Implemented data-centric solution for optimizing out-of-home ads, using geospatial data for real-time analytics
    • Deployed ETL processes to ingest data from 4 vendors into S3 which enhanced accessibility by 30%
    • Orchestrated 10+ data pipelines and developed algorithms to generate audience insights and impressions
    • Scaled platform to handle 500+ GBs of data, managed 10+ billion observations every week
    • Discovered and resolved 3 crucial data quality issues using Deequ, fostered credibility and predictive accuracy
    • Built 5+ impactful Superset dashboards using SQL and Athena, elevated campaign planning for US & UK region
    • Tailored RASA chatbot for top Indian transport firm, refined core NLP features – NEL and intent ranking

  • 2018 - 2019
    QOS Technology

    Trainee Software Developer - R&D

    Bangalore, India
    Data Analysis, Machine Learning, and Deep Learning
    • Designed a system to block IP addresses/domains on Checkpoint Firewall based on score threshold
    • Analyzed unstructured real-time firewall logs, and created 2 attributes to determine severity levels
    • Developed a Neural Network to calculate risk score of incident logs using Python and TensorFlow
    • Automated repetitive SOC analysts’ tasks using APIs in Python, leading to 5x faster incident response time

Skills