• Please Enter All Locations You Are Open to Work (each line will be each city, country location)
    United States
  • Allow Profile Promotion To Recruiters/Companies Yes
  • Allow Profile Promotion on Social Media No
  • Allow Profile Promotion to Alumni Yes
  • Viewed 379

About me



  • 2020 - 2022


    • Understanding the requirement from the architects and proposing designs as per requirements.
    • Analyzed the requirements and framed the business logic for the ETL process.
    • Evaluated existing OLAP cubes and understood dependent fact tables, dimension tables and derived views. Created star schema data model for the same.
    • Extracted data from CSV, JSON and parquet files from S3 buckets in Python and load into AWS S3, Snowflake DB and Snowflake.
    • Worked on migrating existing on-premises data to data lake (S3, Redshift ) using data ingestion services like Apache NIFI, DMS and Glue.
    • Developed PySpark scripts to do data transformations and data movement & control.
    • Created AWS Cloudwatch service to monitor data lake.
    • Created a data pipeline orchestration using AWS Step functions and explored knowledge on Apache Airflow
    • Hands-on experience on data ingestion from Oracle, MSSql, API’s etc., to AWS S3.
    • Experience on writing Redshift queries to create data products and performance tuning and optimization like distribution keys, Redshift Spectrum
    • Worked on writing Glue jobs for data transformations and data Ingestion.
    • Experience on handling secure data using data masking, hashing and tokenization at data lake
    • Experience on writing python scripts to handle metadata throughout pipeline and create
    Glue Catalog using AWS Lambda.
    • Hands on experience on writing AWS lambda to Extend other AWS services with a custom logic and backend services that operate at AWS scale and performance like Glue.
    • Created AWS lambda functions which configured it to receive events from AWS S3 bucket.

  • 2018 - 2020


    • Experience in implementing Pyspark jobs to do data transformation using AWS EMR.
    • Experience on writing python script to automate spin up and terminate EMR service to run pyspark jobs on demand basis
    • Expertized in implementing Spark using Python and Spark SQL for faster testing and processing of data responsible to manage data from different sources
    • Created Airflow Scheduling scripts in Python.
    • Hands on experience on writing AWS lambda to write custom logics, which enables AWS services like EMR.
    • Created AWS lambda functions which configured it to receive events from AWS S3 bucket
    • Expertized in implementing a python scripts to create Glue Catalog to enable S3 data to Athena
    • Worked on writing Python scripts to parse JSON documents and load the data into the S3.
    • Worked on data cleaning and reshaping, generated segmented subsets using NumPy and Pandas in Python.
    • Designed and implemented data loading and aggregation frameworks and jobs that will be able to handle diverse data using Spark.
    • Mastered the ability to design and deploy rich visualizations with Drill Down and Drop-down menu option and Parameterized using Tableau.
    • Experience on creating stored procedures with PL/SQL.
    • Hands-on experience on SQL query performance tuning and optimization