Default profile banner
AS

Anuj Srivastava

@anuj77

Software Engineer at Cloudkeeper

Noida, Uttar Pradesh, India

CloudkeeperChandigarh University

Data Engineer with 2 years of experience designing and optimizing scalable ETL/ELT pipelines on cloud platforms. Strong hands-on expertise in Apache Spark (PySpark), Databricks-style distributed processing, AWS, and SQL-based transformations. Experienced in building analytics-ready data models, enforcing data quality frameworks, and supporting BI and reporting use cases. Proven track record of improving pipeline performance, reliability, and cost efficiency.

Experience

Software Engineer

Cloudkeeper

Full-timeJan 2025 - Jan 2026Noida, Uttar Pradesh, India

• Designed and maintained scalable ETL/ELT pipelines on AWS using Apache Airflow, dbt, and Spark. • Orchestrated 10+ production Airflow DAGs with retries, SLAs, sensors, and backfills. • Built analytics ready dimensional data models in Snowflake following star schema principles. • Optimized Spark and SQL workloads, reducing execution time by 60–80%. • Developed PySpark and Scala-based Spark jobs on EMR for large-scale ingestion from S3. • Integrated external APIs (BillDesk) and persisted curated datasets into S3 and Snowflake. • Implemented data quality checks and automated Slack alerts to improve pipeline reliability

Mathematics AI Trainer

Outlier

FreelanceSep 2024 - Feb 2025California City, CA, USA

• Developed validation workflows for AI-generated mathematical content across calculus, linear alge bra, and statistics. • Conducted 92 quality assurance reviews, maintaining 95%+ accuracy across training datasets.

Data Scientist Intern

Codesoft

Part-timeMay 2024 - Jul 2024Kolkata, West Bengal, India

Built data preprocessing pipelines using Python and SQL, processing 10,000+ records for ML model training. • Performed data collection, cleaning, transformation, and exploratory data analysis to improve model performance.

Education

Chandigarh University

Master of Computer Applications

AIML

Aug 2023 - May 2025Grade: 6.68

A specialized MCA in AI/ML, focusing on the intersection of big data and automated decision-making. Proficient in Data Structures & Algorithms (DSA) and statistical modeling. Actively applying ML frameworks to bridge the gap between theoretical math and scalable software solutions.

Licenses & Certifications

AWS Partner Accreditation (Technical)

SQL Certification

Python Certification

Skills

Data Engineering
ETL/ELT Pipeline
Data Warehousing
Data Modeling
Data Quality
Snowflake
Databricks
Spark
PySpark
AWS
Athena
Glue
Lambda
SQL
Python
SCALA
Git
CI/CD Pipelines
Airflow