Pawan Shukla
@pshukla6270
Software Engineer at HCLTech
Noida, Uttar Pradesh, India
Software Engineer at HCLTech with expertise in ETL development, data engineering, and cloud technologies. Experienced in designing and optimizing ETL pipelines using Pentaho Data Integration and PySpark, and proficient in Oracle SQL and performance tuning. Holds Google Cloud certifications as an Associate Cloud Engineer and Professional DevOps Engineer.
Experience
Software Engineer
HCLTech
Client- (Hewlett Packard Enterprise). Designed, developed, and enhanced ETL jobs and transformations using Pentaho Data Integration (PDI / Kettle) to support enterprise reporting and analytics needs. Extracted data from multiple heterogeneous sources such as Oracle, SQL Server, flat files, and APIs, transforming and loading it into data warehouses and data marts. Implemented complex data transformation, cleansing, standardization, and validation rules to ensure high data quality. Developed and maintained full load, incremental load, and Change Data Capture (CDC) processes for daily. Optimized ETL performance by tuning SQL queries, transformations, database indexes, and job execution parameters, significantly reducing load times. Designed robust error handling and auditing mechanisms to track job execution status and data discrepancies. Scheduled and automated ETL workflows using Pentaho Scheduler, cron jobs, and enterprise scheduling tools, ensuring timely and reliable data availability. Worked closely with business analysts, reporting teams, and stakeholders to understand data requirements and deliver scalable ETL solutions, managed ETL deployments across Development, QA, UAT, and
Graduate Engineer Trainee
HCLTech
Designed, developed, and maintained scalable ETL pipelines for batch and real-time data ingestion from multiple sources into cloud data lakes and data warehouses. Built and optimized PySpark-based ETL jobs to cleanse, normalize, validate, and transform large-scale datasets (TB- level) ensuring data quality and consistency. Implemented data modeling, partitioning, indexing, and schema evolution strategies to improve query performance and reduce processing and storage costs. Developed and optimized Oracle SQL and PL/SQL procedures, functions, packages, and performance-tuned queries for high-volume transactional and analytical workloads. Managed end-to-end data workflows, including scheduling, monitoring, error handling, retries, and SLA compliance using orchestration frameworks. Ensured data security, governance, and compliance by implementing access controls, auditing, and encryption standards. Performed performance tuning for ETL jobs and Oracle databases using execution plans, indexing strategies, and resource optimization. Collaborated with business analysts and stakeholders to translate business requirements into reliable, production- ready data solutions. Supported reporting an
Education
IIMT COLLEGE OF ENGINEERING
B.Tech
Electronics and Communication Engg.
Licenses & Certifications
Associate Cloud Engineer
Google Cloud
Professional DevOps Engineer
Google Cloud