Akash Ghadage
@akashghadage
Azure Data Engineer at Tata Consultancy Services
Pune, India
Akash is a Databricks Certified Azure Data Engineer with over 2.5 years of experience in designing and optimizing high-performance data pipelines. He is proficient in Databricks, Apache Spark, PySpark, Python, and various Azure cloud services, including Azure Data Factory and ADLS. His expertise lies in supporting advanced analytics and AI/ML initiatives for international clients.
Experience
Azure Data Engineer
Tata Consultancy Services
Played a key role in the development and operations of a data lakehouse built on Azure Databricks to process ERP and non-ERP data, supporting AI/ML and BI use cases. Developed a robust, metadata-driven ETL framework leveraging medallion architecture to process 14K+ tables and terabytes of data using Databricks Autoloader, Delta Lake, Python, PySpark, and Apache Spark. Discovered and implemented 2 key optimization strategies, fine-tuned Spark jobs, resulting in a 30-40% reduction in operational costs and enhanced performance. Transitioned 200+ Databricks workloads from DBX runtime 13.3 LTS to 15.5 LTS, improving performance and stability (Awarded by TCS). Implemented Delta Lake optimization techniques, including partitioning, data skipping, optimize, Z-Ordering, and liquid clustering, reducing query time by 40%. Created Python automation scripts that generate alerts for expiring Personal Access Tokens (PATs), reducing authentication and integration failures by 100% between Databricks and other systems/services. Utilized PySpark to create a recovery procedure that addressed data loss issues, ensuring data integrity and on-time availability. Migrated data and applications from dedicated Azure IaaS/PaaS (Azure Databricks, Azure Data Factory) to a shared Azure environment, boosting processing efficiency by 30% and reducing operational costs by 20%. Engineered scalable data ingestion pipelines in Azure Data Factory to ingest 5 TB of data daily from diverse sources (Azure Blob Storage, Azure Data Lake Storage, Azure SQL, and SharePoint) into ADLS Gen 2, ensuring seamless data flow and high data availability with 99.9% uptime. Created and configured datasets and linked services to ensure seamless data integration across systems and platforms. Restructured and optimized Databricks notebooks using Python, PySpark, and Spark SQL, transforming data across multiple layers, including L0 (raw), L1 (harmonized), and L1+ (semantic), while ensuring data consistency, accuracy, and eff
Education
Sanjeevan Engineering and Technology Institute
B.Tech
Computer Science and Engineering
Licenses & Certifications
Data Engineer Professional
Databricks
Azure Data Engineer (DP-203)
Microsoft
AZ-900
Microsoft
AI-900
Microsoft
DP-900
Microsoft