Default profile banner
SK

Sourabh Khadiwala

@sourabhkhadiwala

Architect at Cognizant

PUNE

CognizantTruba Institute of Engg. & Information Technology

Sourabh is a data warehousing and big data professional with over 11 years of experience providing solutions to complex business problems. He specializes in consulting and implementing big data solutions using platforms like Cloudera, Spark, and Kafka. His expertise spans ETL processes with Informatica, data virtualization using Denodo, and managing data governance initiatives. He has extensive experience in leading end-to-end big data projects within Agile frameworks.

Experience

Architect

Cognizant

•Aug 2018 - Present

Providing bigdata solutions and contributing to the project as an individual contributor. Designing and creating data pipeline for ingesting data into Data Hub and solutions to expose the data to the users. Doing PoC and PoT for different tools and technologies. Connect with data steward and data governance board to understand more about the data that is coming in and see sensitive data. Contributing and driving towards RFP for different business groups. Creating spark application, Impala/Kudu SQL for data processing. Driving effort for cluster migration from CDH to CDP.

Tech Lead

Principal Global Services Pvt. Ltd

•Oct 2016 - Aug 2018

Designing and implementing solution for big data technologies and exposing them to end users. Coordinating with architects for design reviews. Setting up flume configuration to pull near real time updated from messaging queue and pushing it to data reservoir (HDFS). Encrypting personal identifiable information and personal health information using HP Voltage. Creating Sqoop jobs for initial load of tables into data reservoir (HDFS). Creating hive tables and view on top of files placed in data reservoir (HDFS). Develop Spark job as per requirement. Creating virtualized view from multiple sources (Oracle,Db2,Hive,Files etc.) as per the business requirement. Providing demo to business user after every sprint about what business value will the data provide them. Interact with business analyst and business partner to know the actual use of data and helping them ingathering business requirement by proving the technical details about the data source.

Associate Consultant

Principal Global Services Pvt. Ltd

•Jun 2012 - Sep 2016

Technical Associate

Tech Mahindra Ltd

•Apr 2010 - May 2012

DTI (Data Transformation and Integration layer) is the central component that lies between the Data Fabric layer and the Data Source layer and plays the critical role of updating the Data Fabric repositories with enterprise data from various ATT data sources. DTI uses the best of breed technologies to extract data from a variety of source systems (Relational, Mainframe, Flat files etc.) in near real-time, if possible, or using a batch interface and to update the Data Fabric repositories with the data directly. Receive real-time feeds from GoldenGate sources, load to DTI databases, apply business rules for transforming and integrating, and load the data in near real-time to data fabric target databases using GoldenGate Change Data Capture extracts and replicats. Receive batch feeds in various formats (flat file, mainframe ebcdic, xml etc.), identify daily deltas, integrate, transform and load into target databases using Informatica. The Call Center Performance Management program is intended to aggregate data from various systems with a comprehensive set of tools for reporting to improve business operations and individual performance.

Education

Truba Institute of Engg. & Information Technology

Bachelor of Computer Science and Engineering

Jan 2009

Graduated from RajivGandhiUniversity(Bhopal).

Skills

Linux
Windows
Oracle 10g
DB2
SQL
Business Objects
Tableau
Informatica Power Center
Informatica Data Quality
Flume
Sqoop
Kafka
Spark
Hive
Impala
Kudu
Oozie
Scala
GIT
SVN
Denodo Data Virtualization
Agile
Scaled Agile
Waterfall
Data Warehousing
Business Intelligence
Data Governance
Encryption
ETL