Pavithra Sridhar

@pavithrasridhar

Data Engineer at Altimetrik India Private Limited

Chennai, TN

Altimetrik India Private LimitedEthiraj College For Women

Pavithra is a Data Engineer with 3 years of experience in designing and developing robust data pipelines. She possesses strong expertise in Big Data technologies, including Hadoop, Spark, Hive, and Sqoop, utilizing Python and SQL for data processing. Her skills include translating complex business requirements into efficient data models and optimizing data workflows.

Experience

Data Engineer

Altimetrik India Private Limited

•Jul 2001 - Dec 2001•INDIA

Involved closely with Operations to identify customer needs and demands. Designed and developed data ingestion pipeline to extract data for Insights. Translated business proportions into quantitative queries and collected the necessary data. Extracted and analyzed data to identify key metrics and transform data into meaningful information. Collected, cleansed and provided structured and unstructured data for business initiatives. Developed tables and views in Snowflake to be used in visualization. Ingested data from multiple data sources using a combination of SQL using Python to create data views to be used in BI tools.

Programmer Trainee

Cognizant Technology Solutions

•Jun 2001 - Jul 2001•INDIA

Monitored critical and daily production jobs and handled the occurred abends of the production jobs. Managed scheduling, holding and re-running the jobs as per the Developer’s request. Ensured higher priority issues are resolved by the respective Core Teams/Development within timeframe. Provided daily status/report for critical, abended and daily production jobs. Performed and handled the Weekly, Monthly and Bi-Monthly production releases.

Product Engineer

Fintuple Technologies Private Limited

•Jan 2001 - May 2001•INDIA

Designed and developed a web crawling pipeline to extract the data for each indices using Scrapy. Designed an ingestion layer to ingest crawled raw data into HDFS. Designed and implemented various preprocessing module using PySpark. Designed a Data Warehouse using Hive, created and managed Hive tables in Hadoop. Implemented data export module to export processed data from Hadoop systems to Relational Database (MySQL) using Sqoop. Involved in optimisation task to improve Hive queries and Spark performance. Actively involved in data validation testing to check the correctness of processed data / crawled data. Managed Spark stand-alone cluster and BigData ecosystem on premises. Designed and constructed web crawling modules to extract the data for stock prices and download sector based PDFs through web crawling using Scrapy. Developed a module to load the crawled data and data extracted from PDFs into HDFS. Designed and implemented PDF content extraction module using Camelot and load the extracted data into appropriate table in Hive. Worked on data processing module to process extracted data using Pyspark. Implemented export module which pushes the processed data from Hive to Relational Database using Sqoop for Web UI access. Implemented data validation module which verifies the processed data by extracting the data from MySQL using SQLAlchemy and validating it automatically (Data Quality Check).

Education

Ethiraj College For Women

Bachelor of Computer Applications

Computer Applications

Jun 2001 - Apr 2001

Licenses & Certifications

Azure DevOps Boards for Project Managers/Analyst/Developers course

Udemy

• No expiration

BigData Analysis : Hive, Spark SQL, DataFrames course

Coursera

• No expiration

Learning PySpark course

Udemy

• No expiration

Data Analysis using Pyspark

Coursera

• No expiration

BigData Essentials : HDFS, MapReduce and Spark RDD course

Coursera

• No expiration

Data Structures and Algorithms

Coursera

• No expiration

Object Oriented Programming in Python

Coursera

• No expiration

Design Patterns with Python

PluralSight

• No expiration

Skills

Python

SQL

MySQL

Snowflake

Hadoop

Spark

Hive

Sqoop

Kafka

Scrapy

Beautiful Soup

GitLab

GitHub

Jira

Azure DevOps Boards