Iftikhar H
Principal Data Engineer
Kompetenzen

Meine Dienstleistungen


Portfolio
Arbeitserfahrung
Principal Data Engineer
GSK • Freiberufler
Oct 2022 - Present • 3 yrs 7 mos
• Led a team of 5 data engineers designing and maintaining enterprise-scale ETL/ELT pipelines using Databricks and cloud data infrastructure across AWS and Azure. • Engineered containerized, Docker- and Kubernetes-based data pipelines using Kafka, Python microservices, and REST APIs, reducing processing time by 40%. • Designed a scalable, high-performance data architecture supporting real-time analytics across 10+ business units, incorporating cloud data warehousing solutions such as Snowflake, Redshift, and BigQuery. • Built and optimized ELT pipelines leveraging dbt and Azure Data Factory to support scalable and maintainable data transformation workflows. • Optimized SQL and Bash scripting for automated data ingestion, transformation, and workflow orchestration. • Managed Git-based CI/CD pipelines for version control and automated deployment of data pipelines. • Standardized logging and audit frameworks for ETL processes, enabling faster troubleshooting and root-cause analysis. • Collaborated with architecture leadership to define governance, lineage, and access frameworks using Unity Catalog and schema registry. • Tech Stack / Tools: Apache Kafka, Airflow, Nifi, Python, SQL, Bash scripting, Docker, Kubernetes, Azure Cloud, REST APIs, Linux
Senior Data Engineer
BridgeLinx Technologies • Vollzeit
Mar 2022 - Sep 2022 • 6 mos
• Built a data warehouse from scratch in AWS Redshift and Python microservices, consolidating 10+ heterogeneous data sources. • Developed Python-based microservices for automated data ingestion and transformation, improving pipeline efficiency and reducing manual intervention. • Tech Stack / Tools: Python, AWS Redshift, AWS S3, REST APIs, Microservices Architecture, SQL, Docker, Git, Linux
Senior Data Engineer
TechnoGenics SMC PVT LTD • Vollzeit
Feb 2021 - Mar 2022 • 1 yr 1 mo
• Designed and maintained large-scale ETL pipelines and data warehouses, improving data processing speed by 30%. • Implemented Python, Kafka, Elasticsearch, FluentD, and GCP (GKE) solutions, supporting 5+ TB daily data ingestion. • Worked on Python, Kafka, Elasticsearch, FluentD, and GCP, with a particular focus on Google Kubernetes Engine (GKE). • Achieved a 30% improvement in data processing speed by designing and implementing a robust data warehouse and ETL pipelines that consolidated data from multiple sources. • Tech Stack / Tools: Python, Apache Kafka, Elasticsearch, FluentD, GCP (GKE, Cloud Storage), Docker, Kubernetes, SQL, Linux