
Jose D
Data Engineer Python SQL Spark AWS GCP Airflow dbt
Kompetenzen

Meine Dienstleistungen


Portfolio
Arbeitserfahrung
Freelance Data Engineer
Freelancer.com • Freiberufler
Jan 2025 - Present • 1 yr 4 mos
Portfolio: https://jmdu99.github.io/portfolio/ These are some of the projects I’ve delivered: 1) Project: Hybrid Batch and Streaming Pipeline for IoT, Legacy, and PostgreSQL Data Integration with NiFi, Kafka, Spark, Airflow, dbt, and Snowflake Industry: eHealth Client Type: Startup 2) Project: Batch and Streaming Pipelines for LMS, SIS, SaaS, and Log Data into BigQuery with Fivetran, Dataproc (Spark), Dataflow (Beam), and Cloud Composer Industry: EdTech Client Type: Mid-sized company Skills: Python · SQL · Apache NiFi · Apache Kafka · Spark Streaming · Apache Airflow · dbt · Amazon S3 · Snowflake (DWH) · Docker · Terraform · Shell Scripting · Apache Spark · Apache Beam · Google Cloud Storage · BigQuery · Fivetran · Pub/Sub · Dataflow
SQL & Python Developer - Models & Data
Santander • Vollzeit
Jun 2024 - Nov 2024 • 5 mos
Customer: BANCO SANTANDER, S.A. Tasks: - Created and maintained SQL processes through the concatenation of functions developed in PL/pgSQL. - Extracted data from SQL tables using Python, utilizing libraries such as psycopg2 and SQLAlchemy. Technologies: Bash · Python · SQL · Automatización de procesos · PL/pgSQL · SQLAlchemy · PostgreSQL
Business Intelligence Engineer - EU Supply Chain
Amazon • Vollzeit
Aug 2023 - Nov 2023 • 3 mos
I provided support to the Supply Chain team in developing stochastic optimization models, with a particular focus on the INSO model (Inbound Network S&OP Plan Optimization). Tasks: - Construction of an automated system that compiles and delivers Excel reports to stakeholders via email. This system leverages AWS services (EC2, S3, Lambda, Glue) and BDT enterprise data analytics products such as Hoot and Datanet. - Development of a Quicksight Dashboard to monitor the inputs and outputs of the INSO model. This involved a transition of data calculations from Excel to SQL using Common Table Expressions (CTEs) and the creation of effective visualizations. - Refactorization of the INSO code to enhance efficiency. This encompasses a shift in input/output management from local to AWS S3 or Redshift, utilization of TOML files for script configuration, and implementation of parallel processing with MPire. Additional enhancements include the integration of docstrings, type hinting, code formatting with Black, and linting with Flake8. - Independent study for the AWS Certified Cloud Practitioner certification, aiming to further enhance expertise in cloud computing. Technologies: Amazon Web Services (AWS) · Amazon EC2 · Python · AWS Lambda · ETL · Amazon QuickSight · Amazon S3 · Amazon Redshift · Amazon Athena · SQL · Automatización de procesos · AWS Identity and Access Management (IAM) · Microsoft Excel · AWS Glue