m
m_irfan_eng

Muhammad Irfan

@m_irfan_eng

Data Scientist AI ML OCR Gen AI Solutions

Pakistan
Englisch, Urdu, Punjabi
Einige Informationen werden in englischer Sprache angezeigt.
Über mich
Stop paying for manual work and costly data errors. I am an expert Python Automation Engineer (ex-AI4LYF) dedicated to building highly robust and scalable data pipelines that cut administrative costs by up to 40%. My Core Value: I transform messy, unstructured data (scanned forms, receipts, clinical reports, sensor logs) into clean, usable formats for immediate analysis. Specialized Expertise: OCR Automation, Data Visualization & Dashboards, PDF-to-Excel script, biomedical classification model... Mehr lesen

Kompetenzen

m
m_irfan_eng
Muhammad Irfan
offline • 
Durchschnittliche Antwortzeit: 1 Stunde

Meine Dienstleistungen

Automatisierungen
I will automate PDF and document data extraction using python ocr

Arbeitserfahrung

AI Engineer

AI for lyfe • Vollzeit

Jun 2021 - Oct 20243 yrs 4 mos

Led end‑to‑end data processing pipelines for wearable sensors and mobile microphones, performing large‑scale cleaning, feature extraction, and behavioral/physiological signal analysis for health insights. Built and optimized a Pix2Pix GAN with MAE, VGG, content, and classifier‑based losses, improving reconstruction quality by 50%. Developed a CNN‑based multiclass, multilabel classifier trained on hybrid real + GAN‑generated datasets, achieving 92% accuracy and demonstrating synthetic data reliability. Engineered an interactive biomarker‑monitoring dashboard in Python Dash, accelerating clinical reporting by 30% and enabling real‑time visualization of key health indicators. Automated OCR workflows using OpenCV, EasyOCR, and AWS Textract, reducing manual data‑entry time by 40% and increasing accuracy. Designed XGBoost and ExtraTrees models for biomedical audio classification across four clinical conditions, integrating demographic metadata for improved robustness. Led data‑infrastructure efforts, designing a scalable collection framework (20K+ validated samples), backend systems, validation pipelines, and team training to ensure safe and compliant data handling.