Abigail Haddad
AI/ML Engineer & Data Science Leader
I build data pipelines and AI/ML systems, with a focus on text data and LLM applications. I create data visualizations with Quarto and JavaScript, and write about technical implementation for both technical and non-technical audiences. Recent work includes automated web scraping of federal job postings, regulatory comment analysis, and LLM evaluation frameworks.
Featured Projects

USAJobs Historic Tracker
Interactive dashboard comparing federal job listings from February to September across 2024 and 2025. Visualizes listings by department, agency, and occupation using bubble charts to show year-over-year changes. Provides downloadable raw jobs data for deeper analysis of federal hiring patterns.

USAJobs Historical Data Pipeline
Automated pipeline collecting and processing 3M+ federal job postings. Features daily GitHub Actions updates, web scraping for questionnaires, and deduplication across dual APIs.

Generic Comment Analyzer
Started with analyzing 35K+ Schedule F comments, then abstracted into a reusable framework. Uses LLMs for stance discovery and classification with interactive web UI.

Federal Personnel Records
Made OPM FedScope's public workforce data more accessible - 150M+ rows from 1998-2025, plus 9.8M+ separations/accessions. Pre-joined lookup tables and standardized schema make analysis much easier than official raw data.
Selected Talks
GitHub: How To Tell Your Professional Story
posit::conf(2024)
How to use GitHub to showcase your work and build your professional brand in data science.
Finding Your Next Federal Data Job
Data Community DC
Navigating the federal hiring process and finding data science opportunities in government.
What Job Is This, Anyway?
RGOV Conference - Lander Analytics
Using LLMs to classify USAJobs Data Scientist listings and understand the federal data workforce.
Experience
Independent AI/ML Developer
Made federal workforce data accessible by cleaning and standardizing 27 years of OPM data. Built LLM-powered comment analyzer using Gemini for attachment processing. Created daily automated USAJobs data pipeline with Playwright for scraping job questionnaires.
Staff AI and ML Engineer (GS-15)
AI Corps, Department of Homeland Security
Architected testing framework for DHSChat (DHS's internal LLM). Built unstructured data pipelines for document processing. Wrote AI testing and evaluation guide.
Lead Data Scientist - LLM Red Teaming
Capital Technology Group
Developed automated red-teaming tool for GPT-3.5 via fine-tuning. Created evaluation frameworks using response length and custom rubrics with LLM-led evaluations.
Lead Data Scientist - USCIS Client
Capital Technology Group
Built ML models for fingerprint quality prediction using PySpark/Databricks. Developed hill-climbing algorithm that reduced retake warnings by 50%. Created geocoding solutions.
Senior Data Scientist (GS-15)
Army FM&C
Built Python pipeline for Army financial transactions (UMTSs) analysis. Developed automated web scraping with Selenium to extract position descriptions. Co-founded Army Data Community.
Get in Touch
I'm available for consulting on data science and engineering, AI/ML evaluation frameworks, and text data pipelines.