rustin k
hello@ | rstk.dev | github | sr.ht
Technical Skills #
- Interests: Data engineering, analytics engineering
- Languages: Python (Spark, Polars), R (Tidyverse), SQL, Go
- Cloud & Infrastructure: AWS, GCP
- Data Engineering: Databricks, dbt, ETL pipelines, data orchestration, data governance, APIs
- DevOps & Tools: Git, GitHub Actions, Docker, CI/CD, Terraform, Dagster, agentic programming, LLM integration
- Hobbies: Reading (sci-fi, fantasy), documentaries, photography, gaming, working out, nature walks
Employment History #
| Company | Role | Dates |
|---|---|---|
| Delfi Diagnostics | Data Engineer | Apr 2022–Present |
| N-Power Medicine | Program Manager | Oct 2021–Apr 2022 |
| GRAIL | Data Engineer | Jan 2018–Oct 2021 |
| Gilead Sciences | Clinical Data Management | Nov 2015–Jan 2018 |
| Genentech | Biosample Coordinator | Dec 2012–Aug 2013 |
Professional Experience #
Data Engineer · Delfi Diagnostics
- Led Databricks adoption initiative, migrating workflows from legacy systems and training team members on platform capabilities
- Develop entire pipelines via Declarative Automation Bundles (DAB)
- Implemented Medallion Architecture in Unity Catalog
- Developed ETL orchestration using Spark Declarative Pipelines (SDP)
- Deployed AI/BI assistant using Genie Spaces
- Built CI/CD pipelines using GitHub Actions and Docker that automated clinical data pipelines
- Automated ETL processes with Dagster Implemented data quality monitoring framework with automated checks and alerting
- Developed 5 internal Python and R packages for data ingestion and tooling, adopted by teams company-wide
- Delivered BI dashboards in Shiny for executive reporting, providing real-time visibility into key metrics
- Created semantic layers following CDISC/ADaM standards, standardizing data for 3 clinical studies
- Built LLM-powered chatbot with DuckDB MCP integration, allowing users to query data in natural language
Data Engineer · GRAIL
- Built data quality pipeline for biospecimen tracking across 3 clinical studies, reconciling thousands of samples
- Developed AWS data warehouse infrastructure using S3, Glue, Athena, and QuickSight to enable self-service analytics for stakeholders
- Led internal software development as project manager, coordinating with engineering team to deliver key features for biosample management platform
Additional Experience #
Program Manager · N-Power Medicine
- Managed software development roadmap, prioritizing initiatives that supported business objectives.
Clinical Data Management · Gilead Sciences
- Managed clinical data operations for Phase I clinical trials, ensuring data accuracy and regulatory compliance.
Biosample Coordinator · Genentech
- Coordinated biosample management operations for clinical studies while maintaining data accuracy.
Education #
| Degree | University | Year |
|---|---|---|
| M.S. Molecular Biology | San Jose State University | 2015 |
| B.A. General Biology | San Francisco State University | 2012 |