Education
- Ph.D. in Computer Science, Northwestern University, Jul 2020 – Aug 2026 (expected)
- M.S. in Analytics (Data Science), Georgetown University, Aug 2018 – May 2020
- B.S. in Biochemistry, University of Washington, Aug 2014 – Dec 2017
Work Experience
- Jun 2025 – Aug 2025: Data Scientist Intern, iSoftStone Inc., Plano, TX
- Built an end-to-end indoor localization pipeline by integrating Wi-Fi RSSI and FTM signals, including data preprocessing, feature engineering, and probabilistic sensor fusion (Bayesian filtering), improving positioning robustness in multipath environments.
- Developed a graph neural network-based correction model to refine localization outputs, enabling cross-site generalization without dense fingerprint recalibration.
- Designed automated data collection and benchmarking workflows, ensuring reproducibility and consistent evaluation across experimental trials.
- Jun 2024 – Aug 2024: Machine Learning Engineer Intern, AstraZeneca, Shanghai (Remote)
- Designed and deployed a domain-specific Retrieval-Augmented Generation (RAG) system leveraging FAISS-based vector search and fine-tuned LLaMA-2, enabling scalable document-level QA over large medical corpora.
- Built an end-to-end ML pipeline including data preprocessing, embedding generation, indexing, and inference orchestration, improving system scalability and retrieval latency.
- Integrated LLM workflows with Azure ML services and distributed compute for large-scale document processing and model evaluation.
- Developed automated evaluation pipelines (GPT-4-based validation and ranking) to ensure robustness and reproducibility of generated responses.
- Aug 2018 – Sep 2019: Innovation Data Analyst, Office of Diversity & Inclusion, Georgetown University, Washington, DC
- Optimized SQL queries for efficient demographic data collection and analysis, improving the tracking of medical student diversity and performance.
- Identified data inconsistencies and missing values, applying statistical methods (bootstrapping and hypothesis testing) to ensure robust analytical conclusions.
- Developed Tableau dashboards to communicate insights, enabling stakeholders to monitor diversity and program effectiveness.
- Mar 2018 – Aug 2018: Data Analyst Intern, iSoftStone Inc., Kirkland, WA
- Built ETL pipelines using Power Query and DAX to integrate multi-source data, reducing manual processing and improving reporting efficiency.
- Automated data analysis workflows using Python and R, improving scalability of recurring reporting tasks.
- Developed Power BI dashboards to track KPIs and support data-driven decision-making.
Technical Skills
- Programming: Python, SQL, C++, Java
- Machine Learning & AI: PyTorch, TensorFlow, Deep Learning, Large Language Models (GPT-4, LLaMA, LangChain), Computer Vision, NLP, Time-Series Modeling, Neural Architecture Search (NAS), TinyML
- ML Systems & MLOps: End-to-End ML Pipelines, Model Deployment, Inference Optimization, Reproducible ML Systems, Experiment Tracking, Vector Databases (FAISS)
- Cloud & Distributed Systems: Azure ML, Azure Cognitive Services, Distributed Data Processing (Spark)
- Data Visualization: Power BI, Tableau, Plotly
- Software & Tools: Git, Docker, REST APIs, Linux
- Statistical Methods: Hypothesis Testing, A/B Testing, Bayesian Methods