Education

  • Ph.D. in Computer Science, Northwestern University, Jul 2020 – Aug 2026 (expected)
  • M.S. in Analytics (Data Science), Georgetown University, Aug 2018 – May 2020
  • B.S. in Biochemistry, University of Washington, Aug 2014 – Dec 2017

Work Experience

  • Jun 2025 – Aug 2025: Data Scientist Intern, iSoftStone Inc., Plano, TX
    • Built an end-to-end indoor localization pipeline by integrating Wi-Fi RSSI and FTM signals, including data preprocessing, feature engineering, and probabilistic sensor fusion (Bayesian filtering), improving positioning robustness in multipath environments.
    • Developed a graph neural network-based correction model to refine localization outputs, enabling cross-site generalization without dense fingerprint recalibration.
    • Designed automated data collection and benchmarking workflows, ensuring reproducibility and consistent evaluation across experimental trials.
  • Jun 2024 – Aug 2024: Machine Learning Engineer Intern, AstraZeneca, Shanghai (Remote)
    • Designed and deployed a domain-specific Retrieval-Augmented Generation (RAG) system leveraging FAISS-based vector search and fine-tuned LLaMA-2, enabling scalable document-level QA over large medical corpora.
    • Built an end-to-end ML pipeline including data preprocessing, embedding generation, indexing, and inference orchestration, improving system scalability and retrieval latency.
    • Integrated LLM workflows with Azure ML services and distributed compute for large-scale document processing and model evaluation.
    • Developed automated evaluation pipelines (GPT-4-based validation and ranking) to ensure robustness and reproducibility of generated responses.
  • Aug 2018 – Sep 2019: Innovation Data Analyst, Office of Diversity & Inclusion, Georgetown University, Washington, DC
    • Optimized SQL queries for efficient demographic data collection and analysis, improving the tracking of medical student diversity and performance.
    • Identified data inconsistencies and missing values, applying statistical methods (bootstrapping and hypothesis testing) to ensure robust analytical conclusions.
    • Developed Tableau dashboards to communicate insights, enabling stakeholders to monitor diversity and program effectiveness.
  • Mar 2018 – Aug 2018: Data Analyst Intern, iSoftStone Inc., Kirkland, WA
    • Built ETL pipelines using Power Query and DAX to integrate multi-source data, reducing manual processing and improving reporting efficiency.
    • Automated data analysis workflows using Python and R, improving scalability of recurring reporting tasks.
    • Developed Power BI dashboards to track KPIs and support data-driven decision-making.

Technical Skills

  • Programming: Python, SQL, C++, Java
  • Machine Learning & AI: PyTorch, TensorFlow, Deep Learning, Large Language Models (GPT-4, LLaMA, LangChain), Computer Vision, NLP, Time-Series Modeling, Neural Architecture Search (NAS), TinyML
  • ML Systems & MLOps: End-to-End ML Pipelines, Model Deployment, Inference Optimization, Reproducible ML Systems, Experiment Tracking, Vector Databases (FAISS)
  • Cloud & Distributed Systems: Azure ML, Azure Cognitive Services, Distributed Data Processing (Spark)
  • Data Visualization: Power BI, Tableau, Plotly
  • Software & Tools: Git, Docker, REST APIs, Linux
  • Statistical Methods: Hypothesis Testing, A/B Testing, Bayesian Methods