Software Engineer & ML Researcher

KenWu.

Based inWaterloo, ON
EducationUniversity of Waterloo, CS '26
StatusSeeking Fall 2026 / New Grad — ML, Systems, SWE
Neo Scholar Finalist5x Hackathon WinnerStanford Code in Place InstructorURA · University of WaterlooUndergraduate Researcher · Lancaster University
7
Internships
5
Hackathon Wins
4
Countries
素顔 — Beyond Code

Beyond Code

🇮🇹Born🇨🇳Raised🇨🇦Studying🇺🇸Interned🇬🇧Exchanged
NBA

Fine-tuned an LLM on r/nba posts. Built an AI that roasts bad basketball takes on X.

Chess

Mediocre player. Builds chess bots on the side to compensate.

Philosophy

Minor who thinks about AI alignment when not shipping ML pipelines.

Off-duty

Basketball, poker, anime, and the occasional Leetcode grind.

経験
経験 — Experience

Experience

研究 — Research

University of Waterloo

With Prof. Ali Ghodsi & Amin Ravanbakhsh
Undergraduate Research AssistantML · Attention & symbolic regressionSep 2024 — PresentWaterloo, Canada · Remote
  • Benchmarked symbolic regression at dataset scale, holding R² ≥ 0.99 whenever fits stayed numerically stable.
  • Fine-tuned Symbolic GPT variants for roughly 19 percentage points higher in-domain accuracy.
  • Ablated tokenizer and Point-Net configurations to balance R² against overall model complexity.
  • Refined inference loops to reduce MSE and MRE consistently across standard benchmark suites.
Python
PyTorch
Transformers

Lancaster University

With Prof. Plamen Angelov
Undergraduate ResearcherUnsupervised learningJan 2025 — PresentLancaster, UK · On-site
  • Recursive ReSil / ReSilC in Python: O(1) key updates and PAMSil up to 85.6% faster on CIFAR-100 at equal quality.
  • Optimized R-Means centroid updates for 17–24% faster runs than K-Means on CIFAR-10/100, MNIST, and Fashion-MNIST.
  • Built a NumPy / scikit-learn pipeline benchmarking recursive versus flat clustering across 8+ datasets with 10-run averages.
  • Tracked silhouette, inertia, and wall-clock time each run so speed-quality comparisons stayed fair.
Python
NumPy
scikit-learn

Nokia

Software Engineer Intern5G Network AgentJul 2025 — Dec 2025Ottawa, Canada
  • Built LoRA fine-tuning pipeline for Qwen generating Camunda BPMN XML with GGUF quantization
  • Designed evaluation framework revealing overfitting from NCSC-only data vs. base models
  • Resolved GPU memory and Triton issues and shipped reusable training tooling for future experiments
Python
PyTorch
Unsloth
Transformer

TD Bank

Data Scientist InternInsurance AnalysisJun 2025 — Aug 2025Toronto, Canada
  • Replatformed pipelines for 1.5M+ rows of data and reproduced a 154,340 row deliverable with 100% parity
  • Automated 30+ minutes of manual ingestion per cycle, cutting QA time by 80% via a parity harness
  • Migrated on-prem pipeline to Azure fully automating runs and saving 2–4 hours/week of manual execution
Python
SQL
PySpark
Pandas
Databricks

Stanford University

Student InstructorTeachingApr 2025 — Jun 2025Stanford, United States
  • Taught Stanford’s Code in Place CS106A course to students globally, taken by 40,000+ students
  • Educated students in Python leveraging beginner friendly libraries including Stanford’s Karel and Tkinter
Python
Karel
Tkinter

August

Software Engineer InternLLM AgentSept 2024 — Dec 2024New York, United States
  • Handled 2,000+ requests/min by deploying 10+ API endpoints using FastAPI, AWS, and Supabase
  • Optimized evaluation cycles for 15+ LLM Agents with a round-robin multi-agent and scoring framework
  • Led the end-to-end development of a multi-agent RAG pipeline powered by LLM-as-Judge strategies
Python
LangGraph
FastAPI
AWS

hum.ai

Formerly Coastal Carbon
Machine Learning Engineer InternSuper ResolutionJul 2024 — Sep 2024Kitchener, Canada
  • Benchmarked SOTA super-resolution models (e.g. ESRGAN, StableSR) through PyTorch pipelines
  • Built automated benchmarking pipelines in Python to evaluate multiple models efficiently
  • Visualized model performance with Matplotlib and Seaborn in Jupyter on SageMaker for analysis
  • Managed experiment infrastructure on AWS S3 and EC2 for scalable fine-tuning and evaluation
Python
PyTorch
AWS
SageMaker
Jupyter

Health Canada

Machine Learning Engineer InternDocument QAApr 2024 — Aug 2024Ottawa, Canada
  • Built a document QA system using Llama3 7B and ChromaDB for OECD report search and summarization
  • Increased response and semantic accuracy by ~20% using query transformation and contextual memory
Python
Azure
LangChain
Streamlit

Saputo

Data Analyst InternOperations & AutomationJan 2024 — Apr 2024Georgetown, Canada
  • Developed TypeScript Office Scripts in Excel that eliminated ~8 hours/week of manual open-order updates
  • Automated weekly workflows for 1000+ Nestlé products, avoiding 20,000+ manual data entries
  • Used the Gemini API to automate competitor research across 200+ brands
  • Ran weekly statistical analysis and EDA in VBA across 2000+ major products and 200+ miscellaneous SKUs
TypeScript
Excel
VBA
Gemini API

Respan

Y Combinator W24 · Formerly Keywords AI
Software Engineer InternResume ParsingMar 2023 — Jun 2023New York, United States
  • Parsed 1,000+ resumes with a spaCy-based NER pipeline to extract structured recruiter data
  • Reduced response delay by 98% through integration of SQLite-based result caching into the parsing engine
Python
spaCy
SQLite

Intapp

Formerly delphai
Machine Learning Engineer InternEntity RecognitionJul 2022 — Sept 2022Berlin, Germany
  • Boosted recall by 20% through improved entity labeling workflows and language-specific training sets
  • Achieved 70% recall by fine-tuning spaCy models and optimizing hyperparameters via WanDB on Azure
Python
spaCy
W&B
BS4
作品
作品 — Projects

Projects

Ding-Bot

Under Construction

Chess engine combining GATEAU-style Graph Attention Networks with contrastive latent-space search

Python
TypeScript
Graph Neural Networks

PokerMon

Under Construction

Deep Counterfactual Regret Minimization (Deep CFR) for 6-player No-Limit Texas Hold'em

Python
Deep CFR
Game Theory

LeaseEase

McHack ’24 · Telus Environment & Social Sustainable Future Prize

Streamlit app demystifying Canada’s Residential Tenancy Act with LLM + RAG, plain-language guidance, and auto-generated forms (T1, N7) for tenants navigating the housing crisis.

Python
Streamlit
OpenAI
Cohere
ChromaDB

MedChat

Cohere RAG Challenge ’23 · Winner

Assistant for clinical Q&A: Cohere Classify routes intent to a brain-tumor CNN or RAG over 1000+ WebMD pages with streamed answers in Streamlit.

Python
Cohere
TensorFlow
Streamlit

DirectU

Hack the North ’23 · Best Use of Cohere

Full-stack planner matching career goals and free-text course preferences to UWFlow reviews via Cohere, assembling a personalized four-year roadmap (React, Flask, MongoDB).

React
Flask
MongoDB
Cohere

LeGM-Lab

AI-powered NBA take analyzer that fact-checks basketball opinions with real stats and roasts bad takes on X

Python
Claude API
FastAPI
X API

FlightCal

Fetches flight info and exports it directly to Google Calendar or as an .ics file for any calendar app

TypeScript
Next.js
Google Calendar API
技術
技術 — Skills

Skills

連絡 — Contact

Let's build something together.

ken.wu@uwaterloo.ca

Ken Wu

2026