research Implicit Personalization Monitoring and attributing the user models LLMs silently build — SPAR research fellowship MoE Interpretability Adapting HeadPursuit / SOMP to classify expert specialization in Mixture-of-Experts LLMs ml BayesianFlow Pixel-wise uncertainty estimation in Flow Matching generative models via Last Layer Laplace Approximation LoRA & DoRA in TinyGrad From-scratch Low-Rank Adaptation and Weight-Decomposed LoRA implemented in TinyGrad Vector Store + RAG Minimal RAG pipeline with a custom vector store and Mistral via Ollama — no LangChain systems Self-Attention Kernels Optimized Causal Multi-Head Self-Attention in CUDA, OpenMP, and SIMD — 1.09× faster than PyTorch naive on A100