Reza Shamji

Research Associate @ Zitnik Lab (Harvard Medical School) PhD applications due Fall 2026 (hopeful entry Fall 2027) — theoretical deep learning & alternatives to neural networks


Looking for Collaborations

I’m actively seeking exposure to theoretical deep learning research before submitting PhD applications this fall. If you work on alternatives to gradient descent, transformer architecture theory, model implicit bias, and/or transfomer/SSM/alternative to transformer training dynamics—I’d love to connect. Early feedback and collaborative projects would help ground my research direction.


Research Vision

I’m driven by a fundamental question: Is gradient descent on deep neural networks the right path toward artificial intelligence?

My research interests span two interconnected directions:

1. LLM Reasoning & Knowledge Integration Currently, I study how large language models leverage external knowledge—particularly when knowledge helps vs. harms performance. I’m developing model-agnostic predictors to determine when retrieval-augmented systems should be deployed, understanding the inductive biases that govern when agent decision-making improves under external constraints.

2. Theoretical Deep Learning & Architecture Alternatives My longer-term vision is to rethink artificial intelligence from first principles. I believe understanding the mathematics of transformers—attention mechanisms, QKV matrices, MLP/hidden state dynamics—is essential groundwork. But I’m skeptical that gradient descent is the most computationally representative path to general intelligence. During my PhD, I plan to:


Preprints & Manuscripts


Research & Engineering

Zitnik Lab — Research Associate (Sept 2025 – Present)

Foundational LLM Health Reasoning & Knowledge Integration

Infrastructure & ToolUniverse Integration


Kempner Institute — ML Research Engineer Intern (Multimodal AI) (Jun – Sept 2025)

Advised by Yilun Du & Sham Kakade.

Large-Scale Ablation Study & Key Findings

Infrastructure & Benchmarking

Findings underpin the VLM architecture design space preprint (ICML 2026 submission under review, manuscript and code available upon request).


Selected Projects

ChainEnv RL Benchmarks (JAX)

A compact sandbox to probe exploration under sparse rewards in a tunable 1-D chain (pure, vectorized JAX; laptop-friendly).
Repo: github.com/rezashamji/jax-chainenv-benchmarks


Technical Skills


Education

Harvard University — A.B. in Computer Science (Secondary in Economics) May 2025

Relevant Coursework: Algorithms & Limitations, Distributed Systems & Machine Organization, Semantics of Programming Languages, Probability, Linear Algebra


Languages

English (native) · Mandarin (fluent)