Research
Below is all currently published research work I'm an author on. Each entry includes a link to the paper, a short summary, and some personal thoughts.
-
Lag and Duration of Leader–Follower Relationships in Mixed Traffic Using Causal Inference
Summary: This paper implements a causal inference approach to analyze leader-follower dynamics in an arterial road in Chennai, India. We quantify the temporal lag and duration of interactions using transfer entropy metrics.
My thoughts: This was my first paper. I learned a lot about how to write research in general from here, and in the future I think I would like to do more papers in this style—analyzing some weird real world phenomenon with an interesting method. Published in Chaos.
-
Batayan: A Filipino NLP benchmark for evaluating Large Language Models
Summary: This paper introduces Batayan, a benchmark to evaluate LLMs on NLP tasks in Filipino.
My thoughts: Most of the work here was in actually writing/re-translating the entries. Would be nice to do some in-depth error analysis ala Parser Showdown at the Wall Street Corral. Published in ACL 2025 Main Conference.
-
Identifying a Circuit for Verb Conjugation in GPT-2
Summary: Looking for a circuit in GPT-2 that does subject-verb agreement. We find one, but it gets progressively larger as the SVA task gets more complicated.
My thoughts: Final project for L193: Explainable Artificial Intelligence. Thinking of a place to submit this.
-
Learning Modular Exponentiation with Transformers
Summary: We teach a small 4-layer transformer modular exponentiation. PCA on embeddings doesn't show any clear structure, but we do find a cool example of grokking by multiples of moduli. Also, we find a small circuit that performs regular/normal exponentiation.
My thoughts: Final project for R252: Theory of Deep Learning. Thinking of a place to submit this.
-
Learning Dynamics of Meta-Learning in Small Model Pretraining
Summary: If you replace half of the steps in language model pretraining with a meta-task, what does the model learn? Model achieves better loss, improves the vanilla model's F1 on NER, and has this really interesting phase transition.
My thoughts: This is one half of my MPhil thesis. Really proud of Figure 6 here.
-
Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages
Summary: At what point in pretraining does meta-pretraining start to improve zero-shot cross-lingual named entity recognition (NER) in Filipino and Tagalog? If you fine-tune every checkpoint from pretraining step 0 to 6000, you find some actual reuse of knowledge from the model's backbone.
My thoughts: This is the other half of my MPhil thesis. Have submitted this to a workshop somewhere. I think Figures 4 to 7 look nice.
-
No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes
Summary: Can we predict the accuracy of LLM answers using model internals, even before the answer is generated? We find that a simple linear probe on activations can achieve surprisingly good performance.
My thoughts: Worked on this with MARS 2.0 people. Nice graphs.
-
Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models
Summary: We study the effects of ReLoRA on the learning dynamics of small language models. Our experiments show that ReLoRA isn't that helpful.
My thoughts: Yuval's thesis. I like the conclusions.
-
Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time
Summary: We investigate the phenomenon of inoculation, where appending even a short system prompt in fine-tuning suppresses this behaviour in general deployment.
My thoughts: Daniel Tan is very agentic.