Research
Below is all my currently published research work. Each entry includes a link to the paper, a short summary, and some personal thoughts.
Lag and Duration of Leader–Follower Relationships in Mixed Traffic Using Causal Inference
Summary: This paper implements a causal inference approach to analyze leader-follower dynamics in an arterial road in Chennai, India. We quantify the temporal lag and duration of interactions using transfer entropy metrics.
My thoughts: This was my first paper. I learned a lot about how to write research in general from here, and in the future I think I would like to do more papers in this style-- analyzing some weird real world phenomenon with an interesting method. Published in Chaos.
Batayan: A Filipino NLP benchmark for evaluating Large Language Models
Summary: This paper introduces Batayan, a benchmark to evaluate LLMs on NLP tasks in Filipino.
My thoughts: Most of the work here was in actually writing/re-translating the entries. Would be nice to do some in-depth error analysis ala Parser Showdown at the Wall Street Corral. Published in ACL 2025 Main Conference
Identifying a Circuit for Verb Conjugation in GPT-2
Summary: Looking for a circuit in GPT-2 that does subject verb agreement. We find one, but it gets progressively larger as the SVA task gets more complicated.
My thoughts: Final project for L193: Explainable Artificial Intelligence. Thinking of a place to submit this.
Learning Modular Exponentiation with Transformers
Summary: We teach a small 4-layer transformer modular exponentiation. PCA on embeddings doesn't show any clear structure, but we do find a cool example of grokking by multiples of moduli. Also, we find a small circuit that performs regular/normal exponentiation.
My thoughts: Final project for R252: Theory of Deep Learning. Thinking of a place to submit this.
Learning Dynamics of Meta-Learning in Small Model Pretraining
Summary: If you replace half of the steps in language model pretraining with a meta-task, what does the model learn? Model achieves better loss, improves the vanilla model's F1 on NER, and has this really interesting phase transition.
My thoughts: This is one half of my MPhil thesis. Have submitted this to a workshop somewhere. Really proud of Figure 6 here.