I am a first-year PhD candidate at the University of Edinburgh based in the Concepts & Causality Lab, supervised by Tadeg Quillien. My research at the intersection of computational cognitive science and explainable AI investigates the alignment of causal reasoning in humans and language models. My current work explores whether language models faithfully replicate human judgments in causal reasoning tasks, expanding upon earlier work in experimental psychology focusing upon the distinction between process and dependency theories of causal attribution. Furthermore, my research examines the mechanisms underlying causal reasoning in language models by implementing activation patching techniques from mechanistic interpretability to identify circuits that are involved in causal reasoning. I am also interested in exploring the communication of causal knowledge, as modeled via RL agents.
I previously received my Master's degree in Brain & Cognitive Sciences at the University of Amsterdam. During my MSc research, I investigated the representational alignment between language models and neural activity in the brain, as measured via fMRI. Prior to that, I completed my Bachelor's degree in Computational Neuroscience at the University of Southern California.