My research interests are LLM Interpretability and Alignment, Bayesian Theory of Mind, or Human-centered CogSci AI in general.
My research interests are LLM Interpretability and Alignment, Bayesian Theory of Mind, or Human-centered CogSci AI in general.