Sanjeev Arora

Program

AI Safety Science
Institution

Princeton University
Location

USA

Dr. Sanjeev Arora is a renowned computer scientist and the Charles C. Fitzmorris Professor at Princeton, where he has been since 1994 after earning his Ph.D. from UC Berkeley. He is a multiple award-winning researcher, recognized with accolades such as the Fulkerson Prize in Discrete Mathematics and the ACM Prize in Computing. He is a member of prestigious academies like the National Academy of Sciences and the American Academy of Arts and Sciences, and a Fellow of the ACM. Arora has also been involved in leading roles at major institutions and conferences, contributing significantly to the field of computational intractability. His extensive experience and leadership in AI and computational theories drive this project’s innovative approach to AI safety.
His funded project builds on the theories of “skill emergence” and “LLM Metacognition” to develop a framework for measuring upper-bound risk, which complements traditional red-teaming methods that assess lower-bound risk.