AI & Advanced Computing
2026 Humanities and AI Virtual Institute Development Awards
Jan 3, 2026
Schmidt Sciences’ Humanities and AI Virtual Institute (HAVI) is supporting 11 global interdisciplinary research teams to explore collaborative AI interventions with the potential to catalyze major breakthroughs in the humanities. These short-term development projects, spanning disciplines from archaeology and dance to environmental humanities and colonial history, will demonstrate how new technical approaches can be successfully implemented at scale. Learn more about these projects below.
Understanding Human Storytelling through the Limits of Generative AI
Hoyt Long (U of Chicago, US), Ari Holtzman (U of Chicago, US), Richard Jean So, (Duke University, US), Emily Wenger (Duke University, US)
A central paradox of generative AI models is their fundamental reliance on narrative training data and their proven inability to produce truly immersive stories. We tackle this paradox by designing experiments to steer generative models towards more “believable” outputs (e.g., grounded in tacit knowledge and causal logic). We use generative AI’s failures to elucidate questions about the art of storytelling, and use humanistic theory and expert evaluation to inform the development AI technology for narrative world-building.
Tracing Enslaved People from Archival Records to the U.S. Census
Adam Rothman (Georgetown, US), Lisa Singh (Georgetown, US)
The United States Census did not identify enslaved people by name, even though they were counted for the purpose of apportioning political representation. This omission poses obstacles for historical, social science, and genealogical research. This project assesses whether and how AI tools can aid in matching the names of enslaved people drawn from historical archival material to records in a historical dataset compiled from the U.S. Census beginning in 1870.
AI Analysis of Small Data: Dating Akkadian Literary Texts
Nathan Wasserman (Hebrew U, IL), Barak Sober (Hebrew U, IL)
We address a central challenge in Mesopotamian studies: establishing a reliable chronology for undated Akkadian literary texts. We will develop interpretable neuro-symbolic language models that integrate statistical learning with explicit linguistic rules from Assyriology, mirroring how scholars track orthographic, morphological, lexical, and syntactic change. The project links humanities and AI by providing scholars with data-driven dating tools and advancing methods for learning from very limited data.
Unlocking Three Hundred Years of Real Data in Historical Spanish
Patricia Murrieta-Flores (Tec de Monterrey, MX), Dr Rodrigo Vega Sánchez (Lancaster U, UK), Dr Paloma Vargas Montes (Tec de Monterrey, MX), Dr Alexander Sánchez Díaz (Universidad de Alicante, ES), Dr Nayomi Kasthuri Arachchi (Tec de Monterrey, MX), Dr Eugenio Torres Flawía (Universidad de San Andrés (UDESA), AR), Arturo Loyola (Colmex, MX), Edna Brito Ramos (Colmex, MX), Gregorio Reyes (Tec de Monterrey, MX), Nicolas Malpic (Uni Andes, Colombia), Sarah Bryden (Columbia, USA), Carlos Gonzalez Mireles (Tec de Monterrey, MX), Javier Cortes (UNAM, MX), Francisco Cruz Rios (ENAH, MX)
The research develops a pipeline to process automated transcriptions of colonial documents in Spanish from the 16th to 18th centuries. The two-stage AI-based pipeline will correct errors in automatic text recognition and transform historical Spanish into modern Spanish. Trained primarily on Mexican corpora, the pipeline seeks to facilitate access, analysis, and reuse of colonial archives, promoting an open and reproducible framework with a decolonial orientation to recover centuries of historical information from Latin America.
Recovering Hidden Histories: Adapting HTR for Colonial Paraguayan Archives
Guillaume Candela (Cardiff University, UK), Patricia Murrieta-Flores (Tec de Monterrey, Mexico), Elizabeth Barriocanal Carisimo (Archivo Nacional de Asunción, Paraguay), David Jara (UNA, Paraguay), Cristian Daniel Zorilla Ortiz (UNA, Paraguay), Francisco Cruz Ríos (ENAH, Mexico), Adriana Lazcano (El Colegio de San Luis, Mexico), Arturo Loyola (Colmex, Mexico)
This project pioneers AI-powered Handwritten Text Recognition (HTR) for Paraguay’s colonial archives, recovering manuscripts documenting the lives of enslaved Indigenous and African peoples. Collaborating with the Archivo Nacional de Asunción, we adapt HTR models from the Fleets of New Spain project to navigate complex scripts and damaged documents. By producing open datasets and training local archivists, we demonstrate how humanistic inquiry into erased histories can drive equitable AI innovation and sustainable capacity building in the Global South.
Improving Motion Modeling with Dance Expertise for Archiving and Analysis
Harmony Bench (Ohio State, US), Kate Elswit (Royal Central Sch. of Speech and Drama, UK), Ashley Brown (Martha Graham School, US), Michael Neff (UC Davis, US), Michael Rau (Stanford, US), Tia-Monique Uzor (Royal Central Sch. of Speech and Drama, UK), Vita Berezina-Blackburn (Ohio State, US), Peter Broadwell (Stanford, US)
How can AI motion models capture the nuance, expertise, and history that dance artists hold in their bodies? This project brings together a team of dance historians, computer scientists, and Martha Graham Technique(™) specialists to evaluate motion capture, monocular and multi-view video, and computer vision. Producing technical specifications for enhanced 2D and 3D motion modeling, we establish a foundation and agenda for multi-modal and AI-enabled research on dance history, choreographic preservation, and embodied legacies.
Talking to Plants: Expanding the Frontiers of Non-Discursive Communication
Emma Strubell (Carnegie Mellon University, US), John Bambery (Manship Artists Residency, US), Ernesto Gianoli (Tarleton State University, US), Damián Blasi (Pompeu Fabra University, Spain)
This project will explore the potential for AI to help mediate communication between humans and plants. We aim to establish a framework and proof-of-concept for human-plant communication, focusing on non-discursive information exchange through mechanical, electrical, and chemical signaling. Success will require expanding AI reasoning to non-symbolic modalities, as well as deepening our understanding of the human capacity for communication in the absence of images or language, highlighting our intrinsic connection to the natural world.
Ludus ex machina: studying ancient games through human-like AI play agents
Tim Penn (U of Reading, UK), James Goodman (U of Reading, UK), Summer Courts (U of Reading, UK), Eric Piette (UCLouvain, UK), Walter Crist, (Leiden U, UK) Aloïs Rautureau (ENS Rennes, FR)
Traditional games AI prioritises optimal play, ignoring the social and emotional nuances of human play. This project will bridge the humanities and computational modelling by investigating “humanlike play” through the Roman game ludus duodecim scriptorum. By encoding non-optimal behaviours, such as risk-taking and social performance attested in Roman sources into the games AI modelling platform the Tabletop Games Framework, the project will create a new methodology for testing historical hypotheses around gaming.
ArchAIa: Foundation Models for Artifactual Understanding
Daphne Ippolito (Carnegie Mellon University, US), Shai Gordin (Ariel University, Israel), Michael Harrower (Johns Hopkins University, USA), Alexandra Karamitrou (University of Southampton, UK), Ivan Sipiran (University of Chile, CENIA, Chile), Nathan Wasserman (Hebrew University of Jerusalem, Israel), Gregory Heyworth (University of Rochester, USA), Jonathan Prag (University of Oxford, UK), Eric Kansa (Open Context)
Archaeological excavations are inherently destructive (any particular location can be excavated only once), which means careful documenting and cataloging are crucial components of the excavation process. Archaeologists have increasingly seen the value of creating digital databases which store records for hundreds of thousands of artifacts, annotated with text descriptions, photographs, measurements, stratigraphic information, provenance metadata, and more. Despite the richness of these datasets, their analysis remains largely manual and siloed, with different archives and excavations adopting incompatible taxonomies and metadata schemas. The goal of this project is to use modern AI techniques to make sense of this trove of data. We aim to allow archaeologists to ask complex, higher-order questions of the data and to build novel data exploration tools that enable insights that would be difficult or even impossible to arrive at via manual data inspection.
The Garden of Forking Prompts: Systematic Mapping of Narrative Space
Maria Antoniak (U of Colorado, Boulder, US), Melanie Walsh, (U of Washington, US), Nora Benedict (U of Georgia, US)
The Garden of Forking Prompts uses the literary frameworks of Jorge Luis Borges to investigate the landscape of AI-generated fiction. Drawing on Borges’s explorations of infinite books, speculative narratives, and imagined texts, this project will develop computational approaches for mapping and understanding spaces of possible stories. The work will help to bridge literary theory and modern generative AI, producing new tools for exploring narrative possibility, interactive visualizations, and evaluation resources for AI-generated fiction.
If Small Data Could Talk
Idris Brewster (Kinfolk, US), Maria Antoniak (U of Colorado Boulder, US), Frank Leon Roberts (Amherst, US)
If Small Data Could Talk is a humanities-first AI research project that asks: how do cultural values travel through data? Using the writings of James Baldwin as a lens, the project will investigate how moral, aesthetic, and social values are represented, distorted, or erased within large language models. As AI systems increasingly serve as public interpreters of culture and history, this work will explore how a small, value-rich dataset rooted in Baldwin’s archive can reveal and reshape the ethical dynamics of machine learning.