I am an AI Researcher at Labelbox, working on model evaluation and reasoning.

I earned my PhD in Computer Science from the University of Arizona, advised by Prof. Mihai Surdeanu. My research interests include Machine Learning, Large Language Models (LLMs), Natural Language Processing, and AI Safety. I am particularly interested in stress-testing the “alignment” of LLMs.

My PhD dissertation is the first to identify the issue of data contamination (data leakage) in LLMs, scenarios where training data overlaps with evaluation data. I developed several methods to detect and estimate contamination in fully black-box LLMs. My PhD research received media coverage and earned several awards, including the Outstanding Graduate Scholarship and the Galileo Circle Scholarship.

Previously, I interned at Google Cloud AI Research, Walmart Global Tech, and Harvard Medical School.

Selected Publications

Towards Compute-Optimal Many-Shot In-Context Learning
Shahriar Golchin, Yanfei Chen, Rujun Han, Manan Gandhi, Tianli Yu, Swaroop Mishra, Mihai Surdeanu, Rishabh Agarwal, Chen-Yu Lee, Tomas Pfister
COLM 2025 | Paper| Poster

Memorization in In-Context Learning
Shahriar Golchin, Mihai Surdeanu, Steven Bethard, Eduardo Blanco, Ellen Riloff
arXiv 2025 | Paper

Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models
Shahriar Golchin, Mihai Surdeanu
TACL — ACL 2025 | Paper| Poster| Video| Media

Grading Massive Open Online Courses Using Large Language Models
Shahriar Golchin, Nikhil Garuda, Christopher Impey, Matthew Wenger
COLING 2025 | Paper

Time Travel in LLMs: Tracing Data Contamination in Large Language Models
Shahriar Golchin, Mihai Surdeanu
ICLR 2024 — Spotlight 🌟 (notable top 5%) | Paper| Poster| Video| Media

Do not Mask Randomly: Effective Domain-Adaptive Pretraining by Masking In-Domain Keywords
Shahriar Golchin, Mihai Surdeanu, Nazgol Tavabi, Ata Kiapour
ACL 2023 RepL4NLP | Paper

A Natural Language Processing Pipeline to Study Disparities in Cannabis Use and Documentation Among Children and Young Adults: A Survey of 21 Years of Electronic Health Records
Nazgol Tavabi, Marium Raza, Mallika Singh, Shahriar Golchin, Harsev Singh, Grant Hogue, Ata Kiapour
Nature Digital Medicine | Paper

Building Large-Scale Registries from Unstructured Clinical Notes Using a Low-Resource Natural Language Processing Pipeline
Nazgol Tavabi, James Pruneski, Shahriar Golchin, Mallika Singh, Ryan Sanborn, Benton Heyworth, Amir Kimia, Ata Kiapour
Artificial Intelligence in Medicine | Paper

Blog Posts

Reflections on NeurIPS 2025: Advancing Evaluation and Continual Learning in AI
Shahriar Golchin, Smit Modi, Stepan Tytarenko, Almas Abdibayev, Marc Wetter
Labelbox | Link

Service to the Field

Reviewer: AAAI 2026, NeurIPS {2025, 2024}, COLM 2025, ICLR 2025, ACL {2024, 2023}