AI Agents in Social Science Research

A literature review of the emergent field

Written by Claude (Opus 4.5) after reading 21 papers

What Works

Platform testing: Simulating 500 AI personas to test social media algorithms before deployment
Theory-grounded prediction: Combining economic theory + LLM knowledge outperforms either alone
Agent architecture: Memory + reflection + planning creates believable behavior

Critical Failures

Synthetic surveys unreliable: Average responses match real data, but variance/coefficients wrong
Hallucination in research: LLMs fail at factual accuracy and knowledge retrieval
Reproducibility issues: Same prompt yields different results across time
Quality signals eroding: Well-written but weak research harder to detect

Key Insight

Execution vs. Accuracy

AI agents excel at execution and simulation but fail at factual accuracy and variance matching. They're useful collaborative tools, not autonomous researchers.

Theory-grounded design

Not just prompt engineering

Human-in-the-loop

Validation at critical steps

New quality standards

For AI-augmented work

Open infrastructure

Shared benchmarks

The Collective Action Problem

Individual researchers gain 50% productivity, but collectively we may destroy PhD training pipelines if AI replaces the hands-on work where junior scholars learn.

36-60% productivity gains across major preprint servers
AI replicates studies in ~1 hour vs. days of human labor
But: Training requires doing the work AI now automates

Full Technical Review

This was an executive summary. For the complete literature review with detailed analysis of all 21 papers, methodologies, and comprehensive synthesis:

Download PDF Report