Why I Fail

Multi-agent autonomous research using rich data from Vibe Infoveillance. Explore, hypothesize, and discover patterns in market behavior.

Why "Why I Fail"?

We study how AI agents behave, reason, and make mistakes. By examining their errors, misjudgments, and blind spots—alongside their successes—we uncover patterns in artificial cognition that might inform better decision-making systems. Failure is data.

New: DAAF-Inspired Validation Framework

Integrated DAAF-inspired safeguards to prevent known failure modes: validation checkpoints (CP1-CP4) catch date filter bugs before they propagate, code review automatically flags anti-patterns like hardcoded assumptions, and state management enables session recovery. The infamous 2025-08-02 date filter bug is now auto-detected.

Inspired by the 100x Research Institution

First Agent-Initiated Study

A report on our inaugural autonomous research experiment

Introduction to Why I Fail

Research by the Agentic Team

Our AI agents don't just analyze data—they read academic papers, follow citation trails, and synthesize findings. Below is a literature review written by Claude (Opus 4.5) after reading 21 papers on AI agents in social science research. The agent identified key works, read them in full, and produced both a technical report and executive summary.

Agent Social Science Slides Full Report

Research Agents

Three AI models collaborating on research with diverse perspectives

Claude Opus 4.5
Advanced reasoning & synthesis
GLM 4.7
Alternative perspectives
Kimi K2.5
Pattern validation

Research Timeline

claude-opus-4-5 completed
Agentic Attention: How AI Agents Allocate Attention to Reddit Discourse
Step 4: Initial Findings
2026-02-19
glm-4.7 completed
AI Agent Decision Patterns - Independent Replication
Step 8: Cross-Validation: Claude & Kimi
2026-02-12
glm-4.7 completed
AI Agent Decision Patterns - Independent Replication
Step 7: Validation Summary & Correction
2026-02-12
glm-4.7 completed
AI Agent Decision Patterns - Independent Replication
Step 6: Self-Critique: Major Error Discovered
2026-02-12
kimi-k2-5 completed
AI Agent Decision Patterns - Independent Replication
Step 7: Audit of Claude's Self-Critique
2026-02-13
kimi-k2-5 completed
AI Agent Decision Patterns - Independent Replication
Step 6: Honest Self-Critique
2026-02-13
claude-opus-4-5 completed
AI Agent Decision Patterns in Market Analysis
Step 7: Epistemic Reflection
2026-02-12
claude-opus-4-5 completed
AI Agent Decision Patterns in Market Analysis
Step 6: Skeptical Self-Critique
2026-02-12
kimi-k2-5 completed
AI Agent Decision Patterns - Independent Replication
Step 5: Replication Studies
2026-02-13
kimi-k2-5 completed
AI Agent Decision Patterns - Independent Replication
Step 4: Initial Findings
2026-02-13
kimi-k2-5 completed
AI Agent Decision Patterns - Independent Replication
Step 3: Research Question Development
2026-02-13
kimi-k2-5 completed
AI Agent Decision Patterns - Independent Replication
Step 2: Data Exploration
2026-02-13
kimi-k2-5 completed
AI Agent Decision Patterns - Independent Replication
Step 1: Knowledge Graph Construction
2026-02-13
glm-4.7 completed
AI Agent Decision Patterns - Independent Replication
Step 5: Replication Studies
2026-02-12
glm-4.7 completed
AI Agent Decision Patterns - Independent Replication
Step 4: Initial Findings
2026-02-12
glm-4.7 completed
AI Agent Decision Patterns - Independent Replication
Step 3: Research Question Development
2026-02-12
glm-4.7 in progress
AI Agent Decision Patterns - Independent Replication
Step 2: Data Exploration
2026-02-12
glm-4.7 in progress
AI Agent Decision Patterns - Independent Replication
Step 1: Knowledge Graph Construction
2026-02-12
claude-opus-4-5 completed
AI Agent Decision Patterns in Market Analysis
Step 4: Initial Findings
2026-02-12
claude-opus-4-5 in progress
AI Agent Decision Patterns in Market Analysis
Step 4: Initial Findings
2026-02-12
claude-opus-4-5 completed
AI Agent Decision Patterns in Market Analysis
Step 3: Research Question Development
2026-02-12
claude-opus-4-5 completed
AI Agent Decision Patterns in Market Analysis
Step 2: Data Exploration
2026-02-12
claude-opus-4-5 completed
AI Agent Decision Patterns in Market Analysis
Step 1: Knowledge Graph Construction
2026-02-12
Agentic Attention: How AI Agents Allocate Attention to Reddit Discourse
claude-opus-4-5 2026-02-19 Step 4: Initial Findings completed
Research Objective
Posted 4 key findings written for non-technical audience. Core insight: AI attention is partly pre-programmed and highly consistent, with only 6-7 stocks representing true emergent discovery.
Announcement: CommDAAF Framework Now Integrated

New Research Methodology Framework

We are excited to announce that the CommDAAF (Computational Communication Data Analyst Augmentation Framework) has been integrated into our research pipeline.

What is CommDAAF?

CommDAAF is a methodology framework designed to enforce rigor in AI-assisted research. Instead of letting AI "just run analysis," CommDAAF requires: 1. Tiered Validation
  • Exploratory: Quick hypothesis generation (30-60 min)
  • Pilot: Preliminary findings with robustness checks (2-4 hours)
  • Publication: Full validation for journal-ready research (1-2 days)
2. Nudge System
  • Forces explicit methodology choices (no "default" or "whatever works")
  • Flags dangerous default parameters
  • Requires justification for every analytical decision
3. Reflection Checkpoints
  • "What surprised you about the data?"
  • "What alternative explanations exist?"
  • "What would disprove your interpretation?"
4. Alternative Explanation Requirements
  • Must identify at least 2-3 alternative explanations
  • Must actively investigate them (not just list them)

Why This Matters

This Agentic Attention study is the first to use CommDAAF. The framework's reflection checkpoint asking "What would disprove your interpretation?" directly led us to investigate the prompt templates - where we discovered the critical confound that 8/15 consensus tickers were pre-programmed. Without CommDAAF, we would have reported "15 tickers achieve AI consensus" as a finding. With CommDAAF, we discovered that only 6-7 represent true emergent attention.

Learn More

CommDAAF was developed for computational communication research and adapted for our stock market analysis pipeline. The framework is open source at: https://github.com/weiaiwayne/commDAAF
Important Methodological Note: Limits of Current Agentic Attention Research

Can We Study "Agentic Attention" With This System?

Honest Answer: Partially, but with significant constraints.

The Problem

Our research discovered that the current AI agent system conflates two types of attention:
TypeDescriptionResearchable?
Directed AttentionTickers mentioned in agent promptsNO - it's instruction-following, not discovery
Emergent AttentionTickers agents notice on their ownYES - this is true agentic behavior
Currently, ~53% of "consensus attention" is directed (8 tickers in prompts), and only ~47% is emergent (7 tickers not in prompts).

What This Means

When we report "AI agents pay attention to X," we cannot cleanly distinguish:
  • Did the AI discover X from Reddit content?
  • Or was the AI told to look for X in its instructions?

What IS Still Valid

Despite the confound, we CAN study: 1. True emergent consensus: META, AMZN, GOOGL, NET, SQ, COIN, WDC (not in prompts) 2. Agent-specific peripheral attention: Each AI's unique discoveries 3. Attention dynamics: Concentration patterns, temporal stability 4. Relative comparisons: How agents differ in their non-prompted attention

Recommendation for Future Research

To enable clean agentic attention research, we recommend: 1. Remove ticker mentions from agent prompts 2. Create "research mode" agents with blank-slate instructions 3. Track provenance: label each mention as "prompted" vs "discovered"

Bottom Line

> "We can study agentic attention, but only for the ~47% that isn't pre-programmed. > The current system makes it impossible to cleanly attribute attention to AI > reasoning vs. prompt engineering." This limitation is now documented and will guide future system improvements.
What We Discovered: AI Agents Pay Attention to the Same Things
We studied how 7 different AI systems (like ChatGPT, Gemini, and others) pay attention when reading Reddit discussions about stocks. Think of it like having 7 different analysts read the same newspaper - do they notice the same stories? The Surprising Answer: Mostly Yes. Out of hundreds of stocks discussed on Reddit, all 7 AI systems consistently noticed the same 15 stocks every single day for two weeks. These included big names like Amazon, Meta (Facebook), Nvidia, and Google. Why This Matters: If you're using AI to help with investment research, you should know that different AI systems might give you similar answers - not because they're all correct, but because they're all paying attention to the same things.
Important Caveat: Some AI Attention is Pre-Programmed
Here's the twist: We discovered that about HALF of those 15 "consensus stocks" were actually written into the AI's instructions beforehand! What This Means in Plain English: Imagine asking 7 people to tell you what's interesting in today's news, but you've already told them "make sure to mention Apple and Tesla." Of course they'll all mention Apple and Tesla - but that doesn't mean those were the most interesting stories. The Bottom Line:
  • 8 stocks were "pre-programmed" in the AI instructions (BA, MU, HOOD, V, SE, DIS, MA, NVDA, GS)
  • Only 6-7 stocks represent "true" AI discovery (META, AMZN, GOOGL, NET, SQ, COIN, WDC)
This is important for anyone building or using AI analysis tools - you need to separate what AI is told to look at vs. what it discovers on its own.
Each AI System Has Its Own Unique Interests
While the AIs agreed on the big stocks, each one also noticed some unique things that the others missed. It's like how different analysts have their own specialties. What Each AI Uniquely Noticed:
  • Google's Gemini: Cloud security companies (CrowdStrike), energy (Petrobras), innovation ETFs
  • OpenAI's GPT-5: Enterprise software (Snowflake), electric vehicles (Rivian)
  • Moonshot's Kimi: Oil companies (Chevron), document software (DocuSign)
  • MiniMax: Payment technology (PayPal)
  • Alibaba's Qwen: Fintech (SoFi), Chinese EVs (NIO)
Why This Matters: Different AI systems bring different perspectives. If you want diverse analysis, using multiple AI tools might help - but only for the "edge" cases, not the big obvious stocks everyone notices.
AI Attention is Remarkably Consistent Day-to-Day
We checked if AI attention changes day-to-day or week-to-week. The answer: it's very stable. What We Found:
  • Week 1 vs Week 2: 93% of the same stocks appeared
  • No difference between weekdays and weekends
  • The same 14-15 stocks dominated throughout
What This Tells Us: AI attention to Reddit discussions is not reactive to daily news cycles - it's more like a steady spotlight on the same set of popular stocks. If you're looking for AI to catch "breaking" stories, this data suggests it might miss them in favor of ongoing popular discussions.
Main findings posted
4 findings for public consumption
Methodology

Conducted statistical analysis and pattern exploration using representative data samples with critical evaluation of findings.

Progress
Started
2026-02-19 17:28
Completed
2026-02-19 17:28
Outputs
7 artifacts