Methodology

How predictions are extracted, graded, and verified — every step in the open.

1️⃣ Data sources

Every prediction comes from publicly available content by finance YouTubers and Weibo influencers:

  1. Download channel-published audio + Weibo text (internal analysis only; not redistributed)
  2. Transcribe with Whisper or use existing subtitles
  3. Multi-model extraction identifies statements that combine a ticker, direction/target, and horizon

Every prediction card links back to the source video/post + timestamp for independent verification.

2️⃣ What counts as a "stock prediction"

A statement qualifies only if it has ALL of:

  • Specific instrument (ticker, ETF, index, crypto, commodity)
  • Direction or price target (long/short/neutral OR an explicit target)
  • Time horizon (a date or "by year-end" / "within 6 months" etc.)
  • Verifiability (can be checked against historical market data)
  • Attribution (the host's own forward call, not relay of someone else's position)

Excluded:

  • Vague claims without a ticker ("tech stocks will do well")
  • Conditional predictions ("if the Fed pivots, then…") — the trigger isn't verifiable
  • Pure recommendations ("you should buy gold") with no forward claim
  • Eschatological calls without a horizon ("the dollar will collapse")

3️⃣ Tiering by horizon

Each prediction is bucketed by holding duration:

  • Short (<60 days): weekly-to-monthly technical calls
  • Mid (60-360 days): quarterly-to-annual fundamental / valuation calls
  • Long (>360 days): multi-year compounders or macro themes

Tiering makes cross-pundit comparison fair: a technician's short-term hit rate isn't apples-to-apples with a value investor's long-term rate.

4️⃣ Strict model consensus

Every candidate prediction must pass strict multi-model validation before publishing.

  1. Candidate extraction: transcript analysis emits candidate rows with ticker / direction / target / horizon.
  2. Primary validator path: when available, Claude Opus + OpenAI GPT + Vertex Gemini run a two-round strict consensus workflow.
  3. Accepted fallback: when Claude is unavailable, OpenAI GPT + Vertex Gemini vote independently and a candidate is published only if both vote KEEP.
  4. Audit trail: each final batch records its consensus version, such as 2v-strict-gpt-gemini, so fallback-validated data is traceable.

This design reduces single-model bias while keeping the launch pipeline usable when one vendor path is unavailable.

5️⃣ Automated market verification

When the horizon expires, an automated job pulls via yfinance:

  • Entry price: closing price on the video-publish date
  • Exit price: closing price on the horizon date
  • P/L %: (exit − entry) / entry (sign-flipped for short calls)
  • Benchmark alpha: P/L difference vs SPY over the same period

"Simulated P/L" is a stylized backtest — each call as a $1 position, opened on publish date, closed on horizon date. For accuracy scoring only; not investment advice.

6️⃣ Verdict labels

  • 🎯 Hit (Bullseye): direction correct, target reached (or clearly profitable)
  • 🤏 Partial: right direction but magnitude or timing off-target
  • 💸 Miss: wrong direction, or target not reached and horizon passed
  • 🔮 Pending: horizon hasn't expired yet

"Miss" judges the prediction, not the person.

⚠️ Known limitations

  • Transcription has errors, especially mixed-language speech
  • LLM judgment on "is this a prediction?" has edge cases — rhetorical phrasing can be miscategorized
  • "$1 equal-weight" backtest ignores fees, slippage, stops, rebalancing
  • Close-price entry/exit may differ from real execution
  • Sample size grows over time — early-period coverage is sparse

All of these are open to correction. See 📮 Corrections & Contact.

🔗 More

Maintained by · Last reviewed: · Errors: contact@truthtracking.org