Every factual claim carries exactly one label. The label is the honesty: most writing about "the algorithm" blurs these five things together, and the blur is where folklore comes from.
| label | what it means |
|---|---|
| CODE-CURRENT | Verifiable in the pinned commit of a live repository. The strongest tier — you can click through to the exact lines. |
| CODE-HISTORICAL | Was real, in released code — and is superseded or the repository is abandoned. Always written in the past tense. |
| OFFICIAL-STATED | X or xAI said it (engineering posts, documentation) but it is not visible in code. |
| EMPIRICAL | Measured or observed — by us or by credible third parties — and labeled as measurement, not code. |
| UNKNOWN | The honest tier: the code provably does not answer this. We say so instead of guessing. |
Repositories move. A claim that was true at one commit can be false at the next — so every code-tier claim here is pinned to a full commit SHA, with a permalink to the exact file and lines. When we say the spam threshold is 0.4, the citation isn't "the GitHub repo"; it is one immutable line at one immutable commit. We also keep complete archival mirrors of every cited repository, so claims remain checkable even if history upstream is rewritten or removed.
X has committed to updating the open-source algorithm roughly every four weeks. An automated tracker watches the repository; when a new release lands, every code-tier claim is re-checked against the changed files. Claims whose cited files were touched are re-read by a human before being re-pinned — and until that happens, the affected pages display a notice saying exactly that. If you ever see the re-verification banner on a page here, that is the system working, not failing. Absence claims ("this module is not in the release") are re-verified whole-tree every cycle, because a release can invalidate them by adding something.
We will never present a guess with the confidence of a citation. We will never quote numbers that are not in the open — the famous 2023 engagement weights are labeled historical because they are; the current weights are referenced through a module that is absent from every public release, and anyone quoting them is guessing. And when the released code itself carries visible redaction seams — emptied sets, conditions with clauses stripped — we document the seams rather than inventing what was behind them.
If any claim here is wrong, we want to know more than you want to tell us. Every claim's permalink makes checking our work a one-click affair. Corrections: [email protected].