How does X's spam classifier work?
Since the May 2026 release, X runs a Grok-based content classifier on posts themselves — and the code shows something nobody guessed: the released spam screen specifically targets replies from low-follower accounts, uses a vision model (images count), and returns a hard yes/no at deterministic settings. What it does not show is the actual spam criteria: the prompt module containing them is absent from the release.
What the code says
The classifier lives at grox/classifiers/content/spam.py in the open-source
repository. Its construction is six lines, and four of them are findings:
25class SpamEapiLowFollowerClassifier(ContentClassifier): 26 def __init__(self, model_name: ModelName = ModelName.VLM_PRIMARY): 27 vlm_config = grox_config.get_model(model_name) 28 vlm_config.temperature = 0.000001 29 vlm = VisionSampler(GrokModelConfig(**vlm_config.model_dump())) 30 super().__init__(categories=[ContentCategoryType.SPAM_COMMENT], llm=vlm)
Read it slowly, because each token is doing work. The class name says who this screen is
for: low-follower accounts get a dedicated spam pathway.
The released spam classifier is named SpamEapiLowFollowerClassifier — low-follower accounts get a dedicated spam-screening pathway, and a positive verdict is logged as 'Spam found for low follower user'.
VisionSampler — a vision-language model, which means the images in a
post are inside the classifier's field of view, not just the text. The temperature is set to
one-millionth: this is not a creative model having opinions, it is a deterministic gate. And the
category being scored is SPAM_COMMENT — replies, not just original posts.
The spam classifier runs a Grok vision-language model (VisionSampler, ModelName.VLM_PRIMARY) at temperature 0.000001, scoring the SPAM_COMMENT content category — images are in scope, the verdict is deterministic, and replies are the screened surface.
Further down, the verdict logic is binary. The model returns a JSON decision; the code maps it
to a score of exactly 1.0 or 0.0 — there is no partial spamminess in this
classifier:
91 is_spam = decision == "spam" 92 score = 1.0 if is_spam else 0.0 93 94 if is_spam: 95 logger.info(f"Spam found for low follower user: {post.id}")
The spam verdict is binary: the model's JSON decision is mapped to a score of exactly 1.0 (spam) or 0.0 (not spam) — a gate, not a graded signal.
Signal by signal
| in the code | in plain english | where xDoctor surfaces it |
|---|---|---|
| LowFollower pathway | Small and new accounts get a dedicated spam screen — the bar is different for you than for a million-follower account. | Checkup · Flagged Posts |
| SPAM_COMMENT category | Your replies are being screened, not just your original posts. Reply behavior is a first-class spam surface. | Checkup · Promotional |
| VisionSampler | Images count. A spammy screenshot or promo graphic in a reply is inside the model's judgment, even with innocent text. | Checkup · Brand Safety |
| score = 1.0 or 0.0 | It is a gate, not a dial — a reply is judged spam or it is not, deterministically, every time. | — |
What the code doesn't say
The actual spam criteria — the system prompt the model is judged against — is imported
from grox.prompts.template as SpamSystemLowFollower. And the entire
grox/prompts/ module is absent from the public release. The machinery is open; the
rules it enforces are withheld — the same pattern as the weighted scorer's missing numeric
parameters. Anyone publishing a list of "X's spam rules" is not reading them from this code.
The actual spam criteria are withheld: spam.py imports its system prompt (SpamSystemLowFollower) from grox.prompts.template, and the entire grox/prompts/ module is absent from the public release — the classifier machinery is open, the rules it enforces are not.
What to do about it
You can't read the withheld prompt, but you can read the architecture, and the architecture tells you where to look in your own history: replies, from when your account was small, containing promotional imagery or giveaway mechanics. That is precisely the slice xDoctor's Checkup surfaces triage — and why the Promotional surface exists at all.