How does X's spam classifier work?

Since the May 2026 release, X runs a Grok-based content classifier on posts themselves — and the code shows something nobody guessed: the released spam screen specifically targets replies from low-follower accounts, uses a vision model (images count), and returns a hard yes/no at deterministic settings. What it does not show is the actual spam criteria: the prompt module containing them is absent from the release.

What the code says

The classifier lives at grox/classifiers/content/spam.py in the open-source repository. Its construction is six lines, and four of them are findings:

grox/classifiers/content/spam.py · L25–L30@ 0bfc279

25class SpamEapiLowFollowerClassifier(ContentClassifier):
26    def __init__(self, model_name: ModelName = ModelName.VLM_PRIMARY):
27        vlm_config = grox_config.get_model(model_name)
28        vlm_config.temperature = 0.000001
29        vlm = VisionSampler(GrokModelConfig(**vlm_config.model_dump()))
30        super().__init__(categories=[ContentCategoryType.SPAM_COMMENT], llm=vlm)

Read it slowly, because each token is doing work. The class name says who this screen is for: low-follower accounts get a dedicated spam pathway.

CODE-CURRENT0bfc279verified 2026-06-12

The released spam classifier is named SpamEapiLowFollowerClassifier — low-follower accounts get a dedicated spam-screening pathway, and a positive verdict is logged as 'Spam found for low follower user'.

xai-org/x-algorithm — grox/classifiers/content/spam.py, class definition (L25) and verdict log (L95)as of the May 15, 2026 release

The model is a VisionSampler — a vision-language model, which means the images in a post are inside the classifier's field of view, not just the text. The temperature is set to one-millionth: this is not a creative model having opinions, it is a deterministic gate. And the category being scored is SPAM_COMMENT — replies, not just original posts.

CODE-CURRENT0bfc279verified 2026-06-12

The spam classifier runs a Grok vision-language model (VisionSampler, ModelName.VLM_PRIMARY) at temperature 0.000001, scoring the SPAM_COMMENT content category — images are in scope, the verdict is deterministic, and replies are the screened surface.

xai-org/x-algorithm — grox/classifiers/content/spam.py, constructor (L26–L30)as of the May 15, 2026 release

Further down, the verdict logic is binary. The model returns a JSON decision; the code maps it to a score of exactly 1.0 or 0.0 — there is no partial spamminess in this classifier:

grox/classifiers/content/spam.py · L91–L95@ 0bfc279

91        is_spam = decision == "spam"
92        score = 1.0 if is_spam else 0.0
93
94        if is_spam:
95            logger.info(f"Spam found for low follower user: {post.id}")

CODE-CURRENT0bfc279verified 2026-06-12

The spam verdict is binary: the model's JSON decision is mapped to a score of exactly 1.0 (spam) or 0.0 (not spam) — a gate, not a graded signal.

xai-org/x-algorithm — grox/classifiers/content/spam.py, verdict mapping (L91–L92)as of the May 15, 2026 release

Signal by signal

in the code	in plain english	where xDoctor surfaces it
LowFollower pathway	Small and new accounts get a dedicated spam screen — the bar is different for you than for a million-follower account.	Checkup · Flagged Posts
SPAM_COMMENT category	Your replies are being screened, not just your original posts. Reply behavior is a first-class spam surface.	Checkup · Promotional
VisionSampler	Images count. A spammy screenshot or promo graphic in a reply is inside the model's judgment, even with innocent text.	Checkup · Brand Safety
score = 1.0 or 0.0	It is a gate, not a dial — a reply is judged spam or it is not, deterministically, every time.	—

What the code doesn't say

▲ What the code doesn't say

The actual spam criteria — the system prompt the model is judged against — is imported from grox.prompts.template as SpamSystemLowFollower. And the entire grox/prompts/ module is absent from the public release. The machinery is open; the rules it enforces are withheld — the same pattern as the weighted scorer's missing numeric parameters. Anyone publishing a list of "X's spam rules" is not reading them from this code.

UNKNOWN0bfc279verified 2026-06-12

The actual spam criteria are withheld: spam.py imports its system prompt (SpamSystemLowFollower) from grox.prompts.template, and the entire grox/prompts/ module is absent from the public release — the classifier machinery is open, the rules it enforces are not.

xai-org/x-algorithm — whole-tree absence check: grox/prompts/ does not exist at the pinned commit (import at spam.py L10)as of the May 15, 2026 release

What to do about it

You can't read the withheld prompt, but you can read the architecture, and the architecture tells you where to look in your own history: replies, from when your account was small, containing promotional imagery or giveaway mechanics. That is precisely the slice xDoctor's Checkup surfaces triage — and why the Promotional surface exists at all.

← Penalties, safety screens & "shadowbans"

How does X's spam classifier work?

What the code says

Signal by signal

What the code doesn't say

What to do about it

Related questions