What is PTOS enforcement in the X algorithm?

PTOS — "Post Terms of Service" — is the safety enforcement layer in the released Grox code, and it names its categories in the open: seven policy areas, screened by a two-stage pipeline of Grok vision models. The policy texts themselves are withheld, and the released file carries visible redaction seams where clauses were stripped before publication.

The seven policies, named in the code

The enforcement taxonomy is not a leak or a guess — it is a Python set in the released file:

grox/classifiers/content/safety_ptos.py · L217–L225@ 0bfc279

217    SUPPORTED_POLICY_CATEGORIES = {
218        SafetyPolicyCategory.ViolentMedia,
219        SafetyPolicyCategory.AdultContent,
220        SafetyPolicyCategory.Spam,
221        SafetyPolicyCategory.IllegalAndRegulatedBehaviors,
222        SafetyPolicyCategory.HateOrAbuse,
223        SafetyPolicyCategory.ViolentSpeech,
224        SafetyPolicyCategory.SuicideOrSelfHarm,
225    }

CODE-CURRENT0bfc279verified 2026-06-12

The PTOS safety enforcement taxonomy is named in the released code: seven policy categories — ViolentMedia, AdultContent, Spam, IllegalAndRegulatedBehaviors, HateOrAbuse, ViolentSpeech, SuicideOrSelfHarm — defined as the SUPPORTED_POLICY_CATEGORIES set.

xai-org/x-algorithm — grox/classifiers/content/safety_ptos.py, SUPPORTED_POLICY_CATEGORIES (L217–L225)as of the May 15, 2026 release

A two-stage pipeline, on two different models

Enforcement is not one model call. A category classifier (SafetyPtosCategoryClassifier, running VLM_SAFETY) first decides whether a post violates anything at all; only then does a policy classifier (SafetyPtosPolicyClassifier, running VLM_PRIMARY_CRITICAL — a critical-tier model) re-judge the post against the specific policy detected. Both are vision models at deterministic temperature: images are inside the judgment, and the verdict is repeatable.

CODE-CURRENT0bfc279verified 2026-06-12

PTOS enforcement is a two-stage pipeline on two model tiers: SafetyPtosCategoryClassifier (VLM_SAFETY) detects whether a post violates anything, then SafetyPtosPolicyClassifier (VLM_PRIMARY_CRITICAL) re-judges it against the specific detected policy — both vision models at temperature 0.000001.

xai-org/x-algorithm — grox/classifiers/content/safety_ptos.py, class constructors (L57–L70 and L140–L152)as of the May 15, 2026 release

There is also a "deluxe" mode: when enabled, the two most consequential categories — AdultContent and ViolentMedia — are routed to a dedicated reasoning model, with a fallback to the standard vision model if that call fails.

CODE-CURRENT0bfc279verified 2026-06-12

In 'deluxe' mode, the two most consequential policy categories — AdultContent and ViolentMedia (DELUXE_4_2_CATEGORIES) — are routed to a dedicated reasoning model, with fallback to the standard vision model on failure.

xai-org/x-algorithm — grox/classifiers/content/safety_ptos.py, DELUXE_4_2_CATEGORIES (L227–L230) and _sample_4_2 fallback (L268–L279)as of the May 15, 2026 release

Signal by signal

in the code	in plain english	where xDoctor surfaces it
7 policy categories	The complete list of what safety enforcement screens for — by name, in a set literal.	Checkup · Policy-Sensitive
two-stage pipeline	Detection first, then a deeper re-judgment against the specific policy — a flag is a two-model event.	Checkup · Flagged Posts
UserRenderer in prompt	The model sees you — your profile is rendered into the judging context alongside the post.	Coach · Account
include_reply_to=True	The post you replied to is in the context. A reply is judged in its conversation, not in isolation.	Checkup · Promotional

CODE-CURRENT0bfc279verified 2026-06-12

Grox content classifiers render the post's AUTHOR into the judging prompt: UserRenderer.render(post.user) places the user's profile in the model's context alongside the post, in the PTOS, banger, and post-safety-screen classifiers alike.

xai-org/x-algorithm — safety_ptos.py L84, banger_initial_screen.py L68, post_safety_screen_deluxe.py L54as of the May 15, 2026 release

What the code doesn't say

▲ What the code doesn't say

Every policy's actual text — HateOrAbusePolicy, ViolentSpeechPolicy, all seven — imports from the absent grox.prompts.template module. And the release carries visible redaction seams: a restriction-lines set emptied to blank strings, and a boolean condition missing its first operand where a clause was stripped before publication. The category names are open; the criteria, and at least one routing condition, are deliberately withheld.

CODE-CURRENT0bfc279verified 2026-06-12

The released safety_ptos.py carries visible redaction seams: the _THINKING_RESTRICTION_LINES set contains only empty strings (L40–L43), and a boolean condition is missing its first operand (L241–L244: "if ( / and self.deluxe / and …") — clauses were removed from the file before publication.

xai-org/x-algorithm — grox/classifiers/content/safety_ptos.py, L40–L43 and L241–L244 (observable in the published file)as of the May 15, 2026 release

UNKNOWN0bfc279verified 2026-06-12

The actual spam criteria are withheld: spam.py imports its system prompt (SpamSystemLowFollower) from grox.prompts.template, and the entire grox/prompts/ module is absent from the public release — the classifier machinery is open, the rules it enforces are not.

xai-org/x-algorithm — whole-tree absence check: grox/prompts/ does not exist at the pinned commit (import at spam.py L10)as of the May 15, 2026 release

What to do about it

The taxonomy is the audit checklist. Seven named categories means seven concrete questions to ask of your own archive — which is exactly how xDoctor's Policy-Sensitive surface is organized: it triages your history against the enforcement categories the code names, rather than against folklore.

← Penalties, safety screens & "shadowbans"