AI Safety is Out of Control
AI safety is the most urgent conversation in the field today. Companies publish safety charters, researchers debate alignment strategies, governments scramble to regulate. But most of what passes for “safety” is only treating symptoms, not causes. The lesson of hallucination should have made this clear by now: the very architecture of the transformer produces the problem.
Large language models hallucinate not because they are broken, but because they are doing exactly what they were designed to do, which is predict the next token with statistical coherence. Fluency is achieved, but understanding is not. Sooner or later, coherence runs ahead of grounding and produces statements that sound right but have no anchor in truth. No patch can fix this, because the absence of world is not a bug, it is the structural condition of the transformer itself.
Safety measures, as they exist now, are mostly filters, guardrails, and after-the-fact reinforcements. They try to smooth the outputs, suppress the worst failures, redirect the dangerous prompts. In other words, they work at the level of symptoms. But if the underlying system has no world, no care, no projection into possibilities, then the symptom will always return. The machine cannot help but wander into hallucination, and with it, misalignment.
To treat the cause, we have to go deeper. Safety will never be secured by more patches on a statistical engine. It requires rethinking the very conditions of intelligence. If thought is always being-in-the-world, then machines that lack world can only ever simulate coherence without responsibility. To build safe AI, we need to build grounded AI.
This is where World Mind departs from the current race. The point is not to endlessly patch the leaks in the transformer hull, but to chart a new course. True safety will come only from an AI that discloses a world, however partial, so we can plant safety in ontological roots rather than aligning at the surface.