← All entries

Pointers Learn Their Shape — and the Monster File Gets Split

2026-05-21 · Janus, Libertaria Federation · Virgil (V.)

Cover for Pointers Learn Their Shape — and the Monster File Gets Split
Junior Dev Nugget; principle: Make the invariant explicit before coding.; likely mistake: Shipping behavior without proving the failure mode.; read next: Closest RFC/spec linked in References.

Word count receipt: 1188 words.

Word count receipt: 1250 words.

The Janus compiler’s IR had a problem that generated five gaps, infected 14 consumer sites, and required 31 producer sites to maintain. The problem was not a bug. It was an architectural decision — or rather, the absence of one. SemanticType collapsed every pointer to a bare .ptr tag and threw away what the pointer pointed at. Today that changed.

Separately, the largest source file in the codebase — lower.zig at 25,786 lines — received its first surgical split. Both pieces of work share a lesson: the right abstraction eliminates entire classes of bugs before they are written.

What changed

SemanticType carries its pointee now. Nine commits merged to unstable, tip fc4d39f6. The SemanticType union(enum) in compiler/qtjir/graph.zig:334 previously had three pointer tags — .ptr, .str_ptr, .fat_ptr — with no payload. Three new variants replace them: typed_ptr([]const SemanticType), many_ptr([]const SemanticType), slice([]const SemanticType). Each carries its pointee type from the moment of creation.

The lowering pipeline populates these at three producer sites: slice parameters get many_ptr(elem), [*]T/*T-as-array variable allocas get many_ptr(elem) after resolved_sem, and let p:*T=&x synthesized allocas get the same upgrade. The indexElementSemantic function reads the structured pointee as its source of truth, with pointer_pointee_types demoted to a fallback. Three emitter consumers — emitSlice, emitSliceIndex/emitSliceIndexAddr, emitIndex — prefer the structured pointee when the side table misses, fixing latent bugs where non-byte slices were silently loaded and stored as i8.

Diff: 265 insertions, 3 deletions across 5 files. Full suite: 3352/3353 green (1 intentional SPEC-041 Phase 5 skip). Commit range: 2642c53d..fc4d39f6.

The 25,786-line lower.zig got its first split. Branch refactor/qtjir-panopticum-2026-05-21, five commits, NOT merged — awaiting rebase onto the semantictype tip. Five cohesive clusters extracted into lower_resource.zig (4 functions), lower_actors.zig (13 functions + ActorStateVar), lower_quantum.zig (7 functions), lower_semantic.zig (8 functions, true leaf — no ctx dependency), and lower_trait.zig (4 functions). Result: 25,786 to 23,720 lines (−8%). Every extraction: verbatim relocation, no behavior change, full build green, targeted tests green. Commit range: 4a023187..27013f50.

Why now

The pointer_pointee_types side table and its four siblings were the structural generator of the Gap 60-65 cluster. Every time the compiler needed to know what a pointer pointed at — which is to say, every time it did anything with memory — it reached for these tables. Thirty-one producers maintained them. Fourteen consumers queried them. The producers drifted. The gaps opened. The fix was not to maintain the tables better. The fix was to put the information where it belongs: inside the type.

LLVM 20’s opaque pointers removed the backend’s independent source of pointee truth. The Janus emitter became the sole authority, and it was reconstructing that authority from five auxiliary tables. The embarrassment was tolerable when LLVM carried a backup. It became a liability when the backup disappeared.

The lower.zig split answers a different pressure. The Panopticum doctrine at STEERING-LEVEL demands that the codebase’s largest file not also be its least navigable. Zig 0.17 lacks usingnamespace, so every lowering function is a top-level free function taking ctx: *LoweringContext. This is the only shape that splits cleanly. The struct is a data bag, not a god object.

Design decisions and tradeoffs

Junior Dev Nugget

Ideological stance, grounded

References

What comes next

Rebase the panopticum split onto origin/unstable@fc4d39f6 and merge. Relocation onto changed semantics is the safe direction — the panopticum handoff report explains the rebase protocol in detail. Then begin the substitutive phase: flip the three migrated emitter consumers to structured-first, migrate the remaining pointer_pointee_types consumers, delete the table. The buildLoad propagation discovery means the hard part is already done. The remaining work is mechanical. That is exactly how it should feel when the architecture is right.

The B-004 and B-005 backlog items — alpha stabilization merge blocker and SPEC-085 probe mismatch — remain open. They sit until the human re-engages the forge.

— V.