mirror of https://github.com/Comfy-Org/ComfyUI_frontend.git synced 2026-04-20 14:30:41 +00:00

Files

Alexander Brown 3e197b5c57 docs: ADR 0008 — Entity Component System (#10420 )

## Summary

Architecture documentation proposing an Entity Component System for the
litegraph layer.

```mermaid
graph LR
    subgraph Today["Today: Spaghetti"]
        God["🍝 God Objects"]
        Circ["🔄 Circular Deps"]
        Mut["💥 Render Mutations"]
    end

    subgraph Tomorrow["Tomorrow: ECS"]
        ID["🏷️ Branded IDs"]
        Comp["📦 Components"]
        Sys["⚙️ Systems"]
        World["🌍 World"]
    end

    God -->|"decompose"| Comp
    Circ -->|"flatten"| ID
    Mut -->|"separate"| Sys
    Comp --> World
    ID --> World
    Sys -->|"query"| World
```

## Changes

- **What**: ADR 0008 + 4 architecture docs (no code changes)
- `docs/adr/0008-entity-component-system.md` — entity taxonomy, branded
IDs, component decomposition, migration strategy
- `docs/architecture/entity-interactions.md` — as-is Mermaid diagrams of
all entity relationships
- `docs/architecture/entity-problems.md` — structural problems with
file:line evidence
- `docs/architecture/ecs-target-architecture.md` — target architecture
diagrams
- `docs/architecture/proto-ecs-stores.md` — analysis of existing Pinia
stores as proto-ECS patterns

## Review Focus

- Does the entity taxonomy (Node, Link, Subgraph, Widget, Slot, Reroute,
Group) cover all cases?
- Are the component decompositions reasonable starting points?
- Is the migration strategy (bridge layer, incremental extraction)
feasible?
- Are there entity interactions or problems we missed?

┆Issue is synchronized with this [Notion
page](https://www.notion.so/PR-10420-docs-ADR-0008-Entity-Component-System-32d6d73d365081feb048d16a5231d350)
by [Unito](https://www.unito.io)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Christian Byrne <cbyrne@comfy.org>

2026-03-26 16:14:44 -07:00

28 KiB

Raw Blame History

ECS Migration Plan

A phased roadmap for migrating the litegraph entity system to the ECS architecture described in ADR 0008. Each phase is independently shippable. Later phases depend on earlier ones unless noted otherwise.

For the problem analysis, see Entity Problems. For the target architecture, see ECS Target Architecture. For verified accuracy of these documents, see Appendix: Critical Analysis.

Planning assumptions

The bridge period is expected to span 2-3 release cycles.
Bridge work is treated as transitional debt with explicit owners and sunset checkpoints, not as a permanent architecture layer.
Phase 5 is entered only by explicit go/no-go review against the criteria in this document.

Phase 0: Foundation

Zero behavioral risk. Prepares the codebase for extraction without changing runtime semantics. All items are independently shippable.

0a. Centralize version counter

graph._version++ appears in 19 locations across 7 files. The counter is only read once — for debug display in LGraphCanvas.renderInfo() (line 5389). It is not used for dirty-checking, caching, or reactivity.

Change: Add LGraph.incrementVersion() and replace all 19 direct increments.

incrementVersion(): void {
  this._version++
}

File	Sites
`LGraph.ts`	5 (lines 956, 989, 1042, 1109, 2643)
`LGraphNode.ts`	8 (lines 833, 2989, 3138, 3176, 3304, 3539, 3550, 3567)
`LGraphCanvas.ts`	2 (lines 3084, 7880)
`BaseWidget.ts`	1 (line 439)
`SubgraphInput.ts`	1 (line 137)
`SubgraphInputNode.ts`	1 (line 190)
`SubgraphOutput.ts`	1 (line 102)

Why first: Creates the seam where a VersionSystem can later intercept, batch, or replace the mechanism. Mechanical find-and-replace with zero behavioral change.

Risk: None. Existing null guards at call sites are preserved.

0b. Add missing ID type aliases

NodeId, LinkId, and RerouteId exist as type aliases. Two are missing:

Type	Definition	Location
`GroupId`	`number`	`LGraphGroup.ts` (currently implicit on `id: number` at line 39)
`SlotIndex`	`number`	`interfaces.ts` (slot positions are untyped `number` everywhere)

Change: Add the type aliases, update property declarations, re-export from barrel (litegraph.ts).

Why: Foundation for branded IDs. Type aliases are erased at compile time — zero runtime impact.

Risk: None. Type-only change.

0c. Fix architecture doc errors

Five factual errors verified during code review (see Appendix):

entity-problems.md: toJSON() should be toString(), execute() should be doExecute(), method count ~539 should be ~848, configure() is ~240 lines not ~180
proto-ecs-stores.md: resolveDeepest() does not exist on PromotedWidgetViewManager; actual methods are reconcile() / getOrCreate()

Phase 1: Types and World Shell

Introduces the ECS type vocabulary and an empty World. No migration of existing code — new types coexist with old ones.

1a. Branded entity ID types

Define branded types in a new src/ecs/entityId.ts:

type NodeEntityId = number & { readonly __brand: 'NodeEntityId' }
type LinkEntityId = number & { readonly __brand: 'LinkEntityId' }
type WidgetEntityId = number & { readonly __brand: 'WidgetEntityId' }
type SlotEntityId = number & { readonly __brand: 'SlotEntityId' }
type RerouteEntityId = number & { readonly __brand: 'RerouteEntityId' }
type GroupEntityId = number & { readonly __brand: 'GroupEntityId' }
type GraphId = string & { readonly __brand: 'GraphId' }  // scope, not entity

Add cast helpers (asNodeEntityId(id: number): NodeEntityId) for use at system boundaries (deserialization, legacy bridge).

Does NOT change existing code. The branded types are new exports consumed only by new ECS code.

Risk: Low. New files, no modifications to existing code.

Consideration: NodeId = number | string is the current type. The branded NodeEntityId narrows to number. The string branch exists solely for subgraph-related nodes (GroupNode hack). The migration must decide whether to:

Keep NodeEntityId = number and handle the string case at the bridge layer
Or define NodeEntityId = number | string with branding (less safe)

Recommend the former: the bridge layer coerces string IDs to a numeric mapping, and only branded numeric IDs enter the World.

1b. Component interfaces

Define component interfaces in src/ecs/components/:

src/ecs/
  entityId.ts          # Branded ID types
  components/
    position.ts        # Position (shared by Node, Reroute, Group)
    nodeType.ts        # NodeType
    nodeVisual.ts      # NodeVisual
    connectivity.ts    # Connectivity
    execution.ts       # Execution
    properties.ts      # Properties
    widgetContainer.ts # WidgetContainer
    linkEndpoints.ts   # LinkEndpoints
    ...
  world.ts             # World type and factory

Components are TypeScript interfaces only — no runtime code. They mirror the decomposition in ADR 0008 Section "Component Decomposition."

Risk: None. Interface-only files.

1c. World type

Define the World as a typed container:

interface World {
  nodes: Map<NodeEntityId, NodeComponents>
  links: Map<LinkEntityId, LinkComponents>
  widgets: Map<WidgetEntityId, WidgetComponents>
  slots: Map<SlotEntityId, SlotComponents>
  reroutes: Map<RerouteEntityId, RerouteComponents>
  groups: Map<GroupEntityId, GroupComponents>
  scopes: Map<GraphId, GraphId | null> // graph scope DAG (parent or null for root)

  createEntity<K extends EntityKind>(kind: K): EntityIdFor<K>
  deleteEntity<K extends EntityKind>(kind: K, id: EntityIdFor<K>): void
  getComponent<C>(id: EntityId, component: ComponentKey<C>): C | undefined
  setComponent<C>(id: EntityId, component: ComponentKey<C>, data: C): void
}

Subgraphs are not a separate entity kind. A node with a SubgraphStructure component represents a subgraph. The scopes map tracks the graph nesting DAG. See Subgraph Boundaries for the full model.

World scope is per workflow instance. Linked subgraph definitions can be reused across instances, but mutable runtime state (widget values, execution state, selection/transient view state) remains instance-scoped through graphId.

Initial implementation: plain Map-backed. No reactivity, no CRDT, no persistence. The World exists but nothing populates it yet.

Risk: Low. New code, no integration points.

Phase 2: Bridge Layer

Connects the legacy class instances to the World. Both old and new code can read entity state; writes still go through legacy classes.

2a. Read-only bridge for Position

The LayoutStore (src/renderer/core/layout/store/layoutStore.ts) already extracts position data for nodes, links, and reroutes into Y.js CRDTs. The bridge reads from LayoutStore and populates the World's Position component.

Approach: A PositionBridge that observes LayoutStore changes and mirrors them into the World. New code reads world.getComponent(nodeId, Position); legacy code continues to read node.pos / LayoutStore directly.

Open question: Should the World wrap the Y.js maps or maintain its own plain-data copy? Options:

Approach	Pros	Cons
World wraps Y.js	Single source of truth; no sync lag	World API becomes CRDT-aware; harder to test
World copies from Y.js	Clean World API; easy to test	Two copies of position data; sync overhead
World replaces Y.js	Pure ECS; no CRDT dependency in World	Breaks collaboration (ADR 0003); massive change

Recommendation: Start with "World copies from Y.js" for simplicity. The copy is cheap (position is small data). Revisit if sync overhead becomes measurable.

Risk: Medium. Introduces a sync point between two state systems. Must ensure the bridge doesn't create subtle ordering bugs (e.g., World reads stale position during render).

2b. Read-only bridge for WidgetValue

WidgetValueStore (src/stores/widgetValueStore.ts) already extracts widget state into plain WidgetState objects keyed by graphId:nodeId:name. This is the closest proto-ECS store.

Approach: A WidgetBridge that maps WidgetValueStore entries into WidgetValue components in the World, keyed by WidgetEntityId. Requires assigning synthetic widget IDs (via lastWidgetId counter on LGraphState).

Dependency: Requires 1a (branded IDs) for WidgetEntityId.

Risk: Low-Medium. WidgetValueStore is well-structured. Main complexity is the ID mapping — widgets currently lack independent IDs, so the bridge must maintain a (nodeId, widgetName) -> WidgetEntityId lookup.

2c. Read-only bridge for Node metadata

Populate NodeType, NodeVisual, Properties, Execution components by reading from LGraphNode instances. These are simple property copies.

Approach: When a node is added to the graph (LGraph.add()), the bridge creates the corresponding entity in the World and populates its components. When a node is removed, the bridge deletes the entity.

The incrementVersion() method from Phase 0a becomes the hook point — when version increments, the bridge can re-sync changed components. (This is why centralizing version first matters.)

Risk: Medium. Must handle the full node lifecycle (add, configure, remove) without breaking existing behavior. The bridge is read-only (World mirrors classes, not the reverse), which limits blast radius.

Bridge sunset criteria (applies to every Phase 2 bridge)

A bridge can move from "transitional" to "removal candidate" only when:

All production reads for that concern flow through World component queries.
All production writes for that concern flow through system APIs.
Serialization parity tests show no diff between legacy and World paths.
Extension compatibility tests pass without bridge-only fallback paths.

These criteria prevent the bridge from becoming permanent by default.

Bridge duration and maintenance controls

To contain dual-path maintenance cost during Phases 2-4:

Every bridge concern has a named owner and target sunset release.
Every PR touching bridge-covered data paths must include parity tests for both legacy and World-driven execution.
Bridge fallback usage is instrumented in integration/e2e and reviewed every milestone; upward trends block new bridge expansion.
Any bridge that misses its target sunset release requires an explicit risk review and revised removal plan.

Phase 3: Systems

Introduce system functions that operate on World data. Systems coexist with legacy methods — they don't replace them yet.

3a. SerializationSystem (read-only)

A function serializeFromWorld(world: World): SerializedGraph that produces workflow JSON by querying World components. Run alongside the existing LGraph.serialize() in tests to verify equivalence.

Why first: Serialization is read-only and has a clear correctness check (output must match existing serialization). It exercises every component type and proves the World contains sufficient data.

Risk: Low. Runs in parallel with existing code; does not replace it.

3b. VersionSystem

Replace the incrementVersion() method with a system that owns all change tracking. The system observes component mutations on the World and auto-increments the version counter.

Dependency: Requires Phase 2 bridges to be in place (otherwise the World doesn't see changes).

Risk: Medium. Must not miss any change that the scattered _version++ currently catches. The 19-site inventory from Phase 0a serves as the test matrix.

3c. ConnectivitySystem (queries only)

A system that can answer connectivity queries by reading Connectivity, SlotConnection, and LinkEndpoints components from the World:

"What nodes are connected to this node's inputs?"
"What links pass through this reroute?"
"What is the execution order?"

Does not perform mutations yet — just queries. Validates that the World's connectivity data is complete and consistent with the class-based graph.

Risk: Low. Read-only system with equivalence tests.

Phase 4: Write Path Migration

Systems begin owning mutations. Legacy class methods delegate to systems. This is the highest-risk phase.

4a. Position writes through World

New code writes position via world.setComponent(nodeId, Position, ...). The bridge propagates changes back to LayoutStore and LGraphNode.pos.

This inverts the data flow: Phase 2 had legacy -> World (read bridge). Phase 4 has World -> legacy (write bridge). Both paths must work during the transition.

Risk: High. Two-way sync between World and legacy state. Must handle re-entrant updates (World write triggers bridge, which writes to legacy, which must NOT trigger another World write).

4b. ConnectivitySystem mutations

connect(), disconnect(), removeNode() operations implemented as system functions on the World. Legacy LGraphNode.connect() etc. delegate to the system.

Extension API concern: The current system fires callbacks at each step:

onConnectInput() / onConnectOutput() — can reject connections
onConnectionsChange() — notifies after connection change
onRemoved() — notifies after node removal

These callbacks are the extension API contract. The ConnectivitySystem must fire them at the same points in the operation, or extensions break.

Recommended approach: The system emits lifecycle events that the bridge layer translates into legacy callbacks. This preserves the contract without the system knowing about the callback API.

Phase 4 callback contract (locked):

onConnectOutput() and onConnectInput() run before any World mutation.
If either callback rejects, abort with no component writes, no version bump, and no lifecycle events.
onConnectionsChange() fires synchronously after commit, preserving current source-then-target ordering.
Bridge lifecycle events remain internal. Legacy callbacks stay the public compatibility API during Phase 4.

Risk: High. Extensions depend on callback ordering and timing. Must be validated against real-world extensions.

Widget value changes go through the World instead of directly through WidgetValueStore. The World's WidgetValue component becomes the single source of truth; WidgetValueStore becomes a read-through cache or is removed.

Risk: Medium. WidgetValueStore is already well-abstracted. The main change is routing writes through the World instead of the store.

4d. Layout write path and render decoupling

Remove layout side effects from render incrementally by node family.

Approach:

Inventory drawNode() call paths that still trigger arrange().
For one node family at a time, run LayoutSystem in update phase and mark entities as layout-clean before render.
Keep a temporary compatibility fallback that runs legacy layout only for non-migrated families.
Delete fallback once parity tests and frame-time budgets are met.

Risk: High. Mixed-mode operation must avoid stale layout reads. Requires family-level rollout and targeted regression tests.

Render hot-path performance gate

Before enabling ECS render reads as default for any migrated family:

Benchmark representative workflows (200-node and 500-node minimum).
Compare legacy vs ECS p95 frame time and mean draw cost.
Block rollout on statistically significant regression beyond agreed budget (default budget: 5% p95 frame-time regression ceiling).
Capture profiler traces proving the dominant cost is not repeated world.getComponent() lookups.

Phase 3 -> 4 gate (required)

Phase 4 starts only when all of the following are true:

A transaction wrapper API exists on the World and is used by connectivity and widget write paths in integration tests.
Undo batching parity is proven: one logical user action yields one undo checkpoint in both legacy and ECS paths.
Callback timing and rejection semantics from Phase 4b are covered by integration tests.
A representative extension suite passes, including rgthree-comfy.
Write bridge re-entrancy tests prove there is no World <-> legacy feedback loop.
Layout migration for any enabled node family passes read-only render checks (no arrange() writes during draw).
Render hot-path benchmark gate passes for every family moving to ECS-first reads.

Phase 5: Legacy Removal

Remove bridge layers and deprecated class properties. This phase happens per-component, not all at once.

5a. Remove Position bridge

Once all position reads and writes go through the World, remove the bridge and the pos/size properties from LGraphNode, Reroute, LGraphGroup.

Once all widget behavior is in systems, the 23+ widget subclasses can be replaced with component data + system functions. BaseWidget, NumberWidget, ComboWidget, etc. become configuration data rather than class instances.

5c. Dissolve god objects

LGraphNode, LLink, LGraph become thin shells — their only role is holding the entity ID and delegating to the World. Eventually, they can be removed entirely, replaced by entity ID + component queries.

Risk: Very High. This is the irreversible step. Must be done only after thorough validation that all consumers (including extensions) work with the ECS path.

Phase 4 -> 5 exit criteria (required)

Legacy removal starts only when all of the following are true:

The component being removed has no remaining direct reads or writes outside World/system APIs.
Serialization equivalence tests pass continuously for one release cycle.
A representative extension compatibility matrix is green, including rgthree-comfy.
Bridge instrumentation shows zero fallback-path usage in integration and e2e suites.
A rollback plan exists for each removal PR until the release is cut.
ECS write path has run as default behind a kill switch for at least one full release cycle.
No unresolved P0/P1 extension regressions are attributed to ECS migration in that cycle.

Phase 5 trigger packet (required before first legacy-removal PR)

The team prepares a single go/no-go packet containing:

Phase 4 -> 5 criteria checklist with links to evidence.
Extension compatibility matrix results.
Bridge fallback usage report (must be zero for the target concern).
Performance gate report for ECS render/read paths.
Rollback owner, rollback steps, and release coordination sign-off.

Open Questions

CRDT / ECS coexistence

The LayoutStore uses Y.js CRDTs for collaboration-ready position data (per ADR 0003). The ECS World uses plain Maps. These must coexist.

Options explored in Phase 2a. The recommended path (World copies from Y.js) defers the hard question. Eventually, the World may need to be CRDT-native — but this requires a separate ADR.

Questions to resolve:

Should non-position components also be CRDT-backed for collaboration?
Does the World need an operation log for undo/redo, or can that remain external (Y.js undo manager)?
How does conflict resolution work when two users modify the same component?

Extension API preservation

The current system exposes lifecycle callbacks on entity classes:

Callback	Class	Purpose
`onConnectInput`	`LGraphNode`	Validate/reject incoming connection
`onConnectOutput`	`LGraphNode`	Validate/reject outgoing connection
`onConnectionsChange`	`LGraphNode`	React to topology change
`onRemoved`	`LGraphNode`	Cleanup on deletion
`onAdded`	`LGraphNode`	Setup on graph insertion
`onConfigure`	`LGraphNode`	Post-deserialization hook
`onWidgetChanged`	`LGraphNode`	React to widget value change

Extensions register these callbacks to customize node behavior. The ECS migration must preserve this contract or provide a documented migration path for extension authors.

Recommended approach: Define an EntityLifecycleEvent system that emits typed events at the same points where callbacks currently fire. The bridge layer translates events into legacy callbacks. Extensions can gradually adopt event listeners instead of callbacks.

Phase 4 decisions:

Rejection callbacks act as pre-commit guards (reject before World mutation).
Callback dispatch remains synchronous during the bridge period.
Callback order remains: output validation -> input validation -> commit -> output change notification -> input change notification.

Extension Migration Examples (old -> new)

The bridge keeps legacy callbacks working, but extension authors can migrate incrementally to ECS-native patterns.

// Legacy pattern
const seedWidget = node.widgets?.find((w) => w.name === 'seed')
seedWidget?.setValue(42)

// ECS pattern (using the bridge/world widget lookup index)
const seedWidgetId = world.widgetIndex.getByNodeAndName(nodeId, 'seed')
if (seedWidgetId) {
  const widgetValue = world.getComponent(seedWidgetId, WidgetValue)
  if (widgetValue) {
    world.setComponent(seedWidgetId, WidgetValue, {
      ...widgetValue,
      value: 42
    })
  }
}

2) `onConnectionsChange` callback

// Legacy pattern
nodeType.prototype.onConnectionsChange = function (
  side,
  slot,
  connected,
  linkInfo
) {
  updateExtensionState(this.id, side, slot, connected, linkInfo)
}

// ECS pattern
lifecycleEvents.on('connection.changed', (event) => {
  if (event.nodeId !== nodeId) return
  updateExtensionState(
    event.nodeId,
    event.side,
    event.slotIndex,
    event.connected,
    event.linkInfo
  )
})

3) `onRemoved` callback

// Legacy pattern
nodeType.prototype.onRemoved = function () {
  cleanupExtensionResources(this.id)
}

// ECS pattern
lifecycleEvents.on('entity.removed', (event) => {
  if (event.kind !== 'node' || event.entityId !== nodeId) return
  cleanupExtensionResources(event.entityId)
})

4) `graph._version++`

// Legacy pattern (do not add new usages)
graph._version++

// Bridge-safe transitional pattern (Phase 0a)
graph.incrementVersion()

// ECS-native pattern: mutate through command/system API.
// VersionSystem bumps once at transaction commit.
executor.run({
  type: 'SetWidgetValue',
  execute(world) {
    const value = world.getComponent(widgetId, WidgetValue)
    if (!value) return
    world.setComponent(widgetId, WidgetValue, { ...value, value: 42 })
  }
})

Question to resolve after compatibility parity:

Should ECS-native lifecycle events stay synchronous after bridge removal, or can they become asynchronous once legacy callback compatibility is dropped?

Atomicity and transactions

The ECS lifecycle scenarios claim operations are "atomic." This requires the World to support transactions — the ability to batch multiple component writes and commit or rollback as a unit.

Current state: beforeChange() / afterChange() provide undo/redo checkpoints but not true transactions. The graph can be in an inconsistent state between these calls.

Phase 4 baseline semantics:

Mutating systems run inside world.transaction(label, fn).
The bridge maps one World transaction to one beforeChange() / afterChange() bracket.
Operations with multiple component writes (for example connect() touching slots, links, and node metadata) still commit as one transaction and therefore one undo entry.
Failed transactions do not publish partial writes, lifecycle events, or version increments.

Questions to resolve:

How should world.transaction() interact with Y.js transactions when a component is CRDT-backed?
Is eventual consistency acceptable for derived data updates between transactions, or must post-transaction state always be immediately consistent?

Keying strategy unification

The 6 proto-ECS stores use 6 different keying strategies:

Store	Key Format
WidgetValueStore	`"${nodeId}:${widgetName}"`
PromotionStore	`"${sourceNodeId}:${widgetName}"`
DomWidgetStore	Widget UUID
LayoutStore	Raw nodeId/linkId/rerouteId
NodeOutputStore	`"${subgraphId}:${nodeId}"`
SubgraphNavigationStore	subgraphId or `'root'`

The World unifies these under branded entity IDs. But stores that use composite keys (e.g., nodeId:widgetName) reflect a genuine structural reality — a widget is identified by its relationship to a node. Synthetic WidgetEntityIds replace this with an opaque number, requiring a reverse lookup index.

Trade-off: Type safety and uniformity vs. self-documenting keys. The World should maintain a lookup index ((nodeId, widgetName) -> WidgetEntityId) for the transition period.

Dependency Graph

Phase 0a (incrementVersion)  ──┐
Phase 0b (ID type aliases)  ───┤
Phase 0c (doc fixes)  ─────────┤── no dependencies between these
                                │
Phase 1a (branded IDs)  ────────┤
Phase 1b (component interfaces) ┤── 1b depends on 1a
Phase 1c (World type)  ─────────┘── 1c depends on 1a, 1b

Phase 2a (Position bridge)  ────┐── depends on 1c
Phase 2b (Widget bridge)  ──────┤── depends on 1a, 1c
Phase 2c (Node metadata bridge) ┘── depends on 0a, 1c

Phase 3a (SerializationSystem)  ─── depends on 2a, 2b, 2c
Phase 3b (VersionSystem)  ──────── depends on 0a, 2c
Phase 3c (ConnectivitySystem)  ──── depends on 2c

Phase 3->4 gate checklist  ──────── depends on 3a, 3b, 3c

Phase 4a (Position writes)  ────── depends on 2a, 3b
Phase 4b (Connectivity mutations) ─ depends on 3c, 3->4 gate
Phase 4c (Widget writes)  ─────── depends on 2b
Phase 4d (Layout decoupling)  ─── depends on 2a, 3->4 gate

Phase 4->5 exit criteria  ──────── depends on all of Phase 4

Phase 5 (legacy removal)  ─────── depends on 4->5 exit criteria

Risk Summary

Phase	Risk	Reversibility	Extension Impact
0 (Foundation)	None	Fully reversible	None
1 (Types/World)	Low	New files, deletable	None
2 (Bridge)	Low-Medium	Bridge is additive	None
3 (Systems)	Low-Medium	Systems run in parallel	None
4 (Write path)	High	Two-way sync is fragile	Callbacks must be preserved
5 (Legacy removal)	Very High	Irreversible	Extensions must migrate

The plan is designed so that Phases 0-3 can ship without any risk to extensions or existing behavior. Phase 4 is where the real migration begins, and Phase 5 is the point of no return.

28 KiB Raw Blame History

ECS Migration Plan

Planning assumptions

Phase 0: Foundation

0a. Centralize version counter

0b. Add missing ID type aliases

0c. Fix architecture doc errors

Phase 1: Types and World Shell

1a. Branded entity ID types

1b. Component interfaces

1c. World type

Phase 2: Bridge Layer

2a. Read-only bridge for Position

2b. Read-only bridge for WidgetValue

2c. Read-only bridge for Node metadata

Bridge sunset criteria (applies to every Phase 2 bridge)

Bridge duration and maintenance controls

Phase 3: Systems

3a. SerializationSystem (read-only)

3b. VersionSystem

3c. ConnectivitySystem (queries only)

Phase 4: Write Path Migration

4a. Position writes through World

4b. ConnectivitySystem mutations

4c. Widget write path

4d. Layout write path and render decoupling

Render hot-path performance gate

Phase 3 -> 4 gate (required)

Phase 5: Legacy Removal

5a. Remove Position bridge

5b. Remove widget class hierarchy

5c. Dissolve god objects

Phase 4 -> 5 exit criteria (required)

Phase 5 trigger packet (required before first legacy-removal PR)

Open Questions

CRDT / ECS coexistence

Extension API preservation

Extension Migration Examples (old -> new)

1) Widget lookup by name

2) onConnectionsChange callback

3) onRemoved callback

4) graph._version++

Atomicity and transactions

Keying strategy unification

Dependency Graph

Risk Summary

28 KiB

Raw Blame History

2) `onConnectionsChange` callback

3) `onRemoved` callback

4) `graph._version++`