feat(isolation): process isolation for custom nodes via pyisolate

Adds opt-in process isolation for custom nodes using pyisolate's
bwrap sandbox and JSON-RPC bridge. Each isolated node pack runs in
its own child process with zero-copy tensor transfer via shared memory.

Core infrastructure:
- CLI flag --use-process-isolation to enable isolation
- Host/child startup fencing via PYISOLATE_CHILD env var
- Manifest-driven node discovery and extension loading
- JSON-RPC bridge between host and child processes
- Shared memory forensics for leak detection

Proxy layer:
- ModelPatcher, CLIP, VAE, and ModelSampling proxies
- Host service proxies (folder_paths, model_management, progress, etc.)
- Proxy base with automatic method forwarding

Execution integration:
- Extension wrapper with V3 hidden param mapping
- Runtime helpers for isolated node execution
- Host policy for node isolation decisions
- Fenced sampler device handling and model ejection parity

Serializers for cross-process data transfer:
- File3D (GLB), PLY (structured + gaussian), NPZ (streaming frames),
  VIDEO (VideoFromFile + VideoFromComponents) serializers
- data_type flag in SerializerRegistry for type-aware dispatch
- Isolated get_temp_directory() fence

New core save nodes:
- SavePLY and SaveNPZ with comfytype registrations (Ply, Npz)

DynamicVRAM compatibility:
- comfy-aimdo early init gated by isolation fence

Tests:
- Integration and policy tests for isolation lifecycle
- Manifest loader, host policy, proxy, and adapter unit tests

Depends on: pyisolate >= 0.9.2
This commit is contained in:
John Pollock
2026-03-12 01:13:43 -05:00
parent 9ce4c3dd87
commit c5e7b9cdaf
54 changed files with 9061 additions and 78 deletions

View File

@@ -92,7 +92,7 @@ if args.cuda_malloc:
env_var = os.environ.get('PYTORCH_CUDA_ALLOC_CONF', None)
if env_var is None:
env_var = "backend:cudaMallocAsync"
else:
elif not args.use_process_isolation:
env_var += ",backend:cudaMallocAsync"
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = env_var