* lora: add weight shape calculations.
This lets the loader know if a lora will change the shape of a weight
so it can take appropriate action.
* MPDynamic: force load flux img_in weight
This weight is a bit special, in that the lora changes its geometry.
This is rather unique, not handled by existing estimate and doesn't
work for either offloading or dynamic_vram.
Fix for dynamic_vram as a special case. Ideally we can fully precalculate
these lora geometry changes at load time, but just get these models
working first.
* Add factorization utils for lokr
* Add lokr train impl
* Add loha train impl
* Add adapter map for algo selection
* Add optional grad ckpt and algo selection
* Update __init__.py
* correct key name for loha
* Use custom fwd/bwd func and better init for loha
* Support gradient accumulation
* Fix bugs of loha
* use more stable init
* Add OFT training
* linting