Fixes LDS bank conflicts on gfx950 for universal gemm v3 pipeline
Replaces hardcoded LDS layer calculations with dynamic computation using the new architecture helpers
Adds architecture-specific helper function get_n_lds_banks()
Changes function attributes from CK_TILE_HOST_DEVICE to CK_TILE_DEVICE in universal gemm policy