Files
Juan Manuel Martinez Caamaño 1810074bd7 [buffer store/load] Replace memcpy with __builtin_memcpy in cast_to_amdgpu_buffer_rsrc_t
We used memcpy to implement a bitcast of the opaque type
amdgcn_buffer_rsrc. However, hip's implementation of memcpy did not
allow the compiler to infer that the result of the copy of a uniform value
was also uniform.

This resulted in a waterfall loop over every value that the copy could
take (and a loss in performance).

When we use __builtin_memcpy, the optimizer correctly handles the
uniform copy.

Solves SWDEV-537500
2025-06-16 17:41:00 +02:00
..