Mithun Mohan
9906fd7b91
F32 eltwise kernel updates to use masks in scale factor load.
...
-Currently the scale factor is loaded without using mask in downscale,
and matrix add/mul ops in the F32 eltwise kernels. This results in
out of memory reads when n is not a multiple of NR (64).
-The loads are updated to masked loads to fix the same.
AMD-Internal: [SWLCSG-3390]
Change-Id: Ib2fc555555861800c591344dc28ac0e3f63fd7cb
2025-02-27 08:17:58 -05:00
..
2025-02-27 08:17:58 -05:00
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-07 05:41:44 -05:00
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2025-02-07 05:41:44 -05:00
2025-02-06 22:59:59 +05:30
2025-02-06 22:59:59 +05:30
2024-09-17 04:48:59 -04:00
2021-06-04 17:45:04 +05:30
2025-01-31 06:04:16 -05:00
2022-01-05 04:19:11 -05:00
2024-04-08 00:06:54 -04:00
2021-06-09 12:29:49 +05:30
2024-08-05 11:52:33 -04:00
2025-01-30 08:28:14 -05:00
2023-11-22 17:11:10 -05:00
2021-05-11 14:57:51 +05:30
2021-05-19 14:21:09 +05:30
2021-05-19 14:05:01 +05:30
2023-04-21 10:02:48 -04:00
2024-04-08 00:06:54 -04:00
2021-06-09 17:05:00 +05:30
2021-05-11 14:57:51 +05:30
2021-11-12 08:58:52 +05:30
2021-06-08 11:54:55 +05:30
2025-01-31 06:04:16 -05:00