composable_kernel/include/ck/tensor_operation/gpu/grid at 9287b7c6b3756f7aae37aeee3e772672e7add404 - composable_kernel - Public git mirror

ROCm/composable_kernel

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-05-04 21:51:28 +00:00

Files

History

Anthony Chang 9287b7c6b3 Grouped batched attention + permute (#412 )

* grouped attn without batch validates; now move toward grouped batched attn

* grouped batched attention

* working

* remove debug logging

clean up

clean up

* reintroduce g_ prefix back to host tensor variables

* format

* rename file

* restore old file

* rename

* consolidate padded/non-padded attention example

* harmonize padding specialization in attn examples

2022-09-19 16:09:44 -05:00

..

block_to_ctile_map.hpp

Grouped batched attention + permute (#412 )

2022-09-19 16:09:44 -05:00

gridwise_2d_multiple_reduction_multiblock.hpp

Batchnorm-forward and Batchnorm-infer Implemented using generic kernels (#320 )

2022-08-15 10:11:02 -05:00

gridwise_2d_multiple_reduction_threadwise.hpp

Batchnorm-forward and Batchnorm-infer Implemented using generic kernels (#320 )

2022-08-15 10:11:02 -05:00

gridwise_2d_reduction_multiblock.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_2d_reduction_threadwise.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_batched_gemm_gemm_xdl_cshuffle_v1.hpp

Fix gemm-softmax-gemm-permute padding cases (#409 )

2022-09-08 09:27:50 -05:00

gridwise_batched_gemm_multiple_d_gemm_multiple_d_xdl_cshuffle_v1.hpp

batched_gemm + multiple_d + gemm + multiple_d (#394 )

2022-09-14 17:54:18 -05:00

gridwise_batched_gemm_softmax_gemm_xdl_cshuffle_v1.hpp

Fix gemm-softmax-gemm-permute padding cases (#409 )

2022-09-08 09:27:50 -05:00

gridwise_contraction_dlops_v1r2.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_elementwise_1d.hpp

Batchnorm-forward and Batchnorm-infer Implemented using generic kernels (#320 )

2022-08-15 10:11:02 -05:00

gridwise_gemm_bias_add_reduce_xdl_cshuffle_v1.hpp

external api for gemm + layernorm (#285 )

2022-06-27 14:25:10 -05:00

gridwise_gemm_dl_v1r3.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_dlops_v1r2.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_dlops_v2.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_dlops_v3.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_multiple_d_multiple_r_xdl_cshuffle.hpp

Refactor the design of DeviceGemmMultipleDMultipleR_Xdl_CShuffle (#378 )

2022-08-24 10:12:54 -05:00

gridwise_gemm_multiple_d_xdl_cshuffle.hpp

Conv bwd data multiple d (#404 )

2022-09-19 11:25:28 -05:00

gridwise_gemm_pipeline_v1.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_pipeline_v2.hpp

GEMM pipeline v2 (#317 )

2022-07-08 15:55:14 -05:00

gridwise_gemm_reduce_xdl_cshuffle_v1.hpp

external api for gemm + layernorm (#285 )

2022-06-27 14:25:10 -05:00

gridwise_gemm_xdl_cshuffle_v1.hpp

GEMM pipeline v2 (#317 )

2022-07-08 15:55:14 -05:00

gridwise_gemm_xdl_layernorm_cshuffle_v1.hpp

Single-kernel GEMM + layernorm (#263 )

2022-07-01 01:38:00 -05:00

gridwise_gemm_xdlops_bwd_weight.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_xdlops_skip_b_lds_v1.hpp

Fused GEMM+GEMM (#351 )

2022-08-13 09:18:58 -05:00

gridwise_gemm_xdlops_v2r3.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_xdlops_v2r4.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_xdlops_v2r4r2.hpp

Add examples of batched/grouped/SplitK Gemm for int8/bfp16/fp16/fp32 (#361 )

2022-08-23 14:41:56 -05:00

gridwise_gemm_xdlops_v3r1.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_xdlops_v3r2.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_gemm_xdlops_v3r3.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_layernorm_naive_variance.hpp

Layernorm welford (#346 )

2022-08-13 09:43:18 -05:00

gridwise_layernorm_welford_variance.hpp

Layernorm welford (#346 )

2022-08-13 09:43:18 -05:00

gridwise_set_buffer_value.hpp

add license in file (#303 )

2022-06-24 23:32:43 -05:00

gridwise_set_multiple_buffer_value.hpp

Batchnorm-forward and Batchnorm-infer Implemented using generic kernels (#320 )

2022-08-15 10:11:02 -05:00

gridwise_softmax.hpp

fix standalone softmax race condition around blockwise reduction (#323 )

2022-07-14 22:52:45 -05:00

gridwise_sparse_embedding3_forward_layernorm.hpp

embedding fuse layernorm (#405 )

2022-09-09 10:41:15 -05:00