Aviral Goel
54de3e55e1
Implementing Test Filters for Smoke and Regression Tests ( #1819 )
...
* smoke and regression targets working with tests
* test filters work for both examples and test
* removed uneccesary comments
* added a missing comment
* added a missing comment
* fixed typo in the comments
* updated README
* Update PULL_REQUEST_TEMPLATE.md
updating the template for future addition of test cases
* Update PULL_REQUEST_TEMPLATE.md
2025-01-16 16:40:08 -08:00
darren-amd
26b3829c02
Disable building DPP kernels by default ( #1804 )
...
* Disable building DPP kernels by default
* Disable building dpp instances, examples, or tests if DPP_KERNELS is not set
* Add new DPP_KERNELS flag to readme
2025-01-08 13:50:42 -05:00
Bartłomiej Kocot
5affda819d
Add basic documentation structure ( #1715 )
...
* Add basic documentation structure
* Add terminology placeholder
* Add codegen placeholder
* Create template for each page
2024-12-04 00:46:47 +01:00
Harisankar Sadasivan
d6d4c2788b
universal streamk fp8 changes ( #1665 )
...
* universal streamk fp8 changes & ckprofiler instances
* revert strides to -1 and verification options
* fp8 exclusion on pre-gfx94 for universal_streamk
* PR review based revisions: permissions reverted, removed hip err checks
---------
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
2024-11-21 08:21:37 -08:00
Illia Silin
03c6448ba3
Reduce build time. ( #1621 )
...
* disable fp8 gemm_universal on gfx90a and gfx908 by default
* fix cmake syntax
* fix clang format
* add ifdefs in amd_xdlops
* disable fp8 gemm instances on gfx90a by default
* update readme
2024-11-01 13:52:23 +08:00
spolifroni-amd
794f2d64a8
added link to documentation ( #1578 )
2024-10-21 08:35:57 -07:00
Illia Silin
f46a9eee9d
only build tests and examples if user sets GPU_TARGETS ( #1565 )
2024-10-10 15:31:56 -07:00
Illia Silin
7d8ea5f08b
Fix build logic using GRU_ARCHS. ( #1536 )
...
* update build logic with GPU_ARCHS
* fix the GPU_ARCHS build for codegen
* unset GPU_TARGETS when GPU_ARCHS are set
2024-10-07 08:18:23 -07:00
Lisa
281f836903
fix typo ( #1067 )
...
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
2023-12-14 14:21:18 -08:00
Illia Silin
d939411dae
Switch from ROCmSoftwarePlatform to ROCm org ( #1091 )
...
* switch from ROCmSoftwarePlatform to ROCm org
* replace ROCmSoftwarePlatform with ROCm in few more places
2023-12-07 15:59:34 -08:00
Illia Silin
4e44a9e8da
Enable sccache in the default docker and CI. ( #1009 )
...
* replace ccache with sccache, pin package versions
* put ccache back temporarily to avoid breaking other CI jobs
* add sccashe_wrapper.sh script
* fix the package version syntax
* fix the pymysql package issue
* run sccache_wrapper before build if ccache server found
* set the paths before calling the sccache_wrapper
* use /tmp instead of /usr/local for cache
* try using sccache --start-server instead of wrapper
* try using redis server with sccache
* define SCCACHE_REDIS
* add redis and ping packages, and redis port
* use the new sccache redis server
* do not use sccache with staging compiler
* fix the condition syntax
* add stunnel to redis
* add tunnel verification
* separate caches for different architectures
* fix syntax for the cache tag
* quse double brackets for conditions
* add bash line to the script
* add a switch for sccache and only use it in build stage
* run check_host function when enabling sccache
* fix the invocation tags for sccache
* fix groovy syntax
* set the invocation tag in groovy
* disable sccache in clang-format stage
* try another syntax for invocation tags
* use local sccache server if can't connect to redis
* fix script syntax
* update README
* refresh readme
* readme updates
* remove the timing and verification caveat from readme
---------
Co-authored-by: Lisa Delaney <lisa.delaney@amd.com >
2023-10-30 13:16:29 -07:00
Illia Silin
9195435c77
Disable DL kernels by default. ( #816 )
2023-07-26 11:06:45 -05:00
Adam Osewski
237f9cd3aa
Add basic setup for precommit ( #749 ) ( #764 )
...
* Add basic setup for precommit
* Update README.md with instructions on installing precommit hooks
---------
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com >
Co-authored-by: Bartlomiej Wroblewski <bwroblewski10@gmail.com >
2023-07-06 11:01:06 -05:00
Sam Wu
3cff340423
Documentation Updates ( #710 )
...
* update documentation dependencies
add version number to docs
rename doc config directories
enable more doc formats on rtd
add license section in docs
2023-05-18 11:08:38 -06:00
Sam Wu
f80776d937
standardize docs ( #655 )
2023-03-23 20:58:59 -07:00
Po Yen Chen
337642a48c
Add quotes for string option values ( #472 )
2022-10-27 15:33:14 -06:00
Chao Liu
6de749e29c
Update doc ( #464 )
...
* update cmake script
* update readme
* Update README.md
* add citation
* add images
* Update README.md
* update
* Update README.md
* Update CONTRIBUTORS.md
* Update README.md
* Update CITATION.cff
* Update README.md
* Update CITATION.cff
* update doc
* Update CONTRIBUTORS.md
* Update LICENSE
2022-10-03 14:34:40 -05:00
Chao Liu
473ba5bc4a
update document: Readme, contributors, citation, ( #463 )
...
* update cmake script
* update readme
* Update README.md
* add citation
* add images
* Update README.md
* update
* Update README.md
* Update CONTRIBUTORS.md
* Update README.md
* Update CITATION.cff
* Update README.md
* Update CITATION.cff
2022-10-03 00:48:24 -05:00
Chao Liu
500fa99512
Clean up conv example, Instances, profiler and test ( #324 )
...
* convnd_fwd fp16 example
* update example
* update example
* update instance
* updating refernce conv
* update reference conv
* update conv fwd profiler
* update conv 1d and 3d instance
* update include path
* clean
* update profiler for conv bwd data and weight
* update conv bwd weight
* clean
* update conv example
* update profiler for conv bwd weight
* update ckprofiler for conv bwd data
* fix reference conv bwd data bug; update conv bwd data test
* update examples
* fix initialization issue
* update test for conv fwd
* clean
* clean
* remove test case too sensitive to error threshhold
* fix test
* clean
* fix build
* adding conv multiple d
* adding conv multiple D
* add matrix padder
* add gemm padding to convnd
* adding group conv
* update gemm multi-d
* refactor
* refactor
* refactor
* clean
* clean
* refactor
* refactor
* reorg
* add ds
* add bias
* clean
* add G
* adding group
* adding group
* adding group
* update Tensor
* clean
* update example
* update DeviceGemmMultipleD_Xdl_CShuffle
* update conv bwd-data and bwd-weight
* upate contraction example
* update gemm and batch gemm with e permute
* fix example build
* instance for grouped conv1d
* update example
* adding group conv instance
* update gemm bilinear instance
* update gemm+add+add+fastgelu instance
* update profiler
* update profiler
* update test
* update test and client example
* clean
* add grouped conv into profiler
* update profiler
* clean
* add test grouped conv, update all conv test to gtest
* update test
2022-07-29 18:19:25 -05:00
Chao Liu
0dcb3496cf
Improve external interface for GEMM and GEMM+add+add+fastgelu ( #311 )
...
* interface for GEMM and GEMM+add+add+fastgelu
* rename namespace
* instance factory
* fix build
* fix build; add GEMM client example
* clean
2022-06-30 22:11:00 -05:00
Liam Wrubleski
b653c5eb2e
Switch to standard ROCm packaging ( #301 )
...
* Switch to standard ROCm packaging
* Revert .gitignore changes
* install new rocm-cmake version
* update readme
Co-authored-by: illsilin <Illia.Silin@amd.com >
Co-authored-by: Chao Liu <chao.liu2@amd.com >
2022-06-25 09:35:16 -05:00
Chao Liu
ccbd8d907b
update readme and script ( #290 )
2022-06-20 23:34:32 -05:00
JD
cec69bc3bc
Add host API ( #220 )
...
* Add host API
* manually rebase on develop
* clean
* manually rebase on develop
* exclude tests from all target
* address review comments
* update client app name
* fix missing lib name
* clang-format update
* refactor
* refactor
* refactor
* refactor
* refactor
* fix test issue
* refactor
* refactor
* refactor
* upate cmake and readme
Co-authored-by: Chao Liu <chao.liu2@amd.com >
2022-05-12 09:21:01 -05:00
Wen-Heng (Jack) Chung
968bd93285
Update README.md ( #228 )
2022-05-09 15:00:04 -05:00
Chao Liu
cd167e492a
Compile for gfx908 and gfx90a ( #130 )
...
* adding compilation for multiple targets
* fix build
* clean
* update Jekinsfile
* update readme
* update Jenkins
* use ck::half_t instead of ushort for bf16
* rename enum classes
* clean
* rename
* clean
2022-03-31 12:33:34 -05:00
Chao Liu
e823d518cb
ckProfiler and device-level XDL GEMM operator ( #48 )
...
* add DeviceGemmXdl
* update script
* fix naming issue
* fix comment
* output HostTensorDescriptor
* rename
* padded GEMM for fwd v4r4r4 nhwc
* refactor
* refactor
* refactor
* adding ckProfiler
* adding ckProfiler
* refactor
* fix tuning parameter bug
* add more gemm instances
* add more fp16 GEMM instances
* fix profiler driver
* fix bug in tuning parameter
* add fp32 gemm instances
* small fix
* refactor
* rename
* refactor gemm profiler; adding DeviceConv and conv profiler
* refactor
* fix
* add conv profiler
* refactor
* adding more GEMM and Conv instance
* Create README.md
Add build instruction for ckProfiler
* Create README.md
Add Readme for gemm_xdl example
* Update README.md
Remove build instruction from top most folder
* Update README.md
* clean up
2021-11-14 11:28:32 -06:00
Chao Liu
c03045ce2d
rename
2021-08-10 23:45:36 +00:00
Chao Liu
85a1429301
Update README.md
2021-07-28 09:41:38 -05:00
Chao Liu
56f93c6f33
Update README.md
2021-07-28 09:40:44 -05:00
Chao Liu
4682d070a6
Create README.md ( #45 )
...
* Create README.md
2021-07-08 13:32:29 -05:00