* Re-structure ckProfiler source files
* Rename profiler.cpp to main.cpp
* Modularize ckProfiler operations
* Add description for profiler operations
* Use longer name to avoid name collision
* Use macro to delay expansion
* Use std::move() to avoid object copying
* Prohibit users from calling dtor
* Use macro to eliminate redundant code
* Make friend function hidden
* Add missing include directive <iostream>
* Fix wrong include directives
* Remove int8 from batchnorm-forward instances since it is not needed for forward training and could fail test
Co-authored-by: Qianfeng Zhang <Qianfeng.Zhang@amd.com>
* add intrin_mfma_f64_16x16x4f64
* add example
* gemm reference add double data type
* chang init data
* fix M N PerXdlops
* fix ifdef
* add comparsion config
* add conv fwd example
* format log out
* change rc matrix egister layout
* reorganize example
* reorganize example 2
* format,because merge develop
* fix call impl adding acc data type
* lost ;
* add compiler warning
* change example tunning parameters
* add test for fp64
* add instance
* add test/gemm/gemm_fp64.cpp
* fix get name issue
* remove some tunning parameter
* fix conflict
* format
* use integer value for GEMM test
* add acc data type
* remove typeid because fp16
* fix streamconfig etc bug from merging develop
* format
* remove test_gemm_xdl_fp64
* add AccDataType
* AccDataType problem
Co-authored-by: qinletao <letaoqin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
* modify ckProfiler_gemm output
* fix syntax
* change ckProfiler output and return 0
* fix syntax
* output datatype
* fix syntax
* output datatype in another way
* fix syntax
* fix syntax
* test return values of ckProfiler
* add layout info and tests, make sure ckprofiler returns 0
* fix syntax
* change layout output
* fix syntax
* fix syntax again
* update script to process perf results
* rearrange jenkins stages
* fix typo
* add python packages to Docker file
* adding setuptools-rust package
* modify parsing for new test parameters
* test db credentials on jenkins
* fix syntax
* update python script to handle incomplete lines
* ungrade python to 3.8 and write the gemm_params table
* add sqlalchemy package to docker
* move perf data processing to master node
* move the master node inside a steps region
* add new stage for result processing
* move results processing to separate stage
* reduce number of tests to speedup debugging
* pass config to processPerfResults stage
* run script on master in a docker container
* replace show_node_info
* try loading docker on master node again
* use ansible node instead of master
* get rid of pymysql package
* try ssh connection using paramiko
* put back pymysql
* put the perf data processing back on the gpu node
* put back artifact definition
* archive the perf_log before parsing
* clean up jenkinsfile, fix parsing
* fix typo
* enable all perf tests
* put all stages in original order, finalize script
* fix gpu_arch version
* update parsing script
* remove obsolete file causing merge conflict
* [What] Separate fixpoint gemm from gemm example
[Why] let example of gemm_int8 be pure gemm.
[What]
1. Add gemm_requant_relu_requant,
2. Let CDataType be int32 in pure gemm, because no one use int8 CDataType. It is also part of gemm_requant_relu_requant
* Fix path
* Revise cmakelist due to merge develop
* Add gemm fp16 test
* Extract PrepareGemmTensor
* Extract TestGemm
* Add test for different layout
* Add 4 layouts of shuffle version of fp32
* Add 4 layouts of shuffle version of int8
* Add 4 layouts of shuffle version of bf16
* replace all DeviceGemmPtr_ with DeviceGemmNoOpPtr to fit naming convension
* Add test for non-shuffle verstion of gemm
* Fix typo
* Print kernel information
* Add rest of the fp32 kernel to the test
* 1. Add rest of the fp16 device iop.
2. Mark the invalid device operation
Co-authored-by: rocking <chunylai@amd.com>
* Add int8 of mk_nk_mn to the ckProfiler
* Add example of int8 gemm
* Fix typo, use ushort instead of half_t for bfloat16
* replace ushortXXX_t to bhalfXXX_t
* rename ushort to bhalf_t
* Add bf16 example
* Add bf16 gemm to ckProfiler
* Fix alignment
* Fix typo
* Add unit test for gemm_xdl int8
* Add gemm_xdl fp32 unit test
* Add gemm_xdl bf16 unit test
* fix build
* fix build issue due to merge conflict
* Fix build
* Fix build error
Co-authored-by: rocking <chunylai@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
* init for splitk f16
* a working prototype
* debug
* perf debug
* update example
* instances for mk kn
* add instances for all layers
* clean
* clean
* add tuning
* format
* add mn_padding into irregular tile
* clean
Co-authored-by: Chao Liu <chao.liu2@amd.com>
* tweak conv for odd C
* update script
* clean up elementwise op
* fix build
* clean up
* added example for gemm+bias+relu+add
* added example for gemm+bias+relu
* add profiler for gemm_s_shuffle; re-org files
* add profiler
* fix build
* clean up
* clean up
* clean up
* fix build