* universal streamk with atomics with ckprofiler support. grid_size and streamk strategy are tunable. grid_size of -1 leads to #WGs = maximum occupancy X num_CUs. implementation supports many different streamk policies: 1-tile, 2-tile, 3-tile and 4-tile. streamk strategy of -1 leads to default streamk policy (4-tile).
* Update README.md
* fixing clang-format issues
* removed conflicts in struct members between streamk and universal streamk
* corrected arg parsing for streamk and universal streamk
* added stream-k policies for 3 tile and 4 tile
* fixed argument type issue with parsing cmd args
* changes suggested in PR review are made- removing comments and correcting copyright
* file permissions updated
* added default value support for grid_size and streamk-policy selection set to -1
* print messages for arguments
* print messages for arguments
* print messages for arguments1
* experiment with config file
* experiment with version.h config
* add more info to version.h
* minor updates
* minor updates
* fix case where DTYPE is not used
* large amount of files but minor changes
* remove white space
* minor changes to add more MACROs
* fix cmakedefine01
* fix issue with CK internal conflict
* fix define and define value
* fix clang-format
* fix formatting issue
* experiment with cmake
* clang format v12 to be consistent with miopen
* avoid clang-format for config file