mirror of
https://github.com/amd/blis.git
synced 2026-05-12 18:15:37 +00:00
Details: - Added a new 'zen3' subconfiguration targeting support for the AMD Zen3 microarchitecture (#561). Thanks to AMD for this contribution. - Restructured clang and AOCC support for zen, zen2, and zen3 make_defs.mk files. The clang and AOCC version detection now happens in configure, not in the subconfigurations' makefile fragments. That is, we've added logic to configure that detects the version of clang/AOCC, outputs an appropriate variable to config.mk (ie: CLANG_OT_*, AOCC_OT_*), and then checks for it within the makefile fragment (as is currently done for the GCC_OT_* variables). - Added configure support for a GCC_OT_10_1_0 variable (and associated substitution anchor) to communicate whether the gcc version is older than 10.1.0, and use this variable to check for recent enough versions of gcc to use -march=znver3 in the zen3 subconfig. - Inlined the contents of config/zen/amd_config.mk into the zen and zen2 make_defs.mk so that the files are self-contained, harmonizing the format of all three Zen-based subconfigurations' make_defs.mk files. - Added indenting (with spaces) of GNU make conditionals for easier reading in zen, zen2, and zen3 make_defs.mk files. - Adjusted the range of models checked by bli_cpuid_is_zen() (which was previously 0x00 ~ 0xff and is now 0x00 ~ 0x2f) so that it is completely disjoint from the models checked by bli_cpuid_is_zen2() (0x30 ~ 0xff). This is normally necessary because Zen and Zen2 microarchitectures share the same family (23, or 0x17), and so the model code is the only way to differentiate the two. But in our case, fixing the model range for zen *wasn't* actually necessary since we checked for zen2 first, and therefore the wide zen range acted like the 'else' of an 'if-else' statement. That said, the change helps improve clarity for the reader by encoding useful knowledge, which was obtained from https://en.wikichip.org/wiki/amd/cpuid . - Added zen2.def and zen3.def files to the collection in travis/cpuid. Note that support for zen, zen2, and zen3 is now present, and while all the three microarchitectures have identical instruction sets from the perspective of BLIS microkernels, they each correspond to different subconfigurations and therefore merit separate testing. Thanks to Devin Matthews for his guidance in hacking these files as slight modifications of zen.def. - Enabled testing of zen2 and zen3 via the SDE in travis/do_sde.sh. Now, zen, zen2, and zen3 are tested through the SDE via Travis CI builds. - Updated travis/do_sde.sh to grab the SDE tarball from a new ci-utils repository on GitHub rather than on Intel's website. This change was made in an attempt to circumvent recent troubles with Travis CI not being able to download the SDE directly from Intel's website via curl. Thanks to Devin Matthews for suggesting the idea. - Updated travis/do_sde.sh to grab the latest version (8.69.1) of the Intel SDE from the flame/ci-utils repository. - Updated .travis.yml to use gcc 9. The file was previously using gcc 8, which did not support -march=znver2. - Created amd64_legacy umbrella family in config_registry for targeting older (bulldozer, piledriver, steamroller, and excavator) microarchitectures and moved those same subconfigs out of the amd64 umbrella family. However, x86_64 retains amd64_legacy as a constituent member. - Fixed a bug in configure related to the building of the so-called config list. When processing the contents of config_registry, configure creates a series of structures and lists that allow for various mappings related to configuration families, subconfigs, and kernel sets. Two of those lists are built via substitution of umbrella families with their subconfig members, and one of those lists was improperly performing the substitution in a way that would erroneously match on partial umbrella family names. That code was changed to match the code that was already doing the substitution properly, via substitute_words(). Also added comments noting the importance of using substitute_words() in both instances. - Comment updates.