20 Commits

Author SHA1 Message Date
Edward Smyth
82bdf7c8c7 Code cleanup: Copyright notices
- Standardize formatting (spacing etc).
- Add full copyright to cmake files (excluding .json)
- Correct copyright and disclaimer text for frame and
  zen, skx and a couple of other kernels to cover all
  contributors, as is commonly used in other files.
- Fixed some typos and missing lines in copyright
  statements.

AMD-Internal: [CPUPL-4415]
Change-Id: Ib248bb6033c4d0b408773cf0e2a2cda6c2a74371
2024-08-05 15:35:08 -04:00
Edward Smyth
ed5010d65b Code cleanup: AMD copyright notice
Standardize format of AMD copyright notice.

AMD-Internal: [CPUPL-3519]
Change-Id: I98530e58138765e5cd5bc0c97500506801eb0bf0
2023-11-23 08:54:31 -05:00
Eleni Vlachopoulou
75a4d2f72f CMake: Adding new portable CMake system.
- A completely new system, made to be closer to Make system.

AMD-Internal: [CPUPL-2748]
Change-Id: I83232786406cdc4f0a0950fb6ac8f551e5968529
2023-11-09 15:49:45 +05:30
Edward Smyth
c445f192d5 BLIS: Missing clobbers (batch 6)
More missing clobbers in skx and zen4 kernels, missed in
previous commits.

AMD-Internal: [CPUPL-3521]
Change-Id: I838240f0539af4bf977a10d20302a40c34710858
2023-08-07 10:52:23 -04:00
Eleni Vlachopoulou
9c613c4c03 Windows CMake bugfix in object libraries for shared library option
Defining BLIS_IS_BUILDING_LIBRARY if BUILD_SHARED_LIBS=ON for the object libraries created in kernels/ directory.
The macro definition was not propagated from high level CMake, so we need to define explicitly for the object libraries.

AMD-Internal: [CPUPL-3241]
Change-Id: Ifc5243861eb94670e7581367ef4bc7467c664d52
2023-05-24 17:30:16 +05:30
Eleni Vlachopoulou
1a7f60ff5b Update CMake system to use object libraries for haswell, skx and zen4.
- AVX2 and AVX512 flags are set up locally for each object library that requires them.
- Default ENABLE_SIMD_FLAGS value is set to none and for AVX2 option the corresponding compiler flag is set globally.
- To be able to build zen4 codepath when ENABLE_SIMD_FLAGS=AVX2, the compiler option is removed by removing the definition before building the corresponding object library.

AMD-Internal: [CPUPL-3241]
Change-Id: Ia570e60f06c4c72b7c58f4c9ca73bac4c060ae73
2023-05-12 10:04:16 -04:00
Kiran Varaganti
eff436c653 Bug Fix to replace vzeroall
Fixed syntax in AVX512 dgemm native kernel.
zen4 configuration follows Intel ASM syntax whereas other AMD configs
follow AT&T ASM syntax. Bug was introduced due to following AT&T syntax
in AVX512 dgemm kernel. In this commit we changed the syntax to Intel ASM
format. src and dst operands are interchanged.

Change-Id: Ie61dc7c5e8309b79437d471331318f3104bcd447
2022-07-22 03:42:17 -04:00
Kiran Varaganti
86134c7278 Replaced vzeroall
Replaced vzeroall instruction with vxorpd and vmovapd for dgemm kernels
-both AVX2 and AVX512. vzeroall is expensive instruction and replaced it
with faster version of zeroing all registers. vzeroupper() instruction is
also added at the end of AVX2 kernels to avoid any AVX2/SSE transition
penalities. Kindly note only the main kernels are modified.

Change-Id: Ieb9bc629db01f0f94dd0e8e55550940d3d7eb2a4
2022-07-20 01:16:59 -04:00
Dipal M Zambare
c87b9aab75 Added support for AVX512 for Windows and AMAVX
- Completed zen4 configuration support on windows
 - Enabled AVX512 kernels for AMAXV
 - Added zen4 configuration in amdzen for windows
 - Moved all zen4 kernels inside kernels/zen4 folder

AMD-Internal: [CPUPL-2108]
Change-Id: I9d2336998bbcdb8e2c4ca474977b5939bfa578ba
2022-06-08 11:09:48 +05:30
Field G. Van Zee
0645f239fb Remove UT-Austin from copyright headers' clause 3.
Details:
- Removed explicit reference to The University of Texas at Austin in the
  third clause of the license comment blocks of all relevant files and
  replaced it with a more all-encompassing "copyright holder(s)".
- Removed duplicate words ("derived") from a few kernels' license
  comment blocks.
- Homogenized license comment block in kernels/zen/3/bli_gemm_small.c
  with format of all other comment blocks.
2018-12-04 14:31:06 -06:00
Field G. Van Zee
e249a00a82 Imported skx dgemm ukernel from skx-redux branch.
Details:
- Added the new bli_dgemm_skx_asm_16x14.c microkernel from the skx-redux
  branch, along with appropriate blocksizes in bli_cntx_init_skx.c and
  a prototype in bli_kernels_skx.h. (Devin has not yet written the
  sgemm analague, so for now we will continue using the older sgemm
  ukernel.)
- Updated frame/include/bli_x86_asm_macros.h with a minor change that
  was present within the skx-redux branch.
2018-09-10 16:48:35 -05:00
Field G. Van Zee
4fa4cb0734 Trivial comment header updates.
Details:
- Removed four trailing spaces after "BLIS" that occurs in most files'
  commented-out license headers.
- Added UT copyright lines to some files. (These files previously had
  only AMD copyright lines but were contributed to by both UT and AMD.)
- In some files' copyright lines, expanded 'The University of Texas' to
  'The University of Texas at Austin'.
- Fixed various typos/misspellings in some license headers.
2018-08-29 18:06:41 -05:00
Devin Matthews
a7166feb10 Finish macroization of assembly ukernels. 2018-06-25 12:09:18 -05:00
Devin Matthews
b4d94e54d4 Convert x86 microkernels to assembly macros. 2018-06-20 14:07:24 -05:00
Field G. Van Zee
5140ee3424 Updated types of bli_is_[un]aligned_to() functions.
Details:
- Changed the void* arguments of the following static functions:
    bli_is_aligned_to()
    bli_is_unaligned_to()
    bli_offset_past_alignment()
  to siz_t, and the return type of bli_offset_past_alignment() from
  guint_t to siz_t. This allows for more versatile usage of these
  functions (e.g. when aligning both pointers and leading dimension).
- Updated all invocations of these functions, mostly in kernels/penryn
  but also in kernels/bgq, to include explicit typecasts to siz_t when
  pointer arguments are passed in.
- Thanks to Devin Matthews for pointing out this potential bug (via issue
  #211).
- Deleted a few trailing spaces in various penryn kernels.
- Removed duplicate instances of the words "derived" and "THEORY" from
  various kernel license headers, likely from a malformed recursive sed
  performed long ago.
2018-05-23 16:56:14 -05:00
Field G. Van Zee
60366a3fab Updates to knl kernels and related code.
Details:
- Imported the 24x16 knl sgemm microkernel (and its corresonding spackm
  kernel) from TBLIS and enabled its use in the knl sub-config. Also
  Added sgemm microkernel prototype to bli_kernels_knl.h.
- Updated dgemm and dpackm microkernels from TBLIS, which included an
  important change regarding the offsets array (changed from extern
  declaration to static declaration/definition).
- Activated use of level-1v and -1f zen kernels in skx and knl
  sub-configs.
- Removed some old macros no longer needed in bli_family_skx.h now that
  libmemkind support exists in configure.
- Moved bli_avx512_macros.h to frame/include and adjusted #includes in
  skx and knl kernels accordingly.
- Moved unused kernels in kernels/knl/3 to kernels/knl/3/other
  directory.
- Fixed a minor bug in the 'make' output per compile when verboseness
  is not turned on. The rule-generating function 'make-kernel-rule' was
  previously passing in the name of the config, rather than the name of
  the kernel set returned by get-config-for-kset, which could give
  misleading information to the user when the kconfig_map mapped a
  kernel set to a sub-configuration that did not share the same name.
  (This didn't affect the CFLAGS that were actually used.)
- Updated test/3m4m/Makefile, removing acml targets and renaming the
  remaining targets.
2018-04-16 18:46:21 -05:00
Field G. Van Zee
78a24e7dad Updated bli_avx512_macros.h in knl and skx configs.
Details:
- Downloaded updated version of bli_avx512_macros.h from TBLIS [1] in
  attempt to address issue #192.
  [1] https://github.com/devinamatthews/tblis/
2018-04-09 17:02:13 -05:00
dnp
ca982148b3 Fixed bug in SKX sgemm microkernel. Modified SKX dgemm mircokernel to be consistent with the sgemm microkernel 2018-04-08 21:27:10 -05:00
dnp
ae9a5be56d Fixed bug in skx sgemm microkernel 2018-03-27 17:01:23 -05:00
dnp
4423e33dc5 Adding SKX kernels and configuration. 2017-12-06 16:35:03 -06:00