67 Commits

Author SHA1 Message Date
Chandrashekara K R
ae698be825 Updated version string from 5.0.1 to 5.1.1 2025-06-13 11:24:50 +05:30
Chandrashekara K R
cfbe202868 Updated version string from 4.2.1 to 5.0.1
Change-Id: I4cbd8d9ae7e35fa235a6707fe7ddbd157eb63b98
(cherry picked from commit 7f1824b8ee)
2024-10-17 03:07:17 -04:00
Chandrashekara K R
d0f890e8d5 Updated version string from 4.1.1 to 4.2.1
Change-Id: I18ff0043a2269269f251078a6ff7c51e70618b6e
2024-03-12 02:07:58 -04:00
Chandrashekara K R
248d09c722 Version String Update
AOCL-BLIS: Updated version string to AOCL-BLIS 4.1.1 Build <YYMMDD>

Change-Id: Iced62a66d0859b3c7d4bcfe6f0e0527922e41cae
2023-08-08 07:27:41 -04:00
Chandrashekara K R
5f1ea3246a Updated blis library version string to 4.0.1
Change-Id: I1f3eb9081bcc05ccec300e00860d905b23d9709a
2022-11-24 10:35:34 +05:30
Chandrashekara K R
fde812015f Updated blis library version from 4.0 to 3.2.1
AMD-Internal: [CPUPL-2322]
Change-Id: I3a6a61543dd2754e2590d7f5f22442c9fdeaee95
2022-07-29 15:55:10 +05:30
Dipal M Zambare
16de63c818 Updated version and copyright notice.
Changed AMD-BLIS version to 3.1.2

AMD-Internal: [CPUPL-2111]
Change-Id: Id8fc3fbc112f08bd5e5def646c472047352e65b5
2022-05-17 18:10:39 +05:30
Dipal M. Zambare
d3b22f590f Updated version number to 3.2
Change-Id: Iea5712d8cb854d4eaffea510e0fe2d9657e4d21f
2022-05-17 18:08:57 +05:30
Dipal M Zambare
31921b9974 Updated windows build system to define BLIS_CONFIG_EPYC flag.
All AMD specific optimization in BLIS are enclosed in BLIS_CONFIG_EPYC
pre-preprocessor, this was not defined in CMake which are resulting in
overall lower performance.

Updated version number to 3.1.1

Change-Id: I9848b695a599df07da44e77e71a64414b28c75b9
2022-05-17 18:03:09 +05:30
Dipal M Zambare
f698afc567 Updated version number to 3.1.0
AMD-Internal: [CPUPL-1811]
Change-Id: I6b485e7622e526791094ae621d9f84d2526e6569
2021-11-12 08:58:48 +05:30
Dipal M Zambare
849e1cee0a Updated version number to 3.0.1.
Change-Id: I07d5c26bb96b590854e1f81d41ed49a5e320f60e
2021-06-03 15:48:05 +05:30
dzambare
3177db4888 Updated version number.
Change-Id: Iba3659b04f2d85ec7dc008ceb84da73c7c66530a
2020-08-06 15:00:46 +05:30
Meghana Vankadari
6896f927da Fixed bug in SUP code path
Details:
- Since GEMM kernel prefers row-storage, if input C matrix is in col-major order,
  entire operation is transposed. In that case uplo(c) needs to be toggled
  before kernel-variant selection.
- disabled "bli_gemmsup_ref_var1n2m_opt_cases" inside gemmtsup.
- Updated version number  to 2.2.1

Change-Id: I0a85df1141fc4a98d98ea4e0c3d42db8602fa69b
2020-07-15 19:41:24 +05:30
prangana
711f26129e Update AMD BLIS version to 2.2
Also updated Makefile to fix issue of multiple symbolic links being
created

Change-Id: Ie9a680cedd5c96fcd7f6af1ce0f849a58c3ed4d3
2020-05-31 21:37:32 +05:30
prangana
d21c726003 update version 2.1
Change-Id: I531fe8005f63ad138077320c3f0b03a05a7c7dd2
2019-10-30 15:33:37 +05:30
prangana
9d93a4caa2 update version 2.0 2019-05-24 17:59:13 +05:30
Field G. Van Zee
e0408c3ca3 Version file update (0.5.1) 2018-12-18 14:56:16 -06:00
Field G. Van Zee
be7c57819c Version file update (0.5.0) 2018-10-25 14:09:40 -05:00
Field G. Van Zee
10fd614031 Version file update (0.4.1) 2018-08-30 15:13:59 -05:00
Field G. Van Zee
4ad61ce905 Version file update (0.4.0) 2018-07-27 16:10:43 -05:00
Field G. Van Zee
2fb4408766 Version file update (0.3.2) 2018-04-28 14:07:31 -05:00
Field G. Van Zee
1f28d7c86e Version file update (0.3.1) 2018-04-04 17:13:15 -05:00
Field G. Van Zee
709f8361eb Version file update (0.3.0) 2018-02-23 17:42:48 -06:00
Field G. Van Zee
940a707ac7 Version file update (0.2.2) 2017-05-02 16:38:42 -05:00
Field G. Van Zee
126482a3b6 Implemented the 1m method.
Details:
- Implemented the 1m method for inducing complex domain matrix
  multiplication. 1m support has been added to all level-3 operations,
  including trsm, and is now the default induced method when native
  complex domain gemm microkernels are omitted from the configuration.
- Updated _cntx_init() operations to take a datatype parameter. This was
  needed for the corresponding function for 1m (because 1m requires us
  to choose between column-oriented or row-oriented execution, which
  requires us to query the context for the storage preference of the
  gemm microkernel, which requires knowing the datatype) but I decided
  that it made sense for consistency to add the parameter to all other
  cntx initialization functions as well, even though those functions
  don't use the parameter.
- Updated bli_cntx_set_blkszs() and bli_gks_cntx_set_blkszs() to take
  a second scalar for each blocksize entry. The semantic meaning of the
  two scalars now is that the first will scale the default blocksize
  while the second will scale the maximum blocksize. This allows scaling
  the two independently, and was needed to support 1m, which requires
  scaling for a register blocksize but not the register storage
  blocksize (ie: "packdim") analogue.
- Deprecated bli_blksz_reduce_dt_to() and defined two new functions,
  bli_blksz_reduce_def_to() and bli_blksz_reduce_max_to(), for reducing
  default and maximum blocksizes to some desired blocksize multiple.
  These functions are needed in the updated definitions of
  bli_cntx_set_blkszs() and bli_gks_cntx_set_blkszs().
- Added support for the 1e and 1r packing schemas to packm, including
  1e/1r packing kernels.
- Added a minor optimization to bli_gemm_ker_var2() that allows, under
  certain circumstances (specifically, real domain beta and row- or
  column-stored matrix C), the real domain macrokernel and microkernel
  to be called directly, rather than using the virtual microkernel
  via the complex domain macrokernel, which carries a slight additional
  amount of overhead.
- Added 1m support to the testsuite.
- Added 1m support to Makefile and runme.sh in test/3m4m. Also simplified
  some code in test_gemm.c driver.
2016-11-25 18:29:49 -06:00
Field G. Van Zee
866b2dde3f Version file update (0.2.1) 2016-10-05 14:41:34 -05:00
Field G. Van Zee
096895c5d5 Reorganized code, APIs related to multithreading.
Details:
- Reorganized code and renamed files defining APIs related to multithreading.
  All code that is not specific to a particular operation is now located in a
  new directory: frame/thread. Code is now organized, roughly, by the
  namespace to which it belongs (see below).
- Consolidated all operation-specific *_thrinfo_t object types into a single
  thrinfo_t object type. Operation-specific level-3 *_thrinfo_t APIs were
  also consolidated, leaving bli_l3_thrinfo_*() and bli_packm_thrinfo_*()
  functions (aside from a few general purpose bli_thrinfo_*() functions).
- Renamed thread_comm_t object type to thrcomm_t.
- Renamed many of the routines and functions (and macros) for multithreading.
  We now have the following API namespaces:
  - bli_thrinfo_*(): functions related to thrinfo_t objects
  - bli_thrcomm_*(): functions related to thrcomm_t objects.
  - bli_thread_*(): general-purpose functions, such as initialization,
    finalization, and computing ranges. (For now, some macros, such as
    bli_thread_[io]broadcast() and bli_thread_[io]barrier() use the
    bli_thread_ namespace prefix, even though bli_thrinfo_ may be more
    appropriate.)
- Renamed thread-related macros so that they use a bli_ prefix.
- Renamed control tree-related macros so that they use a bli_ prefix (to be
  consistent with the thread-related macros that were also renamed).
- Removed #undef BLIS_SIMD_ALIGN_SIZE from dunnington's bli_kernel.h. This
  #undef was a temporary fix to some macro defaults which were being applied
  in the wrong order, which was recently fixed.
2016-06-06 13:32:04 -05:00
Field G. Van Zee
898614a555 Version file update (0.2.0) 2016-04-11 17:32:09 -05:00
Field G. Van Zee
47caa33485 Version file update (0.1.8) 2015-07-29 13:31:09 -05:00
Field G. Van Zee
267253de8a Version file update (0.1.7) 2015-06-19 12:01:49 -05:00
Field G. Van Zee
38ea5022e4 Version file update (0.1.6) 2014-10-23 11:35:45 -05:00
Field G. Van Zee
bde56d0ecf Version file update (0.1.5) 2014-08-04 16:01:58 -05:00
Field G. Van Zee
a7537071b1 Version file update (0.1.4) 2014-07-27 18:20:12 -05:00
Field G. Van Zee
036cc63491 Version file update (0.1.3) 2014-06-23 13:48:17 -05:00
Field G. Van Zee
09d9a3bf67 Reverting version file to test new version script.
Details:
- Changed version file contents to 0.1.2 so that I can test out a new
  version file bumping script.
2014-06-23 13:43:26 -05:00
Field G. Van Zee
ebb3396598 Added 'version' file. 2014-06-23 11:22:50 -05:00
Field G. Van Zee
e9e0747c2f Removed version file from version control.
Details:
- Removed version file from version control to prevent git errors that occur
  when trying to pull new commits.
2013-03-02 12:43:54 -06:00
Field G. Van Zee
bb612f864e Updated behavior of bl2_obj_induce_trans() macro.
Details:
- Changed bl2_obj_induce_trans() so that the transposition bit is no longer
  updated as part of the macro. All current uses of the macro have been
  coupled with instances of bl2_obj_set_trans() to clear the bit.
- Added Jed to CREDITS file.
2013-03-01 12:55:42 -06:00
Field G. Van Zee
f24e29b789 Replaced banded/packed BLAS2 stubs with f2c code.
Details:
- Retired the blas2blis wrappers that simply called abort with a "not yet
  implemented" message. This includes all of the level-2 banded and packed
  routines.
- Replaced the aforementioned with the corresponding netlib implementations
  having been run through f2c (with some customization).
- Added directories named 'attic' to build/gen-make-frags/ignore_list.
2013-02-22 18:15:41 -06:00
Field G. Van Zee
1454c1a142 Moved Fortran name-mangling macro to bl2_config.h.
Details:
- Moved the Fortran-77 name-mangling macros from bl2_blas_macro_defs.h to the
  configuration directory (bl2_config.h, specifically) given that it can be
  expected to be tweaked by some developers.
2013-02-22 12:38:45 -06:00
Field G. Van Zee
ede75693e5 Implemented blas2blis compatibility layer.
Details:
- Added the blas2blis compatibility layer, located in frame/compat. This
  includes virtually all of the BLAS, including banded and packed level-2
  operations.

- Defined bl2_init_safe(), bl2_finalize_safe(). The former allows a conditional
  initialization, which stores the "exit status" in an err_t, which is then
  read by the latter function to determine whether finalization should actually
  take place.
- Added calls to bl2_init_safe(), bl2_finalize_safe() to all level-2 and
  level-3 BLAS-like wrappers.
- Added configuration option to instruct BLIS to remain initialized whenever
  it automatically initializes itself (via bl2_init_safe()), until/unless the
  application code explicitly calls bl2_finalize().

- Added INSERT_GENTFUNC* and INSERT_GENTPROT* macros to facilitate type
  templatization of blas2blis wrappers.
- Defined level-0 scalar macro bl2_??swaps().
- Defined level-1v operation bl2_swapv().
- Defined some "Fortran" types to bl2_type_defs.h for use with BLAS
  wrappers.
2013-02-22 12:11:24 -06:00
Field G. Van Zee
995edf43e2 Updated version file. (Forgot to in prev commit). 2013-02-21 14:30:50 -06:00
Field G. Van Zee
5ece050a66 Updated version file. (Forgot to in prev commit). 2013-02-20 15:50:54 -06:00
Field G. Van Zee
da0c22f241 Minor changes to lower levels of scalm and setm.
Details:
- Removed diagx parameter from lower-level interfaces of scalm.
- Modified scalm_basic_check() to expect an object with a nonunit diagonal.
- Changed setm_unb_var1() so that having an implicit unit diagonal results
  in only the strictly lower or upper triangle of the matrix being modified.
2013-02-15 09:59:48 -06:00
Field G. Van Zee
2c836adadc Updated beta == zero semantics of mulsc.
Details:
- Updated beta == zero semantics of mulsc. Hopefully this is the last
  operation that needed updating.
- Added Devin to CREDITS file.
2013-02-14 10:42:56 -06:00
Field G. Van Zee
722b66c7dc Removed some calls to setv() in test modules.
Details:
- Removed calls to setv() in test modules whose sole purpose was to
  initialize vectors to zero to ensure that nan's and inf's would not
  taint the computation. Now that beta == zero semantics have been
  updated to clear the output operand (when beta is zero), rather than
  multiply against it, these setv() calls are no longer needed.
2013-02-14 10:18:00 -06:00
Field G. Van Zee
e6ac623a90 Properly implemented beta == 0 semantics.
Details:
- Changed name of set0 and set0_mxn macros to set0s and set0s_mxn,
  respectively.
- Added code to the following operations that sets the output operand to
  zero if the corresponding scalar is zero (rather than performing the
  floating-point multiply, or in the case of setv, copying the value).
  This will prevent nan's and inf's from creeping into results from
  uninitialized memory.
  - axpy
  - dotxv
  - scalv
  - scal2v
  - setv
  - gemv
  - ger
  - hemv
  - her
  - her2
  - gemm reference ukernels
2013-02-13 18:44:59 -06:00
Field G. Van Zee
c23135669f Un-deprecated packm_unb_var1.c (needed by l2 ops).
Details:
- Added bl2_packm_unb_var1() back into the mix once I realized that level-2
  operations still need this routine for packing matrices. Now, whether
  level-2 operations should be packing matrices to begin with is another
  matter. But this fixes the segmentation fault one would have gotten when
  running bl2_gemv() on a general stride matrix.
2013-02-13 13:21:00 -06:00
Field G. Van Zee
cf49e35f98 Removed cntl tree usage from packm implementation.
Details:
- Added new fields to obj_t info field:
  - invert_diag
  - pack_order_if_upper
  - pack_order_if_lower
  These fields allow packm_init() to embed information that begins
  in the control tree into the object so that the packm implementation
  does not need to use control trees at all. This is being done to aid
  Bryan's DxT code generation.
- Added macros that operate on above fields.
- Changed packm_init(), packm_blk_var2(), and packm_blk_var3() according
  to above changes.
- Made similar (but much simpler) changes to packv.
- Deprecated packm_blk_var1(), packm_unb_var1(), and packm_densify().
  These were part of prototype implementations and are no longer needed.
2013-02-12 18:39:35 -06:00
Field G. Van Zee
474bac30c9 Removed level-0 macros projrs, grabis.
Details:
- Replaced instances of projrs and grabis macros with newer,
  more general-purpose getris.
2013-02-12 12:23:48 -06:00