Commit Graph

35 Commits

Author SHA1 Message Date
Field G. Van Zee
f5923cd9ff CHANGELOG update (0.7.0) 2020-04-07 14:41:45 -05:00
Field G. Van Zee
e67deb22aa CHANGELOG update (0.6.1) 2020-01-14 16:01:34 -06:00
Field G. Van Zee
5e1e696003 CHANGELOG update (0.6.0) 2019-06-03 18:37:20 -05:00
Field G. Van Zee
05c4e42642 CHANGELOG update (0.5.2) 2019-03-19 17:07:20 -05:00
Field G. Van Zee
0476f706b9 CHANGELOG update (0.5.1) 2018-12-18 14:56:20 -06:00
Field G. Van Zee
e90e7f309b CHANGELOG update (0.5.0) 2018-10-25 14:09:43 -05:00
Field G. Van Zee
ab9f9e684d CHANGELOG update (0.4.1) 2018-08-30 15:14:02 -05:00
Field G. Van Zee
55a04edf52 CHANGELOG update (0.4.0) 2018-07-27 16:10:46 -05:00
Field G. Van Zee
01c4173238 CHANGELOG update (0.3.2) 2018-04-28 14:07:34 -05:00
Field G. Van Zee
c9e4d7db74 CHANGELOG update (0.3.1) 2018-04-04 17:13:15 -05:00
Field G. Van Zee
d9079655c9 CHANGELOG update (0.3.0) 2018-02-23 17:42:48 -06:00
Field G. Van Zee
a4f1d0b880 CHANGELOG update (0.2.2) 2017-05-02 16:38:43 -05:00
Field G. Van Zee
4fb9b4ef2e CHANGELOG update (0.2.1) 2016-10-05 14:41:35 -05:00
Field G. Van Zee
7912af5db4 CHANGELOG update (0.2.0) 2016-04-11 17:32:13 -05:00
Field G. Van Zee
ecc3ebb749 CHANGELOG update (0.1.8) 2015-07-29 13:31:12 -05:00
Field G. Van Zee
0b7255a642 CHANGELOG update (0.1.7) 2015-06-19 12:01:50 -05:00
Field G. Van Zee
a8e12884ee CHANGELOG update (0.1.6) 2014-10-23 11:35:48 -05:00
Field G. Van Zee
9d61afeae2 CHANGELOG update (0.1.5) 2014-08-04 16:01:59 -05:00
Field G. Van Zee
af7a8e6c04 CHANGELOG update (0.1.4) 2014-07-27 18:20:13 -05:00
Field G. Van Zee
9ef1f1e21d CHANGELOG update (0.1.3) 2014-06-23 13:48:17 -05:00
Field G. Van Zee
19c05dfaac CHANGELOG update (for 0.1.2). 2014-06-05 10:54:16 -05:00
Field G. Van Zee
f18aee83a5 CHANGELOG update (for 0.1.1). 2014-02-25 17:58:42 -06:00
Field G. Van Zee
1a4d698f42 CHANGELOG update (for 0.1.0). 2013-11-11 10:15:40 -06:00
Field G. Van Zee
b33e2f4443 CHANGELOG update (for 0.0.9). 2013-07-19 17:15:03 -05:00
Field G. Van Zee
0efb7974f1 CHANGELOG update. 2013-06-12 16:40:04 -05:00
Field G. Van Zee
75405a2b83 CHANGELOG update. 2013-05-01 15:00:30 -05:00
Field G. Van Zee
3414a23c38 CHANGELOG update. 2013-04-13 16:53:16 -05:00
Field G. Van Zee
40a0654ada CHANGELOG update. 2013-03-24 20:18:12 -05:00
Field G. Van Zee
36c782857b CHANGELOG update. 2013-03-18 10:37:03 -05:00
Field G. Van Zee
3b620cc8e9 CHANGELOG update. 2013-02-11 13:38:07 -06:00
Field G. Van Zee
768fcebaa8 Added unified test suite, and many fixes.
Details:
- Added a highly configurable, unified test suite.

- Removed DUPB configuration constant from bl2_kernel.h and macro-kernel
  header files. Now, instead, DUPB is computed as (NDUP != 1) within each
  macro-kernel. This fixes a bug in trmm/trsm whereby bp was indexed into
  incorrectly when DUPB was set to FALSE but the NDUP was still non-unit.
  By encoding both pieces of information into one constant in _kernel.h,
  it seems somewhat less likely others will encounter this bug in the
  future.
- Added level-2 cache blocksizes to _kernel.h for reference configuration,
  and defined blocksizes in _cntl.c files to these default values.

- Changed semantics of her2k and syr2k such that these operations no longer
  expect the B matrix to already be conjugate-transposed (or just transposed
  for syr2k). However, these semantics are preserved for the internal
  mechanics of the implementations, including the internal back-end and all
  blocked variants.
- Inserted checks for real-valued alpha and beta for herk/her2k and herk,
  respectively.

- Relaxed general object structure constraints in _basic_check() for gemv, ger.
- Changed her front-end to NOT copy-cast to real projection; instead, this is
  replaced by selecting either the real part or both parts within the unblocked
  algorithm implementation, depending on the value of conjh.
- Added conjh to all _check routines for her so that the code knows when to
  verify that alpha has an imaginary component equal to zero (for her, but
  not syr).
- Changed control tree for her to forgo packing.

- Added unit diagonal support to fnormm.
- Redefined real versions of abval2s macros in terms of fabs(), fabsf().
- Redefined complex versions of sqrt2s macros using the actual "complex square
  root" formula.
- Created new level-0 object-based routines, suffixed with "sc" (for "scalar").
- Defined new level-1v, -1d, and -1m versions of add and sub operations
  (two-operand add and subtract).
- Added new scalar macros:
  - getris: acquire real and imaginary components.
  - setris: set real and imaginary components.
  - addjs: addition with conjugated x.
  - subjs: subtraction with conjugated x.
- Defined new utility operations:
  - absumv: element-wise sum of absolute values for vector elements.
  - absumm: element-wise sum of absolute values for matrix elements.
  - mkherm: convert existing matrix to Hermitian.
  - mksymm: convert existing matrix to symmetric.
  - mktrim: convert existing matrix to triangular.

- Added various error checking routines.
- Added bl2_clock_min_diff(), which is used to more cleanly measure the
  wall clock time of a code block.
- Added general stride support to bl2_obj_alloc_buffer().
- Added bl2_obj_init_scalar().
- Updated parameter mapping in bl2_param_map.c.
- Added support for queriable version string.

- Fixed a bug in the her2k macro-kernels (which currently are simply
  implemented in terms of two invocations of herk) whereby beta was being
  applied to both the first and second rank-k updates, rather than only
  the first.
- Fixed a bug in trmm/trsm whereby transpose and right side cases were not
  properly implemented due to erroneous assumptions regarding aliasing and
  root objects.
- Fixed a bug in the upper triangular trsm macro-kernel in which the wrong
  MR x NR block of B was being updated.
- Fixed a bug in the inverts macro in the double real case whereby the
  value was typecast to float before inversion. This affected non-unit cases
  of dtrsm.
- Fixed a bug in the reference kernels for gemmtrsm whereby the minus one
  constant was being applied incorrectly.
- Fixed a bug in the overall treatment of non-unit alpha for trsm. The code
  now mimics the rank-k strategy of gemm, whereby alpah is applied during
  the first iteration of variant 3, with BLIS_ONE passed in instead for
  subsequent iterations. This also required passing alpha into the macro-
  kernels as well as the fused gemmtrsm micro-kernels.
- Fixed a bug in trsm_u_blk_var1 whereby the gemm macro-kernel was being
  called for blocks strictly above the diagonal. While this sounds good in
  theory, this cannot be done because gemm_ker_var2 expects row panels of
  A to be packed from top to bottom, while for trsm_u, A is actually packed
  from bottom to top due to the reverse (BR->TL) nature of the algorithm.
- Fixed a bug in packm_cxk() whereby panel packings with unit panel
  dimensions were mishandled due to incorrect arguments to the copyv kernel.
  Also changed the copyv kernel invocation to scal2v so that these edge
  cases are properly handled when scaling is requested.
- Fixed a bug in packv_int() whereby an uninitialized object is passed in
  instead of the source object.
- Fixed a bug whereby level-2 code could allocate memory dynamically via
  bl2_malloc() and then attempt to free it via bl2_mm_release(). Also fixed
  a potential future bug whereby a mem_t object that is actually no longer
  "allocated" from the static pool is mistaken for being allocated due to
  failure to NULLify the buffer when the block was most recently released.
- Fixed a bug in bl2_acquire_mpart_*() whreby the uplo field was mistakenly
  toggled when the requested subpartition needed to be "reflected" due to it
  residing in an unstored region.
2013-02-11 13:20:44 -06:00
Field G. Van Zee
e2e7cb2fbe Expanded reference packm/unpackm kernel set to 16.
Details:
- Added 10xk, 12xk, 14xk, and 16xk reference kernels for packm and
  unpackm.
- Updated bl2_[un]packm_cxk() to silently use scal2m if "out of range"
  kernel size is requested. (Thanks to Tyler for finding this bug.)
- Updated bl2_kernel.h to contain new _KERNEL definitions, according
  to above changes, for 'reference' and 'clarksville' configurations.
- Updated CHANGELOG.
- Removed "output*.m" from .gitignore.
2012-12-13 18:17:54 -06:00
Field G. Van Zee
17455a8bce Minor updates towards to 0.0.1. 2012-12-10 17:23:32 -06:00
Field G. Van Zee
714c527b0e Added 'changelog' make target; other tweaks.
Details:
- Updated CHANGELOG.
- Added 'changelog' target to Makefile that runs 'git log --decorate' and
  overwrites CHANGELOG with the output.
- Other trivial changes.
2012-12-07 19:54:04 -06:00
Field G. Van Zee
00f3498a89 Initial commit. 2012-12-03 12:36:11 -06:00