mirror of
https://github.com/amd/blis.git
synced 2026-05-11 09:39:59 +00:00
2d9c667f3c48a12cab64e5ad09d5fcb9f4c19d78
Details: - Fixed bugs in trmv_l and trsv_u due to backwards iteration resulting in unaligned subpartitions. We were already going out of our way a bit to handle edge cases in the first iteration for blocked variants, and this was simply the unblocked-fused extension of that idea. - Fixed control tree handling in her/her2/syr/syr2 that was not taking into account how the choice of variant needed to be altered for upper-stored matrices (given that only lower-stored algorithms are explicitly implemented). - Added bli_determine_blocksize_dim_f(), bli_determine_blocksize_dim_b() macros to provide inlined versions of bli_determine_blocksize_[fb]() for use by unblocked-fused variants. - Integrated new blocksize_dim macros into gemv/hemv unf variants for consistency with that of the bugfix for trmv/trsv (both of which now use the same macros). - Modified bli_obj_vector_inc() so that 1 is returned if the object is a vector of length 1 (ie: 1 x 1). This fixes a bug whereby under certain conditions (e.g. dotv_opt_var1), an invalid increment was returned, which was invalid only because the code was expecting 1 (for purposes of performing contiguous vector loads) but got a value greater than 1 because the column stride of the object (e.g. rho) was inflated for alignment purposes (albeit unnecessarily since there is only one element in the object). - Replaced some old invocations of set0 with set0s. - Added alpha parameter to gemmtrsm ukernels for x86_64 and use accordingly. - Fixed increment bug in cleanup loop of gemm ukernel for x86_64. - Added safeguard to test modules so that testing a problem with a zero dimension does not result in a failure. - Tweaked handling of zero dimensions in level-2 and level-3 operations' internal back-ends to correctly handle cases where output operand still needs to be scaled (e.g. by beta, in the case of gemm with k = 0).
BLIS framework
README
---
Thank you for deciding to try out the BLIS framework!
BLIS is a portable framework for instantiating BLAS-like libraries. The
framework was designed to isolate essential kernels of computation that,
when optimized, immediately enable optimized implementations of most of
its commonly used and computationally intensive operations.
BLIS has many features. For more detailed information about the project,
please check the BLIS homepage:
http://code.google.com/p/blis/
You can keep in touch with developers and other users of the project by
joining one or more of the following mailing lists:
o blis-announce - http://groups.google.com/group/blis-announce
Used only for announcements and other important messages regarding
BLIS.
o blis-discuss - http://groups.google.com/group/blis-discuss
Please join and post to this mailing list if you have general questions
or feedback regarding BLIS. Application developers (end users) should
probably post here.
o blis-devel - http://groups.google.com/group/blis-devel
Please join and post to this mailing list if you are a BLIS developer
(i.e., you are trying to use BLIS to create libraries, you want to
write kernels for the framework, or you are trying to modify or extend
the framework itself).
Also, please read the LICENSE file for information on copying and
distributing this software.
For a step-by-step guide on configuring, compiling, and installing BLIS,
please read the INSTALL file. Also, please check the BLIS website's wiki
page for other useful how-to guides.
Thanks again for your interest in BLIS!
Regards,
Field G. Van Zee
field@cs.utexas.edu
Description
Languages
C
86.2%
C++
9.7%
Fortran
1.9%
Makefile
0.8%
MATLAB
0.4%
Other
0.9%