CHANGELOG update (for 0.1.2).

This commit is contained in:
Field G. Van Zee
2014-06-05 10:54:16 -05:00
parent 00f232f8ed
commit 19c05dfaac

630
CHANGELOG
View File

@@ -1,4 +1,632 @@
commit fde5f1fdece19881f50b142e8611b772a647e6d2 (HEAD, tag: 0.1.1, origin/master, origin/HEAD, master)
commit 00f232f8ed1f7c41619b12ebf779ebe2c3b2d3cd (HEAD, tag: 0.1.2, origin/master, master)
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Jun 2 13:40:57 2014 -0500
Added single-precision micro-kernel for Knights Corner aka MIC aka Xeon Phi
commit 3fc60e491426f6248c0feae88d971e4d1f88fb95
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Wed May 21 11:34:42 2014 -0500
Fixed ldim alignment bug in core2 gemm ukernel.
Details:
- Fixed a bug in the dunnington/core2 gemm micro-kernels that resulted in
a segmentation fault if a column-stored matrix's starting address was
aligned, but its leading dimension was such that its second column was
unaligned. Basically, the micro-kernel was assuming that aligned load
instructions were safe when they actually were not. An extra condition
that checks the alignment of cs_c (ie: the leading dimension in the
column storage case) has now been added. Thanks to Michael Lehn for
reporting this bug.
commit 77a2d8dac8b242d7a202c9aabda3927ab68cf987
Merge: 8c5d607 21fb089
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Tue May 20 09:53:19 2014 -0500
Merge pull request #8 from tlrmchlsmth/master
Added multithreading to most level-3 operations.
commit 21fb089387ee7c87f6dc53b0f60f68b48d3ff3e8
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon May 19 20:38:55 2014 -0700
Reverting changes dunnington and reference configs
Now they are unchanged from the main branch of BLIS
commit 8a0ef0e0db5880730425926f8ba56b457a2ba764
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri May 16 13:44:14 2014 -0500
Fixed rounding error in bli_get_range_weighted
commit 0b4b1680334528b1b60bc696537600f763198e92
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri May 16 12:23:37 2014 -0500
Fixed bug with disabling JC loop threading for right sided trmm
commit 5c048a90d8dfa1dbde4e45fbc10ffcbdfe59d960
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Wed May 14 16:20:06 2014 -0500
Disabled parallelism for right-sided TRMM JC loop
The loop has dependent iterations.
commit 13a4c717ed0e273359dbaf5554cc4fa70b087d71
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Wed May 14 14:59:04 2014 -0500
Fixed bug with bli_get_range_weighted
commit 45957cc7745e9bb1698408d72f53ef192e960820
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Tue May 13 17:14:46 2014 -0500
Allowed threading to be turned off
No longer requires OpenMP to compile
Define the following in bli_config.h in order to enable multithreading:
BLIS_ENABLE_MULTITHREADING
BLIS_ENABLE_OPENMP
Also fixes a bug with bli_get_range_weighted
commit bd1dc98ce599d74513a553fe3b37a2ebca1c3812
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon May 12 17:26:19 2014 -0500
Disabled multithreading of the kc loop
commit 456df0372170bd7ca2c7e2d85365a69f1f04de88
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Wed Apr 30 12:28:00 2014 -0500
Replaced register blocksize hack with querying the register blocksize for determining parallelism granularity
commit f4fdfe8fc573553eb36795b79cdf681270dab71b
Merge: 31bb065 8c5d607
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Wed Apr 30 11:46:35 2014 -0500
Merge http://github.com/flame/blis
commit 8c5d6071e24ba10a53669390a47287e86ff354ce
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Tue Apr 29 12:26:12 2014 -0500
Added _check() routines for fprint[mv], rand[mv].
Details:
- Added _check() routines for fprintm, fprintv, randm, and randv.
- Added invocations to the above routines from their respective
front-ends.
commit 262cdabcc885bcf6636f4d8bb7d320f95e81d820
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Mon Apr 28 16:48:25 2014 -0500
Changed treatment of NULL object buffers.
Details:
- Relaxed the constraint in bli_obj_attach_buffer_check(), which required
the buffer address being attached to be non-NULL. This is acceptable
because the user was already able to create and use objects with NULL
buffers (via bli_obj_create_without_buffer(), which initializes the
buffer to NULL).
- Inserted calls to newly defined function, bli_check_object_buffer(),
into nearly all operations' _check() or _int_check() functions. This
allows BLIS to abort peacefully if a computational routine is called
with an object containing a NULL buffer. By contrast, under such
conditions, BLAS would typically fail with a segmentation fault.
- Within operation front-ends, moved the calls to _check()/_int_check()
so that zero dimensions are checked first (and if found, execution
returns with trivial or no computation). This resolves issue #7. Thanks
to Jack Poulson for reporting this bug.
commit 31bb065ba40ae0c5a614e743b8025abca012b99e
Merge: 20e2443 7c61959
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Wed Apr 23 12:30:19 2014 -0500
Merge http://github.com/flame/blis
commit 7c61959955c8ba78160d0ed4d1979022029d963b
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Apr 10 17:18:36 2014 -0500
Can now query register blocksizes from blk algs.
Details:
- Added a new field to blksz_t objects that allows one to attach a
sub-object. Doing this allows us to associate a register blocksize with
any given cache blocksize. That way, the register blocksize can be
queried wherever the cache blocksize would normally be accessible
(e.g. a blocked algorithm).
- Modified bli_gemm_cntl.c (and 4m/3m variants) so that the register
blocksizes are attached to the cache blocksizes after they are created.
commit 58671597d3d450817b2eda576c05ed6dadd8af6d
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Apr 10 15:35:30 2014 -0500
Minor cleanups to level-2 _cntl.c files.
Details:
- Changed level-2 _cntl.c files so that the blocksizes for gemv are
imported and used, rather than blocksizes being declared locally.
- Whitespace changes to gemv_cntl.c and gemm_cntl.c files (as well as
4m/3m variants).
- Removed test/old/test_blis2.c.
commit 20e24430a772bc0fbaf24dec2f8c544096fd3f4e
Author: Tyler Michael Smith <tmsmith@vestalac1.ftd.alcf.anl.gov>
Date: Tue Apr 8 17:50:44 2014 +0000
Some fixes for the bgq kernels
commit bde697f75ec1e7f2decebee0c9bd620b4c134cd5
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 16:43:44 2014 -0500
Add -openmp to ldflags as well
commit c332be8cd471eeace7b4fa4ae7443088b6a68ec3
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 16:37:50 2014 -0500
Added -openmp flag to Xeon Phi build for convenience
commit e7ca9e4b4a24d585c9aec8293fc7bb79e4171ad0
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 16:31:15 2014 -0500
Used BLIS_DEFAULT_*_MR for rounding partitioning instead of BLIS_DEFAULT_*_MC
commit 7b9b228c6fa4cfb70b1ebb855b009a036e85fac3
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 16:29:10 2014 -0500
Fix for tree barrier freeing bug
commit 5ec93bd9a76096312d51c326ccde1e9bd0a436ab
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 15:09:10 2014 -0500
Bunch of minor fixes
Removed barrier after unpackm in all level3 blocked variants
Now there is an implicit barrier inside unpackm that only occurs if C is packed (which is usually not the case)
Moved the enabling of the tree barriers into bli_config.h
Fed the default MR and NR for double precision into bli_get_range instead of the number 8
commit 575fb9b0b08f3bdb56ccde056da619d1585617c1
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 12:13:29 2014 -0500
Changed default blocking factor to default double precision MR and NR
commit ab9c7880335c281432d5809fe0dec46753d22569
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 11:38:11 2014 -0500
Added faster tree barriers necessary for performance for Xeon Phi
Fixed up some stuff in the thread info free functions
Disabled threading for TRSM so that it actually works when threading environment variables are set
commit ec58a7923cccac08632670caadf3cf6ff5dce766
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 10:22:48 2014 -0500
Freeing thread info paths.
Also made herk IC and JC loops do weighted partitioning
commit 2b6848b2397d6d84ca4e5f792fc51ad05e351a36
Merge: 4e3eb39 21a0efb
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Apr 4 09:54:54 2014 -0500
Merge http://github.com/flame/blis
Conflicts:
kernels/bgq/1/bli_axpyv_opt_var1.c
kernels/bgq/1/bli_dotv_opt_var1.c
commit 4e3eb39aca4df0b9fdc003d468f368a2f2ba597d
Author: Tyler Michael Smith <tmsmith@vestalac1.ftd.alcf.anl.gov>
Date: Fri Apr 4 14:50:03 2014 +0000
Some fixes to the bgq config
MR and NR for double complex were wrong
Default fusing factor for double precision was wrong as well
commit 21a0efb33d7435139e9c43c1a4787a6bff533e26
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Apr 3 16:38:44 2014 -0500
Fixed follow-up to issue #6.
commit c318157a9bee8ea6e59be16f99f65d9271fe0d27
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Apr 3 16:24:34 2014 -0500
Fixed issue #6 (incorrect 'restrict' usage).
Details:
- Fixed improper usage of restrict keyword in axpyv and dotv bgq kernels.
(However, there may be other instances of similar misuse elsewhere in
BLIS.) Thanks to Jeff Hammond for reporting this issue.
commit b5150a1bf3bd89598e2b3aeac110eb5b44ac6c12
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Apr 3 12:25:45 2014 -0500
Added #include "arm_neon.h" to ARM gemm ukernel.
Details:
- Inserted #include "arm_neon.h" into gemm ukernel source file for
arm/neon. Thanks to Jean-Michel Hautbois for suggesting this fix.
commit 2041c264517b6c590fd4f7e8253e6911b622d1c3
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Apr 3 10:30:03 2014 -0500
Added barriers needed prior to doing scalar reset for rank-k updates.
commit 47a90e69dfde3f4f8fdf90654248a6b499fbadbc
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Tue Apr 1 14:34:31 2014 -0500
Attempted to fix uninitialized variable warnings.
Details:
- Added initialization statements to various macros used in level 1m and
1m-like operations. I wasn't able to reproduce the reported behavior,
so hopefully this takes care of it. Thanks to Jeff Hammond for the
report.
commit d27b4f690c14b1f836f8c7a3c0e91e09d852f02e
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Tue Apr 1 12:57:24 2014 -0500
Use generic paths for toolchain in POWER7.
Details:
- Fixed issue #4. Thanks to Jeff Hammond for contributing changes.
commit 1584ae1c83c3a8c1af76acb46404747507650f19
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Fri Mar 28 15:15:48 2014 -0500
Fixed race condition involving scalar reset
commit 459dde4acc09e49380da58fb7b246db488884ad9
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Mar 27 17:06:45 2014 -0500
Made barrier after packing implicit.
This also fixed a bug where barriers in the blocked variants were inserted after the inner packing routines,
but not the outer packing routines.
This allowed, for instance, the block of B to not be finished being packed before computation to occur.
commit 9f78ec6e7e95fcad89a167b27cad7e2d74b6d122
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Mar 27 14:18:46 2014 -0500
Some fixes for the internal functions,
was innappropriately only having thread chief do some things.
commit a6fd48345424e097f71652be013aa897e098b41e
Author: Tyler Michael Smith <tmsmith@vestalac1.ftd.alcf.anl.gov>
Date: Wed Mar 26 17:19:46 2014 +0000
Added test drivers for level 3 BLAS that run tests in parallel using MPI
commit 73b3db594864be0f9be9a0eb29bf961fa9c95f29
Author: Tyler Michael Smith <tmsmith@vestalac1.ftd.alcf.anl.gov>
Date: Wed Mar 26 15:39:05 2014 +0000
Some fixes for the bgq configuration
commit f0824a04fc75e231c3a3d7757fa4e7294173282f
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Mar 24 15:21:42 2014 -0500
Initial commit to enable threading in TRSM,
Also enabled weighted partitioning for herk, trmm
Fixed bug where multiple threads would try to modify the same state in the internal level 3 functions
Correctly computed a_next and b_next for gemm, herk macrokernels
a_next and b_next point to the current micropanels in trmm
commit 23d9eab354fbc88165889832955e126772bf8488
Merge: 5d5dc2e fd3e32a
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Mar 20 16:54:35 2014 -0500
Merge https://github.com/flame/blis
commit 5d5dc2eedef2f7c90d61371a1b457be5c06cf583
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Mar 20 16:43:36 2014 -0500
Parallelized trmm and trmm3
Also fixed bugs in packm
commit fd3e32a5f419fa412f46afe4dd1c3a26e15f3eb4
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Mar 20 13:59:48 2014 -0500
Refined INSERT_GENTFUNC macro usage.
Details:
- Defined new INSERT_GENTFUNC macros so that the macro always takes
exactly the number of arguments needed for the particular operation or
variant being defined. Many operations were using INSERT_GENTFUNC
macros that expected one auxiliary argument even though none were
needed. Those instances have now been updated. Most of these instances
were in the level-0 and -1v operations, as well as some operations
defined in frame/util.
commit 9b0e715f29338a1a1d6445907d2445c35f011121
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Wed Mar 19 15:47:54 2014 -0500
Minor simplifications to trmm, trsm macro-kernels.
Details:
- Simplified some code that would have allowed the diagonal of a trmm
or trsm triangular matrix to intersect the short end of a micro-panel.
This is disallowed via higher-level constraints on cache blocksizes, so
this code was never needed and only served to obfuscate.
- Updated some comments in trmm, trsm macro-kernels.
commit a3902750b9ab4923433f7e353f3669c3c419f8e4
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Wed Mar 19 12:35:17 2014 -0500
Reorganized norm operations.
Details:
- Completely reoganized norm operations:
- Renames:
- fnormsc, fnormv, fnormm -> normfsc, normfv, normfm (2-norm)
- absumv -> norm1v (vector 1-norm)
- New operations:
- norm1m (matrix 1-norm)
- normiv, normim (infinity-norm)
- amaxv (BLAS-like absolute maximum value index)
- asumv (BLAS-like absolute sum)
- Deprecated absumm, as it did not correspond to any actual norm.
(However, an inlined version now exists in the testsuite module for
randm.)
commit c0140cb752f27e99742f85d23be2181c00a1335e
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Wed Mar 19 11:21:16 2014 -0500
Fixed packm variants 3 and 4 where every thread was trying to manipulate the same state
Now just performed by the master thread.
commit fb42983bd9943711baa7d1c6496de1215bb816ef
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Tue Mar 18 16:37:28 2014 -0500
Fixed a barrier bug and a thread decorator bug
commit aa2405f8b23d0f8d2ec04790882f2176ef2e8fd8
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Tue Mar 18 15:23:09 2014 -0500
Fixing function pointer issues with thread decorator
commit ec8b88f93533942d3711191873310e7ff281bda6
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Tue Mar 18 14:35:37 2014 -0500
Enabled threading for packm blocked variants 3 and 4
commit 0ac534cdf657bbf04601abfe719ba2887aab5da7
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Tue Mar 18 13:26:27 2014 -0500
Added decorator for calling parallelized intermal functions
Will allow for easy support for different threading models
commit 5296f58975f7d351f88909cc80b6d0cffd73def7
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Mar 17 17:15:35 2014 -0500
Fixing some bugs with herk parallelization
commit c51d0110831eb89361b4720bf7ed75edbd26ebce
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Mar 17 15:00:47 2014 -0500
Initial multithreading support for HERK
commit c720b141568d1f289146bf34ded08001f2c0dfbb
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Mar 17 11:39:32 2014 -0500
Switched to using environment variables to control threading.
The environment variables all follow the format BLIS_X_NT,
where X is the index of the loop as described in our paper
Anatomy of High Performance Many-Threaded Matrix Multiplication.
These indices are IR, JR, IC, KC, and JC.
Also enabled parallelism for hemm and symm, but these are currently untested.
commit 92233cf64274b27b2217c5cfffe75443ff6137a4
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Tue Mar 11 14:16:08 2014 -0500
Some fixes to gemm thread info tree creation,
Changed microkernel tests to use the new BLIS_PACKM_SINGLE_THREADED
instead of BLIS_SINGLE_THREADED
commit 020f80c30289d8bcaa688bf600b01fae9b23b54f
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Tue Mar 11 12:08:17 2014 -0500
Added files specific to threading for gemm and packm operations
commit 8d8f4352a41926bc923e47be836365b6b726aff2
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Mar 10 15:47:28 2014 -0500
Added single threaded thread info data structures specifically for gemm and packm
commit 0e8677761175189583ca7d855e24b2bbdd2dada8
Merge: 2e727a0 b3bff63
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Mar 10 15:16:21 2014 -0500
Merge branch 'master' of https://github.com/tlrmchlsmth/blis
commit 2e727a025a8f796d2b6bd14f489d0ee72e7d1fc7
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Mon Mar 10 15:14:33 2014 -0500
Modifying the thread info data structures
This change makes each operation have its own thread info type,
allowing more fine control of threading in operations that have different types of suboperations
commit a770590cf21a459f04bf941c58ee2afd272cc441
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Mon Mar 3 14:31:44 2014 -0600
Minor fixes to sumsqv, abmaxv.
Details:
- Minor update to bli_sumsqv_unb_var1() to bring it up-to-date with
LAPACK 3.5.0's zlassq.f, which, starting with 3.4.2, returns NaN when
the vector (or matrix) contains a NaN.
- Minor change to bli_abmaxv_unb_var1() to more closely mimic the
behavior of netlib BLAS's izamax(). There, a "less than or equal to"
operator is used in the search instead of "less than", which would
change the element index returned if there were multiple maximum values.
- Added macro function definitions for bli_isinf() and bli_isnan(), which
are currently implemented in terms of isinf() and isnan() from math.h.
commit b3bff631eadf98b15cb422fb4a8e2f855c23e8a7
Merge: 2c158fb e8757b0
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Feb 27 16:53:24 2014 -0600
Merge https://github.com/flame/blis
commit 2c158fb885c27f7b599dc1e85b57edd684f19223
Merge: e4738c4 c2b2ab6
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Feb 27 16:46:23 2014 -0600
Merge https://github.com/flame/blis
Conflicts:
frame/1m/packm/bli_packm_blk_var1.c
commit e8757b03a74f9891632242e9a90efb32150826f5
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Feb 27 16:40:07 2014 -0600
Use "%ld" as int format specifier in fprintm.
Details:
- Changed "%d" to "%ld" when printing integers via bli_fprintm().
- Meant to include this in previous commit.
commit c663ce3b5170fee7dfb5b528b650d70c8e932cac
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Thu Feb 27 16:32:57 2014 -0600
Fixed various bugs when C99 complex is enabled.
Details:
- Fixed various bugs in packm_*_cxk(), the 4m/3m micro-kernels, and
elsewhere in the framework that were not yet set up to work properly
when BLIS_ENABLE_C99_COMPLEX is defined in bli_config.h
- Extensive changes to f2c-derived files in frame/compat/f2c to allow
C99 complex storage. Most of these changes center around accessing
real and imaginary components via bli_?real()/bli_?imag() accessor
macros, and setting of values via bli_?sets() assignment macros.
(Thanks to Vladimir Sukarev for pointing out that _ENABLE_C99_COMPLEX
was broken.)
commit e4738c48e00b89391d9baa1fd0aa62d1ea2f95e6
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Feb 27 16:29:46 2014 -0600
Added support for parallelism in gemm micro-kernel
commit bfe214b633765ed40b57b330fbb84c332663aa40
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Feb 27 15:53:10 2014 -0600
Fixed bug with parallel packing, and bug with allocating an array of thread infos
In packm variant 1, the variable p_begin was incremented each iteration, causing a dependency.
This dependeny was removed, allowing each iteration to be executed in parallel.
Somewhere in bli_threading.c, I was allocating an array of pointers instead of an array of structs.
commit 6193d9ceea552e67170dba45abde04c64271c705
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Feb 27 14:09:19 2014 -0600
Fixed bug in thread trees
commit ac5a2de1d17ffd460b00fee9757898525a09abae
Merge: 01b125e bd3c7ec
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Feb 27 11:59:33 2014 -0600
Merge branch 'master' of https://github.com/tlrmchlsmth/blis
commit 01b125e815f19410e8e0611d088b84570e499e93
Author: Tyler Smith <tms@cs.utexas.edu>
Date: Thu Feb 27 11:55:45 2014 -0600
First pass at adding parallelism to BLIS.
Added a multithreading infrastructure that should be independent of multithreading implementation in the future.
Currently, gemm blocked variants 1f and 2f, and packm variant blocked variant 1 is parallelized.
commit c2b2ab62707e4174892aff3ce65f36f54878fae5
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Wed Feb 26 12:46:45 2014 -0600
Deprecated panel stride alignment in bli_config.h.
Details:
- Removed BLIS_CONTIG_STRIDE_ALIGN_SIZE from bli_config.h of all
configurations. It was already going unused in packm_init() since the
recent 4m/3m commit. This setting was rarely, if ever, useful, and its
existence only posed a potential risk for 4m/3m-based implementations.
- Removed BLIS_CONTIG_STRIDE_ALIGN_SIZE usage from mem_pool_macro_defs.h.
- Updated comments regarding CONTIG_STRIDE_ALIGN_SIZE in template
micro-kernels.
commit f18aee83a5ac1b14808686fc3c5a3c846a1d99b9
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Tue Feb 25 17:58:42 2014 -0600
CHANGELOG update (for 0.1.1).
commit fde5f1fdece19881f50b142e8611b772a647e6d2 (tag: 0.1.1)
Author: Field G. Van Zee <field@cs.utexas.edu>
Date: Tue Feb 25 13:34:56 2014 -0600