Details:
- Removed four trailing spaces after "BLIS" that occurs in most files'
commented-out license headers.
- Added UT copyright lines to some files. (These files previously had
only AMD copyright lines but were contributed to by both UT and AMD.)
- In some files' copyright lines, expanded 'The University of Texas' to
'The University of Texas at Austin'.
- Fixed various typos/misspellings in some license headers.
Details:
- Updated the non-tree openmp and pthreads barriers defined in
bli_thrcomm_openmp.c and bli_thrcomm_pthreads.c to instead call a common
implementation in bli_thrcomm.c, bli_thrcomm_barrier_atomic(). This new
implementation goes through the same motions as the previous codes, but
protects its loads and increments with GNU atomic built-ins. These atomic
statements take memory ordering parameters that allow us to specify just
enough constraints for the barrier to work as intended on weakly-ordered
hardware. The prior implementation was only guaranteed to work on systems
with strongly- ordered memory. (Thanks to Devin Matthews for suggesting
this change and his crash-course in atomics and memory ordering.)
- Removed 'volatile' from structs' barrier field declarations in
bli_thrcomm_*.h.
- Updated bli_thrcomm_pthread.? files to use renamed struct barrier fields
consistent with that of the _openmp.? files.
- Updated other bli_thrcomm_* files to rename "communicator" variables to
simply "comm".
Details:
- Reorganized code and renamed files defining APIs related to multithreading.
All code that is not specific to a particular operation is now located in a
new directory: frame/thread. Code is now organized, roughly, by the
namespace to which it belongs (see below).
- Consolidated all operation-specific *_thrinfo_t object types into a single
thrinfo_t object type. Operation-specific level-3 *_thrinfo_t APIs were
also consolidated, leaving bli_l3_thrinfo_*() and bli_packm_thrinfo_*()
functions (aside from a few general purpose bli_thrinfo_*() functions).
- Renamed thread_comm_t object type to thrcomm_t.
- Renamed many of the routines and functions (and macros) for multithreading.
We now have the following API namespaces:
- bli_thrinfo_*(): functions related to thrinfo_t objects
- bli_thrcomm_*(): functions related to thrcomm_t objects.
- bli_thread_*(): general-purpose functions, such as initialization,
finalization, and computing ranges. (For now, some macros, such as
bli_thread_[io]broadcast() and bli_thread_[io]barrier() use the
bli_thread_ namespace prefix, even though bli_thrinfo_ may be more
appropriate.)
- Renamed thread-related macros so that they use a bli_ prefix.
- Renamed control tree-related macros so that they use a bli_ prefix (to be
consistent with the thread-related macros that were also renamed).
- Removed #undef BLIS_SIMD_ALIGN_SIZE from dunnington's bli_kernel.h. This
#undef was a temporary fix to some macro defaults which were being applied
in the wrong order, which was recently fixed.