9 Commits

Author SHA1 Message Date
Field G. Van Zee
7ed415824d Updated copyright headers (continued).
Details:
- Inserted "at Austin" into third clause of license declarations.
  Meant to include this change in previous commit.
2014-07-14 16:14:33 -05:00
Field G. Van Zee
5c2c6c8561 Updated copyright headers to contain "at Austin".
Details:
- Updated copyright headers to include "at Austin" in the name of the
  University of Texas.
- Updated the copyright years of a few headers to 2014 (from 2011 and
  2012).
2014-07-14 16:05:03 -05:00
Field G. Van Zee
fde5f1fdec Added extensive support for configuration defaults.
Details:
- Standard names for reference kernels (levels-1v, -1f and 3) are now
  macro constants. Examples:
    BLIS_SAXPYV_KERNEL_REF
    BLIS_DDOTXF_KERNEL_REF
    BLIS_ZGEMM_UKERNEL_REF
- Developers no longer have to name all datatype instances of a kernel
  with a common base name; [sdcz] datatype flavors of each kernel or
  micro-kernel (level-1v, -1f, or 3) may now be named independently.
  This means you can now, if you wish, encode the datatype-specific
  register blocksizes in the name of the micro-kernel functions.
- Any datatype instances of any kernel (1v, 1f, or 3) that is left
  undefined in bli_kernel.h will default to the corresponding reference
  implementation. For example, if BLIS_DGEMM_UKERNEL is left undefined,
  it will be defined to be BLIS_DGEMM_UKERNEL_REF.
- Developers no longer need to name level-1v/-1f kernels with multiple
  datatype chars to match the number of types the kernel WOULD take in
  a mixed type environment, as in bli_dddaxpyv_opt(). Now, one char is
  sufficient, as in bli_daxpyv_opt().
- There is no longer a need to define an obj_t wrapper to go along with
  your level-1v/-1f kernels. The framework now prvides a _kernel()
  function which serves as the obj_t wrapper for whatever kernels are
  specified (or defaulted to) via bli_kernel.h
- Developers no longer need to prototype their kernels, and thus no
  longer need to include any prototyping headers from within
  bli_kernel.h. The framework now generates kernel prototypes, with the
  proper type signature, based on the kernel names defined (or defaulted
  to) via bli_kernel.h.
- If the complex datatype x (of [cz]) implementation of the gemm micro-
  kernel is left undefined by bli_kernel.h, but its same-precision real
  domain equivalent IS defined, BLIS will use a 4m-based implementation
  for the datatype x implementations of all level-3 operations, using
  only the real gemm micro-kernel.
2014-02-25 13:34:56 -06:00
Field G. Van Zee
2cb13600f9 Updated year in copyright headers to 2014. 2014-01-03 12:29:13 -06:00
Field G. Van Zee
a0331fb10a Introduced auxinfo_t argument to micro-kernels.
Details:
- Removed a_next and b_next arguments to micro-kernels and replaced them
  with a pointer to a new datatype, auxinfo_t, which is simply a struct
  that holds a_next and b_next. The struct may hold other auxiliary
  information that may be useful to a micro-kernel, such as micro-panel
  stride. Micro-kernels may access struct fields via accessor macros
  defined in bli_auxinfo_macro_defs.h.
- Updated all instances of micro-kernel definitions, micro-kernel calls,
  as well as macro-kernels (for declaring and initializing the structs)
  according to above change.
2013-12-19 14:50:11 -06:00
Field G. Van Zee
376bbb59c8 Removed support for duplication.
Details:
- Removed support for duplication from the gemmtrsm/trsm micro-kernels
  and all framework code.
- Updated test suite modules according to above changes.
2013-11-08 11:17:34 -06:00
Field G. Van Zee
97aaf220a8 Added new kernels, configurations.
Details:
- Added various micro-kernels for the following architectures:
    Intel MIC
    IBM BG/Q
    IBM Power7
    AMD Piledriver
    Loogson 3A
  and reorganized kernels directory. Thanks to Tyler Smith, Mike Kistler,
  and Xianyi Zhang for contributing these kernels.
- Added configurations corresponding to above architectures, and renamed
  "clarksville" configuration to "dunnington".
2013-09-17 10:51:36 -05:00
Field G. Van Zee
9d10d7dd9b Added a_next, b_next arguments to micro-kernels.
Details:
- Added two more arguments to the gemm and gemmtrsm microkernels: the
  addresses of the next micro-panels of A and B. By passing these
  pointers into the micro-kernel, we allow the micro-kernel author to
  prefetch micro-panels of A and B as necessary (though this is
  completely optional; these addresses may also be safely ignored).
- Updated all seven macro-kernels so that they compute and pass in
  a_next and b_next. Note that ONLY the gemm macro-kernel computes
  a_next and b_next with the precise semantics we want. I will go back
  and fix the other macro-kernels in the near future.
- Added 'restrict' to various micro-kernels from which it was missing.
2013-04-23 16:00:18 -05:00
Field G. Van Zee
4fe1435f20 Updated dupl implementation to use PACKNR and NR.
Details:
- Updated frame/util/dupl/bli_dupl_unb_var1.c to utilize PACKNR and NR
  explicitly so navigate b1 so that situations where PACKNR > NR are
  supported.
- Moved the 4x2 and 4x4 reference micro-kernels in frame/3/gemm/ukernels and
  frame/3/trsm/ukernels to kernels/c99/.
- Updated clarksville and flame configurations.
2013-04-22 19:00:43 -05:00