Added Doxygen support for extension APIs.

Details:
- Added Doxyfile, a configuration file in docs directory for generating Doxygen document from source files.
- Currently only CBLAS interface of (Batched gemm and gemmt)extension APIs are included.
- Support for BLAS interface is yet to be added.
- To generate Doxygen based document for extension API, use given command.
  $ doxygen docs/Doxyfile

AMD-Internal: [CPUPL-3188]

Change-Id: I76e70b08f0114a528e86514bcb01d666acc591e8
This commit is contained in:
Harsh Dave
2023-04-14 08:19:06 -05:00
parent b531022bac
commit b85b856950
7 changed files with 4909 additions and 4 deletions

2842
docs/Doxyfile Normal file

File diff suppressed because it is too large Load Diff

138
docs/Main_Page.md Normal file
View File

@@ -0,0 +1,138 @@
@mainpage
# Welcome to AOCL-BLIS
---
## Table of Content
* [Introduction](#Introduction)
* [Build and Installation](#Build)
* [Examples](#Example)
* [Contact Us](#Contact)
<div id="Introduction" name="Introduction"></div>
## Introduction
<b> AOCL BLIS </b> BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries. The framework was designed to isolate essential kernels of computation that, when optimized, immediately enable optimized implementations of most of its commonly used and computationally intensive operations. BLIS is written in ISO C99 and available under a new/modified/3-clause BSD license. While BLIS exports a new BLAS-like API, it also includes a BLAS compatibility layer which gives application developers access to BLIS implementations via traditional BLAS routine calls. An object-based API unique to BLIS is also available.
How to Download BLIS
--------------------
There are a few ways to download BLIS. We list the most common four ways below.
We **highly recommend** using either Option 1 or 2. Otherwise, we recommend
Option 3 (over Option 4) so your compiler can perform optimizations specific
to your hardware.
1. **Download a source repository with `git clone`.**
Generally speaking, we prefer using `git clone` to clone a `git` repository.
Having a repository allows the user to periodically pull in the latest changes
and quickly rebuild BLIS whenever they wish. Also, implicit in cloning a
repository is that the repository defaults to using the `master` branch, which
contains the latest "stable" commits since the most recent release. (This is
in contrast to Option 3 in which the user is opting for code that may be
slightly out of date.)
In order to clone a `git` repository of BLIS, please obtain a repository
URL by clicking on the green button above the file/directory listing near the
top of this page (as rendered by GitHub). Generally speaking, it will amount
to executing the following command in your terminal shell:
```
git clone https://github.com/amd/blis.git
```
2. **Download a source repository via a zip file.**
If you are uncomfortable with using `git` but would still like the latest
stable commits, we recommend that you download BLIS as a zip file.
In order to download a zip file of the BLIS source distribution, please
click on the green button above the file listing near the top of this page.
This should reveal a link for downloading the zip file.
3. **Download a source release via a tarball/zip file.**
Alternatively, if you would like to stick to the code that is included in
official releases, you may download either a tarball or zip file of any of
BLIS's previous [tagged releases](https://github.com/flame/blis/releases).
We consider this option to be less than ideal for most people since it will
likely mean you miss out on the latest bugfix or feature commits (in contrast
to Options 1 or 2), and you also will not be able to update your code with a
simple `git pull` command (in contrast to Option 1).
4. **Download a binary package specific to your OS.**
While we don't recommend this as the first choice for most users, we provide
links to community members who generously maintain BLIS packages for various
Linux distributions such as Debian Unstable and EPEL/Fedora. Please see the
[External Packages](#external-packages) section below for more information.
Getting Started
---------------
*NOTE: This section assumes you've either cloned a BLIS source code repository
via `git`, downloaded the latest source code via a zip file, or downloaded the
source code for a tagged version release---Options 1, 2, or 3, respectively,
as discussed in [the previous section](#how-to-download-blis).*
If you just want to build a sequential (not parallelized) version of BLIS
in a hurry and come back and explore other topics later, you can configure
and build BLIS as follows:
```
$ ./configure auto
$ make [-j]
```
You can then verify your build by running BLAS- and BLIS-specific test
drivers via `make check`:
```
$ make check [-j]
```
And if you would like to install BLIS to the directory specified to `configure`
via the `--prefix` option, run the `install` target:
```
$ make install
```
Please read the output of `./configure --help` for a full list of configure-time
options.
If/when you have time, we *strongly* encourage you to read the detailed
walkthrough of the build system found in our [Build System](docs/BuildSystem.md)
guide.
Example Code
------------
The BLIS source distribution provides example code in the `examples` directory.
Example code focuses on using BLIS APIs (not BLAS or CBLAS), and resides in
two subdirectories: [examples/oapi](examples/oapi) (which demonstrates the
[object API](docs/BLISObjectAPI.md)) and [examples/tapi](examples/tapi) (which
demonstrates the [typed API](docs/BLISTypedAPI.md)).
Either directory contains several files, each containing various pieces of
code that exercise core functionality of the BLIS API in question (object or
typed). These example files should be thought of collectively like a tutorial,
and therefore it is recommended to start from the beginning (the file that
starts in `00`).
You can build all of the examples by simply running `make` from either example
subdirectory (`examples/oapi` or `examples/tapi`). (You can also run
`make clean`.) The local `Makefile` assumes that you've already configured and
built (but not necessarily installed) BLIS two directories up, in `../..`. If
you have already installed BLIS to some permanent directory, you may refer to
that installation by setting the environment variable `BLIS_INSTALL_PATH` prior
to running make:
```
export BLIS_INSTALL_PATH=/usr/local; make
```
or by setting the same variable as part of the make command:
```
make BLIS_INSTALL_PATH=/usr/local
```
**Once the executable files have been built, we recommend reading the code and
the corresponding executable output side by side. This will help you see the
effects of each section of code.**
This tutorial is not exhaustive or complete; several object API functions were
omitted (mostly for brevity's sake) and thus more examples could be written.
<div id = "Contact"></div>
## CONTACTS
AOCL BLIS is developed and maintained by AMD. You can contact us on the email-id <b>[aoclsupport@amd.com](mailto:aoclsupport@amd.com)</b>

BIN
docs/styling/AMD_Logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

File diff suppressed because it is too large Load Diff

43
docs/styling/footer.html Normal file
View File

@@ -0,0 +1,43 @@
<!--
Copyright (C) 2023, Advanced Micro Devices. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE. -->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<style>
.footer {
position: relative;
left: 0;
bottom: 0;
width: 100%;
background-color: rgba(22, 22, 22, 0);
text-align: center;
padding: 50px 0px 25px 0px;
}
</style>
<body>
<div class = "footer"> &nbsp; Copyright (C) 2023, Advanced Micro Devices. All rights reserved. </div>
</body>
</html>

87
docs/styling/header.html Normal file
View File

@@ -0,0 +1,87 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen $doxygenversion"/>
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<!-- BEGIN opengraph metadata -->
<meta property="og:title" content="Doxygen Awesome" />
<meta property="og:image" content="https://repository-images.githubusercontent.com/348492097/4f16df80-88fb-11eb-9d31-4015ff22c452" />
<meta property="og:description" content="Custom CSS theme for doxygen html-documentation with lots of customization parameters." />
<meta property="og:url" content="https://jothepro.github.io/doxygen-awesome-css/" />
<!-- END opengraph metadata -->
<!-- BEGIN twitter metadata -->
<meta name="twitter:image:src" content="https://repository-images.githubusercontent.com/348492097/4f16df80-88fb-11eb-9d31-4015ff22c452" />
<meta name="twitter:title" content="Doxygen Awesome" />
<meta name="twitter:description" content="Custom CSS theme for doxygen html-documentation with lots of customization parameters." />
<!-- END twitter metadata -->
<!--BEGIN PROJECT_NAME--><title>$projectname: $title</title><!--END PROJECT_NAME-->
<!--BEGIN !PROJECT_NAME--><title>$title</title><!--END !PROJECT_NAME-->
<link href="$relpath^tabs.css" rel="stylesheet" type="text/css"/>
<link rel="icon" type="image/svg+xml" href="logo.drawio.svg"/>
<script type="text/javascript" src="$relpath^jquery.js"></script>
<script type="text/javascript" src="$relpath^dynsections.js"></script>
<script type="text/javascript" src="$relpath^doxygen-darkmode-toggle.js"></script>
<script type="text/javascript" src="$relpath^doxygen-fragment-copy-button.js"></script>
<!-- <script type="text/javascript" src="$relpath^doxygen-awesome-paragraph-link.js"></script> -->
<script type="text/javascript" src="$relpath^doxygen-interactive-toc.js"></script>
<!-- <script type="text/javascript" src="$relpath^toggle-alternative-theme.js"></script> -->
<script type="text/javascript">
DoxygenAwesomeFragmentCopyButton.init()
DoxygenAwesomeDarkModeToggle.init()
DoxygenAwesomeParagraphLink.init()
DoxygenAwesomeInteractiveToc.init()
</script>
$treeview
$search
$mathjax
<link href="$relpath^$stylesheet" rel="stylesheet" type="text/css" />
$extrastylesheet
</head>
<body>
<!-- https://tholman.com/github-corners/ -->
<a href="https://github.com/jothepro/doxygen-awesome-css" class="github-corner" title="View source on GitHub" target="_blank">
<path d="M0,0 L115,115 L130,115 L142,142 L250,250 L250,0 Z"></path><path d="M128.3,109.0 C113.8,99.7 119.0,89.6 119.0,89.6 C122.0,82.7 120.5,78.6 120.5,78.6 C119.2,72.0 123.4,76.3 123.4,76.3 C127.3,80.9 125.5,87.3 125.5,87.3 C122.9,97.6 130.6,101.9 134.4,103.2" fill="currentColor" style="transform-origin: 130px 106px;" class="octo-arm"></path><path d="M115.0,115.0 C114.9,115.1 118.7,116.5 119.8,115.4 L133.7,101.6 C136.9,99.2 139.9,98.4 142.2,98.6 C133.8,88.0 127.5,74.4 143.8,58.0 C148.5,53.4 154.0,51.2 159.7,51.0 C160.3,49.4 163.2,43.6 171.4,40.1 C171.4,40.1 176.1,42.5 178.8,56.2 C183.1,58.6 187.2,61.8 190.9,65.4 C194.5,69.0 197.7,73.2 200.1,77.6 C213.8,80.2 216.3,84.9 216.3,84.9 C212.7,93.1 206.9,96.0 205.4,96.6 C205.1,102.4 203.0,107.8 198.3,112.5 C181.9,128.9 168.3,122.5 157.7,114.1 C157.9,116.9 156.7,120.9 152.7,124.9 L141.0,136.5 C139.8,137.7 141.6,141.9 141.8,141.8 Z" fill="currentColor" class="octo-body"></path></svg></a><style>.github-corner:hover .octo-arm{animation:octocat-wave 560ms ease-in-out}@keyframes octocat-wave{0%,100%{transform:rotate(0)}20%,60%{transform:rotate(-25deg)}40%,80%{transform:rotate(10deg)}}@media (max-width:500px){.github-corner:hover .octo-arm{animation:none}.github-corner .octo-arm{animation:octocat-wave 560ms ease-in-out}}</style>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<!--BEGIN TITLEAREA-->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 56px;">
<!--BEGIN PROJECT_LOGO-->
<td id="projectlogo"><img alt="Logo" src="$relpath^$projectlogo"/></td>
<!--END PROJECT_LOGO-->
<!--BEGIN PROJECT_NAME-->
<td id="projectalign" style="padding-left: 0.5em;">
<div id="projectname">$projectname
<!--BEGIN PROJECT_NUMBER-->&#160;<span id="projectnumber">$projectnumber</span><!--END PROJECT_NUMBER-->
</div>
<!--BEGIN PROJECT_BRIEF--><div id="projectbrief">$projectbrief</div><!--END PROJECT_BRIEF-->
</td>
<!--END PROJECT_NAME-->
<!--BEGIN !PROJECT_NAME-->
<!--BEGIN PROJECT_BRIEF-->
<td style="padding-left: 0.5em;">
<div id="projectbrief">$projectbrief</div>
</td>
<!--END PROJECT_BRIEF-->
<!--END !PROJECT_NAME-->
<!--BEGIN DISABLE_INDEX-->
<!--BEGIN SEARCHENGINE-->
<td>$searchbox</td>
<!--END SEARCHENGINE-->
<!--END DISABLE_INDEX-->
</tr>
</tbody>
</table>
</div>
<!--END TITLEAREA-->
<!-- end header part -->

View File

@@ -490,12 +490,57 @@ void BLIS_EXPORT_BLAS cblas_strsm(enum CBLAS_ORDER Order, enum CBLAS_SIDE Side,
enum CBLAS_DIAG Diag, f77_int M, f77_int N,
float alpha, const float *A, f77_int lda,
float *B, f77_int ldb);
/** \addtogroup APIS BLIS Extension API
* @{
*/
/** \addtogroup INTERFACE CBLAS INTERFACE
* \ingroup BLIS Extension API
* @{
*/
/**
* sgemmt computes scalar-matrix-matrix product with general matrices. It adds the result to the upper or lower part of scalar-matrix product.
* It accesses and updates a triangular part of the square result matrix.
* The operation is defined as
* C := alpha*Mat(A) * Mat(B) + beta*C,
* where:
* Mat(X) is one of Mat(X) = X, or Mat(X) = \f$X^T\f$, or Mat(X) = \f$X^H\f$,
* alpha and beta are scalars,
* A, B and C are matrices:
* Mat(A) is an nxk matrix,
* Mat(B) is a kxn matrix,
* C is an nxn upper or lower triangular matrix.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] Uplo Specifies whether the upper or lower triangular part of the array c is used. CblasUpper or CblasLower
* @param[in] TransA Specifies the form of Mat(A) used in the matrix multiplication:
* if transa = CblasNoTrans, then Mat(A) = A;
* if transa = CblasTrans, then Mat(A) =\f$A^T\f$;
* if transa = CblasConjTrans, then Mat(A) = \f$A^H\f$.
* @param[in] TransB Specifies the form of Mat(B) used in the matrix multiplication:
* if transb = CblasNoTrans, then Mat(B) = B;
* if transb = CblasTrans, then Mat(B) = \f$B^T\f$;
* if transb = CblasConjTrans, then Mat(B) = \f$B^H\f$.
* @param[in] N Specifies the order of the matrix C.
* @param[in] K Specifies the number of columns of the matrix Mat(A) and the number of rows of the matrix Mat(B).
* @param[in] alpha Specifies the scalar alpha.
* @param[in] A The array is float matrix A.
* @param[in] lda Specifies the leading dimension of a
* @param[in] B The array is float matrix B.
* @param[in] ldb Specifies the leading dimension of b
* @param[in] beta Specifies the scalar beta.
* @param[in,out] C The array is float matrix C.
* @param[in] ldc Specifies the leading dimension of c
* @return None
*/
void BLIS_EXPORT_BLAS cblas_sgemmt(enum CBLAS_ORDER Order, enum CBLAS_UPLO Uplo,
enum CBLAS_TRANSPOSE TransA, enum CBLAS_TRANSPOSE TransB,
f77_int N, f77_int K, float alpha, const float *A,
f77_int lda, const float *B, f77_int ldb,
float beta, float *C, f77_int ldc);
/** @}*/
void BLIS_EXPORT_BLAS cblas_dgemm(enum CBLAS_ORDER Order, enum CBLAS_TRANSPOSE TransA,
enum CBLAS_TRANSPOSE TransB, f77_int M, f77_int N,
f77_int K, double alpha, const double *A,
@@ -525,12 +570,51 @@ void BLIS_EXPORT_BLAS cblas_dtrsm(enum CBLAS_ORDER Order, enum CBLAS_SIDE Side,
enum CBLAS_DIAG Diag, f77_int M, f77_int N,
double alpha, const double *A, f77_int lda,
double *B, f77_int ldb);
/** \addtogroup INTERFACE CBLAS INTERFACE
* @{
*/
/**
* dgemmt computes scalar-matrix-matrix product with general matrices. It adds the result to the upper or lower part of scalar-matrix product.
* It accesses and updates a triangular part of the square result matrix.
* The operation is defined as
* C := alpha*Mat(A) * Mat(B) + beta*C,
* where:
* Mat(X) is one of Mat(X) = X, or Mat(X) = \f$X^T\f$, or Mat(X) = \f$X^H\f$,
* alpha and beta are scalars,
* A, B and C are matrices:
* Mat(A) is an nxk matrix,
* Mat(B) is a kxn matrix,
* C is an nxn upper or lower triangular matrix.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] Uplo Specifies whether the upper or lower triangular part of the array c is used. CblasUpper or CblasLower
* @param[in] TransA Specifies the form of Mat(A) used in the matrix multiplication:
* if transa = CblasNoTrans, then Mat(A) = A;
* if transa = CblasTrans, then Mat(A) =\f$A^T\f$;
* if transa = CblasConjTrans, then Mat(A) = \f$A^H\f$.
* @param[in] TransB Specifies the form of Mat(B) used in the matrix multiplication:
* if transb = CblasNoTrans, then Mat(B) = B;
* if transb = CblasTrans, then Mat(B) = \f$B^T\f$;
* if transb = CblasConjTrans, then Mat(B) = \f$B^H\f$.
* @param[in] N Specifies the order of the matrix C.
* @param[in] K Specifies the number of columns of the matrix Mat(A) and the number of rows of the matrix Mat(B).
* @param[in] alpha Specifies the scalar alpha.
* @param[in] A The array is float matrix A.
* @param[in] lda Specifies the leading dimension of a
* @param[in] B The array is float matrix B.
* @param[in] ldb Specifies the leading dimension of b
* @param[in] beta Specifies the scalar beta.
* @param[in,out] C The array is float matrix C.
* @param[in] ldc Specifies the leading dimension of c
* @return None
*/
void BLIS_EXPORT_BLAS cblas_dgemmt(enum CBLAS_ORDER Order, enum CBLAS_UPLO Uplo,
enum CBLAS_TRANSPOSE TransA, enum CBLAS_TRANSPOSE TransB,
f77_int N, f77_int K, double alpha, const double *A,
f77_int lda, const double *B, f77_int ldb,
double beta, double *C, f77_int ldc);
/** @}*/
void BLIS_EXPORT_BLAS cblas_cgemm(enum CBLAS_ORDER Order, enum CBLAS_TRANSPOSE TransA,
enum CBLAS_TRANSPOSE TransB, f77_int M, f77_int N,
f77_int K, const void *alpha, const void *A,
@@ -560,12 +644,51 @@ void BLIS_EXPORT_BLAS cblas_ctrsm(enum CBLAS_ORDER Order, enum CBLAS_SIDE Side,
enum CBLAS_DIAG Diag, f77_int M, f77_int N,
const void *alpha, const void *A, f77_int lda,
void *B, f77_int ldb);
/** \addtogroup INTERFACE CBLAS INTERFACE
* @{
*/
/**
* cgemmt computes scalar-matrix-matrix product with general matrices. It adds the result to the upper or lower part of scalar-matrix product.
* It accesses and updates a triangular part of the square result matrix.
* The operation is defined as
* C := alpha*Mat(A) * Mat(B) + beta*C,
* where:
* Mat(X) is one of Mat(X) = X, or Mat(X) = \f$X^T\f$, or Mat(X) = \f$X^H\f$,
* alpha and beta are scalars,
* A, B and C are matrices:
* Mat(A) is an nxk matrix,
* Mat(B) is a kxn matrix,
* C is an nxn upper or lower triangular matrix.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] Uplo Specifies whether the upper or lower triangular part of the array c is used. CblasUpper or CblasLower
* @param[in] TransA Specifies the form of Mat(A) used in the matrix multiplication:
* if transa = CblasNoTrans, then Mat(A) = A;
* if transa = CblasTrans, then Mat(A) =\f$A^T\f$;
* if transa = CblasConjTrans, then Mat(A) = \f$A^H\f$.
* @param[in] TransB Specifies the form of Mat(B) used in the matrix multiplication:
* if transb = CblasNoTrans, then Mat(B) = B;
* if transb = CblasTrans, then Mat(B) = \f$B^T\f$;
* if transb = CblasConjTrans, then Mat(B) = \f$B^H\f$.
* @param[in] N Specifies the order of the matrix C.
* @param[in] K Specifies the number of columns of the matrix Mat(A) and the number of rows of the matrix Mat(B).
* @param[in] alpha Specifies the scalar alpha.
* @param[in] A The array is float matrix A.
* @param[in] lda Specifies the leading dimension of a
* @param[in] B The array is float matrix B.
* @param[in] ldb Specifies the leading dimension of b
* @param[in] beta Specifies the scalar beta.
* @param[in,out] C The array is float matrix C.
* @param[in] ldc Specifies the leading dimension of c
* @return None
*/
void BLIS_EXPORT_BLAS cblas_cgemmt(enum CBLAS_ORDER Order, enum CBLAS_UPLO Uplo,
enum CBLAS_TRANSPOSE TransA, enum CBLAS_TRANSPOSE TransB,
f77_int N, f77_int K, const void *alpha, const void *A,
f77_int lda, const void *B, f77_int ldb,
const void *beta, void *C, f77_int ldc);
/** @}*/
void BLIS_EXPORT_BLAS cblas_zgemm(enum CBLAS_ORDER Order, enum CBLAS_TRANSPOSE TransA,
enum CBLAS_TRANSPOSE TransB, f77_int M, f77_int N,
f77_int K, const void *alpha, const void *A,
@@ -595,12 +718,51 @@ void BLIS_EXPORT_BLAS cblas_ztrsm(enum CBLAS_ORDER Order, enum CBLAS_SIDE Side,
enum CBLAS_DIAG Diag, f77_int M, f77_int N,
const void *alpha, const void *A, f77_int lda,
void *B, f77_int ldb);
/** \addtogroup INTERFACE CBLAS INTERFACE
* @{
*/
/**
* zgemmt computes scalar-matrix-matrix product with general matrices. It adds the result to the upper or lower part of scalar-matrix product.
* It accesses and updates a triangular part of the square result matrix.
* The operation is defined as
* C := alpha*Mat(A) * Mat(B) + beta*C,
* where:
* Mat(X) is one of Mat(X) = X, or Mat(X) = \f$X^T\f$, or Mat(X) = \f$X^H\f$,
* alpha and beta are scalars,
* A, B and C are matrices:
* Mat(A) is an nxk matrix,
* Mat(B) is a kxn matrix,
* C is an nxn upper or lower triangular matrix.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] Uplo Specifies whether the upper or lower triangular part of the array c is used. CblasUpper or CblasLower
* @param[in] TransA Specifies the form of Mat(A) used in the matrix multiplication:
* if transa = CblasNoTrans, then Mat(A) = A;
* if transa = CblasTrans, then Mat(A) =\f$A^T\f$;
* if transa = CblasConjTrans, then Mat(A) = \f$A^H\f$.
* @param[in] TransB Specifies the form of Mat(B) used in the matrix multiplication:
* if transb = CblasNoTrans, then Mat(B) = B;
* if transb = CblasTrans, then Mat(B) = \f$B^T\f$;
* if transb = CblasConjTrans, then Mat(B) = \f$B^H\f$.
* @param[in] N Specifies the order of the matrix C.
* @param[in] K Specifies the number of columns of the matrix Mat(A) and the number of rows of the matrix Mat(B).
* @param[in] alpha Specifies the scalar alpha.
* @param[in] A The array is float matrix A.
* @param[in] lda Specifies the leading dimension of a
* @param[in] B The array is float matrix B.
* @param[in] ldb Specifies the leading dimension of b
* @param[in] beta Specifies the scalar beta.
* @param[in,out] C The array is float matrix C.
* @param[in] ldc Specifies the leading dimension of c
* @return None
*/
void BLIS_EXPORT_BLAS cblas_zgemmt(enum CBLAS_ORDER Order, enum CBLAS_UPLO Uplo,
enum CBLAS_TRANSPOSE TransA, enum CBLAS_TRANSPOSE TransB,
f77_int N, f77_int K, const void *alpha, const void *A,
f77_int lda, const void *B, f77_int ldb,
const void *beta, void *C, f77_int ldc);
/** @}*/
/*
* Routines with prefixes C and Z only
@@ -654,6 +816,40 @@ BLIS_EXPORT_BLAS double cblas_dcabs1( const void *z);
*/
// -- Batch APIs -------
/** \addtogroup INTERFACE CBLAS INTERFACE
* @{
*/
/**
* cblas_sgemm_batch interface resembles the GEMM interface.
* Arguments are arrays of pointers to matrices and parameters.
* It batches multiple independent small GEMM operations of fixed or variable sizes into a group
* and then spawn multiple threads for different GEMM instances within the group.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] TransA_array Array of pointers, dimension (group_count), specifies the form of Mat( A ) to be used in the matrix multiplication as follows:
* Mat( A ) = A
* Mat( A ) = \f$A^T\f$
* Mat( A ) = \f$A^H\f$
* @param[in] TransB_array Array of pointers, dimension (group_count), specifies the form of Mat( B ) to be used in the matrix multiplication as follows:
* Mat( B ) = B
* Mat( B ) = \f$B^T\f$
* Mat( B ) = \f$B^H\f$
* @param[in] M_array Array of pointers, dimension (group_count), each is a number of rows of matrices A and of matrices C.
* @param[in] N_array Array of pointers, dimension (group_count), each is a number of columns of matrices B and of matrices C.
* @param[in] K_array Array of pointers, dimension (group_count), each is a number of columns of matrices A and number of rows of matrices B.
* @param[in] alpha_array Array of pointers, dimension (group_count) each is a scalar alpha for each GEMM.
* @param[in] A Array of pointers, dimension (group_count), Each is a matrix A of float datatype.
* @param[in] lda_array Array of pointers, dimension (group_count), each f77_int lda_array specifies the first dimension of matrix A.
* @param[in] B Array of pointers, dimension (group_count), Each is a matrix B of float datatype.
* @param[in] ldb_array Array of pointers, dimension (group_count), each f77_int ldb_array specifies the first dimension of matrix B.
* @param[in] beta_array Array of pointers, dimension (group_count) each is a scalar beta for each GEMM.
* @param[in,out] C Array of pointers, dimension (group_count), Each is a matrix C of float datatype.
* @param[in] ldc_array Array of pointers, dimension (group_count), each f77_int ldc_array specifies the first dimension of matrix C.
* @param[in] group_count group_count specifies total number of groups. Usually it is used for having batch of variable size GEMM. Where each group batches GEMMs of some fixed size.
* @param[in] group_size Array of pointer, each is number of GEMM to be performed per group(batch).
* @return None
*/
void BLIS_EXPORT_BLAS cblas_sgemm_batch(enum CBLAS_ORDER Order,
enum CBLAS_TRANSPOSE *TransA_array,
enum CBLAS_TRANSPOSE *TransB_array,
@@ -662,6 +858,37 @@ void BLIS_EXPORT_BLAS cblas_sgemm_batch(enum CBLAS_ORDER Order,
f77_int *lda_array, const float **B, f77_int *ldb_array,
const float *beta_array, float **C, f77_int *ldc_array,
f77_int group_count, f77_int *group_size);
/**
* cblas_dgemm_batch interface resembles the GEMM interface.
* Arguments are arrays of pointers to matrices and parameters.
* It batches multiple independent small GEMM operations of fixed or variable sizes into a group
* and then spawn multiple threads for different GEMM instances within the group.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] TransA_array Array of pointers, dimension (group_count), specifies the form of Mat( A ) to be used in the matrix multiplication as follows:
* Mat( A ) = A
* Mat( A ) = \f$A^T\f$
* Mat( A ) = \f$A^H\f$
* @param[in] TransB_array Array of pointers, dimension (group_count), specifies the form of Mat( B ) to be used in the matrix multiplication as follows:
* Mat( B ) = B
* Mat( B ) = \f$B^T\f$
* Mat( B ) = \f$B^H\f$
* @param[in] M_array Array of pointers, dimension (group_count), each is a number of rows of matrices A and of matrices C.
* @param[in] N_array Array of pointers, dimension (group_count), each is a number of columns of matrices B and of matrices C.
* @param[in] K_array Array of pointers, dimension (group_count), each is a number of columns of matrices A and number of rows of matrices B.
* @param[in] alpha_array Array of pointers, dimension (group_count) each is a scalar alpha for each GEMM.
* @param[in] A Array of pointers, dimension (group_count), Each is a matrix A of double datatype.
* @param[in] lda_array Array of pointers, dimension (group_count), each f77_int lda_array specifies the first dimension of matrix A.
* @param[in] B Array of pointers, dimension (group_count), Each is a matrix B of double datatype.
* @param[in] ldb_array Array of pointers, dimension (group_count), each f77_int ldb_array specifies the first dimension of matrix B.
* @param[in] beta_array Array of pointers, dimension (group_count) each is a scalar beta for each GEMM.
* @param[in,out] C Array of pointers, dimension (group_count), Each is a matrix C of double datatype.
* @param[in] ldc_array Array of pointers, dimension (group_count), each f77_int ldc_array specifies the first dimension of matrix C.
* @param[in] group_count group_count specifies total number of groups. Usually it is used for having batch of variable size GEMM. Where each group batches GEMMs of some fixed size.
* @param[in] group_size Array of pointer, each is number of GEMM to be performed per group(batch).
* @return None
*/
void BLIS_EXPORT_BLAS cblas_dgemm_batch(enum CBLAS_ORDER Order,
enum CBLAS_TRANSPOSE *TransA_array,
enum CBLAS_TRANSPOSE *TransB_array,
@@ -671,6 +898,38 @@ void BLIS_EXPORT_BLAS cblas_dgemm_batch(enum CBLAS_ORDER Order,
const double **B, f77_int *ldb_array,
const double *beta_array, double **C, f77_int *ldc_array,
f77_int group_count, f77_int *group_size);
/**
* cblas_cgemm_batch interface resembles the GEMM interface.
* Arguments are arrays of pointers to matrices and parameters.
* It batches multiple independent small GEMM operations of fixed or variable sizes into a group
* and then spawn multiple threads for different GEMM instances within the group.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] TransA_array Array of pointers, dimension (group_count), specifies the form of Mat( A ) to be used in the matrix multiplication as follows:
* Mat( A ) = A
* Mat( A ) = \f$A^T\f$
* Mat( A ) = \f$A^H\f$
* @param[in] TransB_array Array of pointers, dimension (group_count), specifies the form of Mat( B ) to be used in the matrix multiplication as follows:
* Mat( B ) = B
* Mat( B ) = \f$B^T\f$
* Mat( B ) = \f$B^H\f$
* @param[in] M_array Array of pointers, dimension (group_count), each is a number of rows of matrices A and of matrices C.
* @param[in] N_array Array of pointers, dimension (group_count), each is a number of columns of matrices B and of matrices C.
* @param[in] K_array Array of pointers, dimension (group_count), each is a number of columns of matrices A and number of rows of matrices B.
* @param[in] alpha_array Array of pointers, dimension (group_count) each is a scalar alpha for each GEMM.
* @param[in] A Array of pointers, dimension (group_count), Each is a matrix A of scomplex datatype.
* @param[in] lda_array Array of pointers, dimension (group_count), each f77_int lda_array specifies the first dimension of matrix A.
* @param[in] B Array of pointers, dimension (group_count), Each is a matrix B of scomplex datatype.
* @param[in] ldb_array Array of pointers, dimension (group_count), each f77_int ldb_array specifies the first dimension of matrix B.
* @param[in] beta_array Array of pointers, dimension (group_count) each is a scalar beta for each GEMM.
* @param[in,out] C Array of pointers, dimension (group_count), Each is a matrix C of scomplex datatype.
* @param[in] ldc_array Array of pointers, dimension (group_count), each f77_int ldc_array specifies the first dimension of matrix C.
* @param[in] group_count group_count specifies total number of groups. Usually it is used for having batch of variable size GEMM. Where each group batches GEMMs of some fixed size.
* @param[in] group_size Array of pointer, each is number of GEMM to be performed per group(batch).
* @return None
*/
void BLIS_EXPORT_BLAS cblas_cgemm_batch(enum CBLAS_ORDER Order,
enum CBLAS_TRANSPOSE *TransA_array,
enum CBLAS_TRANSPOSE *TransB_array,
@@ -679,6 +938,37 @@ void BLIS_EXPORT_BLAS cblas_cgemm_batch(enum CBLAS_ORDER Order,
f77_int *lda_array, const void **B, f77_int *ldb_array,
const void *beta_array, void **C, f77_int *ldc_array,
f77_int group_count, f77_int *group_size);
/**
* cblas_zgemm_batch interface resembles the GEMM interface.
* Arguments are arrays of pointers to matrices and parameters.
* It batches multiple independent small GEMM operations of fixed or variable sizes into a group
* and then spawn multiple threads for different GEMM instances within the group.
*
* @param[in] Order Storage scheme of matrices. CblasRowMajor or CblasColMajor
* @param[in] TransA_array Array of pointers, dimension (group_count), specifies the form of Mat( A ) to be used in the matrix multiplication as follows:
* Mat( A ) = A
* Mat( A ) = \f$A^T\f$
* Mat( A ) = \f$A^H\f$
* @param[in] TransB_array Array of pointers, dimension (group_count), specifies the form of Mat( B ) to be used in the matrix multiplication as follows:
* Mat( B ) = B
* Mat( B ) = \f$B^T\f$
* Mat( B ) = \f$B^H\f$
* @param[in] M_array Array of pointers, dimension (group_count), each is a number of rows of matrices A and of matrices C.
* @param[in] N_array Array of pointers, dimension (group_count), each is a number of columns of matrices B and of matrices C.
* @param[in] K_array Array of pointers, dimension (group_count), each is a number of columns of matrices A and number of rows of matrices B.
* @param[in] alpha_array Array of pointers, dimension (group_count) each is a scalar alpha for each GEMM.
* @param[in] A Array of pointers, dimension (group_count), Each is a matrix A of dcomplex datatype.
* @param[in] lda_array Array of pointers, dimension (group_count), each f77_int lda_array specifies the first dimension of matrix A.
* @param[in] B Array of pointers, dimension (group_count), Each is a matrix B of dcomplex datatype.
* @param[in] ldb_array Array of pointers, dimension (group_count), each f77_int ldb_array specifies the first dimension of matrix B.
* @param[in] beta_array Array of pointers, dimension (group_count) each is a scalar beta for each GEMM.
* @param[in,out] C Array of pointers, dimension (group_count), Each is a matrix C of dcomplex datatype.
* @param[in] ldc_array Array of pointers, dimension (group_count), each f77_int ldc_array specifies the first dimension of matrix C.
* @param[in] group_count group_count specifies total number of groups. Usually it is used for having batch of variable size GEMM. Where each group batches GEMMs of some fixed size.
* @param[in] group_size Array of pointer, each is number of GEMM to be performed per group(batch).
* @return None
*/
void BLIS_EXPORT_BLAS cblas_zgemm_batch(enum CBLAS_ORDER Order,
enum CBLAS_TRANSPOSE *TransA_array,
enum CBLAS_TRANSPOSE *TransB_array,
@@ -687,6 +977,7 @@ void BLIS_EXPORT_BLAS cblas_zgemm_batch(enum CBLAS_ORDER Order,
f77_int *lda_array, const void **B, f77_int *ldb_array,
const void *beta_array, void **C, f77_int *ldc_array,
f77_int group_count, f77_int *group_size);
/** @}*/
void BLIS_EXPORT_BLAS cblas_cgemm3m(enum CBLAS_ORDER Order, enum CBLAS_TRANSPOSE TransA,
enum CBLAS_TRANSPOSE TransB, f77_int M, f77_int N,
f77_int K, const void *alpha, const void *A,