From df673028969b69f8a6f5762d965f6eddfaeeee48 Mon Sep 17 00:00:00 2001 From: "Field G. Van Zee" Date: Wed, 5 Jun 2019 11:43:55 -0500 Subject: [PATCH] Tweaked language in README.md related to sup/AMD. --- README.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index ffcf9dfcc..c69a13730 100644 --- a/README.md +++ b/README.md @@ -79,16 +79,16 @@ and [other educational projects](http://www.ulaff.net/) (such as MOOCs). What's New ---------- - * **Small/skinny matrix support for dgemm now available!** Thanks to funding -from AMD, we have dramatically accelerated `gemm` for double-precision real -matrix problems where one or two dimensions is exceedingly small. A natural -byproduct of this optimization is that the traditional case of small _m = n = k_ -(i.e. square matrices) is also accelerated, even though it was not targeted -specifically. And though only `dgemm` was optimized for now, support for other -datatypes, other operations, and/or multithreading may be implemented in the -future. We've also added a new [PerformanceSmall](docs/PerformanceSmall.md) -document to showcase the improvement in performance when some matrix dimensions -are small. + * **Small/skinny matrix support for dgemm now available!** Thanks to +contributions made possible by our partnership with AMD, we have dramatically +accelerated `gemm` for double-precision real matrix problems where one or two +dimensions is exceedingly small. A natural byproduct of this optimization is +that the traditional case of small _m = n = k_ (i.e. square matrices) is also +accelerated, even though it was not targeted specifically. And though only +`dgemm` was optimized for now, support for other datatypes, other operations, +and/or multithreading may be implemented in the future. We've also added a new +[PerformanceSmall](docs/PerformanceSmall.md) document to showcase the +improvement in performance when some matrix dimensions are small. * **Performance comparisons now available!** We recently measured the performance of various level-3 operations on a variety of hardware architectures,