mirror of
https://github.com/amd/blis.git
synced 2026-05-11 17:50:00 +00:00
Fixed bugs in cblas_sdsdot(), sdsdot_().
Details: - Fixed a bug in sdsdot_sub() that redundantly added the "alpha" scalar, named 'sb'. This value was already being added by the underlying sdsdot_() function. Thus, we no longer add 'sb' within sdsdot_sub(). Thanks to Simon Lukas Märtens for reporting this bug via #367. - Fixed a second bug in order of typecasting intermediate products in sdsdot_(). Previously, the "alpha" scalar was being added after the "outer" typecast to float. However, the operation is supposed to first add the dot product to the (promoted) scalar and THEN downcast the sum to float. Thanks to Devin Matthews for catching this bug.
This commit is contained in:
committed by
Devrajegowda, Kiran
parent
afee36b251
commit
d988a5bbd7
5
CREDITS
5
CREDITS
@@ -51,6 +51,7 @@ but many others have contributed code and feedback, including
|
||||
Ye Luo @ye-luo (Argonne National Laboratory)
|
||||
Ricardo Magana @magania (Hewlett Packard Enterprise)
|
||||
Bryan Marker @bamarker (The University of Texas at Austin)
|
||||
Simon Lukas Märtens @ACSimon33 (RWTH Aachen University)
|
||||
Devin Matthews @devinamatthews (The University of Texas at Austin)
|
||||
Stefanos Mavros @smavros
|
||||
Nisanth Padinharepatt (AMD)
|
||||
@@ -60,7 +61,7 @@ but many others have contributed code and feedback, including
|
||||
Ilya Polkovnichenko
|
||||
Jack Poulson @poulson (Stanford)
|
||||
Mathieu Poumeyrol @kali
|
||||
Christos Psarras @ChrisPsa (RWTH-Aachen)
|
||||
Christos Psarras @ChrisPsa (RWTH Aachen University)
|
||||
@qnerd
|
||||
Michael Rader @mrader1248
|
||||
Pradeep Rao @pradeeptrgit (AMD)
|
||||
@@ -74,7 +75,7 @@ but many others have contributed code and feedback, including
|
||||
Nathaniel Smith @njsmith
|
||||
Shaden Smith @ShadenSmith
|
||||
Tyler Smith @tlrmchlsmth (The University of Texas at Austin)
|
||||
Paul Springer @springer13 (RWTH-Aachen)
|
||||
Paul Springer @springer13 (RWTH Aachen University)
|
||||
Adam J. Stewart @adamjstewart (University of Illinois at Urbana-Champaign)
|
||||
Vladimir Sukarev
|
||||
Santanu Thangaraj (AMD)
|
||||
|
||||
@@ -264,10 +264,16 @@ float PASTEF77(sd,sdot)
|
||||
const float* y, const f77_int* incy
|
||||
)
|
||||
{
|
||||
float r = ( float )PASTEF77(d,sdot)( n,
|
||||
x, incx,
|
||||
y, incy );
|
||||
return r + *sb;
|
||||
return ( float )
|
||||
(
|
||||
( double )(*sb) +
|
||||
PASTEF77(d,sdot)
|
||||
(
|
||||
n,
|
||||
x, incx,
|
||||
y, incy
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
// Input vectors stored in single precision, computed in double precision,
|
||||
|
||||
@@ -75,7 +75,7 @@ void PASTEF772(sds,dot,sub)
|
||||
float* rval
|
||||
)
|
||||
{
|
||||
*rval = *sb + PASTEF77(sds,dot)
|
||||
*rval = PASTEF77(sds,dot)
|
||||
(
|
||||
n,
|
||||
sb,
|
||||
|
||||
Reference in New Issue
Block a user