Fixed bugs in cblas_sdsdot(), sdsdot_().

Details:
- Fixed a bug in sdsdot_sub() that redundantly added the "alpha" scalar,
  named 'sb'. This value was already being added by the underlying
  sdsdot_() function. Thus, we no longer add 'sb' within sdsdot_sub().
  Thanks to Simon Lukas Märtens for reporting this bug via #367.
- Fixed a second bug in order of typecasting intermediate products in
  sdsdot_(). Previously, the "alpha" scalar was being added after the
  "outer" typecast to float. However, the operation is supposed to first
  add the dot product to the (promoted) scalar and THEN downcast the sum
  to float. Thanks to Devin Matthews for catching this bug.
This commit is contained in:
Field G. Van Zee
2019-12-16 16:30:26 -06:00
committed by Devrajegowda, Kiran
parent afee36b251
commit d988a5bbd7
3 changed files with 14 additions and 7 deletions

View File

@@ -51,6 +51,7 @@ but many others have contributed code and feedback, including
Ye Luo @ye-luo (Argonne National Laboratory)
Ricardo Magana @magania (Hewlett Packard Enterprise)
Bryan Marker @bamarker (The University of Texas at Austin)
Simon Lukas Märtens @ACSimon33 (RWTH Aachen University)
Devin Matthews @devinamatthews (The University of Texas at Austin)
Stefanos Mavros @smavros
Nisanth Padinharepatt (AMD)
@@ -60,7 +61,7 @@ but many others have contributed code and feedback, including
Ilya Polkovnichenko
Jack Poulson @poulson (Stanford)
Mathieu Poumeyrol @kali
Christos Psarras @ChrisPsa (RWTH-Aachen)
Christos Psarras @ChrisPsa (RWTH Aachen University)
@qnerd
Michael Rader @mrader1248
Pradeep Rao @pradeeptrgit (AMD)
@@ -74,7 +75,7 @@ but many others have contributed code and feedback, including
Nathaniel Smith @njsmith
Shaden Smith @ShadenSmith
Tyler Smith @tlrmchlsmth (The University of Texas at Austin)
Paul Springer @springer13 (RWTH-Aachen)
Paul Springer @springer13 (RWTH Aachen University)
Adam J. Stewart @adamjstewart (University of Illinois at Urbana-Champaign)
Vladimir Sukarev
Santanu Thangaraj (AMD)

View File

@@ -264,10 +264,16 @@ float PASTEF77(sd,sdot)
const float* y, const f77_int* incy
)
{
float r = ( float )PASTEF77(d,sdot)( n,
x, incx,
y, incy );
return r + *sb;
return ( float )
(
( double )(*sb) +
PASTEF77(d,sdot)
(
n,
x, incx,
y, incy
)
);
}
// Input vectors stored in single precision, computed in double precision,

View File

@@ -75,7 +75,7 @@ void PASTEF772(sds,dot,sub)
float* rval
)
{
*rval = *sb + PASTEF77(sds,dot)
*rval = PASTEF77(sds,dot)
(
n,
sb,