Fixed bugs in cblas_sdsdot(), sdsdot_().

Details: - Fixed a bug in sdsdot_sub() that redundantly added the "alpha" scalar, named 'sb'. This value was already being added by the underlying sdsdot_() function. Thus, we no longer add 'sb' within sdsdot_sub(). Thanks to Simon Lukas Märtens for reporting this bug via #367. - Fixed a second bug in order of typecasting intermediate products in sdsdot_(). Previously, the "alpha" scalar was being added after the "outer" typecast to float. However, the operation is supposed to first add the dot product to the (promoted) scalar and THEN downcast the sum to float. Thanks to Devin Matthews for catching this bug.
2026-05-11 17:50:00 +00:00 · 2019-12-16 16:30:26 -06:00
parent afee36b251
commit d988a5bbd7
3 changed files with 14 additions and 7 deletions
--- a/5
+++ b/5
@@ -51,6 +51,7 @@ but many others have contributed code and feedback, including
  Ye Luo                   @ye-luo             (Argonne National Laboratory)
  Ricardo Magana           @magania            (Hewlett Packard Enterprise)
  Bryan Marker             @bamarker           (The University of Texas at Austin)
+  Simon Lukas Märtens      @ACSimon33          (RWTH Aachen University)
  Devin Matthews           @devinamatthews     (The University of Texas at Austin)
  Stefanos Mavros          @smavros
  Nisanth Padinharepatt                        (AMD)
@@ -60,7 +61,7 @@ but many others have contributed code and feedback, including
  Ilya Polkovnichenko
  Jack Poulson             @poulson            (Stanford)
  Mathieu Poumeyrol        @kali
-  Christos Psarras         @ChrisPsa           (RWTH-Aachen)
+  Christos Psarras         @ChrisPsa           (RWTH Aachen University)
                           @qnerd
  Michael Rader            @mrader1248
  Pradeep Rao              @pradeeptrgit       (AMD)
@@ -74,7 +75,7 @@ but many others have contributed code and feedback, including
  Nathaniel Smith          @njsmith
  Shaden Smith             @ShadenSmith
  Tyler Smith              @tlrmchlsmth        (The University of Texas at Austin)
-  Paul Springer            @springer13         (RWTH-Aachen)
+  Paul Springer            @springer13         (RWTH Aachen University)
  Adam J. Stewart          @adamjstewart       (University of Illinois at Urbana-Champaign)
  Vladimir Sukarev
  Santanu Thangaraj                            (AMD)
--- a/frame/compat/bla_dot.c
+++ b/frame/compat/bla_dot.c
@@ -264,10 +264,16 @@ float PASTEF77(sd,sdot)
       const float*   y, const f77_int* incy
     )
 {
-	float r = ( float )PASTEF77(d,sdot)( n,
-	                                     x, incx,
-	                                     y, incy );
-	return r + *sb;
+	return ( float )
+	       (
+	         ( double )(*sb) +
+	         PASTEF77(d,sdot)
+	         (
+	           n,
+	           x, incx,
+	           y, incy
+	         )
+	       );
 }

 // Input vectors stored in single precision, computed in double precision,
--- a/frame/compat/cblas/f77_sub/f77_dot_sub.c
+++ b/frame/compat/cblas/f77_sub/f77_dot_sub.c
@@ -75,7 +75,7 @@ void PASTEF772(sds,dot,sub)
             float*   rval
     )
 {
-	*rval = *sb + PASTEF77(sds,dot)
+	*rval = PASTEF77(sds,dot)
 	(
 	  n,
 	  sb,