Details:
Using of ymm registers storing 8 float values than 4 floats values
Changed register from ymm to xmm in required places. This can be found
only when leading dimension is greater than the actual dimension.
Change-Id: I39f04eac18c4fa3a8c93048c977d6a83aa92b800