Skip to content

VectorMath: add ScalarMax and ScaleAdd functions

Karl Wette requested to merge ANU-CGA/lalsuite:vectorops-funcs into master

Description

This MR adds 2 sets of functions to the VectorMath module in LAL:

  • ScalarMax: finds the (scalar) maximum M of a vector x_i over all array elements: M = \max_i x_i
  • ScaleAdd: performs a fused-multiply-add over vectors x_i, y_i, z_i: z_i = a x_i + y_i

Note: the ScaleAdd functions, despite the name, do not use the AVX FMA instructions sets. Because these instructions round their results differently (i.e. at the end, rather than any intermediate results), values computed can be significantly different from those computed using usual IEEE arithmetic (I found ~0.1% level differences were easily achieved). There also wasn't much of a performance improvement over using equivalent SSE/AVX intrinsics for multiplication/addition.

Also includes miscellaneous other fixes.

API Changes and Justification

Backwards Compatible Changes

  • This change does not modify any class/function/struct/type definitions in a public C header file or any Python class/function definitions
  • This change adds new classes/functions/structs/types to a public C header file or Python module

Backwards Incompatible Changes

  • This change modifies an existing class/function/struct/type definition in a public C header file or Python module
  • This change removes an existing class/function/struct/type from a public C header file or Python module

Review Status

n/a

Merge request reports