16-bit scalar API#
- group scalar_s16_api
Functions
-
int32_t s16_to_s32(exponent_t *a_exp, const int16_t b, const exponent_t b_exp, const unsigned remove_hr)#
Convert a 16-bit floating-point scalar to a 32-bit floating-point scalar.
Converts a 16-bit floating-point scalar, represented by the 16-bit mantissa
b
and exponentb_exp
, into a 32-bit floating-point scalar, represented by the 32-bit returned mantissa and output exponenta_exp
.remove_hr
, if nonzero, indicates that the output mantissa should have no headroom. Otherwise, the output mantissa will be the same as the input mantissa.- Parameters:
a_exp – [out] Output exponent
b – [in] 16-bit input mantissa
b_exp – [in] Input exponent
remove_hr – [in] Whether to remove headroom in output
- Returns:
32-bit output mantissa
-
int16_t s16_inverse(exponent_t *a_exp, const int16_t b)#
Compute the inverse of a 16-bit integer.
b
represents the integer \(b\) .a
anda_exp
together represent the result \(a \cdot 2^{a\_exp}\) .- Operation Performed
- \[\begin{aligned} a \cdot 2^{a\_exp} \leftarrow \frac{1}{b} \end{aligned}\]
- Parameters:
a_exp – [out] Output exponent \(a\_exp\)
b – [in] Input integer \(b\)
- Returns:
Output mantissa \(a\)
-
int16_t s16_mul(exponent_t *a_exp, const int16_t b, const int16_t c, const exponent_t b_exp, const exponent_t c_exp)#
Compute the product of two 16-bit floating-point scalars.
a
anda_exp
together represent the result \(a \cdot 2^{a\_exp}\) .b
andb_exp
together represent the result \(b \cdot 2^{b\_exp}\) .c
andc_exp
together represent the result \(c \cdot 2^{c\_exp}\) .- Operation Performed
- \[\begin{aligned} a \cdot 2^{a\_exp} \leftarrow \left( b\cdot 2^{b\_exp} \right) \cdot \left( c\cdot 2^{c\_exp} \right) \end{aligned}\]
- Parameters:
a_exp – [out] Output exponent \(a\_exp\)
b – [in] First input mantissa \(b\)
c – [in] Second input mantissa \(c\)
b_exp – [in] First input exponent \(b\_exp\)
c_exp – [in] Second input exponent \(c\_exp\)
- Returns:
Output mantissa \(a\)
-
int32_t s16_to_s32(exponent_t *a_exp, const int16_t b, const exponent_t b_exp, const unsigned remove_hr)#