32-Bit Vector Chunk (8-Element) API#
- group chunk32_api
Functions
-
int32_t chunk_s32_dot(const int32_t b[VPU_INT32_EPV], const q2_30 c[VPU_INT32_EPV])#
Compute the inner product between two vector chunks.
This function computes the inner product of two vector chunks, \(\bar b\) and \(\bar c\) .
Conceptually, elements of \(\bar b\) may have any number of fractional bits (int, fixed-point, mantissas of a BFP vector) so long as they’re all the same. Elements of \(\bar c\) are Q2.30 fixed-point values. Given that, the returned value \(a\) will have the same number of fractional bits as \(\bar b\) .
Only the lowest 32 bits of the sum \(a\) are returned.
- Operation Performed
- \[\begin{aligned} & a \leftarrow \sum_{k=0}^{\mathtt{VPU\_INT32\_EPV}-1} \left( round\left( \frac{b_k\cdot{}c_k}{2^{30}} \right) \right) \end{aligned}\]
- Parameters:
b – [in] Input chunk \(\bar b\)
c – [in] Input chunk \(\bar c\)
- Returns:
\(a\)
-
void chunk_s32_log(q8_24 a[VPU_INT32_EPV], const int32_t b[VPU_INT32_EPV], const exponent_t b_exp)#
Compute the natural log of a vector chunk of 32-bit values.
This function computes the natural logarithm of each of the 8 elements in vector chunk \(\bar b\) . The result is returned as an 8-element chunk \(\bar a\) of Q8.24 values.
b_exp
is the exponent associated with elements of \(\bar b\) .Any input \(b_k \le 0\) will result in a corresponding output \(a_k = \mathtt{INT32_MIN}\) .
- Operation Performed
- \[\begin{split}\begin{aligned} & a_k \leftarrow \ \begin{cases} log(b_k\cdot{}2^{\mathtt{b\_exp}}) & b_k > 0 \\ \mathtt{INT32\_MIN} & \text{otherwise} \\ \end{cases} \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}-1} \end{aligned}\end{split}\]
- Parameters:
a – [out] Output vector chunk \(\bar a\)
b – [in] Input vector chunk \(\bar b\)
b_exp – [in] Exponent associated with \(\bar b\)
- Throws ET_LOAD_STORE:
Raised if `b` or `a` is not double word-aligned (See Note: Vector Alignment)
-
void chunk_float_s32_log(q8_24 a[VPU_INT32_EPV], const float_s32_t b[VPU_INT32_EPV])#
Compute the natural log of a vector chunk of
float_s32_t
.This function computes the natural logarithm of each of the
VPU_INT32_EPV
elements in vector chunk \(\bar b\) . The result is returned as an 8-element chunk \(\bar a\) of Q8.24 values.Any input \(b_k \le 0\) will result in a corresponding output \(a_k = \mathtt{INT32_MIN}\) .
- Operation Performed
- \[\begin{split}\begin{aligned} & a_k \leftarrow \ \begin{cases} log(b_k) & b_k > 0 \\ \mathtt{INT32\_MIN} & \text{otherwise} \\ \end{cases} \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}-1} \end{aligned}\end{split}\]
- Parameters:
a – [out] Output vector chunk \(\bar a\)
b – [in] Input vector chunk \(\bar b\)
- Throws ET_LOAD_STORE:
Raised if `b` or `a` is not double word-aligned (See Note: Vector Alignment)
-
void chunk_q30_power_series(int32_t a[VPU_INT32_EPV], const q2_30 b[VPU_INT32_EPV], const int32_t c[], const unsigned term_count)#
Compute a power series on a vector chunk of Q2.30 values.
This function is used to compute a power series summation on a vector chunk (
VPU_INT32_EPV
-element vector) \(\bar b\) . \(\bar b\) contains Q2.30 values. \(\bar c\) is a vector containing coefficients to be multiplied by powers of \(\bar b\) , and may have any associated exponent. The output is vector chunk \(\bar a\) and has the same exponent as \(\bar c\) .c[]
is an array with shape(term_count, VPU_INT32_EPV)
, where the second axis contains the same value replicated across allVPU_INT32_EPV
elements. That is,c[k][i] = c[k][j]
fori
andj
in0..(VPU_INT32_EPV-1)
. This is for performance reasons. (For the purpose of this explanation, \(\bar c\) is considered to be single-dimensional, without redundancy.)- Operation Performed
- \[\begin{split}\begin{aligned} & b_{k,0} = 2^{30} \\ & b_{k,i} = round\left(\frac{b_{k,i-1}\cdot{}b_k}{2^{30}}\right) \\ & \qquad\text{for }i \in {1..(N-1)} \\ & a_k \leftarrow \sum_{i=0}^{N-1} round\left( \frac{b_{k,i}\cdot c_i}{2^{30}} \right) \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}-1} \end{aligned}\end{split}\]
- Parameters:
a – [out] Output vector chunk \(\bar a\)
b – [in] Input vector chunk \(\bar b\)
c – [in] Coefficient vector \(\bar c\)
term_count – [in] Number of power series terms, \(N\)
-
void chunk_q30_exp_small(q2_30 a[VPU_INT32_EPV], const q2_30 b[VPU_INT32_EPV])#
Compute \(e^b\) on a vector chunk of Q2.30 values.
This function computes \(e^{b_k}\) for each element of a vector chunk (
VPU_INT32_EPV
-element vector) \(\bar b\) of Q2.30 values near \(0\) . The result is computed using the power series approximation of \(e^x\) near zero. It is recommended that this function only be used for \( -0.5 \le b_k\cdot{}2^{-30} \le 0.5\) .The output vector chunk \(\bar a\) is also in a Q2.30 format.
- Operation Performed
- \[\begin{split}\begin{aligned} & a_k \leftarrow e^{b_k\cdot{}2^{-30}} \\ & \qquad\text{for }k \in {0..\mathtt{VPU\_INT32\_EPV}} \end{aligned}\end{split}\]
- Parameters:
a – [out] Output vector chunk \(\bar a\)
b – [in] Input vector chunk \(\bar b\)
-
int32_t chunk_s32_dot(const int32_t b[VPU_INT32_EPV], const q2_30 c[VPU_INT32_EPV])#