BLAS functions¶
Overview¶
A subset of Basic Linear Algebra (BLAS) functions that perform matrixmatrix multiplication. More…
// global functions status dnnl::sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc ); status dnnl::gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co ); status dnnl::gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co ); dnnl_status_t DNNL_API dnnl_sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc ); dnnl_status_t DNNL_API dnnl_gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co ); dnnl_status_t DNNL_API dnnl_gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co );
Detailed Documentation¶
A subset of Basic Linear Algebra (BLAS) functions that perform matrixmatrix multiplication.
Global Functions¶
status dnnl::sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc )
Performs singleprecision matrixmatrix multiply.
The operation is defined as:
C := alpha * op( A ) * op( B ) + beta * C
where
op( X ) = X
orop( X ) = X**T
,alpha
andbeta
are scalars, andA
,B
, andC
are matrices:op( A )
is anMxK
matrix,op( B )
is anKxN
matrix,C
is anMxN
matrix.
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Parameters:
transa 
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed. 
transb 
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed. 
M 
The M dimension. 
N 
The N dimension. 
K 
The K dimension. 
alpha 
The alpha parameter that is used to scale the product of matrices A and B. 
A 
A pointer to the A matrix data. 
lda 
The leading dimension for the matrix A. 
B 
A pointer to the B matrix data. 
ldb 
The leading dimension for the matrix B. 
beta 
The beta parameter that is used to scale the matrix C. 
C 
A pointer to the C matrix data. 
ldc 
The leading dimension for the matrix C. 
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
status dnnl::gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrixmatrix multiply on 8bit unsigned matrix A, 8bit signed matrix B, and 32bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A)  A_offset) * (op(B)  B_offset) + beta * C + C_offset
where
op( X ) = X
orop( X ) = X**T
,alpha
andbeta
are scalars, andA
,B
, andC
are matrices:op( A )
is anMxK
matrix,op( B )
is anKxN
matrix,C
is anMxN
matrix.
A_offset
is anMxK
matrix with every element equal theao
value,B_offset
is anKxN
matrix with every element equal thebo
value,C_offset
is anMxN
matrix which is defined by theco
array of sizelen
:if
offsetc = F
: thelen
must be at least1
,if
offsetc = C
: thelen
must be at leastmax(1, m)
,if
offsetc = R
: thelen
must be at leastmax(1, n)
,
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa 
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed. 
transb 
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed. 
offsetc 
Flag specifying how offsets should be applied to matrix C:

M 
The M dimension. 
N 
The N dimension. 
K 
The K dimension. 
alpha 
The alpha parameter that is used to scale the product of matrices A and B. 
A 
A pointer to the A matrix data. 
lda 
The leading dimension for the matrix A. 
ao 
The offset value for the matrix A. 
B 
A pointer to the B matrix data. 
ldb 
The leading dimension for the matrix B. 
bo 
The offset value for the matrix B. 
beta 
The beta parameter that is used to scale the matrix C. 
C 
A pointer to the C matrix data. 
ldc 
The leading dimension for the matrix C. 
co 
An array of offset values for the matrix C. The number of elements in the array depends on the value of 
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
status dnnl::gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrixmatrix multiply on 8bit signed matrix A, 8bit signed matrix B, and 32bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A)  A_offset) * (op(B)  B_offset) + beta * C + C_offset
where
op( X ) = X
orop( X ) = X**T
,alpha
andbeta
are scalars, andA
,B
, andC
are matrices:op( A )
is anMxK
matrix,op( B )
is anKxN
matrix,C
is anMxN
matrix.
A_offset
is anMxK
matrix with every element equal theao
value,B_offset
is anKxN
matrix with every element equal thebo
value,C_offset
is anMxN
matrix which is defined by theco
array of sizelen
:if
offsetc = F
: thelen
must be at least1
,if
offsetc = C
: thelen
must be at leastmax(1, m)
,if
offsetc = R
: thelen
must be at leastmax(1, n)
,
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa 
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed. 
transb 
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed. 
offsetc 
Flag specifying how offsets should be applied to matrix C:

M 
The M dimension. 
N 
The N dimension. 
K 
The K dimension. 
alpha 
The alpha parameter that is used to scale the product of matrices A and B. 
A 
A pointer to the A matrix data. 
lda 
The leading dimension for the matrix A. 
ao 
The offset value for the matrix A. 
B 
A pointer to the B matrix data. 
ldb 
The leading dimension for the matrix B. 
bo 
The offset value for the matrix B. 
beta 
The beta parameter that is used to scale the matrix C. 
C 
A pointer to the C matrix data. 
ldc 
The leading dimension for the matrix C. 
co 
An array of offset values for the matrix C. The number of elements in the array depends on the value of 
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_sgemm( char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float* A, dnnl_dim_t lda, const float* B, dnnl_dim_t ldb, float beta, float* C, dnnl_dim_t ldc )
Performs singleprecision matrixmatrix multiply.
The operation is defined as:
C := alpha * op( A ) * op( B ) + beta * C
where
op( X ) = X
orop( X ) = X**T
,alpha
andbeta
are scalars, andA
,B
, andC
are matrices:op( A )
is anMxK
matrix,op( B )
is anKxN
matrix,C
is anMxN
matrix.
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Parameters:
transa 
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed. 
transb 
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed. 
M 
The M dimension. 
N 
The N dimension. 
K 
The K dimension. 
alpha 
The alpha parameter that is used to scale the product of matrices A and B. 
A 
A pointer to the A matrix data. 
lda 
The leading dimension for the matrix A. 
B 
A pointer to the B matrix data. 
ldb 
The leading dimension for the matrix B. 
beta 
The beta parameter that is used to scale the matrix C. 
C 
A pointer to the C matrix data. 
ldc 
The leading dimension for the matrix C. 
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_gemm_u8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t* A, dnnl_dim_t lda, uint8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrixmatrix multiply on 8bit unsigned matrix A, 8bit signed matrix B, and 32bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A)  A_offset) * (op(B)  B_offset) + beta * C + C_offset
where
op( X ) = X
orop( X ) = X**T
,alpha
andbeta
are scalars, andA
,B
, andC
are matrices:op( A )
is anMxK
matrix,op( B )
is anKxN
matrix,C
is anMxN
matrix.
A_offset
is anMxK
matrix with every element equal theao
value,B_offset
is anKxN
matrix with every element equal thebo
value,C_offset
is anMxN
matrix which is defined by theco
array of sizelen
:if
offsetc = F
: thelen
must be at least1
,if
offsetc = C
: thelen
must be at leastmax(1, m)
,if
offsetc = R
: thelen
must be at leastmax(1, n)
,
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa 
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed. 
transb 
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed. 
offsetc 
Flag specifying how offsets should be applied to matrix C:

M 
The M dimension. 
N 
The N dimension. 
K 
The K dimension. 
alpha 
The alpha parameter that is used to scale the product of matrices A and B. 
A 
A pointer to the A matrix data. 
lda 
The leading dimension for the matrix A. 
ao 
The offset value for the matrix A. 
B 
A pointer to the B matrix data. 
ldb 
The leading dimension for the matrix B. 
bo 
The offset value for the matrix B. 
beta 
The beta parameter that is used to scale the matrix C. 
C 
A pointer to the C matrix data. 
ldc 
The leading dimension for the matrix C. 
co 
An array of offset values for the matrix C. The number of elements in the array depends on the value of 
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_gemm_s8s8s32( char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t* A, dnnl_dim_t lda, int8_t ao, const int8_t* B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t* C, dnnl_dim_t ldc, const int32_t* co )
Performs integer matrixmatrix multiply on 8bit signed matrix A, 8bit signed matrix B, and 32bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A)  A_offset) * (op(B)  B_offset) + beta * C + C_offset
where
op( X ) = X
orop( X ) = X**T
,alpha
andbeta
are scalars, andA
,B
, andC
are matrices:op( A )
is anMxK
matrix,op( B )
is anKxN
matrix,C
is anMxN
matrix.
A_offset
is anMxK
matrix with every element equal theao
value,B_offset
is anKxN
matrix with every element equal thebo
value,C_offset
is anMxN
matrix which is defined by theco
array of sizelen
:if
offsetc = F
: thelen
must be at least1
,if
offsetc = C
: thelen
must be at leastmax(1, m)
,if
offsetc = R
: thelen
must be at leastmax(1, n)
,
The matrices are assumed to be stored in rowmajor order (the elements in each of the matrix rows are contiguous in memory).
Note
This API does not support XERBLA. Instead, unlike the standard BLAS functions, this one returns a dnnl_status_t value to allow error handling.
Warning
On some architectures saturation may happen during intermediate computations, which would lead to unexpected results. For more details, refer to Nuances of int8 Computations.
Parameters:
transa 
Transposition flag for matrix A: ‘N’ or ‘n’ means A is not transposed, and ‘T’ or ‘t’ means that A is transposed. 
transb 
Transposition flag for matrix B: ‘N’ or ‘n’ means B is not transposed, and ‘T’ or ‘t’ means that B is transposed. 
offsetc 
Flag specifying how offsets should be applied to matrix C:

M 
The M dimension. 
N 
The N dimension. 
K 
The K dimension. 
alpha 
The alpha parameter that is used to scale the product of matrices A and B. 
A 
A pointer to the A matrix data. 
lda 
The leading dimension for the matrix A. 
ao 
The offset value for the matrix A. 
B 
A pointer to the B matrix data. 
ldb 
The leading dimension for the matrix B. 
bo 
The offset value for the matrix B. 
beta 
The beta parameter that is used to scale the matrix C. 
C 
A pointer to the C matrix data. 
ldc 
The leading dimension for the matrix C. 
co 
An array of offset values for the matrix C. The number of elements in the array depends on the value of 
Returns:
dnnl_success / dnnl::status::success on success and a status describing the error otherwise.