VUSDOT (vector)
Dot Product vector form with mixed-sign integers
Dot Product vector form with mixed-sign integers. This instruction performs the dot product of the four unsigned 8-bit integer values in each 32-bit element of the first source register with the four signed 8-bit integer values in the corresponding 32-bit element of the second source register, accumulating the result into the corresponding 32-bit element of the destination register.
From Armv8.2, this is an optional instruction. ID_ISAR6.I8MM indicates whether this instruction is supported in the T32 and A32 instruction sets.
It has encodings from the following instruction sets:
A32 (
A1
)
and
T32 (
T1
)
.
1
1
1
1
1
1
0
0
1
1
0
1
1
0
1
0
0
VUSDOT{<q>}.S8 <Dd>, <Dn>, <Dm>
1
VUSDOT{<q>}.S8 <Qd>, <Qn>, <Qm>
if !HaveAArch32Int8MatMulExt() then UNDEFINED;
if Q == '1' && (Vd<0> == '1' || Vn<0> == '1' || Vm<0> == '1') then UNDEFINED;
integer d = UInt(D:Vd);
integer n = UInt(N:Vn);
integer m = UInt(M:Vm);
integer regs = if Q == '1' then 2 else 1;
1
1
1
1
1
1
0
0
1
1
0
1
1
0
1
0
0
VUSDOT{<q>}.S8 <Dd>, <Dn>, <Dm>
1
VUSDOT{<q>}.S8 <Qd>, <Qn>, <Qm>
if InITBlock() then UNPREDICTABLE;
if !HaveAArch32Int8MatMulExt() then UNDEFINED;
if Q == '1' && (Vd<0> == '1' || Vn<0> == '1' || Vm<0> == '1') then UNDEFINED;
integer d = UInt(D:Vd);
integer n = UInt(N:Vn);
integer m = UInt(M:Vm);
integer regs = if Q == '1' then 2 else 1;
<q>
See Standard assembler syntax fields.
<Qd>
Is the 128-bit name of the SIMD&FP third source and destination register, encoded in the "D:Vd" field as <Qd>*2.
<Qn>
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2.
<Qm>
Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2.
<Dd>
Is the 64-bit name of the SIMD&FP third source and destination register, encoded in the "D:Vd" field.
<Dn>
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field.
<Dm>
Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field.
CheckAdvSIMDEnabled();
bits(64) operand1;
bits(64) operand2;
bits(64) result;
for r = 0 to regs-1
operand1 = Din[n+r];
operand2 = Din[m+r];
result = Din[d+r];
for e = 0 to 1
bits(32) res = Elem[result, e, 32];
for b = 0 to 3
element1 = UInt(Elem[operand1, 4 * e + b, 8]);
element2 = SInt(Elem[operand2, 4 * e + b, 8]);
res = res + element1 * element2;
Elem[result, e, 32] = res;
D[d+r] = result;