VMLAL (by scalar)
Vector Multiply Accumulate Long (by scalar)
Vector Multiply Accumulate Long multiplies elements of a vector by a scalar, and adds the products to corresponding elements of the destination vector. The destination vector elements are twice as long as the elements that are multiplied.
For more information about scalars see Advanced SIMD scalars.
Depending on settings in the CPACR, NSACR, and HCPTR registers, and the Security state and PE mode in which the instruction is executed, an attempt to execute the instruction might be undefined, or trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support.
Related encodings: See Advanced SIMD data-processing for the T32 instruction set, or Advanced SIMD data-processing for the A32 instruction set.
If CPSR.DIT is 1 and this instruction passes its condition execution check:
The execution time of this instruction is independent of:The values of the data supplied in any of its registers.The values of the NZCV flags.
The response of this instruction to asynchronous exceptions does not vary based on:The values of the data supplied in any of its registers.The values of the NZCV flags.
It has encodings from the following instruction sets:
A32 (
A1
)
and
T32 (
T1
)
.
1
1
1
1
0
0
1
1
!= 11
0
0
1
0
1
0
VMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]>
if size == '11' then SEE "Related encodings";
if size == '00' || Vd<0> == '1' then UNDEFINED;
unsigned = (U == '1'); add = (op == '0'); floating_point = FALSE; long_destination = TRUE;
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1;
integer esize;
integer elements;
integer m;
integer index;
if size == '01' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>);
if size == '10' then esize = 32; elements = 2; m = UInt(Vm); index = UInt(M);
1
1
1
1
1
1
1
1
!= 11
0
0
1
0
1
0
VMLAL{<c>}{<q>}.<dt> <Qd>, <Dn>, <Dm[x]>
if size == '11' then SEE "Related encodings";
if size == '00' || Vd<0> == '1' then UNDEFINED;
unsigned = (U == '1'); add = (op == '0'); floating_point = FALSE; long_destination = TRUE;
d = UInt(D:Vd); n = UInt(N:Vn); regs = 1;
integer esize;
integer elements;
integer m;
integer index;
if size == '01' then esize = 16; elements = 4; m = UInt(Vm<2:0>); index = UInt(M:Vm<3>);
if size == '10' then esize = 32; elements = 2; m = UInt(Vm); index = UInt(M);
<c>
For encoding A1: see Standard assembler syntax fields. This encoding must be unconditional.
<c>
For encoding T1: see Standard assembler syntax fields.
<q>
See Standard assembler syntax fields.
<dt>
Is the data type for the scalar and the elements of the operand vector,
U
size
<dt>
0
01
S16
0
10
S32
1
01
U16
1
10
U32
<Qd>
Is the 128-bit name of the SIMD&FP register holding the accumulate vector, encoded in the "D:Vd" field as <Qd>*2.
<Dn>
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field.
<Dm[x]>
Is the 64-bit name of the second SIMD&FP source register holding the scalar. If <dt> is S16 or U16, Dm is restricted to D0-D7. Dm is encoded in "Vm<2:0>", and x is encoded in "M:Vm<3>". If <dt> is S32 or U32, Dm is restricted to D0-D15. Dm is encoded in "Vm", and x is encoded in "M".
if ConditionPassed() then
EncodingSpecificOperations(); CheckAdvSIMDEnabled();
op2 = Elem[Din[m],index,esize]; op2val = Int(op2, unsigned);
for r = 0 to regs-1
for e = 0 to elements-1
op1 = Elem[Din[n+r],e,esize]; op1val = Int(op1, unsigned);
if floating_point then
fp_addend = (if add then FPMul(op1,op2,StandardFPSCRValue())
else FPNeg(FPMul(op1,op2,StandardFPSCRValue())));
Elem[D[d+r],e,esize] = FPAdd(Elem[Din[d+r],e,esize], fp_addend,
StandardFPSCRValue());
else
addend = if add then op1val*op2val else -op1val*op2val;
if long_destination then
Elem[Q[d>>1],e,2*esize] = Elem[Qin[d>>1],e,2*esize] + addend;
else
Elem[D[d+r],e,esize] = Elem[Din[d+r],e,esize] + addend;