3.59 128-bit Multiply
★★
The following code computes the 128-bit product of two 64-bit signed values x and y and stores the result in memory:
typedef __int128 int128_t;
void store_prod(int128_t *dest, int64_t x, int64_t y) {
*dest = x * (int128_t) y;
}
GCC generates the following assembly code implementing the computation:
store_prod:
movq %rdx, %rax
cqto
movq %rsi, %rcx
sarq $63, %rcx
imulq %rax, %rcx
imulq %rsi, %rdx
addq %rdx, %rcx
mulq %rsi
addq %rcx, %rdx
movq %rax, (%rdi)
movq %rdx, 8(%rdi)
ret
This code uses three multiplications for the multiprecision arithmetic required to implement 128-bit arithmetic on a 64-bit machine. Describe the algorithm used to compute the product, and annotate the assembly code to show how it realizes your algorithm. Hint: When extending arguments of x and y to 128 bits, they can be rewritten as and , where , , , and are 64-bit values. Similary, the 128-bit product can be written as , where , and are 64-bit values. Show how the code computes the values of and in terms of , , and .
Solve:
assume:
then:
overflows and don't care. So:
store_prod:
; dest in %rdi, x in %rsi, y in %rdx
movq %rdx, %rax ; %rax = y
cqto ; Singend Extend %rax=>%rdx:%rax, so %rdx = y_h, -1 if y_63=1, and 0 if y_63=0
movq %rsi, %rcx ; %rcx = x
sarq $63, %rcx ; %rcx = x >> 63, so %rcx = x_h, -1 if x_63=1, and 0 if x_63=0
imulq %rax, %rcx ; %rcx = %rax * %rcx = y * -x_63
imulq %rsi, %rdx ; %rdx = %rsi * %rdx = x * -y_63
addq %rdx, %rcx ; %rcx = %rdx + %rcx = x * -y_63 + y * -x_63
mulq %rsi ; %rdx:%rax = %rax * %rsi = uy * ux, because mulq is unsigned full multiply
addq %rcx, %rdx ; %rdx = %rcx + %rdx = uy * ux + x * -y_63 + y * -x_63
movq %rax, (%rdi) ; set lower 64 bits
movq %rdx, 8(%rdi) ; set higher 64 bits
ret
Last updated
Was this helpful?