# 3.59 128-bit Multiply

The following code computes the 128-bit product of two 64-bit signed values x and y and stores the result in memory:

```c
typedef __int128 int128_t;
void store_prod(int128_t *dest, int64_t x, int64_t y) {
    *dest = x * (int128_t) y;
}
```

GCC generates the following assembly code implementing the computation:

```
store_prod:
    movq     %rdx, %rax
    cqto
    movq     %rsi, %rcx
    sarq     $63, %rcx
    imulq     %rax, %rcx
    imulq     %rsi, %rdx
    addq     %rdx, %rcx
    mulq     %rsi
    addq     %rcx, %rdx
    movq     %rax, (%rdi)
    movq     %rdx, 8(%rdi)
    ret
```

This code uses three multiplications for the multiprecision arithmetic required to implement 128-bit arithmetic on a 64-bit machine. Describe the algorithm used to compute the product, and annotate the assembly code to show how it realizes your algorithm. *Hint:* When extending arguments of x and y to 128 bits, they can be rewritten as $$x=2^{64}\cdot x\_h+x\_l$$and $$y=2^{64}\cdot y\_h+y\_l$$ , where $$x\_h$$ , $$x\_l$$ , $$y\_h$$ , and $$y\_l$$ are 64-bit values. Similary, the 128-bit product can be written as $$p=2^{64}\cdot p\_h+p\_l$$ , where $$p\_h$$ , and $$p\_l$$ are 64-bit values. Show how the code computes the values of $$p\_h$$ and $$p\_l$$ in terms of $$x\_h$$ , $$x\_l$$ , $$y\_h$$ and $$y\_l$$ .

Solve:

assume:

$$
ux=x+x\_{63}\cdot 2^{64}\\
uy=y+y\_{63}\cdot 2^{64}
$$

then:

$$
ux\cdot uy=(x+x\_{63}\cdot 2^{64})\cdot (y+y\_{63}\cdot 2^{64})\\
\= x\cdot y+x\cdot y\_{63}\cdot 2^{64}+y\cdot x\_{63}\cdot 2^{64}+x\_{63}\cdot y\_{63}\cdot 2^{128}
$$

$$2^{128}$$ overflows and don't care. So:

$$
x\cdot y=ux\cdot uy-(x\cdot y\_{63}\cdot 2^{64}+y\cdot x\_{63}\cdot 2^{64})
$$

```
store_prod:
    ; dest in %rdi, x in %rsi, y in %rdx
    movq     %rdx, %rax         ; %rax = y
    cqto                        ; Singend Extend %rax=>%rdx:%rax, so %rdx = y_h, -1 if y_63=1, and 0 if y_63=0
    movq     %rsi, %rcx         ; %rcx = x
    sarq     $63, %rcx          ; %rcx = x >> 63, so %rcx = x_h, -1 if x_63=1, and 0 if x_63=0
    imulq    %rax, %rcx         ; %rcx = %rax * %rcx = y * -x_63
    imulq    %rsi, %rdx         ; %rdx = %rsi * %rdx = x * -y_63
    addq     %rdx, %rcx         ; %rcx = %rdx + %rcx = x * -y_63 + y * -x_63
    mulq     %rsi               ; %rdx:%rax = %rax * %rsi = uy * ux, because mulq is unsigned full multiply
    addq     %rcx, %rdx         ; %rdx = %rcx + %rdx = uy * ux + x * -y_63 + y * -x_63
    movq     %rax, (%rdi)       ; set lower 64 bits
    movq     %rdx, 8(%rdi)      ; set higher 64 bits
    ret
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://valineliu.gitbook.io/deuterium-wiki/reading/cs-jing-dian-shu-ji/csapp-3e-homework-solution/3.-machine-level-representation-of-programs/3.59-128-bit-multiply.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
