gcc actually produces assembler and assembles it using the as assembler. Not all compilers do this - the MS compilers produce object code directly, though you can make them generate assembler output. Translating assembler to object code is a pretty simple process, at least compared with compilation.
Some compilers produce other high-level language code as their output - for example, cfront, the first C++ compiler produced C as its output which was then compiled by a C compiler.
Note that neither direct compilation or assembly actually produce an executable. That is done by the linker, which takes the various object code files produced by compilation/assembly, resolves all the names they contain and produces the final executable binary.
As you feared, movq %rcx, %rsi
is not correct. You need to pass a pointer to memory. Registers are not part of the memory address space and thus you can't have pointers to them. You need to allocate storage either globally or locally. Incidentally, you should not put your data (especially writable) into the default .text
section, as that is intended for code and is typically read-only. Also, calling convention usually mandates 16 byte stack pointer alignment, so you should take care of that too.
.globl main
main:
push %rbp # keep stack aligned
mov $0, %eax # clear AL (zero FP args in XMM registers)
leaq f(%rip), %rdi # load format string
leaq x(%rip), %rsi # set storage to address of x
call scanf
pop %rbp
ret
.data
f: .string "%d" # could be in .rodata instead
x: .long 0
(If your environment expects a leading underscore on symbols, then use _main
, and probably _scanf
.)
There are actually 3 choices for putting addresses of symbols / labels into registers. RIP-relative LEA is the standard way on x86-64. How to load address of function or label into register in GNU Assembler
As an optimization if your variables are in the lower 4GiB of the address space, e.g. in a Linux non-PIE (position-dependent) executable, you can use 32-bit absolute immediates:
mov $f, %edi # load format string
mov $x, %esi # set storage to address of x
movq $f, %rdi
would use a 32-bit sign-extended immediate (instead of implicit zero-extension into RDI from writing EDI), but has the same code-size as a RIP-relative LEA.
You can also load the full 64 bit absolute address using the mnemonic movabsq
. But don't do that because a 10-byte instruction is bad for code-size, and still needs a runtime fixup because it's not position-independent.
movabsq $f, %rdi # load format string
movabsq $x, %rsi # set storage to address of x
Upon request: using a local variable for the output could look like:
subq $8, %rsp # allocate 8 bytes from stack
xor %eax, %eax # clear AL (and RAX)
leaq f(%rip), %rdi # load format string
movq %rsp, %rsi # set storage to local variable
call scanf
addq $8, %rsp # restore stack
ret
Best Answer
There are at least three things that you need to do to successfully use libc with dynamic linking:
/usr/lib/crt1.o
, which contains_start
, which will be the entry point for the ELF binary;/usr/lib/crti.o
(before libc) and/usr/lib/crtn.o
(after), which provide some initialisation and finalisation code;/lib/ld-linux.so
.For example: