#include<stdio.h>

int main()

{

printf("hello,world\n");

return 0;

}

 

x86

use MSVC Compiler

cl 1.cpp /Fa 1.asm

/Fa Option causes the compiler to generate an assembly manifest file (assembly listing file), And specifies that the name of the assembly list file is 1.asm

 

1.asm The contents are as follows :

CONST SEGMENT

$SG3830 DB ‘hello,world',0AH,00H

CONST END

PUBLIC _main

EXTRN _printf:PROC;Function compile flags:/0dtp

_TEXT SEGMENT

_main PROC

push ebp

mov ebp,esp

push OFFSET $SG3830

call _printf

add esp,4

xor eax,eax

pop ebp

ret 0

_main ENDP

_TEXT ENDS

In generation 1.asm after , The compiler generates 1.obj Then link it to an executable 1.exe

CONST: Data segment

_TEXT: Code snippet

 

The above source code is equivalent to :

#include <stdio.h>

const char *$SG3830[] = "hello,world\n";

int main()

{

printf($SG3830);

return 0;

}

We found that the compiler added hexadecimal digits to the end of the string constant 0, Namely 00h, Adds an end flag to a string constant .

adopt PUSH instructions , The program pushes a pointer to a string onto the stack . such ,printf() Function can call the pointer in the stack , That is, string “hello,world\n" Address of .

stay printf() After the end of the function , The control flow of the program returns to the main() Function . here , The string address remains in the data stack . At this point, you need to adjust the pointer ESP Register to release the pointer .

add ESP,4 hold ESP The value in the register is added 4

Why add 4, that is because x86 Memory address usage of the platform 32 Bit data description . In the same way , stay x64 When the pointer is released on the system ,ESP It's going to be added 8.

therefore , This directive can be understood as POP A register . It's just that the instruction in this example directly discards the data in the stack POP The instruction also stores the value in the register to a given register .

printf() After the end of the function ,main() The function returns 0. Namely main() The result of the operation of the function is 0.

This return value is returned by the command XOR EAX,EAX Calculated .

 

gcc generate hello world program

gcc 1.c -o 1

Assembly instruction

Main proc near

var_10 = dword ptr -10h

push ebp

mov ebp,esp

and esp,0FFFFFFF0h

sub esb,10h

mov eax,offsett aHelloWorld; "hello,world\n"

mov [esp+10h+var_10],eax

call _printf

mov eax,0

leave

retn

main endp

AND
ESP,0FFFFFFF0h instructions , It makes the stack address ESP Value direction of 16 Byte edge alignment , become 16 Integral multiple of , Belongs to initialization instruction . If the address bits are not aligned , that CPU You may need to access memory twice to get the data in the stack . Although in the 8 Byte boundary alignment can be satisfied 32 position x86
CPU and 64 position x64 CPU Requirements of , However, the Compilation Rules of mainstream compilers stipulate that ” The address that the program accesses must be directed to 16 byte alignment “.

SUB ESP,10h Will be allocated in the stack 0x10
bytes, Namely 16 byte . This program only uses 4 Byte space . But because of the compiler's stack address ESP Yes 16 byte alignment , So it's distributed every time 16 Byte space .

then , The program writes the string address directly to the data stack . among var_10 Is a local variable , For the back printf() Function transfer parameters .

The last one LEAVE instructions , Equivalent to MOV ESP,EBP and POP EBP Two instructions .

GCC Other features of

#include<stdio.h>

int f1()

{

printf("world\n");

}

int f2()

{

printf("hello world\n");

}

int main()

{

f1();

f2();

}

Assembly instruction

f1 proc near

s =dowrd ptr-1ch

sub esp,1Ch

mov [esp+1Ch+s],offset s; "world\n"

call _puts

add esp,1Ch

retn

f1 endp

f2 proc near

s =dword ptr-1ch

sub esp,1Ch

mov [esp+1Ch+s],offset aHello;"hello ”

call _puts

add esp,1Ch

retn

f2 endp

aHello db 'hello'

s db 'world',0xa,0

 

In print string “hello
world" When , The two word pointer addresses are actually adjacent . Calling puts() Function output , The function itself does not know that the string it outputs is divided into two parts . In fact, we can see it in the assembly instruction list , The two strings are not actually separated .

stay f1() function call
puts Function time , It outputs a string ”world" And plus Terminator , because puts() The function does not know that a string can be concatenated with the previous string to form a new string .GCC Will make full use of this technology to save memory .

 

ARM

No optimization enabled ARM pattern

armcc.exe --arm --c90 -O 0 1.c

main

STMFD SP!{R4,LR}

ADR R0,aHelloWorld; "hello, world"

BL __2printf

MOV R0,#0

LDMFD SP!{R4,PC}

 

aHelloWorld DCB "hello,world",0

STMFD SP!{R4,LR} amount to x86r Of PUSH instructions . It puts R4 Registers and LR Link
Register The value of the register is placed in the data stack . This directive will first SP Decline , Allocate a new space in the stack for storage R4 and LR Value of .

 

ADR R0,aHelloWorld First of all, it is right PC Value operation , And then put “hello,world" Offset of string and PC Add the values of , Store its results in R0 in .

BL __2printf call printf() function .BL Specific operation :

1) The address of the next instruction , The address 0xC place MOV R0,#0 Address of , write in LR register

2) Then the printf() The address of the function write in PC register , To boot the system to execute the function

 

MOV R0,#0 take R0 Register setting 0

LDMFD SP!R4,PC This and the order . It takes the values out of the stack , Assign values to R4 and PC, And adjust the stack pointer SP.

 

 

Technology