当前位置：网站首页>Compiler division optimization

Compiler division optimization

2022-07-20 08:01:00 【Sili who can't write code】

We know that division in modern times CPU More clock cycles are consumed in the calculation ：
The following figure Add and div The comparison
Insert picture description here

Div Delay 60-80 Between and add Only 1, It can be seen that we should avoid calling the division assembly instruction in the real situation .

terms of settlement ：

For dividing by 2 Multiple optimization of

For a that is to be divided and the divisor is 2 Multiple , We can accomplish it by displacement
Here's the picture ：

int x = 58;
// x = x%8
x >>= 3;

But for negative numbers, displacement may cause errors as follows ：
Insert picture description here

For negative numbers, we can use the following formula to convert

Insert picture description here

for instance ：
a = -58
b = 8

In this example, we conform to the following formula in combination with the characteristics of the computer （ For example, the result is -7.25 We should take it up instead of down ）：
Insert picture description here
ceil ((a+b-1)/b) =ceil ((-58+8-1)/8) =ceil(-6.375)=-7

So our division optimization can eliminate division in this way

	int x = -58;
	// If the divisor is less than 0  Use formulas to eliminate division 
	if (x < 0)
	{
    
		//x = (x + 8 - 1) / 8; => (x + 7) >> 3;
		// The computer defaults to rounding down 
		x = (x + 7) >> 3;
	}
	else {
    
		x >>= 3;
	}

Let's look at the following code ：

#include<stdio.h>

int main(int argc,char* args[])
{
    
	
	printf("%d\r\n",argc/8);

	return 0;
}

After compilation, it will become the following assembly instructions

Insert picture description here
mov eax,dword ptr[argc] Move to register
cdq hold eax Expand to 64 The high bit moves to edx On . Here is to deal with the situation of positive and negative numbers
and edx ,7 If it's a positive number edx Than for 0, If it's negative edx All for 1. Therefore, after the positive number is executed edx by 0, A negative number is 7.
add eax,edx Want to be with us (x + 7), But the back 7 May be 0
sar eax,3 eax Move arithmetic right 3 Bit guarantee

Divide by non 2 Multiple optimization

We can use the following formula

Insert picture description here

The inverse formula is as follows ：
Insert picture description here
tip: M Not divisible because 2^n It's even and c Right and wrong 2 Multiple , So the result is rounded down by default , This conclusion needs attention in the case of negative numbers

among M Is a constant, so it can be optimized by the compiler into a compilation constant , among n At least for 32（n The larger the, the more accurate the result ）

Examples are as follows ：

Signed Division

int main(unsigned int argc, char* args[])
{
    	
	// Note that the result of unsigned division is unsigned 、、argc It's an unsigned number 
	printf("y ===>>> %d\r\n", argc / 3);
	return 0;
}

Corresponding assembly statement ：
Insert picture description here

among 0AAAAAAABh Yes, in our formula M

; eax = M
mov eax,0AAAAAAABh
;  among edx High storage 32 position  eax It's low 32 position  ret Mark as our calculation result .argc by a   So I got 
; (edx,eax) = ret = eax * argc   = aM  
mul eax,dword ptr [argc]
;edx Because it is high 32, Moving one bit is equivalent to ret Move 33 position , That is to say  ret>>33   Equivalent   edx=edx/(2^33)
shr edx,1

We use the formula to get the divisor
Insert picture description here

As you can see, there is no division instruction in the whole formula above .
The exception is tough IDA pro It can help us quickly identify
Insert picture description here

Signed Division

#include<stdio.h>
int main( int argc, char* args[])
{
    
	// Be careful argc It's symbolic 
	printf("y ===>>> %d\r\n", argc / 3);
	return 0;
}

Insert picture description here
Compared with unsigned division, there are more following compilations


0040104B  mov         eax,edx  
;1F To 10 Base number 31
0040104D  shr         eax,1Fh  
00401050  add         eax,edx

The reason why there are a few more lines of assembly is to avoid the problem of rounding down by default in negative division , So when it is negative, you need to add 1.（ Refer to above ）

; take a*m The high order of the product is eax
0040104B  mov         eax,edx  
;1F To 10 Base number 31
; Move right 31 After that, only the sign bit is left, that is 1 perhaps 0
; A negative number is 1  A positive number is 0
0040104D  shr         eax,1Fh  
; If it's a negative number eax by 1  Then add a place 
00401050  add         eax,edx

M Greater than 32 Deformation of digits

#include<stdio.h>

int main(unsigned int argc, char* args[])
{
    
	// Be careful argc It's symbolic 
	printf("y ===>>> %d\r\n", argc / 7);	
	return 0;
}

Insert picture description here

The above compilation is converted into mathematical formula as follows ：
Insert picture description here
among M by 24924925h, And our original real formula M by 2^32+M Obviously, this number 32 register Not enough to store , And divide by 2^35 It's the same thing

Multiplication optimization

Suppose we can only use 16 The bit multiplier performs the following calculation ：

seek A*8086h Result （8086h Stored in ax in ,A Stored in cx)

among 8086h It's an unsigned number ,A It's user input word The size value is a signed number .
Insert picture description here

So if you directly use mul cx; Words Will be able to cx As an unsigned number , But it is possible that the number entered by the user is negative
imul cx; Will make ax As a negative number , But the actual ax Is an unsigned number .

Let's first look at the following formula ：

;x It's negative word size 
x + ~x = ffffh
x+ ~x + 1 =10000h
// This is ~x+1 Namely x Complement 
~x+1=10000h-x
x = 10000h - (~x+1)

; So 
;(~x+1) Namely 8086h  Binary conversion to 10 Into the system for -32634   .(10000h-8086h)  be equal to 8086 In binary, a positive number of symbols indicates     10000h-8086h= 32634
; And the minus sign in front  - (10000h-8086h)  Used to convert to the correct number  （ because 8086h stay 16 position imul It's a negative number ）
dx.ax                =  A *  - (10000h-8086h)
dx.ax 		        =   A * (8086h-10000h)
dx.ax 		        =  8086Ah-10000Ah
dx.ax+10000A  = 8086h*A
(dx+A).ax = 8086h*A

 Through the above formula, we can get the following assembly code to calculate the problem 

mov ax,8086h
mov cx,A
imul cx;
add dx,cx

If you pass in A It's a negative number, so just put add dx,cx Turn into sub dx,cx

Suppose we want to calculate *-8086h A The numerical We can only use 16 Bit register calculation , however -8086h Has exceeded 16 precision

 because 
-x =  (~x+1)-10000h

 therefore ：

dx.ax                  =  A *  -(10000h-8086h)
dx.ax 		          =  8086Ah-10000h
A0000h+dx.ax   = 8086h*A
; Multiply both sides by minus one 
(dx-A).ax          = -8086h*A


mov ax,8086h
mov cx,A
imul cx;
sub dx,cx

Negative optimization of division

int main( int argc, char* args[]){
    
	printf("y ===>>> %d\r\n", argc / -5);
	return 0;
}

Insert picture description here
Let's first review the division optimization formula above

When c When it's negative ：
Insert picture description here

So in this case 99999999h It's a negative number , And the current binary is a complement code .
Insert picture description here
The corresponding decimal number is -1717986919, So when we need to reverse the divisor, we use 10 Hexadecimal number can be calculated

Insert picture description here

c = 2^33 / -1717986919=-4.999999998 The resulting -5

Negative optimization of division 2

 int main( int argc, char* args[])
{
    

	printf("y ===>>> %d\r\n", argc / -7);

	return 0;
}

Insert picture description here
What the compiler actually wants to express here M Not at all 6DB6DB6D The simultaneous number actually represents a negative number ,6DB6DB6D Greater than 4 Byte signed storage range That's why the above assembly code

 set up 6DB6DB6D by Y,A For divisor 

 because 
-x =  (~x+1)-100000000h

 therefore ：
; because M Less than 0 So the minus sign 
dx.ax    =  A *  (100000000h-Y)
 therefore  M = - (100000000h-Y)
M =- (100000000h-6DB6DB6DH) =-92492493h

When I find M Then you can convert the divisor c= 2^34 / - 92492493h =- 7

原网站

版权声明
本文为[Sili who can't write code]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/201/202207190447586523.html