### Author Topic: Reverse-Polish (Postfix) Notaton and X86 Assembler  (Read 13067 times)

0 Members and 1 Guest are viewing this topic.

#### Charles Pegge

• Global Moderator
• Hero Member
• Posts: 670
##### Reverse-Polish (Postfix) Notaton and X86 Assembler
« on: August 04, 2007, 01:19:26 PM »

This shows how easily it translates.

FreeBasic
Code: [Select]
`' POSTFIX or REVERSE-POLISH NOTATION AND x86 ASSEMBLER' Charles E V Pegge' 04 Aug 2007' FREEBASIC' FLOATING POINT EXAMPLEdim as double ans, a=1, b=2, c=3, d=4' ans=(a+b)*(c+d)      ' conventional notation                       '' a b + c d + * to ans ' postfix notation'======================'asm fld   qword ptr [a]   ' push A onto FPU stack fadd qword ptr  [b]   ' add  B fld   qword ptr [c]   ' push C onto FPU stack fadd  qword ptr [d]   ' add D fmulp           st(1) ' multiply the 2 results and pop the stack fstp qword ptr  [ans] ' store the result and pop the stackend asm                ''======================'print ans`

#### Petr Schreiber

• Full Member
• Posts: 183
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #1 on: August 04, 2007, 03:04:42 PM »
Hi Charles,

nice samples. RPN is very nice.
Here my two tries:

First for PB/WIN 8.03:
Code: [Select]
`#COMPILE EXE#DIM ALLFUNCTION PBMAIN () AS LONG  #REGISTER NONE  LOCAL a, b, c, d AS DOUBLE  LOCAL ans AS DOUBLE    a = 1  b = 2  c = 3  d = 4    ' ans=(a+b)*(c+d)      ' conventional notation  ' a b + c d + * to ans ' postfix notation  '======================'  ! fld a                ; push A onto FPU stack  ! fadd b               ; add  B  ! fld c                ; push C onto FPU stack  ! fadd d               ; add D  ! fmulp st(1), st(0)   ; multiply the 2 results and pop the stack  ! fstp ans             ; store the result and pop the stack  '======================'  MSGBOX STR\$(ans)END FUNCTION`
Second for HP-42S
Code: [Select]
`RCL "A"RCL+ "B"RCL "B"RCL+ "D"*STO "ANS"`

Thanks,
Petr
« Last Edit: August 04, 2007, 06:27:26 PM by Petr Schreiber »
AMD Sempron 3400+ | 1GB RAM @ 533MHz | GeForce 6200 / GeForce 9500GT | 32bit Windows XP SP3

psch.thinbasic.com

#### Charles Pegge

• Global Moderator
• Hero Member
• Posts: 670
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #2 on: August 04, 2007, 06:09:08 PM »
Thanks Petr,

Here is the same thing using long integers on the CPU.

The first block shows a very literal translation of RPN into x86 assembler. This is not as futile as it looks because this strategy may be applied to expressions of almost unlimited complexity on just about any CPU that can add, multiply and has a stack. But when there are general purpose registers available then optimising the code can make a big difference as you can see in the second block.

FreeBasic

Code: [Select]
`' LONG INTEGER EXAMPLEdim as long ans, a=1, b=2, c=3, d=4' ans=(a+b)*(c+d)      ' conventional notation                       '' a b + c d + * to ans ' postfix notation''========================='' RIGID INTERPRETATION' no regs used for storing' only for doing operations'========================='asm                       ' push dword ptr [a]       ' load A onto stack push dword ptr [b]       ' load B onto stack'-------------------------' pop  ecx                 ' pop operand pop  eax                 ' pop accummulator add  eax,ecx             ' do op push eax                 ' leave result on stack'-------------------------' push dword ptr [c]       ' load C onto stack push dword ptr [d]       ' load D onto stack'-------------------------' pop  ecx                 ' pop operand pop  eax                 ' pop accummulator add  eax,ecx             ' do op push eax                 ' leave result on stack'-------------------------' pop  ecx                 ' pop operand pop  eax                 ' pop accummulator mul  ecx                 ' do op push eax                 ' leave result on stack'-------------------------' pop  eax                 ' pop result mov [ans],eax            ' store itend asm                   ''========================='print ans' OPTIMISED INTERPRETATION'========================'asm                      ' mov  ecx, [a]           ' load A into ecx accummulator add  ecx, [b]           ' add  B to it mov  eax, [c]           ' load C into into eax accummulator add  eax, [d]           ' add  D to it mul  ecx                ' multiply the two accumulators (result in eax) mov  [ans],eax          ' save resultend asm                  ''========================'print ans`

#### Petr Schreiber

• Full Member
• Posts: 183
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #3 on: August 04, 2007, 06:27:01 PM »
Hi,

here comes again PB/WIN version
Code: [Select]
`#COMPILE EXE#DIM ALLDECLARE FUNCTION GetTickCount LIB "KERNEL32.DLL" ALIAS "GetTickCount" () AS DWORDFUNCTION PBMAIN () AS LONG    #REGISTER NONE  LOCAL t1, t2 AS DWORD  LOCAL i AS LONG  LOCAL a, b, c, d AS LONG  LOCAL ans AS LONG  a = 1  b = 2  c = 3  d = 4  ' ans=(a+b)*(c+d)      ' conventional notation                         '  ' a b + c d + * to ans ' postfix notation  '  '========================='  ' RIGID INTERPRETATION  ' no regs used for storing  ' only for doing operations  '========================='  t1 = GetTickCount  FOR i = 1 TO 1000000000      ! push a                   ' load A onto stack    ! push b                   ' load B onto stack    '-------------------------'    ! pop  ecx                 ' pop operand    ! pop  eax                 ' pop accummulator    ! add  eax,ecx             ' do op    ! push eax                 ' leave result on stack    '-------------------------'    ! push c                   ' load C onto stack    ! push d                   ' load D onto stack    '-------------------------'    ! pop  ecx                 ' pop operand    ! pop  eax                 ' pop accummulator    ! add  eax,ecx             ' do op    ! push eax                 ' leave result on stack    '-------------------------'    ! pop  ecx                 ' pop operand    ! pop  eax                 ' pop accummulator    ! mul  ecx                 ' do op    ! push eax                 ' leave result on stack    '-------------------------'    ! pop  eax                 ' pop result    ! mov  ans,eax             ' store it  NEXT  t2 = GetTickCount  MSGBOX "ASM General:"+STR\$((t2-t1)/1000)+"s, result:"+STR\$(ans)  ' OPTIMISED INTERPRETATION  '========================'                     '  t1 = GetTickCount  FOR i = 1 TO 1000000000                         ! mov  ecx, a             ' load A into ecx accummulator    ! add  ecx, b             ' add  B to it    ! mov  eax, c             ' load C into into eax accummulator    ! add  eax, d             ' add  D to it    ! mul  ecx                ' multiply the two accumulators (result in eax)    ! mov  ans, eax           ' save result  NEXT  t2 = GetTickCount  MSGBOX "ASM Optimized:"+STR\$((t2-t1)/1000)+"s, result:"+STR\$(ans)    PB without ASM "suggestions"  '========================'    t1 = GetTickCount  FOR i = 1 TO 1000000000    ans = (a+b)*(c+d)  NEXT  t2 = GetTickCount  MSGBOX "PB Classic:"+STR\$((t2-t1)/1000)+"s, result:"+STR\$(ans)END FUNCTION`
I presume I cannot use EBX as PB uses it already ?
I have added #REGISTER NONE to not get in troubles too

It is interesting to watch the speed difference.
General approach takes ~12.7 seconds for 1 000 000 000 iterations, while optimized just 3.5 seconds, which is quite the same as PB does by optimizing high-level expression.

Thanks,
Petr
« Last Edit: August 04, 2007, 06:36:38 PM by Petr Schreiber »
AMD Sempron 3400+ | 1GB RAM @ 533MHz | GeForce 6200 / GeForce 9500GT | 32bit Windows XP SP3

psch.thinbasic.com

#### Theo Gottwald

• Hero Member
• Posts: 897
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #4 on: August 04, 2007, 07:18:33 PM »
Have a Hint on EBX for you. Not really from me, but directly from Bob Zale, copied from the PB Forum:

Quote
Actually, the register preservation requirements for EBX/ESI/EDI only apply to assembler code which precedes BASIC statements within a sub/function. PowerBASIC assumes that it can use these 3 registers to store data from one statement to the next. If there are no more statements, it's of no consequence. PowerBASIC must preserve registers between sub/function calls, and does so automatically. Of course, ESP and EBP must always be preserved.
One note... never change a segment register. Win32 in unforgiving.

Source: PB-Forum

#### Petr Schreiber

• Full Member
• Posts: 183
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #5 on: August 04, 2007, 07:20:26 PM »
Thanks for the info!,

Petr
AMD Sempron 3400+ | 1GB RAM @ 533MHz | GeForce 6200 / GeForce 9500GT | 32bit Windows XP SP3

psch.thinbasic.com

#### Charles Pegge

• Global Moderator
• Hero Member
• Posts: 670
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #6 on: August 04, 2007, 08:13:21 PM »
Thanks for taking those measurements Petr. It shows the cost of generalised methods without  optimisation.  Of course once you start using interpreter techniques, the time penalty is much more extreme.

I am not sure when PB requires the EBX and ESI registers to be left unaltered in between lines, or for referencing global and static variables, arrays etc.
Pushing them only takes 1 clock each as does popping them.

One potential GOTCHA, Theo, is in loading one of these variables into a reserved register which is normally used to index them, causing a GPF when you try to access the next variable.

There is a simple and cheap method to protect against this, costing 3 clocks per variable loaded instead of 1

Code: [Select]
`dim a as global long, b as global long, c as global long'.... ! push a     ' push contents of all the variables to be used ! push b     ' ! push c     ' ! pop edi    ' now pop then into the target registers ! pop esi    ' ! pop ebx   ''....         '`
« Last Edit: August 04, 2007, 08:16:44 PM by Charles Pegge »

#### Kent Sarikaya

• Full Member
• Posts: 174
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #7 on: August 04, 2007, 10:53:13 PM »
This is way beyond me, but in looking through the code I can see sort of how it is going... I don't understand how it knows which operand to use? It is never pushed or put into memory as the variable values are? How does it know to add instead of subtract?

#### Theo Gottwald

• Hero Member
• Posts: 897
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #8 on: August 04, 2007, 11:27:24 PM »
Code: [Select]
`I am not sure when PB requires the EBX and ESI registers to be left unaltered in between lines,`
I can help you on this Charles. Reading my own post from before, I can tell you this:

PowerBasic uses these registers:

EBX - internal usage
ESI/EDI - for REGISTER variables, if you do not use #REGISTER NONE

Therefore they need to be preserved and restored if you have BASIC CODE and ASM Code intermixed.

As we see in the Disassembly, the EBP-Register is used to adress local datastructures and variables.

Example:
MOV DWORD PTR [EBP+FFFFFF6C], ESI

Would load a local variable with a LONG or DWORD Value.
« Last Edit: August 04, 2007, 11:35:08 PM by Theo Gottwald »

#### Charles Pegge

• Global Moderator
• Hero Member
• Posts: 670
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #9 on: August 05, 2007, 12:13:29 AM »
Thanks Theo, we have to be aware of shifting sands - different compilers.

Kent,
As Push'n Pop is Last-in First-out  buffering, so the above snippet stacks the values of the variables a,b,c which we want to use in the registers . EDI ESI and EBX, may be used to index these variables, so we must not overwrite them while obtaining the values of a,b and c.

Once their values are on the stack, we can safely pop them into these registers.
The stack is indexed by means of the stack pointer ESP. which is automatically decremented before PUSHing and incremented after POPping. but this stack pointer register is seldom altered by us directly.

In this example, the values of a, b and c end up in ebx, esi and edi respectively.

Of course, the three registers we have just overwritten cannot be used
to access further BASIC variables unless we have taken steps to preserve them with another shell of pushes and pops around our assembler code.

However, it is safe to assume that LOCAL variables, including function parameters are indexed only by the EBP register, which is set up for us by PB at the beginning of the function, and it is not normally used for anything else unless you are desperate for a spare reg to make your algorithm zing.

#### Kent Sarikaya

• Full Member
• Posts: 174
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #10 on: August 05, 2007, 06:54:27 AM »
Thanks for the explanation Charles and Theo. Amazing how you guys keep all of which compiler does what etc. The ASM files look different from compiler to compiler too.

#### Donald Darden

• Sr. Member
• Posts: 364
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #11 on: August 06, 2007, 08:04:57 AM »
It's not usually pointed out, but RPN notation works well with ASM because it reduces complex instructions to their simplest form, wirking first with values (or variables), then performing the specified operation.  If you took a typical ASM operation, like ADD eax,a, and wrote it this way:  eax,a ADD, it would easily show the similarity in structure.

More specifically, the use of RPN favors a stack structure, but if we can limit or
confine our efforts to the general purpose registers, we will benefit because the registers are more quickly accessed.  Aside from the fact that some instructions are specific to certain registers, the use of the stack involves RAM memory, which has to be referenced via the stack pointer, which in term has to incremented or decremented, and this makes it quite a bit slower overall.

Another consideration with RPN is that it favors the retention and reuse of
intermediate results.  You can store them somewhere temporarily, then reintroduce them when needed, which can save computational time and effort.
One example of this capability is when multiplying a number by ten by using the
shift-and-add capabilities instead.  Let me describe the general principle and you can then play with it.

Begin with some integer and call it x.  Copy the original x into EAX.    Then copy it from EAX into a second register.  Then shift EAX left two places, which
effectively multiplies your x value by four.  Add the value of x from the temporary register back into EAX, so that the total is now equivalent to five times x.  Then shift EAX left one more place, and your 5x will become 10x.  This
will work for any integer value of x, as long as the value when multipled does not result in a register overflow.  You can rearrange the shifts and adds slightly and get the same result.  For instance, start with a left shift of 1 bit, save the
temporary result, shift left two more places to the left, and add in the temporary result again.  There you are.

Unfortunately, there is no clever method available for dividing by ten, which
would be extremely useful for convirting base 2 numbers to base 10 (decimal).

#### Charles Pegge

• Global Moderator
• Hero Member
• Posts: 670
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #12 on: August 06, 2007, 11:23:27 AM »
division

Yes the only way to divide a number (unless its a simple power of 2) is by
successive approximation/ This makes both CPU and FPU work very hard, taking more than 40 clocks, about 4 times longer than multiplying.

So whenever possible replace dividing factors with multiplying factors eg:
a*001 instead of a/1000.

multiplying EAX by ten
Code: [Select]
` ! shl eax,1 ! mov ecx,eax ! shl eax,2 ! add eax,ecx `
This takes 4 clocks on the CPU instead of 10 using MUL, however floating point multiplications on the FPU are about 3 clocks!

RPN notation

A SHL DUP SHL SHL +

pretending that the EAX and ECX registers are part of a stack.

PS: the new Intel chips do things on half clocks so take my figures as being relative.
« Last Edit: August 06, 2007, 11:42:42 AM by Charles Pegge »

#### Donald Darden

• Sr. Member
• Posts: 364
##### Re: Reverse-Polish (Postfix) Notaton and X86 Assembler
« Reply #13 on: August 06, 2007, 09:20:17 PM »
While the FPU is highly optimized, there is still a time penalty for conversion of numbers to floating point form and back.  Since the PB compilers are optimized for
use with Longs (and will generally revert to floating point form for other numeric
types), your best bet for fastest code is to attempt to use Longs wherever possible.