Full Version : Fast SquareRoot Routine (PIC ASM)
avr >>MATH ROUTINES >>Fast SquareRoot Routine (PIC ASM)


Admin3- 04-18-2006
Eric Smith and Martin Nilsson independently suggested a way to squeeze one word out of the total program memory count. However, the routine was still a dog compared to Andy's screamer. So I dropped the whole square root thing until I serendipitously came across the book "The Logic of Computer Arithmetic", by Ivan Flores. He discusses two methods one can use to find the square root of a number. His first method called the "non-restoring method" is the technique I used in the program above. However, his explanation gave me another idea. So after a half dozen attempts or so I came up with a new routine:

30 bytes long
requires only 3 bytes of RAM - Including the 2 for the 16 bit input ( a savings of one byte)
Executes in 104 cycles best case, 109 average, 122 worst case.

Here's the routine along with some code illustrating its use. Just for grins, I've thrown in the first square root slug I wrote (it's called SQRT_A, while the faster one is called SQRT):

CODE

       list    p=16C64,t=ON,c=132,n=80,st=off
       radix   dec


       include "P16C64.INC"

s1      equ     0x20
s2      equ     0x21
N_hi    equ     0x22
N_lo    equ     0x23
mask    equ     0x24
xl      equ     0x25
xh      equ     0x26
xstep   equ     0x27


       ORG     0              ;Reset Vector

       GOTO    Main

       ORG     4              ;Interrupt Vector


Main

       BCF     STATUS,RP0     ;Point to BANK 0

       CLRF    xl
       CLRF    xh
       MOVLW   1
       MOVWF   xstep

xxx
       MOVF    xh,W
       MOVWF   N_hi
       MOVF    xl,W
       MOVWF   N_lo

       NOP
       CALL    sqrt
       NOP

       MOVF    xstep,W
       ADDWF   xl,F
       SKPNC
         INCF  xh,F

       GOTO    xxx



;---------------------------------------------------------------
;sqrt
;
; The purpose of this routine is take the square root of a 16 bit
;unsigned integer.
;Inputs:  N_hi - High byte of the 16 bit number to be square rooted
;         N_lo - Low byte     "                  "             "
;Outputs: s1   - Temporary variable
;         s2   - 8 bit result returned in here
;
;27 words, 145 cycles best case

sqrt__A
       MOVLW   0x40
       MOVWF   s1
       CLRF    s2

L2      ADDWF   s2,W
       SUBWF   N_hi,W         ;W = N_hi-s1-s2

       SKPNC                  ;if N_hi > (s1+s2)
         goto  L3             ;  then replace N_hi with N_hi-s1-s2

       BTFSS   s1,6           ;If this is the first time through the loop
         BTFSS   N_lo,0       ;or If MS bit is zero, then N < s1+s2
           goto  L1

L3
  ;N is greater than s1+s2, so replace N_hi with N_hi-s1-s2
       MOVWF   N_hi

       RLF     s1,W
       RLF     s1,W
       ADDWF   s2,F           ;s2 = s2 | (s1<<1)
L1
       BTFSC   s1,7           ;If s1 has rolled all the way around, we're done.
         return

  ;Next, roll N left one bit position. I.e. N = (N<<1) | (N>>15). This puts the
  ;MS bit into the LS bit position.
       RLF     N_hi,W         ;C = MS bit of N
       RLF     N_lo,F
       RLF     N_hi,F

       CLRC
       RRF     s1,W           ;Roll s1 right one bit: s1 = ((s1>>1) | (s1<<7)) & 0xff
       RRF     s1,F           ;

       BTFSS   s1,7           ;If this is the last time through the loop, we need to make
         goto  L2             ;a tiny exception.

  ;We only get here if this is the last time through the loop. This special exception needs
  ;to handle a 10 bit addition. Right now, the value of s1 that got shifted into W is zero.
  ;It should be 1 >> 1, ie a fraction of 1/2. Or, another way to look at it is the value of
  ;s1 to be subtracted from N is 0x0080. Now, if N_lo is less than 0x80, then the subtraction
  ;will cause a borrow that must be propagated to N_hi, i.e. N_hi must be decremented. Rather
  ;than subtracting 1 from N_hi now, we'll instead add 1 to s2 and do the subtraction above.

       BTFSS   N_lo,7
         MOVLW 1

       goto    L2

;----------------------------------------------------------
;sqrt
;
; The purpose of this routine is take the square root of a 16 bit
;unsigned integer.
;Inputs:  N_hi - High byte of the 16 bit number to be square rooted
;         N_lo - Low byte     "                  "             "
;Outputs: W register returned with the 8 bit result
;
;Memory used:
;  mask
;  30 Instructions
;  104 Cycles best case, 109 average, 122 worst case

sqrt
       MOVLW   0xc0           ;Initialize value for mask
       MOVWF   mask
       MOVLW   0x40           ;Initial value for the root

sq1     SUBWF   N_hi,F         ;Subtract the root developed so far
       SKPC                   ;If it is larger than N_hi
         goto  sq5            ;  then go restore the subtraction

sq2     IORWF   mask,W         ;Set the current bit
sq3     RLF     N_lo,F         ;Shift N left one position
       RLF     N_hi,F
       RRF     mask,F         ;Shift the mask right, and pick up msb of N
       BTFSC   mask,7         ;If msb of N is set, then unconditionally
         goto  sq6            ;set the next bit (because subtracting the
                              ;root is guranteed not to cause a borrow)

       XORWF   mask,W         ;Append "01" to the root developed so far

       SKPC                   ;If the lsb of mask was shifted into the carry
         goto  sq1            ;then we're done. Otherwise, loop again.

 ;We are almost done. In the last iteration, we have 7 bits of the root. When
 ;"01" is appended to it, we will have a 9-bit number that must be subtracted
 ;from N.

       SUBWF   N_hi,F         ;
       SKPC                   ;If the upper 7 bits cause a borrow, then
         RETURN               ;the appended "01" will as well: We're done.

       SKPNZ                  ;If the result of the subtraction is zero
         BTFSC N_lo,7         ;AND the msb of N_lo is set then the LSB of the
                              ;root is zero.
        XORLW  1              ;Otherwise, it is one.

       RETURN                 ;

sq6     SKPNC                  ;Need to unconditionally set the current bit of the root.
         RETURN               ;However, if we're through iterating, then leave. Note,
                              ;C is set by shifting the mask right and the LSB of root
                              ;was set by IORWF at sq2.

       BCF     mask,7         ;Clear the MSB of N that got shifted into the mask.
       XORWF   mask,W         ;Append "01" to the root developed so far.
       SUBWF   N_hi,F         ;Subtract
       goto    sq2            ;Go unconditionally set the current bit.

sq5     ADDWF   N_hi,F         ;Restore N_hi by reversing the SUBWF with a ADDWF
       goto    sq3            ;Don't set the current bit


       END




Forumer™ is Voted #1 Free Forum Hosting provider
Build your own community today with the largest message board hosting company.