Full Version : THE INCREDIBLE SHRINKING OSCCAL ROUTINE
avr >>ASM PROGRAMMING (AVR) >>THE INCREDIBLE SHRINKING OSCCAL ROUTINE


AVR_Admin- 04-13-2006
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

A SMALL DEVIATION: TOWARDS A SUPER-FAST CALIBRATION ROUTINE

This approach of checking if the size of the "Error" of the Oscillator rather than using the traditional method of just seeing if it falls between an upper and lower boundary and incrementding or decrementing the OSCCAL register might be useful to some.

If you wanted a super-fast calibration, you could measure the size of the error and adjust the OSCCAL register accordingly rather than in small increments/decrments of one. The method which seems to be used by most.

Since this particular routine is done shortly after power-up (or reset) the longer it takes to adjust the OSCCAL register, the better, because it gives the oscillator more time to "settle."

Okay, I think I've been a "deviant" long enough, time to get back to the main topic of this thread...the shrinking of the OSCCAL Routine.

AVR_Admin- 04-13-2006
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

I went back and looked at the actual typical values of the Upper and Lower Limits.

Typical Lower-Limit:
CODE

LOWER = IDEALTIME - 50uS
LOWER = 6103 - 50
LOWER = 6053 = $17:A5

Typical Upper-Limit:
CODE

UPPER = IDEAL + 50uS
UPPER = 6103 + 50
UPPER = 6153 = $18:09


Notice that the high bytes are only out by one, and that if I reduced the upper value by 10, then the high byte would drop to 17 and both high bytes would be the same:

My Lower-Limit:
CODE

LOWER = IDEALTIME - 40uS
LOWER = 6103 - 40
LOWER = 6043 = $17:AF

My Upper-Limit:
CODE

UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF


This means that if we use +/- 40uS instead of +/- 50uS we can simplify part of our routine by simply checking if the high byte equals $17. You'll learn in the next post why it is important that I use exactly +/- 40uS.
CODE

TSTOSC: DEC TMP        ;SET-UP TIMERS
       STS  OSCCAL,TMP
       OUT  TIFR1,FF
       OUT  TIFR2,FF
W6103:  SBIS TIFR2,TOV2;WAIT
        RJMP W6103
       LDS  XL,TCNT1L ;READ TIMER
       LDS  XH,TCNT1H
       STS  TCNT1H,ZERO
       STS  TCNT1L,ZERO
       CPI  XH,23 ;<===== CHECK IF HIGH BYTE IS $17
        BRNE TSTOSC
       (INCOMPLETE AS YET)
         RET

AVR_Admin- 04-13-2006
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Now that we've "taken-care-of" the high byte all we need to do now is check if the lower byte is above or below our selected range? That's amounts to just two single byte compares.

However, if we can "set-things-up" so that our "range" begins or ends at ZERO or 255 then we need only check in "one direction." That means a single compare statment instead of two.

You might have missed it first-time-around so look again at the Lower-Byte value of the Upper-Limit in hex notation, it's $FF:

My Upper-Limit:
CODE

UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF


I chose to use +40uS so that the test for our lower-byte would fall on the byte's upper "boundary" at $FF=255. This mean that all we need to do is compare our lower-byte against our lower-boundary value and if we fall below it then our oscillator is out-of-range.

Let me expand on this "trick" in case some readers don't get it, because it is a little hard to follow...

Remember earlier I said that single unsigned bytes are actually like little circles and not little rulers. Well if we compare our timer against our range, and the range just happend to end at $FF=255. If we're over 255 we actually "wrap-around" to ZERO and now we're actually UNDER, so that counts as a failure.

Also if we compare against our range and we actually do fall under, then that's a failure also. So we've magically combined two tests, one for over and another for under into a singe test for being under because the over will "wrap" to being under.

So our final program looks like this:
CODE

TSTOSC: DEC TMP        ;SET-UP TIMERS
       STS  OSCCAL,TMP
       OUT  TIFR1,FF
       OUT  TIFR2,FF
W6103:  SBIS TIFR2,TOV2;WAIT
        RJMP W6103
       LDS  XL,TCNT1L ;READ TIMER
       LDS  XH,TCNT1H
       STS  TCNT1H,ZERO
       STS  TCNT1L,ZERO
       CPI  XH,23 ;<===== CHECK IF HIGH BYTE IS $17
        BRNE TSTOSC
       CPI XL,175;<===== CHECK IF UNDER $AF
        BRLO TSTOSC;<=== ALSO SNEAKY TEST IF OVER $FF
         RET

AVR_Admin- 04-13-2006
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

So after we add a few lines of set-up the final routine looks like the below, which should be fairly close to the routine I posted in Giorgos' thread:

CODE

       LDS TMP,OSCCAL
TSTOSC: DEC TMP        ;SET-UP TIMERS
       STS  OSCCAL,TMP
       OUT  TIFR1,FF
       OUT  TIFR2,FF
       STS TCNT1H,ZERO
       STS TCNT1L ZERO
       STS TCCR1B,ONE
       STS TCNT2,ZERO
W6103:  SBIS TIFR2,1  ;WAIT
        RJMP W6103
       LDS  XL,TCNT1L;READ TIMER
       LDS  XH,TCNT1H
       CPI  XH,23;<=== HIGH BYTE = $17?
        BRNE TSTOSC
       CPI XL,175;<=== LOW BETWEEN $AF-$FF?
        BRLO TSTOSC
         RET

IN CONCLUSION:

Well I hope I've answered all your questions about my condensed OSCCAL Routine. How it works, why it works and how it got to it's present form.

I certainly hope you found the trip entertaining as well as informative.

I've used this routine now for hundreds of uploads to Butteflies and have not experienced a single problem. There have been hundreds of by Bootloaders downloaded wich use this OSCAL Routine and have yet to receive any reports of problems.


CONTINUING EDUCATION:

To learn more about the AVR Butterfly in gerneral, you can visit the Butterfly & Beginner's Web Site at: [URL=http://retrodan.tripod.com/]


REQUEST FOR FEEDBACK:

If you found this tutorial discussion interesting and/or entertaining and would like to see more like it, please let the moderator(s) know.

Thank you for your time and consideration.

ADDENDUM:

Here is the [CRICKET] program; the Butterfly Replacement Bootloader. Time between "Chirps" is time for OSCCAL to be set, then waits about 10 secs for an upload then a final "Chirp" before bed-time. Try it, you'll love it!

AVR_Admin- 04-13-2006
One in a thousand...Hmm, not too bad, however, the real probability is much, much lower than this.

QUOTE ("koshchi")
The probability of this is, in fact, 0 (assuming that the crystal attached to the TOSC lines is really 32.768kHz and the cpu clock prescaler is set to 8 ).

Consider this. 6103 comes from this:

N = (F_CPU / 32.768kHz) * 200

Where the nominal cpu frequency is 1mHz, and 200 is the number of ticks of the 32.768kHz crystal that we wait for in the sample. To find the actual cpu frequency we turn it around:

F_CPU = N * 32.768kHz / 200

If timer 1 overflowed then the count value would be at least 65536. Putting that into the equation we find that the actual F_CPU would have to be almost 11mHz. Since the OSCAL at maximum can achieve at most twice the nominal frequency, this will be impossible. And if you forgot to set the clock prescaler to 8 it may have gotten this high with the OSCAL at it's highest. But even with the lowest OSCAL, it would still result in a count of at least 32768, so it would not be possible to adjust it to within range. So if timer1 overflows, this is necessarily an error condition.

Both your routine and Atmel's original code are not fault tolerant. If any condition occurs that doesn't allow the count to get into range, it will loop forever (which in real time is a significant amount ). You've got to be careful about the acceptable range you use. If you get too small, you might not be able to get within that range. Calculating it I find that an adjustment of 1 to OSCAL results in a change of about 18 in the count. So your range of 80 should work without a problem, but making it any narrower or lowering the number of ticks per sample could be dangerous.



The one with it missing was slightly slower, but without the audible clues, no one would ever notice. So on a human scale there's not much difference.

QUOTE ("koshchi")
If the OSCAL setting is within range on the first pass, then the wait time is virtually zero. But if the setting is just out of range on the low side, the delay will be 1.5 secs. Whether or not this is acceptable would depend on the circumstances. If the calibration is being done upon waking from sleep in order to service a signal on the UART, this could be entirely unacceptable. Your suggestion of adjusting according to the size of the error is a good one for this type of thing. You could also use successive approximation.



Since this particular routine is done shortly after power-up (or reset) the longer it takes to adjust the OSCCAL register, the better, because it gives the oscillator more time to "settle."

QUOTE ("koshchi")
This is not a good thing since if the OSCAL happens to be set so that it is within range in the first couple of passes, then the oscillator will drift out of acceptable range after the routine is run. You must let the oscillator settle before you run the calibration routine.

AVR_Admin- 04-13-2006
QUOTE ("Koshchi")
Since this particular routine is done shortly after power-up (or reset) the longer it takes to adjust the OSCCAL register, the better, because it gives the oscillator more time to "settle." 

This is not a good thing since if the OSCAL happens to be set so that it is within range in the first couple of passes, then the oscillator will drift out of acceptable range after the routine is run. You must let the oscillator settle before you run the calibration routine.


Excellent point Steve, you are 100% correct of course. One of the pit-falls of listing code fragments is that people have no idea what is going on before or after the routine is called. Typically there is a wait period before the OSCCAL routine is called so we are not relying on this one "delay" as our only way to allow the Oscillator to settle after power-up. The point I was trying to make is that the "extra" time spent trying to calibrate the oscillator is not a bad thing and might actually be beneficial for this particular application.

Thank you ever-so-much for verifying my "gut felling" about the timer over-flow. I knew the probability of a "false positive" would be infinitely small, but I'm over-joyed to hear it's actually Zero. How did you get so smart?

If the objective was not to optimize for minimal size, perhaps a slow expanding of the tollerance range might be one way to prevent an infinite loop rather than a direct time-out. A less-than-perfect calibration might be better than something that is way off. I use +/- 40uS but I think it should work for upto +/-100uS as well. I haven't done the math, has anyone worked back-wards from BAUD Rate tollerances to the OSCCAL tollerances for 19,200 to get some exact figures?

Thank you very much for the excellent feed-back in this, and many of my other threads.

Have a great day, Steve!

AVR_Admin- 04-13-2006
QUOTE ("koshchi")
I got to thinking about successive approximation. Here's the algorythm:
Code:
Mask = 0b10000000; // start with the highest bit
Value = Mask;
for(8 times)  // for 8 bits of accuracy
{
  OSCAL = Value;
  Wait for timeout;
  if (CountIsTooLow)
  {   
      //OSCCAL needs to be lower, so remove the bit
      Value = Value - Mask;
  }
  Mask >>= 1;  //shift the mask one bit
  Value = Value | Mask; //add the bit into the value
}

This is pretty simple. You only need to compare the actual count with the desired count. No need to test against a range since we are guaranteed to be accurate to the nearest value. This gets the count to within around 20, which is even better than Dan's routine. It is also much quicker. The average time for Dan's is around 0.75 sec. For this it is about 0.05 sec. Furthermore, the time is constant (within microseconds). It also has the advantage that it is more fault tolerant. The only way it could get stuck is if there is no crystal at all on the TOSC pins.

Using Dan's routine as a base here is my solution (CAVEAT: I have not had the time to check it yet, but I believe it is correct):
Code:
      CLR  TMP
      LDI  MASK, 0X40
TSTOSC: OR  TMP, MASK
      STS  OSCCAL, TMP
      OUT  TIFR1, FF
      OUT  TIFR2, FF
      STS TCNT1H, ZERO
      STS TCNT1L, ZERO
      STS TCCR1B, ONE
      STS TCNT2, ZERO
W6103: SBIS TIFR2,1    ;WAIT
      RJMP W6103
      LDS  XL, TCNT1L
      LDS  XH, TCNT1H
      CPI  XH, 26
      BRLT NEXT
      SUB  TMP, MASK
NEXT: LSR  MASK
      BRCC TSTOSC
      RET

You'll notice that I changed the check of the high byte of TCNT1 to 26. What I did was increase the ticks per sample to 218 instead of 200. This makes the target count 6652, which is just 4 short of 0x1a00 (6656), which is close enough as to not matter. This eliminates the need for checking the low byte. Also I am ending the loop by seeing if the bit has been shifted into the carry, eliminating the need for a counter. This routine is only one line longer than Dan's in the loop.

Edit: Corrected code
Changed BRGE NEXT to BRLT NEXT.
Changed LDI MASK, 0x80 to LDI MASK, 0x40

AVR_Admin- 04-13-2006
You bring up a few excellent points Steve.

I have a confession to make: I'm actually even more of a Hack than I have already told you. However I'm a little gun-shy about posting my stuff because people seem so eager to tear-it-apart even though it works.

The ACTUAL routine that I use at HOME is only 15 lines long. I know there'll be questions about it, so I might as well just post it now and get-it-over-with. I was going to post in the future as an update.

So far I have not had a single failed connection the tollerances are such that "Mister 19200 BAUD" never has problem with it.

Here's the actual Code I'm using:

CODE

;------------------------------------;
; RETRO DAN's 15 LINE OSCCAL ROUTINE;
;------------------------------------;
TSTOSC:  DEC    TMP       ;ADJUST OSCAL  
        STS    OSCCAL,TMP
        OUT TIFR1,FF   ;RESET
        OUT TIFR2,FF   ;OSC COUNTER
        STS TCNT1H,ZERO;START IT
        STS TCNT1L,ZERO;FROM ZERO
        STS TCCR1B,ONE ;READY...GO!
        STS    TCNT2,ZERO;CLEAR TIMER
W6103:   SBIS TIFR2,1   ;CHECK TIMER
         RJMP W6103    ;6103uS PASSED?
        LDS    XL,TCNT1L ;READ COUNTER
        LDS XH,TCNT1H  ;CHECK ACCURACY
        CPI    XH,23     ;CHECK HIGH BYTE
         BRNE  TSTOSC    ;OUT-OF-RANGE
          RET


Wildman that I am, what I've done to save 2 program steps (4 bytes) is to remove any check on the lower-byte. Yeeha!

You hinted at a similar short-cut in your last post. As long as the "ball-is-in-the-park" it doesn't have a problem putting it into play! It calibrates extemely quickly and I've never experienced a failed connection.

If I'd offered Giorgos this version, he'd probably accused me trying to sabotage his work. Ha ha ha!

AVR_Admin- 04-13-2006
QUOTE ("koshchi")

And throw your accuracy out the window.

With successive approximation, you are guaranteed to get better accuracy with every bit, which is why I always do all 8 iterations and don't try to bail early. My routine will always result in a count within about 20 of 6656, making the accuracy better than 0.3%.

Your routine needs a range. With your routine only checking the high byte, your range is now 5888 to 6143. At 5888 you are now 215 off of the target count of 6103. That make the clock 3.5 % high.

AVR_Admin- 04-13-2006
That's why I only use it at home Steve, it's nowhere as accurate. and I would not have offered this version to Giorgos.

However, if we are to be true to our initial goal of reducing size - all other considerations such as accuracy go out the window as long as the routine continues to perform as good or better than the original. This one hasn't failed yet.

Don't get me wrong however, I highly value your input, and another version of OSCCAL that can calibrate to a much higher accuracy in a fraction of the time must have needed application also.

As proof that I value your routine, I actually tested it. I remember that the branch after the test needed to be pointed to the top of the routine. However, I must have done something else wrong because it failed on me. I set OCR2A=218 (not 200) is there something else I overlooked because I'd love to see this routine work. It's still much smaller than the original 30-plus line version and should be super-fast and super-accurate

AVR_Admin- 04-13-2006
TITLE: COMPLETE SUPER-CONDENSED OSCCAL ROUTINE IN ASM

For those of you that want to put your programs on a diet, here is the complete useable routine including set-up. It's about 24 lines. Most OSCCALs I've seen run 50 to 60 lines. If you decide to use it, please remember to mention me.

CODE

;------------------------------------------;
; CALIBRATE INTERNAL RC OSCILLATOR: OSCCAL;
;------------------------------------------;
RETRO_CAL: STS  CLKPR,V128 ;DROP CLOCK TO 1Mhz
          STS  CLKPR,THREE;
          STS  TIMSK1,ZERO;SETUP TIMER1
          STS  TCCR1B,ONE ;TC1 = 1MHz = 1uS
          STS  TIMSK2,ZERO;SETUP TIMER2
          STS  ASSR,EIGHT ;CLOCK FROM 32KHz OSCILLATOR
          LDI  TMP1,200   ;OCR2A = 200
          STS  OCR2A,TMP1 ;200 / 32768Hz ~= 6103uS
          STS  TCCR2A,ONE ;TC2 = 32768Hz ~= 30.5176uS
          LDS  TMP,OSCCAL ;READ OSCCAL
TSTOSC:    STS  OSCCAL,TMP ;WRITE OSCCAL
          DEC  TMP        ;ADJUST OSCCAL
          OUT  TIFR2,FF   ;RESET FLAG
          STS  TCNT1H,ZERO;STARTEM @ZERO
          STS  TCNT1L,ZERO;
          STS  TCNT2,ZERO ;
          STS  TCCR1B,ONE ;READY...GO!
W6103:     SBIS TIFR2,1    ;CHECK TIMER
           RJMP W6103     ;6103uS PASSED?
          LDS  XL,TCNT1L  ;READ COUNTER
          LDS  XH,TCNT1H  ;
          CPI  XH,23      ;CHECK HIGH BYTE
           BRNE TSTOSC    ;WAY OUT-OF-RANGE
            RET           ;ITS MILLER TIME!

For the less adventurous among you that would like to maintain that nice "tight" +/-40uS calibration, simply add these two lines between the BRNE TSTOSC and final RET.

CODE

        CPI XL,175    ;CHECK LOW BYTE
         BRLO TSTOSC  ;OUT-OF-RANGE


If nothing else, perhaps this routine can form the basis for your very own OSSCAL routine, like Steve's super-fast, super-accurate version. Happy Programming!

AVR_Admin- 04-13-2006
QUOTE ("koshchi")
I have corrected an error in my successive approximation code above. The line:

BRGE NEXT

has been changed to:

BRLT NEXT

Also, in looking at the datasheet, I noticed that OSCCAL is only 7 bits, not 8, so I could change my initial mask value to 0x40 instead of 0x80.

A couple things I noticed in testing the code.

First, it is possible that the second to last approximation could actually be closer to the target value than the last approximation, so the value could be one off. However the maximum error will still be about 1.1%

Second, the count value between two successive OSCCAL values is much greater than I had expected. For my routine (with 218 ticks) is about 80. For 200 ticks it's about 75. So Dan, narrowing your range to 80 is getting pretty dangerous.

Both of these things affect my accuracy. It is about four times what I had calculated. I'm still thinking about how to get my extra bit back. If I can I'll get half my accuracy back.

One more thing, and this applies to both our routines. The line:

STS TCCR1B,ONE

to enable timer1 doesn't need to be there in the loop since we are never disabling it. This line can be moved to before the loop.

AVR_Admin- 04-13-2006
This line can go also:


OUT TIFR1,FF ;RESET



Thereby saving another program step.

AVR_Admin- 04-13-2006
QUOTE ("C_oflynn")
Hey,

Wow - that's an in depth tutorial! You weren't kidding. . .

Again thanks for your tutorial, it is a great asset!. . .

Warm Regards,

-Colin O'Flynn

AVR_Admin- 04-13-2006
QUOTE ("koshchi")
I was looking at the C code and noticed a couple of things. After setting the registers for timer2, the code waits for the update of the registers to be complete (with timer2 in asynchronous mode the update is not immediate). However when timer2 is cleared within the loop, this wait is not done. I wondered if this would make a difference in the count, so I tested it. I am getting values that are consistently lower by about 32 when I wait for the update busy flag to clear than when I don't. (By the way, I moved the clearing of timer2 to just above the clearing of timer1 so that timer1 is not busy counting while I'm waiting for timer2.) I also noticed that the C code is centering on a count value of 6185 instead of 6103. This may have been done to compensate for the low readings that I observed. They may have even used an oscilloscope to calibrate the calibration routine. I don't have a scope. Maybe someone with a scope can check and see what actual frequency we are getting with our routines.


Forumer™ is Voted #1 Free Forum Hosting provider
Build your own community today with the largest message board hosting company.