26 Jul 2021 - tsp
Last update 26 Jul 2021
13 mins
The following blog post describes the implementation and workings of a simple
module for AVR microcontrollers that allows keeping track of elapsed time
and a simple implementation of functions such as delay
and micros
that
are commonly known from various libraries and runtimes (such as the C standard library,
Arduino, etc.). It shows how one is able to implement such functions utilizing
the timer 0 of AVRs and the simple equations that are required to keep track
of time.
In my module Timer 0 is configured to run off the system frequency F_CPU
,
for example $16 MHz$ or $16000000 Hz$. Since frequency is inverse proportional
to the elapsed period one can directly calculate the length per clock cycle:
As one can see each clock pulse is taking $62.5$ nanoseconds. Since this would be way too fast for an ISR to trigger (the ISR would only have a single instruction to finish) one applies a prescaler. The larger the prescaler the lower the resolution of the clock but the less time is wasted for the timekeeping task. The period of a single timer tick scales linear with the prescaler - a 64 way prescaler for example would have a period of $\Delta t = 62.5 * 64 ns = 1000 ns$ and thus $\Delta t = 1 us$. Each ISR would have only 64 clock cycles of time to handle the task - before the next ISR is triggered again. This would still be way to much work for the microcontroller - and even if possible a huge proportion of the available time would be used for timekeeping without providing any more useful functionality.
[ \Delta t_{tick} = \frac{1}{f_{cpu}} * n_{prescaler} \\ \Delta t_{tick} = \Delta t_{cpu} * n_{prescaler} \\ \Delta t_{tick} = 62.5 ns * 64 = 1000 ns = 1 \mu s ]To divide the clock cycle even further I’m only using the overflow interrupt of timer 0. Since timer 0 is a 8 bit timer it will overflow every 256’th time - the overflow interrupt will thus be only triggered
[ \Delta t_{overflow} = \frac{1}{f_{cpu}} * n_{prescaler} * 256 \\ \Delta t_{overflow} = 1 \mu s * 256 = 256 \mu s ]A period of $256$ microseconds sounds way less problematic - this is a single overflow
interrupt every 16384 clock cycle. This is the first definition that’s calculated
in sysclk.h
:
#define SYSCLK_TIMER_OVERFLOW_MICROS (64L * ((256L * 1000000L) / F_CPU))
The brackets are selected in a way to not lead to an overflow during calculation since compilers truncate the intermediate results. From this information one can calculate the milliseconds per tick rounded down to the nearest integer millisecond:
#define SYSCLK_MILLI_INCREMENT (SYSCLK_TIMER_OVERFLOW_MICROS / 1000L)
To reduce the drift due to rounding error and to allow a micros()
function
with sub millisecond resolution later on the application will also keep track
of the elapsed micro seconds. This is done by adding the remaining microseconds
whenever the timer overflows. As soon as the microseconds sum up to a full millisecond
the microsecond counter will advance again. Unfortunately the remainder of a
division by $1000$ would require 10 bits of storage. To still fit into a single
byte and allow useful addition only the most significant 7 bits of the millisecond
increment will be stored. Thus also the reachable threshold for a millisecond
will have to be shifted the same amount of bits (divided by 8 / shifted by 3 bits)
#define SYSCLK_MILLIFRACT_INCREMENT ((SYSCLK_TIMER_OVERFLOW_MICROS % 1000L) >> 3)
#define SYSCLK_MILLIFRACT_MAXIMUM (1000 >> 3)
The ISR that’s then called on every timer overflow performs a pretty simple job:
SYSCLK_TIMER_OVERFLOW_MICROS
SYSCLK_MILLIFRACT_INCREMENT
SYSCLK_MILLIFRACT_MAXIMUM
the internal millisecond counter will be incremented by one and the used microseconds
will be removed from the internal microsecond counter.systemMonotonicOverflowCnt
will be
incremented on every timer invocation.volatile unsigned long int systemMillis = 0;
volatile unsigned long int systemMilliFractional = 0;
volatile unsigned long int systemMonotonicOverflowCnt = 0;
ISR(TIMER0_OVF_vect) {
unsigned long int m, f;
m = systemMillis;
f = systemMilliFractional;
m = m + SYSCLK_MILLI_INCREMENT;
f = f + SYSCLK_MILLIFRACT_INCREMENT;
if(f >= SYSCLK_MILLIFRACT_MAXIMUM) {
f = f - SYSCLK_MILLIFRACT_MAXIMUM;
m = m + 1;
}
systemMonotonicOverflowCnt = systemMonotonicOverflowCnt + 1;
systemMillis = m;
systemMilliFractional = f;
}
The initialization is pretty simple:
TCCR0A
to 0x00
TCCR0B
to 0x03
TOIE0
in TIMSK0
PRTIM0
in PRR
void systickInit() {
uint8_t sregOld = SREG;
cli();
TCCR0A = 0x00;
TCCR0B = 0x03; /* /64 prescaler */
TIMSK0 = 0x01; /* Enable overflow interrupt */
PRR = PRR & (~0x20);
SREG = sregOld;
}
The millis function should just return the current time in milliseconds modulo
an implementation specific word size. Since we’re counting a systemMillis
variable this is pretty easy - interrupts are just disabled in case they’ve
been enabled to prevent partial reads from the variable:
unsigned long int millis() {
unsigned long int m;
uint8_t srOld = SREG;
cli();
m = systemMillis;
SREG = srOld;
return m;
}
The micros()
function that should deliver the current time in microseconds
modulo an implementation specific word size is a little bit more challenging.
To implement that function I’m counting the number of overflows that have
occurred (every 256’th timer tick). In addition one can use the current value
of the TCNT0
register - i.e. the current number of elapsed timer ticks.
Using the duration of a single tick in microseconds one could now calculate the elapsed time:
[ n_{tickstotal} = n_{TCNT0} + n_{overflow} * 256 * \Delta t_{tick} \\ n_{tickstotal} = n_{TCNT0} + n_{overflow} * 256 * \frac{n_{prescaler} * 1000000}{f_{cpu}} \\ ]There is one drawback though and that’s whenever there is an unhandled timer 0
interrupt in case the timer has overflown exactly when the register has been
read. To circumvent that situation one check if the timer has a pending
overflow by checking the TOV0
bit in the TIFR0
register:
unsigned long int micros() {
uint8_t srOld = SREG;
unsigned long int overflowCounter;
unsigned long int timerCounter;
cli();
overflowCounter = systemMonotonicOverflowCnt;
timerCounter = TCNT0;
if(((TIFR0 & 0x01) != 0) && (timerCounter < 255)) {
overflowCounter = overflowCounter + 1;
}
SREG = srOld;
return ((overflowCounter << 8) + timerCounter) * (64L / (F_CPU / 1000000L));
}
Again brackets in the calculation have been chosen in a way to prevent rounding errors and truncation due to overflow.
Based on the monotonic clock and the micros()
function one can also
implement a simple delay function. It would also be possible to use millis()
but that would result in an error of around $\pm 1 ms$. Using micros()
reduces
the possible error way into the sub millisecond range. In case this is not
necessary millisr
might be more interesting. Keep in mind that busy-waiting
is usually considered a bad idea anyways - consider interrupt driven designs in
these cases and only use busy-waiting when it’s really acceptable.
The easiest implementation to implement the delay first queries the current time
in microseconds. It then checks if the difference between current and previous
microsecond timestamp is equal or larger than $1000 \mu s$ which equals $1 ms$.
If this happens an internal counter that contains the time to wait in milliseconds
gets decremented. The internal state variable of the last known microseconds timestamp
gets added $1000$. It’s not reset to the current value to not accumulate drift
by execution of instructions during the micros()
calculation and inside
the loop of the delay function - thus the reference is always the timestamp
queried initially and error does not accumulate. Overflow is accounted automatically
for by the wrap around of the addition and subtraction.
void delay(unsigned long millisecs) {
unsigned int lastMicro;
lastMicro = (unsigned int)micros();
while(millisecs > 0) {
unsigned int curMicro = micros();
if(curMicro - lastMicro >= 1000) {
lastMicro = lastMicro + 1000;
millisecs = millisecs - 1;
}
}
return;
}
Implementing a delayMicros(period)
function is way more challenging if
one wants to be able to specify durations precisely. So this function contains
some arcane magic in form of inline assembly to burn a specific amount of CPU
cycles to round up function invocation to full clock cycles as well as some knowledge
about the code generated by the compiler - so this is not portable and works only
for a specific compiler with specific optimization setting. In case one discovers
that the routines don’t work any more one has to inspect assembly output and readjust
again.
void delayMicros(unsigned int microDelay) {
#if F_CPU == 20000000L
__asm__ __volatile__ (
"nop\n"
"nop\n"
);
if((microDelay = microDelay - 1) == 0) {
return;
}
microDelay = (microDelay << 2) + microDelay;
#elif F_CPU == 16000000L
if((microDelay = microDelay - 1) == 0) {
return;
}
microDelay = (microDelay << 2) - 2;
#elif F_CPU == 8000000L
if((microDelay = microDelay - 1) == 0) {
return;
}
if((microDelay = microDelay - 1) == 0) {
return;
}
microDelay = (microDelay << 1) - 1;
#else
#error No known delay loop calibration available for this F_CPU
#endif
__asm__ __volatile__ (
"lp: sbiw %0, 1\n"
" brne lp"
: "=w" (microDelay)
: "0" (microDelay)
);
return;
}
Since this is a common mistake a short note about clock sources is useful: If you’re
running the code above and measuring that a delay of $1000 ms$ takes around 15 to 16
seconds (varying in exact time) this might be due to the fact the AVRs are usually
shipped with an internal RC oscillator selected (CKSEL = 0010b
) that’s oscillating
at around 8 MHz. In addition the CKDIV8
fuse is also set (0
) so the clock
is divided even further by a factor of 8 resulting in a 1 MHz master clock (side note:
The startup delay is also maximized by setting SUT = 10b
which gives the
oscillator the longest time available to stabilize before executing the reset handler).
Since it’s an RC oscillator it’s not that stable so timing might not be reliable.
The specified 8 MHz are nominal frequency at 25 degree Celsius and stable 5V
operating voltage - the same instability also has to be accounted for the other
available internal oscillator that’s running at 128 kHz that’s available for special
low power applications.
In case one wants to use an external full swing crystal oscillator (for example
at 16 MHz) one has to set a proper clock selection in CKSEL
and disable
the clock division. The CKDIV8
bit could be programmed by flashing the low
fuse byte - but it’s also possible to change the prescaler bits in CLKPR
at runtime:
CLKPR = 0x80;
CLKPR = 0;
Changing the clock source away from the internal RC oscillator is not possible
at runtime and requires pre programming the CKSEL
and SUT
bits in
the low fuse byte. When one uses an external 16 MHz quartz oscillator for example
one should set CKSEL=1111
and SUT=11
. In case one also wants to
disable clock division by 8 on startup the CKDIV8
fuse should be set to 1
.
The last feature that is controlled via the low fuse byte is CKOUT
. If this
fuse is programmed the divided system clock will be output by PB0
independent
of any other settings - which is usually not desired. To disable this feature
one should set CKOUT=1
. In case one wants to use the mentioned external 16 MHz
quartz it’s required to program the low fuse to 0xFF
. This is done by adding
the argument -U lfuse:w:0xff:m
to the avrdude
command.
For example:
avrdude -v -p atmega328p -c avrisp -P /dev/ttyU0 -b 57600 -U lfuse:w:0x7f:m -U flash:w:example.hex:i
To revert the fuses to factory defaults one might use
avrdude -v -p atmega328p -c avrisp -P /dev/ttyU0 -b 57600 -U lfuse:w:0x7f:m -U hfuse:w:0xd9:m -U efuse:w:0xff:m
A last word of caution before one plays around with fuses: If one disabled SPI
program download in the hfuse
(which is enabled by default) the usual
cheap or self built programmers for AVRs (or using an Arduino as an ISP) won’t
work any more. In this case one requires a programmer that supports high voltage
programming mode that applies 12V to the AVRs reset pin and then uploads the program
using an parallel programming interface instead of the serial one used by SPI
based programmers.
The source code for the whole simple sysclock module is available as a GitHub GIST
This article is tagged: Tutorial, Programming, AVR, Electronics, ANSI C, Basics, Microcontroller
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/