The Perfect Pulse- generating precise one-shots on AVR8

iq4n4 (2)

It is possible to generate one-shot pulses on an AVR that…

  1. Are as narrow as a single clock cycle (63 nanoseconds!)
  2. Are precise to a single clock cycle
  3. Will run to correct completion no matter what else the processor is doing 1
  4. Do not require you to turn off interrupts at all (!)
  5. Do not require any assembly code

These pulses are generated in pure hardware. They require a couple of instructions of interruptible code to fire. Once fired, they are completely autonomous and depend only on the system clock to run to completion.

Sound cool? Read on!

Demo

Here is a handy demo program…

https://github.com/bigjosh/TimerShot/blob/master/TimerShot.ino

This demo is written for an Arduino to make it easy to try, but this technique can work on any AVR8 with a timer module.

Download and run the demo and your Ardunio will start outputting one pulse per second on digital pin 3.

The 1st pulse is 0 cycles long (no pulse),
The 2nd pulse is 1 cycle long (~63ns),
The 3rd pulse is 2 cycles long (~126ns),

…up to a total of 20 pulses, and then will start over.

The output should look like the above animated GIF.  The yellow trace is the pulse.

Note that there is no cli() or ATOMIC() anywhere in the code. Interrupts are on the whole time, yet you will never see a stretched pulse because an interrupt happened to come along at just the wrong time. You will also never see jitter because the firing was delayed.

It is like magic!

Usage

TLDR;

To make a pulse, call…

OSP_SET_WIDTH(cycles);

…where cycles is the number of clock cycles wide you want the pulse to be. For example…

OSP_SET_WIDTH( 1000 );

…will output a pulse that is 1000 cycles wide.

More options

Here are the functions and macros you can use to add precise one-shot pulses in your own code…

osp_setup()
osp_setup(cycles)
Setup the timer to generate one-shot signals. Must be called before any of the other functions.  Can be called multiple times. You can optionally specify the width of subsequent  shots in cycles.

OSP_SET_WIDTH(cycles)
Set the width of any subsequent shots to the specified number of cycles.

OSP_FIRE()
Fire off a shot using the most recently set width.

OSP_SET_AND_FIRE(cycles)
Set the width for this and any subsequent shots to cycles, and then fire a shot. Slightly faster than calling OSP_SET_WIDTH() followed by OSP_FIRE().

OSP_INPROGESS()
Returns true if there is currently a shot in progress.

Width

At any moment, there is a width set and the next shot fired will have that width. Firing a shot does not change the current width setting.

  • You can set the each time you fire a shot using OSP_SET_AND_FIRE().
  • If you are going to be firing several shots of the same width in a row, it is slightly faster to set the width with OSP_SET_WIDTH() once and then call OSP_FIRE() for each shot.
  • If you know the width of the first shot (and possibly subsequent shots), it is slightly faster to specify the initial width when calling osp_setup() and then call OSP_FIRE().

Testing for a pulse in progress

You can check if there is currently a pulse being generated with OSP_INPROGRESS(). If it returns 0, then the most recently fired shot has completed. Because of the overhead of executing the OSP_INPROGRESS() macro and code that uses the result, there will always be a few cycles of downtime between fires.

Because they run independently, the end of a one shot can occur while the CPU is in the middle of an instruction. This can cause some jitter when testing for the end of pulses, but you will never have to worry about erroneously seeing the pulse finished while it is still in progress. Check out the animated GIF above to get a feel for how the end-of-pulse test interacts with different pulse lengths (the blue trace goes low when the code detects and acts on OSP_INPROGRESS() going low).

Computing cycles

You can use the F_CPU macro to figure out how many cycles at the defined clock speed. F_CPU is defined by the compiler to be the number of cycles per second, so you could use…

OSP_SET_AND_FIRE( F_CPU / 1000000 );

…to generate a 1us pulse (1us =1/1000000 of a second).

Note that F_CPU assumes you are using a normal clock setup. if you are messing with fuses or otherwise changing the cpu clock settings, then you’ll have to figure out what your new F_CPU should be and redefine it in your code.

Examples


osp_setup(1);
OSP_FIRE();

Fires a single shot that is 1 cycle long (about 63ns with a 16MHz clock). Note that you do not need to test if a 1 cycle shot completed because just executing the code to check would take many cycles.

PNG_201531292740


osp_setup();
OSP_SET_AND_FIRE(10);
while (OSP_INPROGRESS());
OSP_SET_AND_FIRE(20);
while (OSP_INPROGRESS());
OSP_SET_AND_FIRE(30);

Fires three shots, of width 10, 20, and 30 cycles each respectively. There will be at least 1 cycle of space between the two shots (in practice it will be at least 10 cycles because that is how long it takes the CPU to evaluate and exit the while.

PNG_201531293718


osp_setup();
for (uint8_t i = 1; i < 10; i++) {
OSP_SET_WIDTH(i);
for (uint8_t j=0; j<=i; j++) {
OSP_FIRE();
while (OSP_INPROGRESS());
}
}

Fires 1 shot that is 1 cycle long, then 2 shots that are each 2 cycles long, then 3 shots that are each 3 cycles long, and so on up to 10 shots that are each 10 cycles long. There will be at least 1 cycle of space between all the shots.

PNG_201531294234


FAQ

Why not just use a tight loop to bit bang the pulse out?

  1. You would need to turn off interrupts for the length of the pulse, or risk being interrupted in the middle of a pulse and stretching it unpredictably. Turning interrupts off for the full duration of the pulse could disrupt any interrupt dependent tasks that might need to run. With this one shot technique, firing a shot happens within a single atomic instruction, so you never need to turn off interrupts.
  2. You are just burning cycles the whole time you are locked in that delay loop. With this technique, once the shot is fired, the processor is free to go off and do whatever it likes. The shot will continue to run in the hardware and end at exactly the right time no matter what code the processor happens to be executing at that time.
  3. The shortest pulse you can generate with bit banging is 2 cycles.2 With this one-shot technique you can reliably generate a pulse as short as exactly 1 cycle.
  4. There is no reliable 3 way to do calculated timing loops finer than microseconds in C, so you have to drop to assembly.
  5. There is no clean and efficient4 way to generate cycle-denominated loops in assembly.

What is the longest pulse you can generate with this technique?

The code here can generate a pulse up to 254 cycles wide, which is about 16 microseconds on at 16Mhz Arduino.

If you need a longer timeout, you can increase the prescaler in Clock Source bits in the TCCR2B register.

cd bits Capture

Changing the line

TCCR2B = _BV(WGM22)| _BV(CS20);

to

TCCR2B = _BV(WGM22) | _BV(CS22)|_BV(CS21)|_BV(CS20);

will change the prescaler to 64, which means that each cycle of the timer will take 64 clock cycles. So, OSP_SET_WIDTH_AND_FIRE(1) will generate a pulse that is about 1 microsecond wide and OSP_SET_WIDTH_AND_FIRE(254) will generate a pulse about 1 millisecond wide.

If you want any longer, you could use the 16 bit Timer1 (the code here uses Timer2 which is only 8 bits). That could give you a pulse width range of about 0-4 seconds, in 64 microsecond steps. Keep in mind that these 4 second long pulses are still actuate to a single clock tick (~62.5ns). Pretty impressive! One complication is that access to the 16 bit counter is not atomic, so you would need to either (1) disable interrupts for about 200ns to fire the shot, or (2) stop the timer, set the counter, and then restart the timer to fire the shot. Maybe I’ll do a full article on this is people want it.

Wait! How the heck does this actually work? I thought the AVR timers were free running!

Stay tuned next week for a full explanation!


  1. The clock feeding the timer clock much still be running, so you can’t shutdown and you can’t change the clock speed or you’ll mess up the timer. 
  2. …or at least the fastest I can figure is a sbi followed by a cbi, each of which takes 2 cycles. Is there a faster way I don’t know about? 
  3. You could try the _delay_cycles() built-in, but it is not universally supported (doesn’t work on the OSX version of the Arduino IDE, for example). No matter what, you can never be sure that your code won’t get re-ordered on you and mess up your carefully computed timing. 
  4.  There are some complicated tricks to insert some combination of macros based on bits, but these are definitely not clean. You can use the REPT macro to insert a straight series of NOPs, but this is not memory efficient. 

10 comments

  1. gdstevens2015

    Thanks Josh.

    I am probably doing it wrong, but if I use:
    /*************/
    void setup()
    {
    DDRB |= _BV(4); // Set Digital Pin 12 to output for ossiliscope trigger
    osp_setup(320);
    }
    void loop()
    {
    OSP_FIRE();
    while (OSP_INPROGRESS());
    }
    /*************/

    I get 4 microsecond long pulses. IF I did my arithmetic correctly, they ought to be 20 microseconds. What am I doing wrong?

    In any case thanks for your work.

    • bigjosh2

      The code here can generate a pulse up to 254 cycles because this is only an 8-bit timer.

      320 & 0xff = 64

      64 ticks * ~63 ns/tick = ~4us, so what you are seeing is expected.

      If you want to generate a pulse that is 320 cycles long, you could try using a 32 step prescaler like this…

      /*************/
      void setup()
      {
      DDRB |= _BV(4); // Set Digital Pin 12 to output for ossiliscope trigger
      osp_setup(10); // 10 clock ticks * 32 prescaler = 320 cycles
      TCCR2B = _BV(WGM22) | _BV(CS21)|_BV(CS20); // Prescaler = /32
      }
      void loop()
      {
      OSP_FIRE();
      while (OSP_INPROGRESS());
      }
      /*************/

      All a prescaler does is divide the clock signal going into the timer, so the timer sees one tick for each 32 clicks of the input clock.
      I haven’t tested, but should work. LMK!

  2. Charlie Myers

    Is the reason the 16 bit timer is not atomic with respect to interrupts because it takes two separate reads or writes to access the 16 bit timer registers?
    Would you be so kind as to post the code for using a 16 bit timer on an Arduino Mega 2560?

    Thanks for a great hack!

    • bigjosh2

      It actually looks like access to the 16-bit counter is atomic on the AVR, so everything should work just fine. You will need to transpose all the registers and make everything 16-bit aware. You’ll also need to cross reference the bits to make sure you are selecting the equivalent modes in the 16-bit timer. Finally, if you are using something like Arduino, you’ll need to make sure it is not also trying to use the 16-bit timer (I think it does) and conflict with you. Share your code when you get it working!

      • jagdish mevada

        hey josh,
        I want to generate pulse width ranging from 1ms to 2 second, in a step of 1ms.
        can i use ur code directly to generate required pulse. also u said arduino may be using 16bit timer2 and conflict with generated pulse. How to overcome this.

        • bigjosh2

          The supplied code only uses an 8-bit time, so the maximum pulse length you could generate with 1ms resolution would be 254ms. You could recode to use the 16-bit timer, which would give you a maximum length of 65.535s. To make the changes, you need to look at the datasheet and follow the same strategy for the registers on the 16 bit timer.

  3. Darren

    Neat trick, is it possible to use this method to create pulses of width 1500 as well as 1501 microseconds? Using the prescalers results in pulse widths ~8 microseconds apart at that scale. Cheers.

    • bigjosh2

      I think this is easy. If you used the 8Mhz RC system clock (or, say, a 16Mhz xtal and /2 system clock prescaler) and a clkio/8 timer prescaler, then you could directly set the pulse width in microseconds. If you used the 16 timer, then just set 1500 for 1500us and 1501 for 1501us.

    • bigjosh2

      No go for one-shots on Digital Pin 8. This technique depends on the chip’s hardware timers, so it can only work on pins that are attached to the outputs of the hardware timers. On an Arduino, these pins have a little squiggly next to them which indicates that these pins can do PWM output. Since the PWM uses the same timers, this means any squiggly pin should also me able to do one-shot, although you’d have to edit my program since I’ve hard-coded in pin #3.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s