January 5, 2016

AVR Coding Challenge: How many ways can you twiddle this bit?

The canonical way to set the interrupt bit on AVR is the sei (enable interrupt) instruction, but there are many creative and devious ways to set a bit.

Before you click more, see if you can think of every possible way to set the I bit in SREG, then read on to see if I missed any!

The Issue

From the AVR datasheets…

The instruction following SEI will be executed before any pending interrupts.

All the sei instruction does is set the interrupt enable bit in the status register. As far as I can tell, there is no mention of this delayed interrupt handling for any other method of setting the I bit. Whenever the documentation is ambiguous, we are left to experiment to find the answer!

We will need to test every case, but what are the cases?

So Many Ways to Set a Bit

Here are the four cases I could think up (with a little help from my friends)…

SEI (base case)

Super simple. By the book.

[code]
sei
[/code]

OUT

Obvious. Even your grandma got this one. No points awarded.

[code]
in r16,0x3f // Get SREG
ori r16,128 // Set I bit
out 0x3f,r16 // Save back to SREG
[/code]

STS

Did you know that you can map IO registers to data space by adding 0x20 to the address? You get a point!

[code]
lds r16, 0x5f // Get SREG
ori r16,128 // Set I bit
sts 0x5f, r16 // Put SREG in though the back door
[/code]

(hat tip to Nerd Ralph for this one!)

ST

Did you remember indirect addressing? Give yourself another point!

[code]
clr r29 // Clear Y pointer high byte
ldi r28,0x5f // Set Y low byte to point to SREG (0x3f+0x20)
ld r16, Y // Get SREG
ori r16,128 // Set I bit
st Y,r16 // Put SREG in the Doug Henning way
[/code]

RETI

Did you remember that the I bit is always set upon return from an interrupt? From datasheet section 7.7…

The I-bit is automatically set when a Return from Interrupt instruction – RETI – is
executed.

So here is possibly the slowest and most indirect way to execute an sei. If you get paid by the KLOC, feel free to use this in all your code!

[code]

ldi r28,low(retitarget) // push the high and low bytes of the target address
push r28
ldi r28,high(retitarget) // on stack so return will end up there
push r28
reti // jump off…

retitarget: // …and land here! (and implicitly set I bit)

[/code]

(Hat tip to Zevv for this one!)

Results

For all of the above cases, it seams that the processor does always allow the instruction following the setting of the I to run, even if there is a pending interrupt. Not a surprising result, but good to know for sure!

Sweet Failure!

Here are some cases (some suggested by others) that turn out to not be alternate ways to flip the bit. Some even downright crash & burn!

STD

STD is really just ST in disguise. Let’s look at the op-codes..

10q0 qq1r rrrr 1qqq std (r=register, q=offset) 1001 001r rrrr 1100 st (r=register)

See the resemblance? st is just std with a 0 offset.

BSET

It turns out that sei is bset! They are one and the same! Take a look at the op-codes…

1001 0100 0sss 1000 bset (s=bit in SREG) 1001 0100 0111 1000 sei

The interrupt enable I bit is bit 7 (111 in binary) in SREG.

So sei is just an easier to read and remember way to say “bset 7“.

PUSH

This one is a weaselly hack. We set the stack pointer to the memory mapped address of the SREG register, and then we push the value we want to end up in SREG onto the stack.

If you thought of this one then you are an AVR zen master (and a sneaky bastard).

[code]
clr r29 // ZERO
out 0x3e, r29 // Stack high
ldi r28,0x5f // pointer to memory mapped SREG
out 0x3d, r28 // Stack now points to SREG!

in r16, 0x3f // Get SREG value
ori r16,128 // Set I bit
push r16 // Store back to SREG. What? Yep, that just happened.
[/code]

Unfortunately, it just doesn’t work. The I bit never gets set. As far as I can tell the pushed byte does not get pushed anywhere. Not into the SREG register, not into memory, nowhere. It just disappears forever in apparent violation of the laws of god and science (ok, not really).

I even tried pointing the stack at the General Purpose I/O Register register (GPIOR1). This register is in the same IO space as SREG, but is completely free and unencumbered- avoiding any complications that might be related to trying to update SREG. Even pushing to GPIOR1 seems to have no effect at all other than updating the stack pointer.

What is going on here? My guess is that the microcode for stack operations does not go though the memory mapping translator path, so everything melts. This is fair, since no one should ever really be trying to do what we are trying to do, so why add extra hardware just to make it work correctly?

CALL

So here is the (psychotic) thinking behind this hot mess of a bit flip…

The CALL instruction works like a PUSH except that instead of pushing the contents of a register onto the stack, it pushes a return address onto the stack.

So… if we can contrive our code so that the address of the CALL in program memory is such that the return address has a bit set in the right place, when the CALL is executed then it will push the return address with the set bit into SREG. Follow?

I know, I know – it is just so crazy that it might work.

Here is the crazy-ass code…

[code]
clr r28 // ZERO
out 0x3e, r28 // Stack high
ldi r28,0x5f // pointer to memory mapped SREG (so when we CALL, the low byte of the PC will end up in 0x5f)
out 0x3d, r28 // Stack now points to SREG!

jmp addresswithIbit

.org 0b10000000-3 // This nasty org is really an address designed to have the top bit set
// so when it gets pushed to the stack (which will point to SREG), then
// the I bit in SREG will get set. Get it?

addresswithIbit:
call callTarget // push PC (which has I bit set) onto stack (which points to SREG).

rjmp loop // Lock up, but we really should never end up here

calltarget: // When we get here, the address with the I bit set has just been pushed into SREG, enabling interrupts
sbi PORTB,5 // LED on
loop:
jmp loop // Infinate loop
[/code]

To my horror and relief, this does not work at all. The processor does jump to the call targte and SP is decremented, but the return address is lost forever. When the ret is executed, we fly off into la-la land of non-existing flash address. Don’t try this at home.

This almost certainly doesn’t work for the same reason that push doesn’t work – the hardware apparently does not support dereferencing a stack pointer into memory mapped IO register space.

XCH

According to the datasheet entry on XCH…

Memory access is limited
to the current data segment of 64KB.

…which seems to imply that it will not work on memory mapped io memory. But there is only one way to find out- test on actual hardware.

But wait! I can not find a single chip where this instruction is actually implemented in the silicon! (and I have a big collection of AVR chips!) It is definitely not supported on any of the normal chips like the ATTINY’s or ATMEGA’s. Looks like it is probably on some XMEGA’s, but not all versions and it varies from rev to rev.

I can’t even find any Atmel documentation listing which chips support these instructions. Can you?

If you can find a chip that can do it, please test and let me know what happens!

(Same goes for LAS & LAT)

DebugWire for extra credit

(Again, suggested by Nerd Ralph)

Ignore this part unless you are a pathological AVR trivia buff since it can never ever be of any practical use. (or could it?)

The idea here is to blow the DebugWire fuse on our AVR, and then use the DebugWire connection to discretely reach directly into it’s mind and set the I without any using any code on the chip whatsoever. No instructions are executed in this setting of the I bit.

Will the AVR still dutifully put off a pending interrupt and execute the next instruction even if the I is set when no one is looking? Heck, will it even process the interrupt if the I is set without an instruction taking place?

This was a hard test to do and involved breaking out a Dragon board and wiring it up to our poor little chip.

After lots of A/B testing and single stepping, I am 90% sure that the instruction that is pointed to by PC when you manually set the I bit over DebugWire is, in fact, always executed even if there is a pending interrupt.

I also tested to see if the current instruction even sees the I bit as being set after it is changed via DebugWire. I could imagine some buffer somewhere that holds the new value and does notice the change until the clock ticks. To do this, I stopped the processor right before an instruction that read SREG into a normal register, then I later checked that register and it did, indeed, have the I set.

FAQ

Q: Why?
A: This issue actually came up on a project long ago where I wanted to save a cycle by enabling interrupts and clearing a flag at the same time. I only tested the case I needed and it worked, so I forgot about it. When I saw the same question pop up on Stack Exchange today, I figured it was a good excuse to finally answer the question conclusively!

Q: What happens if I think of a case that you missed?
A: Then you win an all expense paid lunch at any restaurant you want in NYC! (transportation and lodging not included)

Q: Ha! You idiot! You forgot SBI!
A: Not so fast, the SBI instruction only works on IO registers 0-31 and SREG is all the way up at slot 63. Nice try, though!

Q: Code?
A: Here.

18 comments

January 6, 2016 - 1:45 am Ralph Doncaster (Nerd Ralph)

STS and STD are technically different ways than ST.

Loading...

Reply
- January 6, 2016 - 10:21 am bigjosh2
  
  I am going to push back on `STD`/`ST` since (IMHO) they are variants of the same microcode. The emitted op-code for `ST` is really just an `STD` with the offset bits (`q`) set to zero. Am I missing something?
  
  Loading...
  
  Reply
  - January 6, 2016 - 10:53 am Ralph Doncaster (Nerd Ralph)
    
    I never thought to look at the instruction opcode, so I agree ST and STS are the same.
    
    Loading...
    
    Reply
- January 6, 2016 - 10:25 am bigjosh2
  
  `STS` is definitely a winner. It is clearly completely different microcode than `ST`/`STD`. I can’t believe I missed that one! I own you lunch, but I’d want to buy you lunch next time you are in NYC anyway!
  
  Loading...
  
  Reply
January 6, 2016 - 2:03 am Ralph Doncaster (Nerd Ralph)

You might say this is cheating, but you can do it with debugWire.

Loading...

Reply
- January 6, 2016 - 10:30 am bigjosh2
  
  I was afraid something like this would happen. This is off-the-rails, out-of-the-box out there. A software test that requires physically burning fuses and adding (expensive!) external hardware to carry out. I love it! I’ll dig out my old Dragon now and see what we find! (unless of course you have some inside information about the undocumented magic DebugWire register you’d like to share with us all… Eh?)
  
  Loading...
  
  Reply
  - January 6, 2016 - 10:55 am Ralph Doncaster (Nerd Ralph)
    
    DebugWire can probably be done with a standard UART with Tx and Rx tied together.
    http://www.ruemohr.org/docs/debugwire.html
    
    Loading...
    
    Reply
    - January 9, 2016 - 8:15 pm bigjosh2
      
      DebugWire tested… and it works! Details above. I would have never thought of that one. Thanks!
      
      Loading...
      
      Reply
January 6, 2016 - 3:17 am Zevv

Does using CALL earn me lunch? CALL writes PC+2 to SP, so with some easily crafted code this can be used to set the I flag as well.

Loading...

Reply
- January 6, 2016 - 10:40 am bigjosh2
  
  I think my mind sub-consciously suppressed this possibility because I am far too lazy to start calculating the codebase origins to make this actually work. So much for that plan, along with all the stuff I was going to get done today. :)
  This is totally legit (although totally reckless and depraved), so I can’t withhold that which is rightfully yours – the fame and glory of you very own credited `I` setting scheme (and a free lunch)!
  
  Loading...
  
  Reply
  - January 6, 2016 - 11:19 am Zevv
    
    Yay, fame and glory it is!
    
    I’ll have to skip lunch though, as I’m a 9h/$500 flight away from NYC. The offer is much appreciated though!
    
    Loading...
    
    Reply
    - January 6, 2016 - 5:36 pm bigjosh2
      
      I congratulated too soon! I just tested and CALL resoundingly does not work! It anti-works! It is a travesty against humanity! See new *FAILURES* section above for gory details!
      
      Loading...
      
      Reply
      - January 7, 2016 - 3:25 am Zevv
        
        Well, that’s kind of unexpected. Not sure if this kind of behavior deserves an errata in the data sheet, this is clearly a situation where Atmel would simply say “don’t do that, it’s stupid” :)
        
        Loading...
        
        Reply
January 6, 2016 - 12:44 pm Zevv

Just looking at the AVR opcode table, I suspect there are a few more you missed:

XCH(Z, rd): (Z) <- RD | RD <- Z
LAS(Z, rd): (Z) <- Rd and (Z)
LAT(Z, rd): (Z) <- Rd xor (Z)
BSET(s): SREG(s) <- 1
SBI(A,b): I/O(A,b) <- 1

Loading...

Reply
- January 7, 2016 - 11:35 pm bigjosh2
  
  BSET and SBI covered above. Will shoot down XCH, LAS, & LAT tomorrow. Keep them coming!
  
  Loading...
  
  Reply
January 10, 2016 - 4:15 am Zevv

Ok, one last try:

RETI: “Returns from interrupt. The return address is loaded from the Stack and the global interrupt flag is set.”.

I can not find any documentation saying that RETI is invalid outside of interrupt context, so doing a CALL/RETI pair might just work.

Loading...

Reply
January 10, 2016 - 2:05 pm bigjosh2

Hmmm… I don’t think this one qualifies for experiment since it is a documented case. “7.7 When the AVR exits from an interrupt, it will always return to the main program and execute one more instruction before any pending interrupt is served.” But I guess it is technically a way to set the I bit, so I’ll add it to the list!

Loading...

Reply
July 22, 2024 - 7:22 pm esot.eric

I’ve done enough of this rabbithole to finally stumble on your highly-informative article after pages of irrelevant search results, and thereafter dive into many related links and theirs, to decide with somewhat certainty that there is no official documentation on the matter, and that every piece of data findable on the matter is purely speculation or empiracle results from testing [amazing as it is] like yours.

Which is to say that the reason I dove down this rabbithole is due to avr-gcc’s assembly-output for a completely unrelated experiment…

acr-gcc 5.4.0, attiny85, -O0
upon passing a large struct (22bytes) as an argument to a function, it loads the stack-pointer into r25:24; sbiw r24, 22; then writes the new address to sp via:
in r0,0x3f (r0=sreg)
cli
out 0x3e, r25 (sph=newAddrH)
out 0x3f, r0 (sreg=r0, restored to pre-cli value)
out 0x3d, r24 (spl=newAddrL)

It uses this same technique for allocating the original struct, too… So, it’s obviously deeply-ingrained in a lot of systems’ firmwares!

Since it seems you are the foremost expert on the matter (thank your for your hard work, and humor!), I thought you might like to know of its’ being in use in products. (Shock! Terror! what about the 2025 run of next-version attiny85’s?)

Maybe whoever’s responsible at the avr-gcc crew will reach out to point us to some real documentation on the matter!

Loading...

Reply

josh.com