Electronic – USIBR loses the MSB of the received byte in ATtiny45/85 of AVR family

avr

Update: Dec.06.2017

I've just received an answer from Microchip for this bug report. They also
could reproduce this symptom and they will inform me about their
further investigation.


Original question:

I often use the Parallel port (D-Sub DB-25 female) of PC to communicate
with MCU or a processor card with my unique and self developed software and interface circuit .

Now it looks like I've found a functional bug in the USI hardware of AVR ATtiny45/85 MCU.

I wanted to read the received bytes from the USIBR register instead of USIDR to
take advantage of what was being written in data sheet of Atmel (today Microchip's).
Here is a quote from 2586Q–AVR–08/2013 data sheet of ATtiny25/45/85:

" Instead of reading data from the USI Data Register the USI Buffer
Register can be used. This makes controlling the USI less time
critical and gives the CPU more time to handle other program tasks.
USI flags as set similarly as when reading the USIDR register. The
content of the USI Data Register is loaded to the USI Buffer Register
when the transfer has been completed. "

I would like to show and proof with my experiment that the USIBR is unusable in the ATtiny45/85 chips,
because it loses the MSB bit of the received byte.
E.g: if the value of received byte is 3 then 7 is readable from USIBR,
if received byte is 4 then the result is 8 from the USIBR.

Generally form is: if the received byte:n then ( (n<<1) + (1&n) )&0xff is readable from USIBR.
(It is like a "backward arithmetic shift".)

In my demonstration I used USI in three wire and slave mode and I connected the chip
to the Parallel port of my PC via 1kOhm resistors, and I used 5V from a USB plug.
And I used the inside calibrated oscillator in default mode 8MHz with CKDIV8 fuse so
speed was 1usec/1instruction (if it is one cycle instruction.)
The accurate pin assignment between PC and ATtiny45 ("the interface circuit")
has been seen top of my C program in a comment.

This C program in the PC acts as a three wire master under Linux.
It uses very slow clock: 1ms for half period and wait 10ms between bytes.
This master sends the series of one byte integers to the slave.
The slave assembly program in the chip only echoes back to the received bytes
and the master gets this echo and print them on the screen.
Of course the received byte was coupled with the previously sent byte in the printed string.

The master and slave program demonstrates the successful operation
when I use the "IN R16,USIDR" line in slave. But if I change this
one line to "IN R16,USIBR" then the echo from the slave to the master
is satisfies the above strange form that means the MSB had been lost
in USIBR. I think we can conclude that USIBR is unusable.

Please look at my sources. I tried to keep the source code as short as possible
both in C and assembly. And I put the two Intel HEX outputs also for the clarity.
This also shows that only one instruction word was different in the slave program (B10F is godd case but B300 isn't working (data part of Intell HEX is little endian)).

I'm going to send a bug report to Microchip from this, because this
problem isn't in the Errata of the last 2586Q–AVR–08/2013 data sheet.

I wonder if anyone can confirm the problem of USIBR or refute it?

Here is the Echo prg (slave on three wire):

; This is a slave on 3 wire
; and echo back the received bytes.

        .EQU   OK = 1       ;If OK=1 then "in a,usidr",
                            ;If OK=0 then "in a,usibr" will be compiled 
        .nolist
        .include   "/usr/share/avra/tn45def.inc"
        .list
;       .device ATtiny45
        .def    a =  r16
        .cseg
        .org    0x0000       ;IT vector not used
        bclr    sreg_i       ;disable IT forever
        sbi     ddrb, ddb1   ;DB1(MISO) is output, all other is input

        ldi     a, 0x18     ;SIE:0 OIE:0 WM1:0 WM0:1 CS1:1 CS0:0 CLK:0 TC:0
        out     usicr, a    ;set USI to three wire as slave

cyc0:   sbi     usisr, usioif   ;clr USIOIF
cyc1:   sbis    usisr, usioif   ;skip if received a byte
        rjmp    cyc1            ;wait for a byte from master

        .IF     OK
        in      a, usidr        ;get the byte from usidr
        .ELSE
        in      a, usibr        ;get the byte from usibr it's DON'T working
        .ENDIF

        out     usidr, a        ;echo back the received byte
        rjmp    cyc0            ; do forever

;  If .EQU OK = 1 then Intel HEX output this: and USIDR is used successful
;  :020000020000FC
;  :10000000F894B99A08E10DB9769A769BFECF0FB1B4
;  :040010000FB9FACF5B
;  :00000001FF

;  If .EQU OK = 0 then Intel HEX output this: but USIBR is used with FAULT
;  :020000020000FC
;  :10000000F894B99A08E10DB9769A769BFECF00B3C1
;  :040010000FB9FACF5B
;  :00000001FF

If .EQU OK = 1 in the above source, then the part of output screen of master.c

Received: 00 == Previously sent: 00
Received: 01 == Previously sent: 01
Received: 02 == Previously sent: 02
Received: 03 == Previously sent: 03
Received: 04 == Previously sent: 04

If .EQU OK = 0 in the above source, then the part of output screen of master.c

Received: 00 == Previously sent: 00
Received: 03 != Previously sent: 01
Received: 04 != Previously sent: 02
Received: 07 != Previously sent: 03
....
Received: 00 != Previously sent: 80
Received: 03 != Previously sent: 81
Received: 04 != Previously sent: 82
Received: 07 != Previously sent: 83
....
Received: e8 != Previously sent: f4
Received: eb != Previously sent: f5
Received: ec != Previously sent: f6
Received: ef != Previously sent: f7

Finally, here is my C program acts as a three wire master under Linux
via Parallel por of PC.

/*
Test of three wire between PC & ATtiny45

  This prg acts as master of 3wire.
  It sends series of byte 0, 1, 2 ...
  And it prints the received and
  that was sent in before.

  Pin connection between PC and ATtiny45:
  PC Parallel port
  D-Sub DB-25 Female              ATtiny45
  PORT.D0 (pin2) --> 1kOhm --> PB0 (pin5) (MOSI)
  PORT.D1 (pin3) --> 1kOhm --> PB2 (pin7) (SCK)
  (PORT+1).D6 (pin10) <-- 1kOhm <-- PB1 (pin6) MISO
  PORT.D2 (pin4) --> 1kOhm --> PB5 (pin1) ~RST
  GND (pin25) -------------------GND (pin4)

  Power is 5V (from a USB plug).
  And inside calibrated oscillator was used
  in default mode (8MHz and div8) so
  1usec/one instruction (if it is one cycle).

gcc -O0 three_wire_test.c -o three_wire_test -lrt

If you want to use this prg by a normal user:
chown root:laci three_wire_test && chmod +s three_wire_test
*/

#include <stdio.h>
#include <stdlib.h>
#include <sys/io.h>
#include <sys/time.h>

#define PORT    0x378      //Others: 0x278 0x3bc

//very slow clock
#define W100    100000     // 100ms reset
#define W1      1000       // 1ms half ck
#define W10     10000      // 10ms after a byte transfer

void wait_u(int);
int snd_rec_byte(int);

int main(){
int i, snt, snt_1, rec;

// Get permission for direct I/O under Linux:
if(ioperm(PORT,2,1)){
    printf("Couldn't open parallel port 0x%x\n", PORT); exit(1);}

//One reset pulse 
outb(4, PORT); wait_u(W100); outb(0, PORT); wait_u(W100);
outb(4, PORT); wait_u(W100); // waiting for startup of chip

// send, receive and print bytes
for(snt_1=-1, snt=0; ;snt_1=snt, snt++, snt&=0xff){
    rec=snd_rec_byte(snt);
    if(snt_1!=-1)printf("Received: %02x %s Previously sent: %02x\n",\
                         rec, rec==snt_1?"==":"!=", snt_1);}
} //End of main

int snd_rec_byte(int s){                 // One byte send & receive
int r, i;
for(r=i=0 ; i<8; i++, s<<=1){            // 8 bit shifting
    outb(s&0x80?5:4, PORT); wait_u(W1);  // hold SCK=L,MOSI=MSBofs; wait 1ms
    outb(s&0x80?7:6, PORT); wait_u(W1);  // rise SCK=H,hold MOSI; wait 1ms
    r<<=1; if(inb(PORT+1)&0x40)r|=1;     // MISO shift into r,outputs unchanged
    outb(s&0x80?5:4, PORT); wait_u(W1);} // fall SCK=L,HOLD MOSI, wait 1ms
wait_u(W10); return(r);
} // End of snd_rec_byte

void wait_u(int c){                      // Delay minimum c usec
struct timeval req, req2;
gettimeofday (&req, NULL);
for(;;){
    gettimeofday (&req2, NULL);
    if((req2.tv_sec-req.tv_sec)*1000000+req2.tv_usec-req.tv_usec >= c)return;}
} // End of wait_u

Best Answer

I continued my experiments with USI hardware of ATtiny45/85 and I could find the solution for the usage of USIBR.(I have to note it that Microchip Help desk's answer was basically wrong and incomplete.)

My answer consists of two parts, in the first part I show my analysis of USI hw and in the second part I explicitly give four equivalent examples for three wire (3w) slave prg which uses USIBR to receive bytes well.

The purpose of the analysis is to discover the true operation of USIBR because it isn't documented in the mcu's data sheet.

The physical environment was unchanged. (The 3w master prg in C on a PC under Linux and it communicates via parallel port of PC with ATtiny45/85. Pls. see detailed pin connection between D-Sub25 and 4 pins of chip and resistors in the comment of C source of master.)

There are two entities of analysis: MASTER3W_DUMP.C master prg on the PC and the slave USI_CHK_AGENT.ASM on the mcu. And the DUMP_OUT of master gives the result to us on the screen.

First MASTER3W_DUMP.C sends two bytes (0xca, 0x75) to slave agent as test material and agent saves all hardware values of USI hw in triplet of bytes into the SRAM of mcu after every edge of CK (both falling and rising edge).

A triplet occupies the fallowing data:

    1. byte: (CK=PB2, DI=MOSI=PB0, DO=MISO=PB1, USIOIF, USICNT[3:0])
    1. byte: USIDR
    1. byte: USIBR

After the 32 edges finished (two test bytes were sent) the agent starts to send back these recorded triplets from the SRAM to the master (99 bytes) and master prints them on the screen in a human format (see DUMP_OUT).

Before you start to interpret these two prgs with their output, you should refresh some facts from 2586Q–AVR–08/2013 data sheet of ATtiny25/45/85 in the USI chapter. The key facts what needs to be watched:

  • USIDR copied into the USIBR only when USICNT[3:0] has been overflowed.
  • USICNT[3:0] has been increased by every edge of CK of master. (both falling and rising edge).
  • USIDR has been shifted only one kind of edge of CK of master that was chosen in USICR. (only falling or only rising edge).

Now it has to be clear why I initiated USICNT[3:0] with 0xd after its every overflow. Because I wanted to see more sampling of USIDR into USIBR during 32 edges. And I wanted to see how the sampling of USIDR into USIBR was going when USICNT[3:0] has been overflowed at different edge of CK of master (that is why an odd number was chosen).

MASTER3W_DUMP.C

/*
MASTER3W_DUMP.C  LSz 2018.Jan.05 

  This prg acts as 3wire master for a slave prg in
  ATtiny45 mcu via parallel port of PC.

  First it sends two bytes (0xca, 0x75) to slave
  and slave saves all hardware values of USI in
  triplet of bytes into the SRAM after every 32=2*16
  edge of CK (both falling and rising).

  Stucture of this triplet is:
  1. byte: (CK=PB2, DI=MOSI=PB0, DO=MISO=PB1, USIOIF, USICNT[3:0])
  2. byte: USIDR
  3. byte: USIBR

  After this the master receives 1+32 triplets and   
  prints them on the screen in a human format.

  Pin connection between PC and ATtiny45:
  PC Parallel port
  D-Sub DB-25 Female              ATtiny45
  PORT.D0 (pin2) --> 1kOhm --> PB0 (pin5) (MOSI)
  PORT.D1 (pin3) --> 1kOhm --> PB2 (pin7) (SCK)
  (PORT+1).D6 (pin10) <-- 1kOhm <-- PB1 (pin6) MISO
  PORT.D2 (pin4) --> 1kOhm --> PB5 (pin1) ~RST
  GND (pin25) -------------------GND (pin4)

  All fuse remained in their original default mode.
  Power is 5V (from a USB plug).
  And inside calibrated oscillator was used
  in default mode (8MHz and div8) so
  1usec/one instruction (if it is one cycle).


gcc -O0 master3w.c -o master3w -lrt

If you want to use this prg by a normal user:
chown root:laci three_wire_test && chmod +s three_wire_test
*/

#include <stdio.h>
#include <stdlib.h>
#include <sys/io.h>
#include <sys/time.h>

#define PORT    0x378      // Others: 0x278 0x3bc

// Constants for very slow clock
#define W100    100000     // 100ms reset
#define W1      1000       // 1ms half ck
#define W10     10000      // 10ms after a byte transfer
// Constants for test
#define DS  99         // Dump size
#define B1  0xca       // First test byte
#define B2  0x75       // Second test byte

void wait_u(int);
int snd_rec_byte(int);

int main(){
int i, rec, rec2, rec3;

// Get permission for direct I/O under Linux:
if(ioperm(PORT,2,1)){printf("Couldn't open parallel port 0x%x\n", PORT); exit(1);}

// Reset pulse on ~RST pin of ATtiny45 while holding MOSI, SCK in Low
// and waiting for starting slave prg in mcu befor
// this could send bytes to the slave

outb(0, PORT); wait_u(W100);  //~RST=L, MOSI=L, SCK=L Tiny45 in reset state
outb(4, PORT);                //~RST=H, MOSI=L, SCK=L Tiny45 do startup
wait_u(W100);                 //This master wait for finsh sturtup of Tiny45

printf("Send and receive two bytes\n");
printf("Rec:%02x Snt:%02x\n", snd_rec_byte(B1), B1);
printf("Rec:%02x Snt:%02x\n", snd_rec_byte(B2), B2);

printf("Dump %d triplets from SRAM:\n", DS/3);
printf("Edge CK(PB2)  DI(PB0)  DO(PB1) usiOIF   usiCNT  usiDR     usiBR\n");
for(i=0; i<DS/3; i++){
    rec=snd_rec_byte(0)&0xff; rec2=snd_rec_byte(0)&0xff; rec3=snd_rec_byte(0)&0xff;
    printf("%2d.   %01d%s%01d%s%01d%s%01d%s%01x%s%02x%s%02x %s\n", i,\
        rec&0x80?1:0, rec&0x80?" - - -  ":"        ",\
        rec&0x40?1:0, rec&0x80?" - - -  ":"        ",\
        rec&0x20?1:0, rec&0x80?" - - -  ":"        ",\
        rec&0x10?1:0, rec&0x80?" - - -  ":"        ",\
        rec&0xf, rec&0x80?" - - -  ":"        ",\
        rec2, rec&0x80?" - - -  ":"        ", rec3, rec&0x10?(rec2==rec3?"Good":"Wrong"):"");}
}  //End of main

// One byte send & receive               
int snd_rec_byte(int s){                 // PORT.D2=~RST  PORT.D1=SCK  PORT.D0=MOSI  (PORT+1).D6=MISO
int r, i;
for(r=i=0 ; i<8; i++, s<<=1){              // 8 bit shifting while ~RST pin holding in H
    outb((s&0x80?5:4), PORT); wait_u(W1);  // holding SCK=L, MOSI=MSB of s; wait 1ms
    outb((s&0x80?7:6), PORT); wait_u(W1);  // rise SCK=H, holding MOSI; wait 1ms
    r<<=1; if(inb(PORT+1)&0x40)r|=1;       // MISO shifted into r, outputs unchanged
    outb((s&0x80?5:4), PORT); wait_u(W1);} // fall SCK=L, HOLDING MOSI, wait 1ms
wait_u(W10); return(r);                    // waiting 10ms after the byte transfer
}  //End of fnc

void wait_u(int c){                      // Delay minimum c usec
struct timeval req, req2;
gettimeofday (&req, NULL);
for(;;){
    gettimeofday (&req2, NULL);
    if((req2.tv_sec-req.tv_sec)*1000000+req2.tv_usec-req.tv_usec >= c)return;}
}  //End of func

USI_CHK_AGENT.ASM

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; USI_CHK_AGENT.ASM by Laszlo SZILAGYI  2018.Jan.05 
;
; USI is in three wire and slave mode with ext CK.
;
; I want to see exactly how registers of USI behave
; after every edge of CK mainly of USIBR.
;
; Therefore the goal of this prg is to display 
; USIDR, USIBR, CK=PB2, DI=MOSI=PB0, DO=MISO=PB1,
; USICNT[3:0] after every edge of CK
; (both falling and rising).
;   
; This prg puts a triplet into the 'dmp' array
; at every rising and falling edge of CK.
; The second byte is USIDR and third is USIBR
; and the bits of first byte (from MSB to LSB) is:
; CK, DI, DO, USIOIF, USICNT[3:0] in the triplet.
;
; After the USICNT[3:0] overflowed it always
; puts 0xd into the USICNT[3:0] to force USIOIF
; to strobe USIBR at every third edge.
; USIBR gets new value only when the USICNT
; has been overflowed.
;
; I have chosen 3-edge sampling for the following reasons.
; We can observe a lot of sampling of USIDR into
; USIBR during 32 edges. And because 3 is an odd number
; the overflow of USIOIF will occure at two types of CK edge.
; In that case when the edge of CK causes shifting of USIDR
; (now the rising edge was choosen by USICR) the sampling 
; will be good.
; In other case when the edge of CK doesn't cause shifting
; of USIDR (now it is the falling edge) the sampling will
; be wrong.
;
; The communication logically is half duplex between
; master and slave. In the first time the master sents 
; two bytes pattern (e.g.:0xca, 0x75) to the slave.
; And slave puts the 1+32 triplets into dmp array at
; every edge of CK. After 32 edges the size of 'dmp'
; will be 3+32*3=99 bytes.
;
; In the second part of transfer the slave sents back
; these 99 bytes from the SRAM to the master and master
; prints them on the screen in a human format. 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 

                .nolist
                .include   "/usr/share/avra/tn45def.inc"
                .list
;               .device ATtiny45

                .equ    EDGE = 32    ; 16 rising + 16 falling edge for shift in 2 bytes
                .equ    CNT0 = 0xd   ; pre init of USICNT[3:0]
                .equ    CNT1 = 0xd   ; init of USICNT[3:0] at every overflow of USICNT
                .equ    UDR0 = 0x95  ; initial number in USIDR
                .equ    Z = sreg_z
                .equ    I = sreg_i

                .dseg
                .org    sram_start
dmp:            .byte   3+3*EDGE         ; 3+32*3=99 bytes : three bytes for every edge
                                         ; dmp is an array of 3 bytes:
                                         ; 1/ b7=CK=PB2, b6=DI=PB0, b5=DO=PB1, b4=USIOIF, b[3:0]=USICNT[3:0]
                                         ; 2/ usidr
                                         ; 3/ usibr
                                         ; and x-> dmp

                .def    a =   r16        ;tmp1
                .def    b =   r17        ;tmp2
                .def    prd = r18        ;num of edge, [32:1]

                .MACRO  wldi              ; e.g.: wldi  x, dmp
                ldi     @0l, low(@1)      ;       ldi  xl, low(dmp)
                ldi     @0h, high(@1)     ;       ldi  xh, high(dmp)
                .ENDM

                .MACRO  outi              ; e.g.: outi  usidr, 0x00  ; using "a=r16"
                ldi     a, @1             ; ldi   a, 0x00
                out     @0, a             ; out   usidr, a
                .ENDM

                .MACRO  btmv             ; e.g.: btmv  b, 6, pinb0  ; using "a=r16"
                bst     a, @2            ;       bst   a, pinb0 
                bld     @0, @1           ;       bld   b, 6
                .ENDM

                .cseg
                .org    0x0000             ;IT vectors not used
                bclr    I                  ;disable IT
                outi    spl, low(ramend)   ;sp=ramend
                outi    sph, high(ramend)

                outi    usicr, 1<<usiwm0 | 1<<usics1    ;set USI to three wire as slave, rising ck, ext ck
                outi    usidr, UDR0                     ;init USIDR
                outi    ddrb, 1<<ddb1                   ;PB1(MISO) is output, all other is input, no pull-up

                wldi    x, dmp           ; x-> dmp
                ldi     prd, EDGE        ; num of edge [32:1]
                outi    usisr, CNT0      ; pre init USICNT[3:0]
                rcall   dump             ; dump of first triplet 

cyc0:                                    ; Start cycle of dumping +32 triplet
                in      b, pinb          ; read old_pinb2
wait_alt:                                ; Wait for alter of pinb2
                in      a, pinb          ; read new_pinb2 again
                eor     a, b             ; new_pinb2^=old_pinb2
                andi    a, 1<<pinb2      ; mask pinb2
                brbs    Z, wait_alt      ; if(new_pinb2!=old_pinb2)goto wait_alt
                rcall   dump             ; dump of next triplet after any edge
                dec     prd              ; dec num of edge
                brbc    Z, cyc0          ; do while prd>0

                                         ; End dumping cycle and Start sending 99 bytes to master
                outi    usisr, 0x40      ; clear USIOIF and USICNT
                wldi    x, dmp           ; reinit x-> dmp
                ldi     b, 3+3*EDGE      ; b=99 send 99=3+3*32 byte to the master
snd:                                     ; to display them on the screen
                ld      a, x+
                out     usidr, a         ; send [x+]
                sbi     usisr, usioif    ; clear usioif
wtfb:                
                sbis    usisr, usioif
                rjmp    wtfb             ; wait for USIOIF
                dec     b                ; dec byte counter
                brbc    Z, snd           ; do while b>0
end_main:       
                rjmp    end_main         ; do nothing
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; end main

dump:      ;;;; DUMP SUBROUTINE ;;;;     ; x-->dmp
                                         ; [x+]= b7=CK=PB2, b6=DI=PB0, b5=DO=PB1, b4=USIOIF, b[3:0]=USICNT[3:0]
                                         ; [x+]= usidr
                                         ; [x+]= usibr
                in      a, usisr         ; read USISR
                ldi     b, CNT1          ; prepar b with CNT1
                sbrc    a, usioif        ; skip if no ovf
                out     usisr, b         ; init USICNT[3:0] again after ovf
                eor     b, b             ; clr b for collecting bits

                btmv    b, 4, usioif     ; b4=USIOIF
                andi    a, 0x0f          ; clr high nibble
                or      b, a             ; b[3:0]=USICNT[3:0]

                in      a, pinb          ; chek pins
                btmv    b, 5, pinb1      ; b5=pinb1=DO=MISO
                btmv    b, 6, pinb0      ; b6=pinb0=DI=MOSI        
                btmv    b, 7, pinb2      ; b7=pinb2=CK
                st      x+, b            ; [x+]=(CK, DI, DO, USIOIF, USICNT[3:0])
                in      a, usidr
                st      x+, a            ; [x+]=usidr
                in      a, usibr
                st      x+, a            ; [x+]=usibr
                ret
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; end dump
               .exit

By observing the DUMP_OUT and tracking the values of DI,DO,OIF,CNT,USIDR,USIBR while two test bytes are shifting into USIDR. We can establish these 3 facts:

1/ All data are good as they were expected except only USIBR. It is some times good that is equal with USIDR and some times wrong that is about double of USIDR at sampling moment. (At edges when there weren't overflow of USCNT[3:0] the USIBR was unchanged as we expect it.)

2/ If the overflow of USICNT[3:0] happened at rising edge of CK, the USIBR was equal with USIDR, these lines were marked as "Good" by master prg. And in opposite case when the overflow happened at falling edge then the value of USIBR was wrong, these lines were marked as "Wrong".

3/ USIDR was shifted at every rising edge because now the rising edge was chosen by USICR for shifting.

DUMP_OUT

$ ./master3w_dump
Send and receive two bytes
Rec:95 Snt:ca
Rec:ca Snt:75
Dump 33 triplets from SRAM:
Edge CK(PB2)  DI(PB0)  DO(PB1) usiOIF   usiCNT  usiDR     usiBR
 0.   0        0        1        0        d        95        00 
 1.   1 - - -  1 - - -  1 - - -  0 - - -  e - - -  2b - - -  00 
 2.   0        1        0        0        f        2b        00 
 3.   1 - - -  1 - - -  0 - - -  1 - - -  0 - - -  57 - - -  57 Good
 4.   0        1        0        0        e        57        57 
 5.   1 - - -  0 - - -  0 - - -  0 - - -  f - - -  ae - - -  57 
 6.   0        0        1        1        0        ae        5c Wrong
 7.   1 - - -  0 - - -  1 - - -  0 - - -  e - - -  5c - - -  5c 
 8.   0        0        0        0        f        5c        5c 
 9.   1 - - -  1 - - -  0 - - -  1 - - -  0 - - -  b9 - - -  b9 Good
10.   0        1        1        0        e        b9        b9 
11.   1 - - -  0 - - -  1 - - -  0 - - -  f - - -  72 - - -  b9 
12.   0        0        0        1        0        72        e4 Wrong
13.   1 - - -  1 - - -  0 - - -  0 - - -  e - - -  e5 - - -  e4 
14.   0        1        1        0        f        e5        e4 
15.   1 - - -  0 - - -  1 - - -  1 - - -  0 - - -  ca - - -  ca Good
16.   0        0        1        0        e        ca        ca 
17.   1 - - -  0 - - -  1 - - -  0 - - -  f - - -  94 - - -  ca 
18.   0        0        1        1        0        94        28 Wrong
19.   1 - - -  1 - - -  1 - - -  0 - - -  e - - -  29 - - -  28 
20.   0        1        0        0        f        29        28 
21.   1 - - -  1 - - -  0 - - -  1 - - -  0 - - -  53 - - -  53 Good
22.   0        1        0        0        e        53        53 
23.   1 - - -  1 - - -  0 - - -  0 - - -  f - - -  a7 - - -  53 
24.   0        1        1        1        0        a7        4f Wrong
25.   1 - - -  0 - - -  1 - - -  0 - - -  e - - -  4e - - -  4f 
26.   0        0        0        0        f        4e        4f 
27.   1 - - -  1 - - -  0 - - -  1 - - -  0 - - -  9d - - -  9d Good
28.   0        1        1        0        e        9d        9d 
29.   1 - - -  0 - - -  1 - - -  0 - - -  f - - -  3a - - -  9d 
30.   0        0        0        1        0        3a        74 Wrong
31.   1 - - -  1 - - -  0 - - -  0 - - -  e - - -  75 - - -  74 
32.   0        1        0        0        f        75        74 

Now the secret life of USIBR has been turned out and we can realize that it was absolutely undocumented in the 2586Q–AVR–08/2013 data sheet of ATtiny25/45/85 until now.

From the above 3 facts of experiment we can conclude the base theory of usage of USIBR.

The USIDR will be sampled well into USIBR if and only if the overflowing of USICNT[3:0] and shifting of USIDR happening at the same edge of CK of master.

Unfortunately these informations are missing from mcu's data sheet in USI chapter. They haven't shown any usage of USIBR, (but e.g.: their SPI Master Operation Example at page 110 has a mistake, too).

Now I've finished the analysis of side effect of USIBR and now I give four equivalent exact examples of usage of USIBR. These are understandable without having to realize all details from analysis part. And these explicit examples will also demonstrate well the undocumented basic theory of usage of USIBR.

Now let's focus on the next newer two source.

Basically our 3w slave ECHO_3WSLV.ASM only reads the byte from USIBR and sends it back to its complement when USIOIE == 1.

The MASTER3W.C is a very simple 3w master, it sends series of byte 0, 1, 2 ... to 3w slave ECHO_3WSLV.ASM and the master compares the complement of received byte with the one that was sent previously.

We can start master with 'l' or 'h' parameter:

./master3w l

./master3w h

This parameter will determine what will be the inactive initial state of CK of master.

Now let's consider slave ECHO_3WSLV.ASM, this source has four kinds of output of its compilation, according to how MODE and PHASE symbols were uncommented. Pls. see the source.

So if we use LAW level as initial state of CK of master.

./master3w l

We need to compile A2 or B1 case:

A2 case: 3wire, slave, ext CK, shifting by rising edge, USISR=0x41

.equ MODE = 1<<usiwm0 | 1<< usics1

.equ PHASE = 1<< usioif | 1<< usicnt0

or

B1 case: 3wire, slave, ext CK, shifting by falling edge, USISR=0x40

.equ MODE = 1<<usiwm0 | 1<< usics1 | 1<< usics0

.equ PHASE = 1<< usioif

to yield the proper symbol definitions for the choosing of appropriate preinitial value of USICR and USISR.

The other two availabilities (A1, B2 with ./master3w h) pls. see in comments at the end of the source ECHO_3WSLV.ASM and you can find their Intel-HEX results, too.

MASTER3W.C

/*
MASTER3W.C  LSz 2018.Jan.05

 Test of three wire between PC & ATtiny45
 This prg acts as 3wire master for a slave prg in
 ATtiny45 mcu via parallel port of PC.

  It sends series of byte 0, 1, 2 ...
  And it prints the complement of received and
  that was sent in before.

  The initial level of CK is deppend from the
  starting parameter (l or h) of this prg.

  Pin connection between PC and ATtiny45:
  PC Parallel port
  D-Sub DB-25 Female              ATtiny45
  PORT.D0 (pin2) --> 1kOhm --> PB0 (pin5) (MOSI)
  PORT.D1 (pin3) --> 1kOhm --> PB2 (pin7) (SCK)
  (PORT+1).D6 (pin10) <-- 1kOhm <-- PB1 (pin6) MISO
  PORT.D2 (pin4) --> 1kOhm --> PB5 (pin1) ~RST
  GND (pin25) -------------------GND (pin4)

  All fuse remained in their original default mode.
  Power is 5V (from a USB plug).
  And inside calibrated oscillator was used
  in default mode (8MHz and div8) so
  1usec/one instruction (if it is one cycle).


gcc -O0 master3w.c -o master3w -lrt

If you want to use this prg by a normal user:
chown root:laci three_wire_test && chmod +s three_wire_test
*/

#include <stdio.h>
#include <stdlib.h>
#include <sys/io.h>
#include <sys/time.h>

#define PORT    0x378      // Others: 0x278 0x3bc

// Constants for very slow clock
#define W100    100000     // 100ms reset
#define W1      1000       // 1ms half ck
#define W10     10000      // 10ms after a byte transfer

void wait_u(int);
int phase, snd_rec_byte(int); int main(int, char **);

int main(int n, char **p){
int i, snt, snt_1, rec;

if(n!=2){printf("Usage: %s {h|l}\n", p[0]); exit(1);}
phase=p[1][0]=='h'?2:0 ;

// Get permission for direct I/O under Linux:
if(ioperm(PORT,2,1)){printf("Couldn't open parallel port 0x%x\n", PORT); exit(1);}

// Reset pulse on ~RST pin of ATtiny45 while holding MOSI, SCK in Low
// and waiting for starting slave prg in mcu befor
// this could send bytes to the slave

outb(0|phase, PORT); wait_u(W100);  //~RST=L, MOSI=L, SCK=L|H Tiny45 in reset state
outb(4|phase, PORT);                //~RST=H, MOSI=L, SCK=L|H Tiny45 do startup
wait_u(W100);                 // This master wait for finsh sturtup of Tiny45


// This main cycle will sent bytes and compare the complement of the received with
// that was sent previous. But first received will be drop.

for(snt_1=-1, snt=0; ;snt_1=snt, snt++, snt&=0xff){
    rec=~snd_rec_byte(snt)&0xff; // send and receiv and get complement of received byte
    if(snt_1!=-1)printf("~Received: %02x %s Previously sent: %02x\n", rec, rec==snt_1?"==":"!=", snt_1);}
}  //End of main

// One byte send & receive               
int snd_rec_byte(int s){                 // PORT.D2=~RST  PORT.D1=SCK  PORT.D0=MOSI  (PORT+1).D6=MISO
int r, i;
for(r=i=0 ; i<8; i++, s<<=1){                    // 8 bit shifting while ~RST pin holding in H
    outb((s&0x80?5:4)|phase, PORT); wait_u(W1);  // holding SCK=L|H, MOSI=MSB of s; wait 1ms
    outb((s&0x80?7:6)^phase, PORT); wait_u(W1);  // change SCK=H|L, holding MOSI; wait 1ms
    r<<=1; if(inb(PORT+1)&0x40)r|=1;             // MISO shifted into r, outputs unchanged
    outb((s&0x80?5:4)|phase, PORT); wait_u(W1);} // change SCK=L|H, HOLDING MOSI, wait 1ms
wait_u(W10); return(r);                          // waiting 10ms after the byte transfer
}  //End of fnc

void wait_u(int c){                      // Delay minimum c usec
struct timeval req, req2;
gettimeofday (&req, NULL);
for(;;){
    gettimeofday (&req2, NULL);
    if((req2.tv_sec-req.tv_sec)*1000000+req2.tv_usec-req.tv_usec >= c)return;}
}  //End of func

ECHO_3WSLV.ASM

        .include   "/usr/share/avra/tn45def.inc"
;       .device ATtiny45

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; ECHO_3WSLV.ASM  LSz 2018.Jan.05 
;
; USI is in three wire and slave mode with ext CK.
; It echoes back the complement of received bytes.
;
; Before the compilation it needs to be uncommented one of MODE and
; one of PHASE definition from the following 4 (equ) lines!
; You need to choose initial level of master CK
; according the following table when you start the master3w . 
;
; ./master3w l  --> A2 or B1
; ./master3w h  --> A1 or B2
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

                                                            ; 3wire, slave, ext CK and
;       .equ    MODE = 1<<usiwm0 | 1<< usics1               ; [case A] shifting by rising edge
;       .equ    MODE = 1<<usiwm0 | 1<< usics1 | 1<< usics0  ; [case B] shifting by falling edge

                                                            ; clr USIOIF in USISRi and
;       .equ    PHASE = 1<< usioif                          ; [case 1] set USICNT[3:0]=0
;       .equ    PHASE = 1<< usioif | 1<< usicnt0            ; [case 2] set USICNT[3:0]=1

        .cseg
        .org    0x0000          ;IT vector not used
        bclr    sreg_i          ;disable IT forever

        sbi     ddrb, ddb1      ;DB1(MISO) is output, all other is input

        ldi     r16, MODE       ;3wire, slave, ext CK, shifting by rising or falling edge
        out     usicr, r16
        ldi     r16, PHASE      ;clr USIOIF, USICNT[3:0]= 0 or 1
        out     usisr, r16

cyc0:   sbi     usisr, usioif   ;clr USIOIF
cyc1:   sbis    usisr, usioif   ;skip if byte is received
        rjmp    cyc1            ;wait for a byte from master

        in      r16, USIBR      ;get the received byte from USIBR
        com     r16
        out     usidr, r16      ;echo back the complement
        rjmp    cyc0            ;do forever
        .exit

; The four outputs of compilation and appropriate starting parameter of master:

; echo_3wslv_A1.hex   (./master3w h) 
; :020000020000FC
; :10000000F894B99A08E10DB900E40EB9769A769B96
; :0A001000FECF00B300950FB9F9CF41
; :00000001FF

; echo_3wslv_A2.hex   (./master3w l)
; :020000020000FC
; :10000000F894B99A08E10DB901E40EB9769A769B95
; :0A001000FECF00B300950FB9F9CF41
; :00000001FF

; echo_3wslv_B1.hex   (./master3w l)
; :020000020000FC
; :10000000F894B99A0CE10DB900E40EB9769A769B92
; :0A001000FECF00B300950FB9F9CF41
; :00000001FF

; echo_3wslv_B2.hex   (./master3w h)
; :020000020000FC
; :10000000F894B99A0CE10DB901E40EB9769A769B91
; :0A001000FECF00B300950FB9F9CF41
; :00000001FF

Now we can briefly summarize these cases what the appropriate usage of USIBR means.

When the level of inactive initial state of CK of master equals with LOW then we can use two kinds of appropriate preinitialized 3w slaves:

1/ (B1) choosing falling edge for shifting and leave USICNT[3:0] in 0.

2/ (A2) choosing rising edge for shifting and preset USICNT[3:0] in 1.

And we can choose the dual equivalent of the above:

When the level of inactive initial state of CK of master equals with HIGH then we can use two kinds of appropriate preinitialized 3w slaves:

1/ (A1) choosing rising edge for shifting and leave USICNT[3:0] in 0.

2/ (B2) choosing falling edge for shifting and preset USICNT[3:0] in 1.

And we must be aware that the above special align between preinitializing of slave and the level of inactive initial state of CK of master isn't part of any 3w or 2w protocol. This special aligning between slave and master is just because of the overflowing USICNT[3:0] and shifting USIBR has to be done at same kind of CK edge of master in ATtiny45/85.

This is the essential information that should be put in the data sheet of chip and it could save many customers'/developers' hours who want to use USI part of this mcu family. (Or if this chip wouldn't have this side effect with USIBR.)

I note that this knowledge is also important in 2w mode.

Why was wrong the answer of Microchip support?: Because they wanted to initialize USICNT[3:0] with the same value (0x01) after every overflow of USICNT[3:0].

And why was incomplete also their answer?: Because they don't explore the duality between preinitializing USICNT[3:0] with 1 and changing initial level of CK of master.

Brgds