Electronic – Changing to relocatable code for PIC microcontrollers

assemblypicprogramming

I've come back to PIC programming after 10 years and I've been relearning everything. I'm looking at the section in the MPASM manual where it discusses relocatable code. And I've come to the conclusion that I'm puzzled to why absolute code is used at all?

Take this case:

        processor pic16f88
        #include p16f88.inc

.data1  udata      0x20
var1    res        1
var2    res        1

.reset  code 0
        pagesel    Init
        goto       Init

        code
Init:
        ....

        end

For all intents and purposes, isn't that absolute code?

So, from what I can see you can change the udata line and remove the 0x20. Then the linker will place it where it wants to. But you can override that in the linker script and specify an exact position:

section name=.data1 ram=gpr0

I'm mentioning this on stackexchange because it's rarely mentioned and it's the only reason one would ever use absolute code over relocatable (again, in my humble opinion).

While typing this, Stackexchange suggested this link which is a superb response and one of the few places I've seen the linker mentioned.

I do have a few questions though:

  • Can I use a dot to lead a section name? Like ".reset" above. I've seen it used but I'm not sure of its validity. In my mind its a way of keeping labels (Initial caps) and variable names (all lower) and section names (start with dot) in seperate "namespaces".

  • I'm puzzled as to the idata directive. If the data is initialised, who does so? Is there a code block I have to call to set initial data. I'd love to use this instead of setting initial data in an init section.

  • I want to use a large block of data for a buffer. I'm using a PIC16F1829 so I can use the FSR in a flat memory mode to point cleanly across banks. My issue is – how do I tell the assembler and the linker that I'm using, say, banks 3 and 4 for this purpose. If I use the keyword "protected" in the linker, do I have do use udata in the assembler? Or can I just pick some memory, put it in FSR and start writing?

And: Why on earth do books and tutorials (Gooligum excepted) insist on using absolute and why do lecturers keep using it? It does seems totally bizarre to promote absolute over relocatable long past its due date.

I realise this is opinion and might not be within Stackexchange's guidelines either but it's terribly important to know that relocatable code exists!

Best Answer

There are two main reasons you still see absolute mode MPASM code out there: Originally that's all there was, and there are a lot of religious people out there with strongly held beliefs. They think absolute mode is "easier" somehow or that using the linker is hard to learn. Basically, that's what they know and don't want to bother learning a different way, even if it's a far better way, so they make excuses.

Yes, using relocatable mode is obvious. It is all I have ever used in well over 100 PIC projects. I did have the advantage of starting with PICs right after the linker was introduced (1998?), so I never had any investment in absolute code to protect. In any case, this is now ancient history, and there simply is no good reason today. Note that absolute mode isn't even a option with any of the newer toochains, like for the dsPIC.

Some substantial advantages of using relocatable code:

  1. It is possible to actually allocate RAM for variables. The RES directive, which is the only way to do this, is only available in relocatable mode.

    The common hack of using CBLOCK in absolute mode to define variables creates symbols with sequential values that only you know represent addresses of variables. Since the system doesn't know the memory locations are used for these variables, it can't detect and tell you about collisions or overflows.

  2. You get to use modules, meaning different parts of your code are separately built. This provides, among other things, a separate namespace for local symbols in each module. Separate modules can be written separately, each having a local variable called COUNT or a label called LOOP without conflict, for example.

  3. You can easily prevent code sections from crossing page boundaries.

    In absolute mode, the code just ends up where it ends up, with no detection or warning that different parts are on different pages, and therefore require PCLATH manipulation to jump or call between them. Worse yet, this can change every build as code is modified. It might be fine one build, them you get a subtle bug when a page boundary happens to end up between the start and end of a loop.

  4. Code is somewhat insulated from memory layout details. Some old PICs started user RAM at 20h. Code that used CBLOCK h'20' to define variable symbols (remember that CBLOCK doesn't really define variables) will break on many newer PICs without warning. Code that used UDATA and RES will be fine, or will get a linker error if the RAM region is overflowed.

There are other advantages, but these are so compelling as to make absolute code a blatantly stupid choice. There really is no excuse.

While the overall advantages are overwhelming, there are some issues that need to be considered in relocatable mode as apposed to absolute mode.

The main one is that when using a bare UDATA, you don't know the bank of variables at build time. This prevents bank-setting optimizations. I get around this by specifying the bank for local variables of a module, and usually a single bank for the limited global state. Local variables within a module are forced to a particular bank by something like UDATA .BANK2, where .BANK2 is a section defined in the linker file that is forced to bank 2. That still lets the linker allocate variables within the bank, and you will get a error if you put too much stuff in any one bank. This scheme means you end up doing bank allocation per module, but that's still a lot better than the all-manual bank allocation without overflow detection that you get in absolute mode.

Since in my code the banks of most variables are known at build time, I can optimize bank setting. I have macros that set the bank and track the current bank setting. On a classic PIC 16, the DBANKIF (set direct bank if needed) macro emits 0, 1, or 2 BSF/BCF instructions on the bank bits in STATUS. A second redundant DBANKIF never emits any code. I can therefore use DBANKIF in front of most variable references, and only the minimum necessary bank setting instructions are actually included in the code. This results in nicely optimized code, with a single assembly constant to change if I want all the local variables in a different bank. The bank switching code will be automatically adjusted accordingly.

Since the DBANKIF and related macros track the bank state in source code order, you do have to pay some attention. For example, this system can't know that code from another place with a different live bank setting may jump into other code. For this reason, I have macros that either tell the build-time logic what the bank setting actually is, or tell it explicitly that it doesn't know. For example, most code labels have UNBANK following them. That tells the build-time bank tracking system to invalidate any assumptions. The next DBANKIF will explicitly set both bank bits, then the system starts tracking from there again.

The way I deal with pages is to use the convention that the upper two bits of PCLATH are always set to the page of the currently executing code, and that each page is defined as a separate memory region in the linker file. That guarantees that any one code section won't straddle a bank boundary. Usually each module contains a single named code section, so effectively code within a module can use local GOTO and CALL without PCLATH manipulation. The flip side is that you have to assume code in any other section can be on another page, if your PIC has more than one page. This means before a CALL to a remote subroutine, you have to set PCLATH for the target page, then restore it after the return. I have GCALL (global call) and GJUMP (global jump) macros that do just that in a single line of source code.