Electronic – Reading char values, returning 16 and 8 bit values

cmicrocontrollerpointers

I'm reading a internal memory block, im declaring a tracking variable and the start address and end address of the block"

//Defined as globals.
#define _beginAddrr     0x80000000/*0x80070000*/
#define _endAddrr       0x8000FFFF/*0x80077FF3*/
static UINT32 _currentAddrr = _beginAddrr;

The following code is called every 100 ms (so we dont hang the processor), reading 10 values at a time.

void memory_map ( void )
{
    int i = 0;
    for (i = 0; i < 10; i++)
    {
        if( _currentAddrr <= (_endAddrr) )
        {
            PRINTF("0x%08X : 0x%X",_currentAddrr, (*(char*)_currentAddrr)); //Print Value
            _currentAddrr++; //Increment
        }
        else
        {
            //We finish, turn this false.
            set_memorymap_enabled(FALSE);
            return;
        }
    }
}

set_memorymap_enabled() sets the flag off so we stop reading memory, so the memory_map function is called in this way:

if(get_memorymap_enabled() != 0)
{
    memory_map(); //PERFORM MEMORY MAP FUNCTION FOR JIRA
}
else
{
    //PRINTF("Restarting _currentAddrr");
    _currentAddrr = _beginAddrr;
}

Im using an avr32UC3 microcontroller for this task, the output is the following:

enter image description here

My question is, if im reading a CHAR variable, why am i getting leading 'FF' on some variables, does it mean that the value is signed?

Best Answer

I had started to add an answer earlier but I wanted to wait to see if I had anything to add to what was being written before bothering. I have a couple of added points that may help you in the future. So I'll write a little something here.

First off, I'd describe this as a "two's complement sign extension" issue. In the early days of C there was no standard except for the version of C that was maintained by Dennis Ritchie at Bell Labs. Later, when Brian Kernighan and Ritchie published "The C Programming Language" in 1978, that book became the standard (such as it was.) (That is when I learned C, 1978, while working on Unix v6 kernel source code.) Still later, when Sam Harbison and Guy Steele Jr published "C: A Reference Manual" it changed things still more, in part because it elaborated so many nuances that previously hadn't been put into easily accessible print before.

Since then, C has undergone standardization (and Harbison and Steele updated their edition with version 5, I think, to provide a post-standard edition) several times. I stopped keeping track of the details circa the mid-1990's, as I was no longer bothering with compilers then. So I missed out on much of the post-1999 changes (C99 and C11.) I can only speak to the intimate details prior to C99 and C11, unfortunately.

Because of early choices made for the function activation frame in C, it could accommodate function calls with varying parameter lists. This was a remarkable (at the time) idea that paid dividends with functions such as printf() and scanf(), where the caller could determine how many parameters needed formatted I/O rather than the function, itself.

However, there is always a price to pay for every choice. In this case, since there was no possible way for the C compiler to examine the called function to find out the type of each parameter as intended by its programmer, and instead had to look over a laundry list of calls to it with different numbers of expressions of differing types, the compiler and/or function coder was forced to either offer a nearly unbounded set of variable argument processing [such as with %f, %d and so on] or else had to make some "reasoned assumptions" about the physical size, layout, and implications of passed parameters in order to put a limit to things. So they chose to tell the C compiler that it only had to deal with placing expression results into a small number of physical sizes and forms and then inform a function coder that they only had to worry about this "short list."

So the C compiler conspired together with coders so as to limit the type and layout allowed. It boiled down to "pointers" (which usually only had one specific size -- though Intel would later definitely make a huge mess of this idea with their x86 core), integers, and floating point. The floating point concept was reduced further, by fiat, to "double." So all expressions that would otherwise be some kind of floating point result would be promoted into 'double' and then passed as that size and layout. The integer concept was more of a struggle. They banned 'char', saying instead that all integer expressions of a size smaller than 'int' would be promoted to 'int' and laid out in that format, and that all integer expressions requiring (or specifying) a size larger than 'int' would be promoted to 'long' and laid out in that format.

It boiled down to some assumptions that everyone decided to share. This limited the sprawl of value sizes and types and made variable parameter lists tractable when writing real code.

In your case, your expression is:

(*(char*)_currentAddrr)

This is taken to mean "read the value stored in _currentAddrr and assume it represents a pointer to a 'char' data type, then go read the value indicated by the pointer." This is, of course, of type 'char'. The C compiler then notices that this is smaller than (probably, but not always) an 'int' and follows the rules everyone has agreed to and promotes it to an 'int' before laying it out as a parameter value for the function call.

The problem here is "promotes." What does this mean? If the value to be promoted is a signed value, then the C standard requires that the larger physical format also be signed and that the resulting value matches in its meaning. So if negative as a 'char' then it must remain negative as an 'int' after the promotion. If positive as a 'char' then it must remain positive as an 'int' after the promotion.

In your case, any byte value larger in absolute magnitude than 0x7F (0x80 or larger) must be seen as a negative value if the type is 'signed char.' [Normally, a 'char' type is taken to be 'signed char' but this wasn't a firm requirement in the standards of C I know well.] So a byte of 0xAA would become (two's complement) 0xFFAA. In short, the sign gets extended. Which is why I say this is a "sign extension" issue.

However, if the type were 'unsigned char' then the rule instead says that the positive nature of the value is to be preserved in the larger format. So when promoting an 'unsigned char' to an 'int' (which is really a 'signed int'), the C compiler does NOT perform the sign extension. Instead, the value is simply placed into the larger format with the upper byte(s) set to zero.

This is why it's rather easy to fix with:

( *(unsigned char*) _currentAddrr )

All this does is tell the C compiler something different about your pointer, explicitly saying that it is an unsigned byte. So the C compiler gets the clue and does NOT sign-extend the value when promoting it (as it must do, earlier by convention and later by explicit standard.)

Always keep in mind these implicit conversions. They also take place in "if" statements, for the condition tests. So they appear in a variety of places and can catch you, if you are unaware.