Intel HEX files are always byte-addressed. This does not mean they can't handle information for other word sizes, only that there needs to be a convention about how those words are mapped to the bytes of the HEX file.
Just like with all the other non-byte addressed PICs (PIC 10, 12, and 16), the addresses are doubled in the HEX file. PIC programmer software knows this and interprets the HEX file addresses accordingly. This is of course all well documented in the programming spec for whatever part you want to program.
You say you want to make your own programmer. That's fine as long as you understand this will take way more time and frustration than just getting a known working one. If the point is the experience and learning of making your own, then fine, but otherwise go buy one.
If you really do want to make your own, you should look at the code for my PIC programmers. All the host code and firmware is open and available in the Development Software release at http://www.embedinc.com/picprg/sw.htm. By looking thru the host source code, you can see how there are flags indicating whether HEX file addresses are doubled for various parts of the PIC's memory.
If you make your programmer compatible with my PIC programmers protocol, then you can make use of all my host-side tools. This could be very helpful when bringing up your system since you have known working code on the other side. The protocol spec may look intimidating at first, but look carefully and you will see much of it is optional, especially if you plan to only support a single PIC.
This is a standard format Intel hex file used by many manufacturers' linkers.
The first character is always ':'
The next two hex digits are the byte count, in two-digit pairs, in the data field of the line. So the first line has two bytes of data, the next eight bytes, and tghe last zero bytes.
The next four hex digits are the address, in bytes. Olin has already mentioned why the starting address of 33 (0x21) is display as 0042.
The next two hex digits are the record type. 02 is an extended address type, allowing addresses to extend beyond the 64K limit of the original format. But the data field following the 02 in the first line is 0000, so there really isn't any extension in this case.
The last two hex digits are the checksum (see the Wikipedia article for the calculation).
In the second line, the 00 record says there will be an address field (e.g. 0042) preceding the record type, and the eight bytes of data following the record type (00,C0,... thru CF).
The last line, with the record type of 01, is an end of record.
Best Answer
You can use
avr-size
to check the real size of your program:As you can see it works better with
.elf
files since it can also show you both how much ram you need (data
+bss
) and how much flash will be used (text
+data
). With the.hex
file only shows you the second figure (labeling itdata
)