Electronic – Why are data records in intel hex files often limited to 16 bytes, even for long contiguous blocks

compilerhex filelinker

I'm not sure if this is true in general, but all intel hex files I have seen (from Atmel Studio, STM32CubeIDE and MPLAB) use data records with a length of 16 bytes. Even when the addresses written to are completely sequential there are just a lot of 16 byte records. This results in a lot of redundancy, because the preamble with the memory address is transmitted again and again.

Why is it not common to use longer data blocks? Even if we wanted to stick to power-of-two data block lengths we could easily fit 128 Bytes in one record with the one-byte length-field and reduce the amount of totally transmitted data by some ~30%.

Sometimes I see even shorter blocks, still without skipping any memory addresses. In the following example I added "-" to make it easier to distinguish the seperate fields (:LL-AAAA-TT-…) from each other.

:10-0160-00-3D0B00083D0B00083D0B000800000000-9F
:10-0170-00-3D0B00083D0B00083D0B000800000000-8F
:0C-0180-00-3D0B00083D0B00083D0B0008-83
:10-018C-00-10B5054C237833B9044B13B10448AFF3-C5
:10-019C-00-00800123237010BD0C00002000000000-23

There are a lot of data records with 16 data bytes, always increasing the address by 16. And then there is one record with only 13 bytes, with the next address increasing only by 13.

Can somebody shed some light on this behaviour?
Because both gcc for STM and AVR controllers as well as the MPLAB XC Compiler for dsPIC show this same behaviour I don't think that it is someting architecture or compiler specific.

Is this just "evolved historically" to keep compatible to older parsers? Or does the checksum become too inefficent for longer records?
And for the even shorter records I can only imagine it has something to do with the linker? Are the shorter data records the end of some logical block and the linker is for some reason not filling up the record with data from the next block? Or is there some other reason?

Best Answer

The Intel Hexadecimal Object File Format Specification rev. A from 1988 acknowledges that representing binary data as ASCII makes it possible to store binary files into non-binary medium such as punch cards.

The maximum data record size is 255 bytes, but even a 32-byte data record would not fit into a 70-column puch card or data terminal, so next smaller useful size for a record is 16-bytes, could have been quite common for compatibility reasons and easy readability, as address increments 0x10 bytes per record.

So the format is not designed for speed or least overhead to begin with, but maximum portability and compatibility between equipment for storage and transmission.

As to why there are shorter blocks is because there is a change of block in the program, like end of code area and start of data area. It is because these linker sections are anyway processed in one go, and it is up to the user to determine which sections are to be included in a hex file. So the linked program, like an .elf file, is not first converted to raw binary, and the raw binary then converted to .hex file, the .elf file sections are processed one at a time. If you want, make a raw binary first from the .elf binary, and the raw binary can be converted to .hex file and it would not contain any sign of different sections, just directly the same continuous data as the raw binary.

Related Topic