Debugging – How to Debug a Binary Format

binarydebugging

I would like to be able to debug building a binary builder. Right now I am basically printing out the input data to the binary parser, and then going deep into the code and printing out the mapping of the input to the output, then taking the output mapping (integers) and using that to locate the corresponding integer in the binary. Pretty clunky, and requires that I modify the source code deeply to get at the mapping between input and output.

It seems like you could view the binary in different variants (in my case I'd like to view it in 8-bit chunks as decimal numbers, because that's pretty close to the input). Actually, some numbers are 16 bit, some 8, some 32, etc. So maybe there would be a way to view the binary with each of these different numbers highlighted in memory in some way.

The only way I could see that being possible is if you actually build a visualizer specific to the actual binary format/layout. So it knows where in the sequence the 32 bit numbers should be, and where the 8 bit numbers should be, etc. This is a lot of work and kind of tricky in some situations. So wondering if there's a general way to do it.

I am also wondering what the general way of debugging this type of thing currently is, so maybe I can get some ideas on what to try from that.

Best Answer

For ad-hoc checks, just use a standard hexdump and learn to eyeball it.

If you want to tool up for a proper investigation, I usually write a separate decoder in something like Python - ideally this will be driven directly from a message spec document or IDL, and be as automated as possible (so there's no chance of manually introducing the same bug in both decoders).

Lastly, don't forget you should be writing unit tests for your decoder, using known-correct canned input.

Update

I don't know what language your working in, or even if what I'm about to post will be of any use to you :-) , but here goes anyway.

This is a TLV decoder (Note it has no encoding capabilities, but it should be easy to reverse it) I wrote back in 2008(ish) using C# , it was designed for decoding ber-tlv packets coming off a smart card in a payment terminal, but it might serve as a starting point for you to hack it into a more useful shape.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Card_Analyzer
{
  public class tlv
  {
    public int tag = 0;
    public int length = 0;
    public byte tagClass = 0;
    public byte constructed = 0;
    public List<byte> data = new List<byte>();
  }

  public class tlvparser
  {
    // List of found TLV structures
    public List<tlv> tlvList = new List<tlv>();

    // Constructor
    public tlvparser(byte[] data)
    {
      if (data != null)
      {
        this.doParse(data);
      }
    }

    // Main parsing function
    public void  doParse(byte[] data)
    {
      int fulltag = 0;
      int tlvlen = 0;
      int dptr = 0;

      while (dptr < data.Length)
      {
        byte temp = data[dptr];
        int iclass = temp & 0xC0;
        int dobj = temp & 0x20;
        int tag = temp & 0x1F;

        if (tag >= 31) // Using extracted vars, decide if tag is a 2 byte tag
        {
          fulltag = (temp << 8) + data[dptr + 1];
          tlvlen = data[dptr + 2];
          dptr += 3;
        }
        else
        {
          fulltag = temp;
          tlvlen = data[dptr + 1];
          dptr += 2;
        }// End if tag 16 bit

        if ((tlvlen & 128) == 128)
        {
          tlvlen = (tlvlen << 8) + data[dptr];
          dptr++;
        }

        tlv myTlv = new tlv();
        myTlv.tag = fulltag;
        myTlv.length = tlvlen;
        myTlv.tagClass = Convert.ToByte(iclass >> 6);
        myTlv.constructed = Convert.ToByte(dobj >> 5);

        for (int i = 0; i < tlvlen; i++)
        {
          if(dptr < data.Length)
            myTlv.data.Add(data[dptr++]);
        }

        if (myTlv.constructed == 1)
          this.doParse(myTlv.data.ToArray());

        tlvList.Add(myTlv);

      }// End main while loop
    }// End constructor
  }// end class tlvparser
}// end namespace

If it's of no use, then feel free to just ignore it.

Best Answer

Related Solutions

Concurrency Design – How to Approach Design and Debug Implementation

Data format for binary data transfer

Update