Code Quality – How to Present Code in Academic Work?

code-qualitysource codewriting

Actually, I'm writing my undergrad thesis, that consists in analysing the BitTorrent algorithm and see its application on Transmission client as an example of implementation.

Reading through its code, written in C, you can see many layers of functions

static const char*
tr_metainfoParseImpl (const tr_session  * session,
                      tr_info           * inf,
                      bool              * hasInfoDict,
                      int               * infoDictLength,
                      const tr_variant     * meta_in)
{
  int64_t i;
  size_t len;
  const char * str;
  const uint8_t * raw;
  tr_variant * d;
  tr_variant * infoDict = NULL;
  tr_variant * meta = (tr_variant *) meta_in;
  bool b;
  bool isMagnet = false;

  /* info_hash: urlencoded 20-byte SHA1 hash of the value of the info key
   * from the Metainfo file. Note that the value will be a bencoded
   * dictionary, given the definition of the info key above. */
  b = tr_variantDictFindDict (meta, TR_KEY_info, &infoDict);
  if (hasInfoDict != NULL)
    *hasInfoDict = b;

  if (!b)
    {
      /* no info dictionary... is this a magnet link? */
      if (tr_variantDictFindDict (meta, TR_KEY_magnet_info, &d))
        {
        (...)

tr_metainfoParseImpl() is the function called after we add a .torrent by file or magnet link. It calls tr_variantDictFindDict() to find some string "info" somewhere in the metadata dictionary, in order to get information about that torrent file.

Algoritmically, it has no value to me, since I want to emphasize other aspects of BitTorrent algorithm other than string search, although I want to leave its caller line there just to illustrate it's happenning.

The function tr_variantDictFindDict() and its child are

bool // func1
tr_variantDictFindDict (tr_variant       * dict,
                        const tr_quark     key,
                        tr_variant      ** setme)
{
  return tr_variantDictFindType (dict, key, TR_VARIANT_TYPE_DICT, setme);
}

static bool // func2
tr_variantDictFindType (tr_variant      * dict,
                        const tr_quark    key,
                        int               type,
                        tr_variant     ** setme)
{
  return tr_variantIsType (*setme = tr_variantDictFind (dict, key), type);
}

As we can see, although this may exist for code engineering reasons, algorithmically it has no value.

So, I'm looking for reasonable, feasible, practical ways to avoid showing this kind of code in my work.

Using the code above as an example, I thought some options:

  1. put the relevant part of func1 caller function, showing the line calling func1, and after that showing func2 callee (in which will have the another relevant part of code.

  2. putting func1 caller code and func2 callee "side by side", as if they were in one big function

  3. literate programming from the beginning

Please, feel free to share your experiences with this situation. Also, please change the SX site if needed, although I looked for the best SX site to ask this question and this one seemed legit.

Best Answer

For my graduate thesis, I had a similar challenge and spent some time reflecting about what it was I was trying to accomplish.

Putting a non-trivial amount of code in printed form brings up several challenges.

  • Code is best examined and manipulated in electronic form, so how do you provide that maneuverability within the printed page?

  • Quite a bit of the code is irrelevant to the problem at hand. File IO, memory manipulation, error handling, etc... are all examples of things that have to be in the code but don't support the thesis itself.

  • You want / need to provide all of the code so a future student can pick up your research and continue the work. In addition, the University expects all of the code so they can demonstrate you actually did the work and validate your results.


My thesis involved taking an existing algorithm, refining it for performance, and then extending the algorithm to a new set of use cases.

Within the body of my thesis, I placed only the relevant portions of the old and new routines side by side in order to provide a measure of comparison. Within the text, I then explained the differences between the functions and the measurable differences that I had found. I would then include appropriate charts / graphs / illustrations along with the explanation to help support the point I was making.

In some cases, I had refactored the existing code into new functions. Sometimes it made sense to include the refactored out sections of code within the discourse and sometimes it didn't.

Think of your thesis as a narrative. Don't include anything that doesn't directly contribute to the point you need to be making within each section. Your thesis will be long enough as it is, and you don't want to overload the reader with unnecessary or irrelevant detail. This aspect was crucial for me as several of my advisers didn't care as much about the code and wanted to focus on the results.

The last aspect to consider is where to include the complete source listing. I placed my source in an appendix at the end of my thesis. I also included some explanatory text on compilation and running the program so others could validate what I had done. Since I had some test sets of data that were necessary to recreate what I had done they were included too in the appendix.

Related Topic