C Memory Usage – When to Use malloc and free

cmallocmemorymemory usageshell

Valgrind does not report a memory leak during my actual usage, only during my scripted test that I scripted with a shell script to test my own shell. I found that I didn't have to use malloc every time I did. For example, strdup could do it for me. Now I wonder if it is possible to formally prove that I really need malloc where I use it or can I make an analysis and either prove or disprove that malloc is actually needed?

If I can rewrite the program so that it doesn't use malloc then I'd be glad. I'd be even happier if I can prove for certain cases that I don't need malloc, since there were cases were I only had to free() and malloc size was done automatically. One error message that appears only during test is the following.

. 
==29846== Memcheck, a memory error detector
==29846== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==29846== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==29846== Command: ./shell
==29846== 
'PATH' is set to /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin.
stdin is a file or a pipe
==29846== 
==29846== HEAP SUMMARY:
==29846==     in use at exit: 83,052 bytes in 171 blocks
==29846==   total heap usage: 240 allocs, 69 frees, 101,754 bytes allocated
==29846== 
==29846== 12 bytes in 1 blocks are definitely lost in loss record 5 of 93
==29846==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29846==    by 0x50FCA59: strdup (strdup.c:42)
==29846==    by 0x4E57624: readline (in /usr/lib/x86_64-linux-gnu/libedit.so.2.0.53)
==29846==    by 0x40193B: main (main.c:599)
==29846== 
==29846== LEAK SUMMARY:
==29846==    definitely lost: 12 bytes in 1 blocks
==29846==    indirectly lost: 0 bytes in 0 blocks
==29846==      possibly lost: 0 bytes in 0 blocks
==29846==    still reachable: 83,040 bytes in 170 blocks
==29846==         suppressed: 0 bytes in 0 blocks
==29846== Reachable blocks (those to which a pointer was found) are not shown.
==29846== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==29846== 
==29846== For counts of detected and suppressed errors, rerun with: -v
==29846== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The test I scripted is:

#!/bin/sh
echo "-- Testing our implementation of OpenShell --"
echo ""
echo "- If you have any problem in passing a test read the corresponding"
echo "- source file to understand what the test is checking"
echo ""
printf "********************* PRESS ENTER TO RUN TESTS  ... "
read _

# Key pressed, do something
printf "********************* TEST WILDCARDS \n***** Press any key to listing all files in current directory... "
read _
valgrind --leak-check=full --leak-kinds=all./shell << EOF
ls -al *.*
EOF
printf "********************* TEST ALGORITHMS ... "
read _
echo "top -b -n1|head -8|tail -1" | ./shell

printf "********************* TEST ALGORITHMS Part II.  ... "
read _
valgrind --leak-check=full --leak-kinds=all ./shell << EOF
who|awk '{print \$4 ; print \$3}'|sort -n|wc -l
EOF

printf "********************* TEST CHECKENV.  ... "
read _
valgrind --leak-check=full --leak-kinds=all ./shell << EOF
checkenv
EOF
printf "********************* TEST DONE. YOU SHOULD SEE OUTPUT FROM TEST ABOVE ... "
read _

My offending code is here:

 while (1) {
        buf = "> ";
        if (!isatty(fileno(stdin))) {
            printf("stdin is a file or a pipe\n");
            if (buf)
                command(readline(buf));
            exit(0);
        }
        cwd = malloc(sizeof(char *) * 100);
        if (cwd != NULL && getcwd(cwd, 99) == cwd) {
            printf("%s: ", cwd);
            char *input = readline(buf);
            add_history(input);
            command(input);
            free(input);
            free(cwd);

        } else {
            printf("%s: ", getenv("USER"));
            char *input = readline(buf);
            add_history(input);
            if (input) {
                command(input);
                free(input);
            }

        }
    }

Best Answer

I'm not going to debug your code, there's not enough context to do this anyway, but I'm going to show you an idiom that you will probably find easier to use correctly. As a bonus, it will also be faster.

Have a look at your loop body. You are allocating memory during each iteration and free it under certain circumstances depending on the overall control flow. This puts very high mental load on anybody (including yourself) reading the code and trying to verify that each malloc is paired with a corresponding free on each possible path of execution.

What if you instead took the memory allocation and deallocation outside the loop?

const size_t maxcwd = PATH_MAX;
char * cwd = malloc(maxcwd);
if (cwd == NULL)
  {
    fprintf(stderr, "error: cannot start shell: out of memory\n");
    return EXIT_FAILURE;
  }
while (true)
  {
    if (getcwd(cwd, maxcwd))
      {
        // Good, use it…
      }
    else
      {
        // Bad, do something else…
      }
  }
free(cwd);

Note how there is only a single malloc and a single free and it is fairly easy to see that they pair up correctly. The code will also be faster because there are much fewer allocations and deallocations performed.

However, the code is not ideal. Why allocate a huge array upfront that might never be needed and why fail if it is still too small? It would be better to dynamically resize the array as needed.

size_t cwdsize = 0;
char * cwd = NULL;
while (true)
  {
    while ((cwd == NULL) || (getcwd(cwd, cwdsize) == NULL))
      {
        size_t newsize = (cwdsize > 0) ? (cwdsize << 1) : 128;
        char * temp = realloc(cwd, newsize);
        if (temp == NULL)
          {
            fprintf(stderr, "error: out of memory\n");
            break;
          }
        cwdsize = newsize;
        cwd = temp;
      }
    // cwd is now always good to use…
  }
free(cwd);

Of course, the code is a little verbose and you might decide to factor it out into a function. This is also the pattern used by the POSIX getline function which I find very elegant.

It is too bad that GNU readline does not support the same idiom. It always allocates new memory and you have to remember to free it after every call. Even worse, there is no way for it to report failure to allocate memory.

bool keep_going = true;
while (keep_going)
  {
    char * line = NULL;
    // Other stuff that needs to be done…
    line = readline("$ ");
    if (line == NULL)
      {
        // Probably EOF
        keep_going = false;
        goto finally;
      }
  finally:
    free(line);
  }

If you also have other error paths, you can safely say goto finally; multiple times in your loop. This is because free(NULL) is a safe no-op. If you're dealing with resources where the deallocation function must only be called with valid resources, your cleanup code would have to check itself whether it actually has a resource to free. This disciplined use of goto basically mimics deterministic destruction in C++ and is a very useful idiom in C.

Related Topic