Linux (non-transparent) per-process hugepage accounting

javalinuxlinux-kernel

I've recently converted some java apps to run with linux manually-configured hugepages, as described here. I point out "manually-configured" because they are not transparent hugepages, which gave us some performance issues.

So now, I've got about 10 tomcats running on a system and I am interested in knowing how much memory each one is using.

I can get summary information out of /proc/meminfo as described in Linux Huge Pages Usage Accounting.

But I can't find any tools that tell me about the actual per-process hugepage usage.

I poked around in /proc/pid/numa_stat and found some interesting information that led me to this grossity:

function pshugepage () {
    HUGEPAGECOUNT=0
    for num in `grep 'anon_hugepage.*dirty=' /proc/$@/numa_maps | awk '{print $6}' | sed 's/dirty=//'` ; do
        HUGEPAGECOUNT=$((HUGEPAGECOUNT+num))
    done
    echo process $@ using $HUGEPAGECOUNT huge pages
}

or this, in perl:

sub counthugepages {
    my $pid=$_[0];
    open (NUMAMAPS, "/proc/$pid/numa_maps") || die "can't open numa_maps";
    my $HUGEPAGECOUNT=0;
    while (my $line=<NUMAMAPS>) {
        next unless ($line =~ m{ huge }) ;
        next unless ($line =~ m{dirty=});
        chomp $line;
        $line =~ s{.*dirty=}{};
        $line =~ s{\s.*$}{};
        $HUGEPAGECOUNT+=$line;
    }
    close NUMAMAPS;
    # we want megabytes out, but we counted 2-megabyte hugepages
    return ($HUGEPAGECOUNT*2);
}

The numbers it gives me are plausible, but i'm far from confident this method is correct.

Environment is a quad-CPU dell, 64GB ram, RHEL6.3, oracle jdk 1.7.x (current as of 20130728)

Best Answer

Update: Red Hat now recommends this method for process hugepage accounting on RHEL5/6:

grep -B 11 'KernelPageSize:     2048 kB' /proc/[PID]/smaps \
   | grep "^Size:" \
   | awk 'BEGIN{sum=0}{sum+=$2}END{print sum/1024}'

I asked this on the procps-ng developers' mailing list. I was told:

The hugepage support has been introduced in the procps-ng/pmap tool several months ago (switches -XX, -C, -c, -N, -n should allow you to configure and display any entries supported by the running kernel).

I experimented a bit with this with procps-3.3.8 on fedora 19. I don't think it gave me any information i didn't get from the stuff I suggested in my question, but at least it has the aura of authority.

FWIW I ended up with the following:

.pmaprc file containing:

[Fields Display]
Size
Rss
Pss
Referenced
AnonHugePages
KernelPageSize
Mapping

[Mapping]
ShowPath

And then I used the following command to pull hugepage information:

pmap -c [process id here] | egrep 'Add|2048'

in the grep, "Add" is for the header line. "2048" will grab anything with a kernel page size of 2048, i.e., huge pages. It will also grab unrelated stuff.

Here's some sample output:

     Address    Size   Rss   Pss Referenced AnonHugePages KernelPageSize Mapping
    ed800000   22528     0     0          0             0           2048 /anon_hugepage (deleted)
    f7e00000   88064     0     0          0             0           2048 /anon_hugepage (deleted)
    fd400000   45056     0     0          0             0           2048 /anon_hugepage (deleted)
7f3753dff000    2052  2048  2048       2048          2048              4 [stack:1674]
7f3759000000    4096     0     0          0             0           2048 /anon_hugepage (deleted)
7f3762d68000    2048     0     0          0             0              4 /usr/lib64/libc-2.17.so
7f376339b000    2048     0     0          0             0              4 /usr/lib64/libpthread-2.17.so

We only care about the lines with kernelPageSize 2048.

I think it's telling me that I've allocated 159744 Kbytes (22528+88064+45056+4096) of RAM in huge pages. I told java to use exactly 128M for its heap, and it has some other memory pools, so this is a plausible number. Rss & Referenced 0 doesn't quite make sense, however the test java program is extremely simple so it too is plausible.

It doesn't agree with the number i get from my perl snippet above, because the perl is searching only for "dirty" pages - ones that have actually been used. And/or because the perl is just wrong, I don't know.

I also tried procps 3.3.9 on an RHEL6 machine with some active tomcats using lots of hugepage memory. The Rss & Referenced columns were all 0. This may very well be the fault of the kernel rather than procps, I don't know.