123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392 |
- <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
- <HTML>
- <HEAD>
- <link rel="stylesheet" href="designstyle.css">
- <title>Gperftools Heap Profiler</title>
- </HEAD>
- <BODY>
- <p align=right>
- <i>Last modified
- <script type=text/javascript>
- var lm = new Date(document.lastModified);
- document.write(lm.toDateString());
- </script></i>
- </p>
- <p>This is the heap profiler we use at Google, to explore how C++
- programs manage memory. This facility can be useful for</p>
- <ul>
- <li> Figuring out what is in the program heap at any given time
- <li> Locating memory leaks
- <li> Finding places that do a lot of allocation
- </ul>
- <p>The profiling system instruments all allocations and frees. It
- keeps track of various pieces of information per allocation site. An
- allocation site is defined as the active stack trace at the call to
- <code>malloc</code>, <code>calloc</code>, <code>realloc</code>, or,
- <code>new</code>.</p>
- <p>There are three parts to using it: linking the library into an
- application, running the code, and analyzing the output.</p>
- <h1>Linking in the Library</h1>
- <p>To install the heap profiler into your executable, add
- <code>-ltcmalloc</code> to the link-time step for your executable.
- Also, while we don't necessarily recommend this form of usage, it's
- possible to add in the profiler at run-time using
- <code>LD_PRELOAD</code>:
- <pre>% env LD_PRELOAD="/usr/lib/libtcmalloc.so" <binary></pre>
- <p>This does <i>not</i> turn on heap profiling; it just inserts the
- code. For that reason, it's practical to just always link
- <code>-ltcmalloc</code> into a binary while developing; that's what we
- do at Google. (However, since any user can turn on the profiler by
- setting an environment variable, it's not necessarily recommended to
- install profiler-linked binaries into a production, running
- system.) Note that if you wish to use the heap profiler, you must
- also use the tcmalloc memory-allocation library. There is no way
- currently to use the heap profiler separate from tcmalloc.</p>
- <h1>Running the Code</h1>
- <p>There are several alternatives to actually turn on heap profiling
- for a given run of an executable:</p>
- <ol>
- <li> <p>Define the environment variable HEAPPROFILE to the filename
- to dump the profile to. For instance, to profile
- <code>/usr/local/bin/my_binary_compiled_with_tcmalloc</code>:</p>
- <pre>% env HEAPPROFILE=/tmp/mybin.hprof /usr/local/bin/my_binary_compiled_with_tcmalloc</pre>
- <li> <p>In your code, bracket the code you want profiled in calls to
- <code>HeapProfilerStart()</code> and <code>HeapProfilerStop()</code>.
- (These functions are declared in <code><gperftools/heap-profiler.h></code>.)
- <code>HeapProfilerStart()</code> will take the
- profile-filename-prefix as an argument. Then, as often as
- you'd like before calling <code>HeapProfilerStop()</code>, you
- can use <code>HeapProfilerDump()</code> or
- <code>GetHeapProfile()</code> to examine the profile. In case
- it's useful, <code>IsHeapProfilerRunning()</code> will tell you
- whether you've already called HeapProfilerStart() or not.</p>
- </ol>
- <p>For security reasons, heap profiling will not write to a file --
- and is thus not usable -- for setuid programs.</p>
- <H2>Modifying Runtime Behavior</H2>
- <p>You can more finely control the behavior of the heap profiler via
- environment variables.</p>
- <table frame=box rules=sides cellpadding=5 width=100%>
- <tr valign=top>
- <td><code>HEAP_PROFILE_ALLOCATION_INTERVAL</code></td>
- <td>default: 1073741824 (1 Gb)</td>
- <td>
- Dump heap profiling information each time the specified number of
- bytes has been allocated by the program.
- </td>
- </tr>
- <tr valign=top>
- <td><code>HEAP_PROFILE_INUSE_INTERVAL</code></td>
- <td>default: 104857600 (100 Mb)</td>
- <td>
- Dump heap profiling information whenever the high-water memory
- usage mark increases by the specified number of bytes.
- </td>
- </tr>
- <tr valign=top>
- <td><code>HEAP_PROFILE_TIME_INTERVAL</code></td>
- <td>default: 0</td>
- <td>
- Dump heap profiling information each time the specified
- number of seconds has elapsed.
- </td>
- </tr>
- <tr valign=top>
- <td><code>HEAPPROFILESIGNAL</code></td>
- <td>default: disabled</td>
- <td>
- Dump heap profiling information whenever the specified signal is sent to the
- process.
- </td>
- </tr>
- <tr valign=top>
- <td><code>HEAP_PROFILE_MMAP</code></td>
- <td>default: false</td>
- <td>
- Profile <code>mmap</code>, <code>mremap</code> and <code>sbrk</code>
- calls in addition
- to <code>malloc</code>, <code>calloc</code>, <code>realloc</code>,
- and <code>new</code>. <b>NOTE:</b> this causes the profiler to
- profile calls internal to tcmalloc, since tcmalloc and friends use
- mmap and sbrk internally for allocations. One partial solution is
- to filter these allocations out when running <code>pprof</code>,
- with something like
- <code>pprof --ignore='DoAllocWithArena|SbrkSysAllocator::Alloc|MmapSysAllocator::Alloc</code>.
- </td>
- </tr>
- <tr valign=top>
- <td><code>HEAP_PROFILE_ONLY_MMAP</code></td>
- <td>default: false</td>
- <td>
- Only profile <code>mmap</code>, <code>mremap</code>, and <code>sbrk</code>
- calls; do not profile
- <code>malloc</code>, <code>calloc</code>, <code>realloc</code>,
- or <code>new</code>.
- </td>
- </tr>
- <tr valign=top>
- <td><code>HEAP_PROFILE_MMAP_LOG</code></td>
- <td>default: false</td>
- <td>
- Log <code>mmap</code>/<code>munmap</code> calls.
- </td>
- </tr>
- </table>
- <H2>Checking for Leaks</H2>
- <p>You can use the heap profiler to manually check for leaks, for
- instance by reading the profiler output and looking for large
- allocations. However, for that task, it's easier to use the <A
- HREF="heap_checker.html">automatic heap-checking facility</A> built
- into tcmalloc.</p>
- <h1><a name="pprof">Analyzing the Output</a></h1>
- <p>If heap-profiling is turned on in a program, the program will
- periodically write profiles to the filesystem. The sequence of
- profiles will be named:</p>
- <pre>
- <prefix>.0000.heap
- <prefix>.0001.heap
- <prefix>.0002.heap
- ...
- </pre>
- <p>where <code><prefix></code> is the filename-prefix supplied
- when running the code (e.g. via the <code>HEAPPROFILE</code>
- environment variable). Note that if the supplied prefix
- does not start with a <code>/</code>, the profile files will be
- written to the program's working directory.</p>
- <p>The profile output can be viewed by passing it to the
- <code>pprof</code> tool -- the same tool that's used to analyze <A
- HREF="cpuprofile.html">CPU profiles</A>.
- <p>Here are some examples. These examples assume the binary is named
- <code>gfs_master</code>, and a sequence of heap profile files can be
- found in files named:</p>
- <pre>
- /tmp/profile.0001.heap
- /tmp/profile.0002.heap
- ...
- /tmp/profile.0100.heap
- </pre>
- <h3>Why is a process so big</h3>
- <pre>
- % pprof --gv gfs_master /tmp/profile.0100.heap
- </pre>
- <p>This command will pop-up a <code>gv</code> window that displays
- the profile information as a directed graph. Here is a portion
- of the resulting output:</p>
- <p><center>
- <img src="heap-example1.png">
- </center></p>
- A few explanations:
- <ul>
- <li> <code>GFS_MasterChunk::AddServer</code> accounts for 255.6 MB
- of the live memory, which is 25% of the total live memory.
- <li> <code>GFS_MasterChunkTable::UpdateState</code> is directly
- accountable for 176.2 MB of the live memory (i.e., it directly
- allocated 176.2 MB that has not been freed yet). Furthermore,
- it and its callees are responsible for 729.9 MB. The
- labels on the outgoing edges give a good indication of the
- amount allocated by each callee.
- </ul>
- <h3>Comparing Profiles</h3>
- <p>You often want to skip allocations during the initialization phase
- of a program so you can find gradual memory leaks. One simple way to
- do this is to compare two profiles -- both collected after the program
- has been running for a while. Specify the name of the first profile
- using the <code>--base</code> option. For example:</p>
- <pre>
- % pprof --base=/tmp/profile.0004.heap gfs_master /tmp/profile.0100.heap
- </pre>
- <p>The memory-usage in <code>/tmp/profile.0004.heap</code> will be
- subtracted from the memory-usage in
- <code>/tmp/profile.0100.heap</code> and the result will be
- displayed.</p>
- <h3>Text display</h3>
- <pre>
- % pprof --text gfs_master /tmp/profile.0100.heap
- 255.6 24.7% 24.7% 255.6 24.7% GFS_MasterChunk::AddServer
- 184.6 17.8% 42.5% 298.8 28.8% GFS_MasterChunkTable::Create
- 176.2 17.0% 59.5% 729.9 70.5% GFS_MasterChunkTable::UpdateState
- 169.8 16.4% 75.9% 169.8 16.4% PendingClone::PendingClone
- 76.3 7.4% 83.3% 76.3 7.4% __default_alloc_template::_S_chunk_alloc
- 49.5 4.8% 88.0% 49.5 4.8% hashtable::resize
- ...
- </pre>
- <p>
- <ul>
- <li> The first column contains the direct memory use in MB.
- <li> The fourth column contains memory use by the procedure
- and all of its callees.
- <li> The second and fifth columns are just percentage
- representations of the numbers in the first and fourth columns.
- <li> The third column is a cumulative sum of the second column
- (i.e., the <code>k</code>th entry in the third column is the
- sum of the first <code>k</code> entries in the second column.)
- </ul>
- <h3>Ignoring or focusing on specific regions</h3>
- <p>The following command will give a graphical display of a subset of
- the call-graph. Only paths in the call-graph that match the regular
- expression <code>DataBuffer</code> are included:</p>
- <pre>
- % pprof --gv --focus=DataBuffer gfs_master /tmp/profile.0100.heap
- </pre>
- <p>Similarly, the following command will omit all paths subset of the
- call-graph. All paths in the call-graph that match the regular
- expression <code>DataBuffer</code> are discarded:</p>
- <pre>
- % pprof --gv --ignore=DataBuffer gfs_master /tmp/profile.0100.heap
- </pre>
- <h3>Total allocations + object-level information</h3>
- <p>All of the previous examples have displayed the amount of in-use
- space. I.e., the number of bytes that have been allocated but not
- freed. You can also get other types of information by supplying a
- flag to <code>pprof</code>:</p>
- <center>
- <table frame=box rules=sides cellpadding=5 width=100%>
- <tr valign=top>
- <td><code>--inuse_space</code></td>
- <td>
- Display the number of in-use megabytes (i.e. space that has
- been allocated but not freed). This is the default.
- </td>
- </tr>
- <tr valign=top>
- <td><code>--inuse_objects</code></td>
- <td>
- Display the number of in-use objects (i.e. number of
- objects that have been allocated but not freed).
- </td>
- </tr>
- <tr valign=top>
- <td><code>--alloc_space</code></td>
- <td>
- Display the number of allocated megabytes. This includes
- the space that has since been de-allocated. Use this
- if you want to find the main allocation sites in the
- program.
- </td>
- </tr>
- <tr valign=top>
- <td><code>--alloc_objects</code></td>
- <td>
- Display the number of allocated objects. This includes
- the objects that have since been de-allocated. Use this
- if you want to find the main allocation sites in the
- program.
- </td>
- </table>
- </center>
- <h3>Interactive mode</a></h3>
- <p>By default -- if you don't specify any flags to the contrary --
- pprof runs in interactive mode. At the <code>(pprof)</code> prompt,
- you can run many of the commands described above. You can type
- <code>help</code> for a list of what commands are available in
- interactive mode.</p>
- <h1>Caveats</h1>
- <ul>
- <li> Heap profiling requires the use of libtcmalloc. This
- requirement may be removed in a future version of the heap
- profiler, and the heap profiler separated out into its own
- library.
-
- <li> If the program linked in a library that was not compiled
- with enough symbolic information, all samples associated
- with the library may be charged to the last symbol found
- in the program before the library. This will artificially
- inflate the count for that symbol.
- <li> If you run the program on one machine, and profile it on
- another, and the shared libraries are different on the two
- machines, the profiling output may be confusing: samples that
- fall within the shared libaries may be assigned to arbitrary
- procedures.
- <li> Several libraries, such as some STL implementations, do their
- own memory management. This may cause strange profiling
- results. We have code in libtcmalloc to cause STL to use
- tcmalloc for memory management (which in our tests is better
- than STL's internal management), though it only works for some
- STL implementations.
- <li> If your program forks, the children will also be profiled
- (since they inherit the same HEAPPROFILE setting). Each
- process is profiled separately; to distinguish the child
- profiles from the parent profile and from each other, all
- children will have their process-id attached to the HEAPPROFILE
- name.
-
- <li> Due to a hack we make to work around a possible gcc bug, your
- profiles may end up named strangely if the first character of
- your HEAPPROFILE variable has ascii value greater than 127.
- This should be exceedingly rare, but if you need to use such a
- name, just set prepend <code>./</code> to your filename:
- <code>HEAPPROFILE=./Ägypten</code>.
- </ul>
- <hr>
- <address>Sanjay Ghemawat
- <!-- Created: Tue Dec 19 10:43:14 PST 2000 -->
- </address>
- </body>
- </html>
|