From 35a201cc8ef0c3f5b2df88d2e528aabee1048348 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Fri, 30 Apr 2021 18:47:09 +0200 Subject: Initial/Final commit --- libxml2-2.9.10/doc/xmlmem.html | 122 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 libxml2-2.9.10/doc/xmlmem.html (limited to 'libxml2-2.9.10/doc/xmlmem.html') diff --git a/libxml2-2.9.10/doc/xmlmem.html b/libxml2-2.9.10/doc/xmlmem.html new file mode 100644 index 0000000..10befd7 --- /dev/null +++ b/libxml2-2.9.10/doc/xmlmem.html @@ -0,0 +1,122 @@ + + +Memory Management
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Memory Management

Developer Menu
API Indexes
Related links

Table of Content:

    +
  1. General overview
  2. +
  3. Setting libxml2 set of memory routines
  4. +
  5. Cleaning up after using the library
  6. +
  7. Debugging routines
  8. +
  9. General memory requirements
  10. +
  11. Returning memory to the kernel
  12. +

General overview

The module xmlmemory.h +provides the interfaces to the libxml2 memory system:

    +
  • libxml2 does not use the libc memory allocator directly but xmlFree(), + xmlMalloc() and xmlRealloc()
  • +
  • those routines can be reallocated to a specific set of routine, by + default the libc ones i.e. free(), malloc() and realloc()
  • +
  • the xmlmemory.c module includes a set of debugging routine
  • +

Setting libxml2 set of memory routines

It is sometimes useful to not use the default memory allocator, either for +debugging, analysis or to implement a specific behaviour on memory management +(like on embedded systems). Two function calls are available to do so:

    +
  • xmlMemGet + () which return the current set of functions in use by the parser
  • +
  • xmlMemSetup() + which allow to set up a new set of memory allocation functions
  • +

Of course a call to xmlMemSetup() should probably be done before calling +any other libxml2 routines (unless you are sure your allocations routines are +compatibles).

Cleaning up after using the library

Libxml2 is not stateless, there is a few set of memory structures needing +allocation before the parser is fully functional (some encoding structures +for example). This also mean that once parsing is finished there is a tiny +amount of memory (a few hundred bytes) which can be recollected if you don't +reuse the library or any document built with it:

    +
  • xmlCleanupParser + () is a centralized routine to free the library state and data. Note + that it won't deallocate any produced tree if any (use the xmlFreeDoc() + and related routines for this). This should be called only when the library + is not used anymore.
  • +
  • xmlInitParser + () is the dual routine allowing to preallocate the parsing state + which can be useful for example to avoid initialization reentrancy + problems when using libxml2 in multithreaded applications
  • +

Generally xmlCleanupParser() is safe assuming no parsing is ongoing and +no document is still being used, if needed the state will be rebuild at the +next invocation of parser routines (or by xmlInitParser()), but be careful +of the consequences in multithreaded applications.

Debugging routines

When configured using --with-mem-debug flag (off by default), libxml2 uses +a set of memory allocation debugging routines keeping track of all allocated +blocks and the location in the code where the routine was called. A couple of +other debugging routines allow to dump the memory allocated infos to a file +or call a specific routine when a given block number is allocated:

When developing libxml2 memory debug is enabled, the tests programs call +xmlMemoryDump () and the "make test" regression tests will check for any +memory leak during the full regression test sequence, this helps a lot +ensuring that libxml2 does not leak memory and bullet proof memory +allocations use (some libc implementations are known to be far too permissive +resulting in major portability problems!).

If the .memdump reports a leak, it displays the allocation function and +also tries to give some information about the content and structure of the +allocated blocks left. This is sufficient in most cases to find the culprit, +but not always. Assuming the allocation problem is reproducible, it is +possible to find more easily:

    +
  1. write down the block number xxxx not allocated
  2. +
  3. export the environment variable XML_MEM_BREAKPOINT=xxxx , the easiest + when using GDB is to simply give the command +

    set environment XML_MEM_BREAKPOINT xxxx

    +

    before running the program.

    +
  4. +
  5. run the program under a debugger and set a breakpoint on + xmlMallocBreakpoint() a specific function called when this precise block + is allocated
  6. +
  7. when the breakpoint is reached you can then do a fine analysis of the + allocation an step to see the condition resulting in the missing + deallocation.
  8. +

I used to use a commercial tool to debug libxml2 memory problems but after +noticing that it was not detecting memory leaks that simple mechanism was +used and proved extremely efficient until now. Lately I have also used valgrind with quite some +success, it is tied to the i386 architecture since it works by emulating the +processor and instruction set, it is slow but extremely efficient, i.e. it +spot memory usage errors in a very precise way.

General memory requirements

How much libxml2 memory require ? It's hard to tell in average it depends +of a number of things:

    +
  • the parser itself should work in a fixed amount of memory, except for + information maintained about the stacks of names and entities locations. + The I/O and encoding handlers will probably account for a few KBytes. + This is true for both the XML and HTML parser (though the HTML parser + need more state).
  • +
  • If you are generating the DOM tree then memory requirements will grow + nearly linear with the size of the data. In general for a balanced + textual document the internal memory requirement is about 4 times the + size of the UTF8 serialization of this document (example the XML-1.0 + recommendation is a bit more of 150KBytes and takes 650KBytes of main + memory when parsed). Validation will add a amount of memory required for + maintaining the external Dtd state which should be linear with the + complexity of the content model defined by the Dtd
  • +
  • If you need to work with fixed memory requirements or don't need the + full DOM tree then using the xmlReader + interface is probably the best way to proceed, it still allows to + validate or operate on subset of the tree if needed.
  • +
  • If you don't care about the advanced features of libxml2 like + validation, DOM, XPath or XPointer, don't use entities, need to work with + fixed memory requirements, and try to get the fastest parsing possible + then the SAX interface should be used, but it has known restrictions.
  • +

Returning memory to the kernel

You may encounter that your process using libxml2 does not have a +reduced memory usage although you freed the trees. This is because +libxml2 allocates memory in a number of small chunks. When freeing one +of those chunks, the OS may decide that giving this little memory back +to the kernel will cause too much overhead and delay the operation. As +all chunks are this small, they get actually freed but not returned to +the kernel. On systems using glibc, there is a function call +"malloc_trim" from malloc.h which does this missing operation (note that +it is allowed to fail). Thus, after freeing your tree you may simply try +"malloc_trim(0);" to really get the memory back. If your OS does not +provide malloc_trim, try searching for a similar function.

Daniel Veillard

-- cgit v1.2.3