From 35a201cc8ef0c3f5b2df88d2e528aabee1048348 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Fri, 30 Apr 2021 18:47:09 +0200 Subject: Initial/Final commit --- libxml2-2.9.10/doc/xmlio.html | 141 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 libxml2-2.9.10/doc/xmlio.html (limited to 'libxml2-2.9.10/doc/xmlio.html') diff --git a/libxml2-2.9.10/doc/xmlio.html b/libxml2-2.9.10/doc/xmlio.html new file mode 100644 index 0000000..eb210a8 --- /dev/null +++ b/libxml2-2.9.10/doc/xmlio.html @@ -0,0 +1,141 @@ + + +I/O Interfaces
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

I/O Interfaces

Developer Menu
API Indexes
Related links

Table of Content:

    +
  1. General overview
  2. +
  3. The basic buffer type
  4. +
  5. Input I/O handlers
  6. +
  7. Output I/O handlers
  8. +
  9. The entities loader
  10. +
  11. Example of customized I/O
  12. +

General overview

The module xmlIO.h provides +the interfaces to the libxml2 I/O system. This consists of 4 main parts:

    +
  • Entities loader, this is a routine which tries to fetch the entities + (files) based on their PUBLIC and SYSTEM identifiers. The default loader + don't look at the public identifier since libxml2 do not maintain a + catalog. You can redefine you own entity loader by using + xmlGetExternalEntityLoader() and + xmlSetExternalEntityLoader(). Check the + example.
  • +
  • Input I/O buffers which are a commodity structure used by the parser(s) + input layer to handle fetching the information to feed the parser. This + provides buffering and is also a placeholder where the encoding + converters to UTF8 are piggy-backed.
  • +
  • Output I/O buffers are similar to the Input ones and fulfill similar + task but when generating a serialization from a tree.
  • +
  • A mechanism to register sets of I/O callbacks and associate them with + specific naming schemes like the protocol part of the URIs. +

    This affect the default I/O operations and allows to use specific I/O + handlers for certain names.

    +
  • +

The general mechanism used when loading http://rpmfind.net/xml.html for +example in the HTML parser is the following:

    +
  1. The default entity loader calls xmlNewInputFromFile() with + the parsing context and the URI string.
  2. +
  3. the URI string is checked against the existing registered handlers + using their match() callback function, if the HTTP module was compiled + in, it is registered and its match() function will succeeds
  4. +
  5. the open() function of the handler is called and if successful will + return an I/O Input buffer
  6. +
  7. the parser will the start reading from this buffer and progressively + fetch information from the resource, calling the read() function of the + handler until the resource is exhausted
  8. +
  9. if an encoding change is detected it will be installed on the input + buffer, providing buffering and efficient use of the conversion + routines
  10. +
  11. once the parser has finished, the close() function of the handler is + called once and the Input buffer and associated resources are + deallocated.
  12. +

The user defined callbacks are checked first to allow overriding of the +default libxml2 I/O routines.

The basic buffer type

All the buffer manipulation handling is done using the +xmlBuffer type define in tree.h which is a +resizable memory buffer. The buffer allocation strategy can be selected to be +either best-fit or use an exponential doubling one (CPU vs. memory use +trade-off). The values are XML_BUFFER_ALLOC_EXACT and +XML_BUFFER_ALLOC_DOUBLEIT, and can be set individually or on a +system wide basis using xmlBufferSetAllocationScheme(). A number +of functions allows to manipulate buffers with names starting with the +xmlBuffer... prefix.

Input I/O handlers

An Input I/O handler is a simple structure +xmlParserInputBuffer containing a context associated to the +resource (file descriptor, or pointer to a protocol handler), the read() and +close() callbacks to use and an xmlBuffer. And extra xmlBuffer and a charset +encoding handler are also present to support charset conversion when +needed.

Output I/O handlers

An Output handler xmlOutputBuffer is completely similar to an +Input one except the callbacks are write() and close().

The entities loader

The entity loader resolves requests for new entities and create inputs for +the parser. Creating an input from a filename or an URI string is done +through the xmlNewInputFromFile() routine. The default entity loader do not +handle the PUBLIC identifier associated with an entity (if any). So it just +calls xmlNewInputFromFile() with the SYSTEM identifier (which is mandatory in +XML).

If you want to hook up a catalog mechanism then you simply need to +override the default entity loader, here is an example:

#include <libxml/xmlIO.h>
+
+xmlExternalEntityLoader defaultLoader = NULL;
+
+xmlParserInputPtr
+xmlMyExternalEntityLoader(const char *URL, const char *ID,
+                               xmlParserCtxtPtr ctxt) {
+    xmlParserInputPtr ret;
+    const char *fileID = NULL;
+    /* lookup for the fileID depending on ID */
+
+    ret = xmlNewInputFromFile(ctxt, fileID);
+    if (ret != NULL)
+        return(ret);
+    if (defaultLoader != NULL)
+        ret = defaultLoader(URL, ID, ctxt);
+    return(ret);
+}
+
+int main(..) {
+    ...
+
+    /*
+     * Install our own entity loader
+     */
+    defaultLoader = xmlGetExternalEntityLoader();
+    xmlSetExternalEntityLoader(xmlMyExternalEntityLoader);
+
+    ...
+}

Example of customized I/O

This example come from a +real use case, xmlDocDump() closes the FILE * passed by the application +and this was a problem. The solution was to redefine a +new output handler with the closing call deactivated:

    +
  1. First define a new I/O output allocator where the output don't close + the file: +
    xmlOutputBufferPtr
    +xmlOutputBufferCreateOwn(FILE *file, xmlCharEncodingHandlerPtr encoder) {
    +    xmlOutputBufferPtr ret;
    +    
    +    if (xmlOutputCallbackInitialized == 0)
    +        xmlRegisterDefaultOutputCallbacks();
    +
    +    if (file == NULL) return(NULL);
    +    ret = xmlAllocOutputBuffer(encoder);
    +    if (ret != NULL) {
    +        ret->context = file;
    +        ret->writecallback = xmlFileWrite;
    +        ret->closecallback = NULL;  /* No close callback */
    +    }
    +    return(ret);
    +} 
    +
  2. +
  3. And then use it to save the document: +
    FILE *f;
    +xmlOutputBufferPtr output;
    +xmlDocPtr doc;
    +int res;
    +
    +f = ...
    +doc = ....
    +
    +output = xmlOutputBufferCreateOwn(f, NULL);
    +res = xmlSaveFileTo(output, doc, NULL);
    +    
    +
  4. +

Daniel Veillard

-- cgit v1.2.3