From 35a201cc8ef0c3f5b2df88d2e528aabee1048348 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Fri, 30 Apr 2021 18:47:09 +0200 Subject: Initial/Final commit --- libxml2-2.9.10/doc/python.html | 254 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 254 insertions(+) create mode 100644 libxml2-2.9.10/doc/python.html (limited to 'libxml2-2.9.10/doc/python.html') diff --git a/libxml2-2.9.10/doc/python.html b/libxml2-2.9.10/doc/python.html new file mode 100644 index 0000000..fd52966 --- /dev/null +++ b/libxml2-2.9.10/doc/python.html @@ -0,0 +1,254 @@ + + +Python and bindings
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Python and bindings

Developer Menu
API Indexes
Related links

There are a number of language bindings and wrappers available for +libxml2, the list below is not exhaustive. Please contact the xml-bindings@gnome.org +(archives) in +order to get updates to this list or to discuss the specific topic of libxml2 +or libxslt wrappers or bindings:

    +
  • Libxml++ seems the + most up-to-date C++ bindings for libxml2, check the documentation + and the examples.
  • +
  • There is another C++ wrapper + based on the gdome2 bindings maintained by Tobias Peters.
  • +
  • and a third C++ wrapper by Peter Jones <pjones@pmade.org> +

    Website: http://pmade.org/pjones/software/xmlwrapp/

    +
  • +
  • XML::LibXML Perl + bindings are available on CPAN, as well as XML::LibXSLT + Perl libxslt + bindings.
  • +
  • If you're interested into scripting XML processing, have a look at XSH an XML editing shell based on + Libxml2 Perl bindings.
  • +
  • Dave Kuhlman provides an + earlier version of the libxml/libxslt wrappers for Python.
  • +
  • Gopal.V and Peter Minten develop libxml#, a set of + C# libxml2 bindings.
  • +
  • Petr Kozelka provides Pascal units to glue + libxml2 with Kylix, Delphi and other Pascal compilers.
  • +
  • Uwe Fechner also provides idom2, a DOM2 + implementation for Kylix2/D5/D6 from Borland.
  • +
  • There is bindings for Ruby + and libxml2 bindings are also available in Ruby through the libgdome-ruby module + maintained by Tobias Peters.
  • +
  • Steve Ball and contributors maintains libxml2 and libxslt bindings for + Tcl.
  • +
  • libxml2 and libxslt are the default XML libraries for PHP5.
  • +
  • LibxmlJ is + an effort to create a 100% JAXP-compatible Java wrapper for libxml2 and + libxslt as part of GNU ClasspathX project.
  • +
  • Patrick McPhee provides Rexx bindings fof libxml2 and libxslt, look for + RexxXML.
  • +
  • Satimage + provides XMLLib + osax. This is an osax for Mac OS X with a set of commands to + implement in AppleScript the XML DOM, XPATH and XSLT. Also includes + commands for Property-lists (Apple's fast lookup table XML format.)
  • +
  • Francesco Montorsi developped wxXml2 + wrappers that interface libxml2, allowing wxWidgets applications to + load/save/edit XML instances.
  • +

The distribution includes a set of Python bindings, which are guaranteed +to be maintained as part of the library in the future, though the Python +interface have not yet reached the completeness of the C API.

Note that some of the Python purist dislike the default set of Python +bindings, rather than complaining I suggest they have a look at lxml the more pythonic bindings for libxml2 +and libxslt and check the mailing-list.

Stéphane Bidoul +maintains a Windows port +of the Python bindings.

Note to people interested in building bindings, the API is formalized as +an XML API description file which allows to +automate a large part of the Python bindings, this includes function +descriptions, enums, structures, typedefs, etc... The Python script used to +build the bindings is python/generator.py in the source distribution.

To install the Python bindings there are 2 options:

    +
  • If you use an RPM based distribution, simply install the libxml2-python + RPM (and if needed the libxslt-python + RPM).
  • +
  • Otherwise use the libxml2-python + module distribution corresponding to your installed version of + libxml2 and libxslt. Note that to install it you will need both libxml2 + and libxslt installed and run "python setup.py build install" in the + module tree.
  • +

The distribution includes a set of examples and regression tests for the +python bindings in the python/tests directory. Here are some +excerpts from those tests:

tst.py:

This is a basic test of the file interface and DOM navigation:

import libxml2, sys
+
+doc = libxml2.parseFile("tst.xml")
+if doc.name != "tst.xml":
+    print "doc.name failed"
+    sys.exit(1)
+root = doc.children
+if root.name != "doc":
+    print "root.name failed"
+    sys.exit(1)
+child = root.children
+if child.name != "foo":
+    print "child.name failed"
+    sys.exit(1)
+doc.freeDoc()

The Python module is called libxml2; parseFile is the equivalent of +xmlParseFile (most of the bindings are automatically generated, and the xml +prefix is removed and the casing convention are kept). All node seen at the +binding level share the same subset of accessors:

    +
  • name : returns the node name
  • +
  • type : returns a string indicating the node type
  • +
  • content : returns the content of the node, it is based on + xmlNodeGetContent() and hence is recursive.
  • +
  • parent , children, last, + next, prev, doc, + properties: pointing to the associated element in the tree, + those may return None in case no such link exists.
  • +

Also note the need to explicitly deallocate documents with freeDoc() . +Reference counting for libxml2 trees would need quite a lot of work to +function properly, and rather than risk memory leaks if not implemented +correctly it sounds safer to have an explicit function to free a tree. The +wrapper python objects like doc, root or child are them automatically garbage +collected.

validate.py:

This test check the validation interfaces and redirection of error +messages:

import libxml2
+
+#deactivate error messages from the validation
+def noerr(ctx, str):
+    pass
+
+libxml2.registerErrorHandler(noerr, None)
+
+ctxt = libxml2.createFileParserCtxt("invalid.xml")
+ctxt.validate(1)
+ctxt.parseDocument()
+doc = ctxt.doc()
+valid = ctxt.isValid()
+doc.freeDoc()
+if valid != 0:
+    print "validity check failed"

The first thing to notice is the call to registerErrorHandler(), it +defines a new error handler global to the library. It is used to avoid seeing +the error messages when trying to validate the invalid document.

The main interest of that test is the creation of a parser context with +createFileParserCtxt() and how the behaviour can be changed before calling +parseDocument() . Similarly the information resulting from the parsing phase +is also available using context methods.

Contexts like nodes are defined as class and the libxml2 wrappers maps the +C function interfaces in terms of objects method as much as possible. The +best to get a complete view of what methods are supported is to look at the +libxml2.py module containing all the wrappers.

push.py:

This test show how to activate the push parser interface:

import libxml2
+
+ctxt = libxml2.createPushParser(None, "<foo", 4, "test.xml")
+ctxt.parseChunk("/>", 2, 1)
+doc = ctxt.doc()
+
+doc.freeDoc()

The context is created with a special call based on the +xmlCreatePushParser() from the C library. The first argument is an optional +SAX callback object, then the initial set of data, the length and the name of +the resource in case URI-References need to be computed by the parser.

Then the data are pushed using the parseChunk() method, the last call +setting the third argument terminate to 1.

pushSAX.py:

this test show the use of the event based parsing interfaces. In this case +the parser does not build a document, but provides callback information as +the parser makes progresses analyzing the data being provided:

import libxml2
+log = ""
+
+class callback:
+    def startDocument(self):
+        global log
+        log = log + "startDocument:"
+
+    def endDocument(self):
+        global log
+        log = log + "endDocument:"
+
+    def startElement(self, tag, attrs):
+        global log
+        log = log + "startElement %s %s:" % (tag, attrs)
+
+    def endElement(self, tag):
+        global log
+        log = log + "endElement %s:" % (tag)
+
+    def characters(self, data):
+        global log
+        log = log + "characters: %s:" % (data)
+
+    def warning(self, msg):
+        global log
+        log = log + "warning: %s:" % (msg)
+
+    def error(self, msg):
+        global log
+        log = log + "error: %s:" % (msg)
+
+    def fatalError(self, msg):
+        global log
+        log = log + "fatalError: %s:" % (msg)
+
+handler = callback()
+
+ctxt = libxml2.createPushParser(handler, "<foo", 4, "test.xml")
+chunk = " url='tst'>b"
+ctxt.parseChunk(chunk, len(chunk), 0)
+chunk = "ar</foo>"
+ctxt.parseChunk(chunk, len(chunk), 1)
+
+reference = "startDocument:startElement foo {'url': 'tst'}:" + \ 
+            "characters: bar:endElement foo:endDocument:"
+if log != reference:
+    print "Error got: %s" % log
+    print "Expected: %s" % reference

The key object in that test is the handler, it provides a number of entry +points which can be called by the parser as it makes progresses to indicate +the information set obtained. The full set of callback is larger than what +the callback class in that specific example implements (see the SAX +definition for a complete list). The wrapper will only call those supplied by +the object when activated. The startElement receives the names of the element +and a dictionary containing the attributes carried by this element.

Also note that the reference string generated from the callback shows a +single character call even though the string "bar" is passed to the parser +from 2 different call to parseChunk()

xpath.py:

This is a basic test of XPath wrappers support

import libxml2
+
+doc = libxml2.parseFile("tst.xml")
+ctxt = doc.xpathNewContext()
+res = ctxt.xpathEval("//*")
+if len(res) != 2:
+    print "xpath query: wrong node set size"
+    sys.exit(1)
+if res[0].name != "doc" or res[1].name != "foo":
+    print "xpath query: wrong node set value"
+    sys.exit(1)
+doc.freeDoc()
+ctxt.xpathFreeContext()

This test parses a file, then create an XPath context to evaluate XPath +expression on it. The xpathEval() method execute an XPath query and returns +the result mapped in a Python way. String and numbers are natively converted, +and node sets are returned as a tuple of libxml2 Python nodes wrappers. Like +the document, the XPath context need to be freed explicitly, also not that +the result of the XPath query may point back to the document tree and hence +the document must be freed after the result of the query is used.

xpathext.py:

This test shows how to extend the XPath engine with functions written in +python:

import libxml2
+
+def foo(ctx, x):
+    return x + 1
+
+doc = libxml2.parseFile("tst.xml")
+ctxt = doc.xpathNewContext()
+libxml2.registerXPathFunction(ctxt._o, "foo", None, foo)
+res = ctxt.xpathEval("foo(1)")
+if res != 2:
+    print "xpath extension failure"
+doc.freeDoc()
+ctxt.xpathFreeContext()

Note how the extension function is registered with the context (but that +part is not yet finalized, this may change slightly in the future).

tstxpath.py:

This test is similar to the previous one but shows how the extension +function can access the XPath evaluation context:

def foo(ctx, x):
+    global called
+
+    #
+    # test that access to the XPath evaluation contexts
+    #
+    pctxt = libxml2.xpathParserContext(_obj=ctx)
+    ctxt = pctxt.context()
+    called = ctxt.function()
+    return x + 1

All the interfaces around the XPath parser(or rather evaluation) context +are not finalized, but it should be sufficient to do contextual work at the +evaluation point.

Memory debugging:

last but not least, all tests starts with the following prologue:

#memory debug specific
+libxml2.debugMemory(1)

and ends with the following epilogue:

#memory debug specific
+libxml2.cleanupParser()
+if libxml2.debugMemory(1) == 0:
+    print "OK"
+else:
+    print "Memory leak %d bytes" % (libxml2.debugMemory(1))
+    libxml2.dumpMemory()

Those activate the memory debugging interface of libxml2 where all +allocated block in the library are tracked. The prologue then cleans up the +library state and checks that all allocated memory has been freed. If not it +calls dumpMemory() which saves that list in a .memdump file.

Daniel Veillard

-- cgit v1.2.3