Processing XML with ElementTree

The xml package is part of the Python standard library and contains in turn a series of packages and modules specializing in the management and manipulation of keyed documents.

The xml.etree.ElementTree package specializes in XML documents and contains various classes and functions that can be used for that purpose.

Let's see how we may create the previously mentioned example XML document by using ElementTree. Open a Python interpreter and run the following commands:

>>> import xml.etree.ElementTree as ET
>>> root = ET.Element('root')
>>> ET.dump(root)
<root />

We start by creating the root element, that is, the outermost element of the document.  The <root /> representation is an XML shortcut for <root></root>. It's used to show an empty element, that is, an element with no data and no child tags.

We create the <root> element by creating a new ElementTree.Element object. You'll notice that the argument we give to Element() is the name of the tag that is created. Our <root> element is empty at the moment, so let's put something in it:

>>> book = ET.Element('book')
>>> root.append(book)
>>> ET.dump(root)
<root><book /></root>

Now we have an element called <book> in our <root> element. When an element is directly nested inside another, then the nested element is called a child of the outer element, and the outer element is called the parent. Similarly, elements that are at the same level are called siblings.

Let's add another element, and this time let's give it some content. Add the following commands:

>>> name = ET.SubElement(book, 'name')
>>> name.text = 'Book1'
>>> ET.dump(root)
<root><book><name>Book1</name></book></root>

Now our document is starting to shape up. We do two new things here: first, we use the shortcut class method ElementTree.SubElement() to create the new <name> element and insert it into the tree as a child of <book> in a single operation. Second, we give it some content by assigning some text to the element's text attribute.

We can remove elements by using the remove() method on the parent element, as shown in the following commands:

>>> temp = ET.SubElement(root, 'temp')
>>> ET.dump(root)
<root><book><name>Book1</name></book><temp /></root>
>>> root.remove(temp)
>>> ET.dump(root)
<root><book><name>Book1</name></book></root>
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset