Deferred Query Execution, Node Removal, and the Halloween Problem

This section serves as a warning that there are some goblins out there to be leery of. First up is deferred query execution. Never forget that many of the LINQ operators defer query execution until absolutely necessary, and this can cause potential side effects.

Another problem to be on the lookout for is the Halloween problem. The Halloween problem earned its name because it was first openly discussed among a small group of experts on Halloween. The problem is basically any problem that occurs by changing data that is being iterated over that affects the iteration. It was first detected by database engineers while working on the database optimizer. Their run-in with the problem occurred when their test query was changing the value of a database column that the optimizer they were developing was using as an index. Their test query would retrieve a record based on an index created over one of the table's columns and the query would change the value in that column. Since that column affected the indexing of the record, the record appeared again farther down in the list of records, causing it to be retrieved again in the same query and reprocessed. This caused an endless loop, because every time it was retrieved from the record set, it was updated and moved farther down the record set where it would only be picked up again and processed the same way indefinitely.

You may have seen the Halloween problem yourself even though you may have not known the name for it. Have you ever worked with some sort of collection, iterated through it, and deleted an item, and this caused the iteration to break or misbehave? I have seen this recently working with a major suite of ASP.NET server controls. The suite has a DataGrid server control, and I needed to remove selected records from it. I iterated through the records from start to finish, deleting the ones I needed to, but in doing so, it messed up the pointers being used for the iteration. The result was some records that should not have been deleted were, and some that should have been deleted were not. I called the vendor for support and its solution was to iterate through the records backward. This resolved the problem.

With LINQ to XML, you will most likely run into this problem when removing nodes from an XML tree, although it can occur at other times, so you want to keep this in your mind when you are coding. Let's examine the example in Listing 7-13.

Example. Intentionally Exposing the Halloween Problem
XDocument xDocument = new XDocument(
  new XElement("BookParticipants",
    new XElement("BookParticipant",
      new XAttribute("type", "Author"),
      new XElement("FirstName", "Joe"),
      new XElement("LastName", "Rattz")),
    new XElement("BookParticipant",
      new XAttribute("type", "Editor"),
      new XElement("FirstName", "Ewan"),
      new XElement("LastName", "Buckingham"))));

IEnumerable<XElement> elements =
  xDocument.Element("BookParticipants").Elements("BookParticipant");

foreach (XElement element in elements)
{
  Console.WriteLine("Source element: {0} : value = {1}",
    element.Name, element.Value);
}

foreach (XElement element in elements)
{
  Console.WriteLine("Removing {0} = {1} ...", element.Name, element.Value);
  element.Remove();
}

Console.WriteLine(xDocument);

In the previous code, I first build my XML document. Next, I build a sequence of the BookParticipant elements. This is the sequence I will enumerate through, removing elements. Next, I display each element in my sequence so you can see that I do indeed have two BookParticipant elements. I then enumerate through the sequence again, displaying a message that I am removing the element, and I remove the BookParticipant element. I then display the resulting XML document.

If the Halloween problem does not manifest itself, you should see the "Removing ..." message twice; and when the XML document is displayed at the end, you should have an empty BookParticipants element. Here are the results:

Source element: BookParticipant : value = JoeRattz
Source element: BookParticipant : value = EwanBuckingham
Removing BookParticipant = JoeRattz ...
<BookParticipants>
  <BookParticipant type="Editor">
    <FirstName>Ewan</FirstName>
    <LastName>Buckingham</LastName>
  </BookParticipant>
</BookParticipants>

Just as I anticipated, there are two source BookParticipant elements in the sequence to remove. You can see the first one, Joe Rattz, gets removed. However, you never see the second one get removed, and when I display the resulting XML document, the last BookParticipant element is still there. The enumeration misbehaved; the Halloween problem got me. Keep in mind that the Halloween problem does not always manifest itself in the same way. Sometimes enumerations may terminate sooner than they should; sometimes they throw exceptions. Their behavior varies depending on exactly what is happening.

I know that you are wondering, what is the solution? The solution for this case is to cache the elements and to enumerate through the cache instead of through the normal enumeration technique, which relies on internal pointers that are getting corrupted by the removal or modification of elements. For this example, I will cache the sequence of elements using one of the Standard Query Operators that is designed for the purpose of caching to prevent deferred query execution problems. I will use the ToArray operator. Listing 7-14 shows the same code as before, except I call the ToArray operator and enumerate on it.

Example. Preventing the Halloween Problem
XDocument xDocument = new XDocument(
  new XElement("BookParticipants",
    new XElement("BookParticipant",
      new XAttribute("type", "Author"),
      new XElement("FirstName", "Joe"),
      new XElement("LastName", "Rattz")),
    new XElement("BookParticipant",
      new XAttribute("type", "Editor"),
      new XElement("FirstName", "Ewan"),
      new XElement("LastName", "Buckingham"))));

IEnumerable<XElement> elements =
  xDocument.Element("BookParticipants").Elements("BookParticipant");

foreach (XElement element in elements)
{
  Console.WriteLine("Source element: {0} : value = {1}",
    element.Name, element.Value);
}

foreach (XElement element in elements.ToArray())
{
  Console.WriteLine("Removing {0} = {1} ...", element.Name, element.Value);
  element.Remove();
}

Console.WriteLine(xDocument);

This code is identical to the previous example except I call the ToArray operator in the final enumeration where I remove the elements. Here are the results:

Source element: BookParticipant : value = JoeRattz
Source element: BookParticipant : value = EwanBuckingham
Removing BookParticipant = JoeRattz ...
Removing BookParticipant = EwanBuckingham ...
<BookParticipants />

Notice that this time I got two messages informing me that a BookParticipant element was being removed. Also, when I display the XML document after the removal, I do have an empty BookParticipants element because all the child elements have been removed. The Halloween problem has been foiled!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset