Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
When you have to process large XML files, it might not be feasible to load the whole XML tree into memory. This article shows how to stream fragments using an XmlReader in C# and Visual Basic.
One of the most effective ways to use an XmlReader to read XElement objects is to write your own custom axis method. An axis method typically returns a collection such as IEnumerable<T> of XElement, as shown in the example in this article. In the custom axis method, after you create the XML fragment by calling the ReadFrom method, return the collection using yield return. This provides deferred execution semantics to your custom axis method.
When you create an XML tree from an XmlReader object, the XmlReader must be positioned on an element. The ReadFrom method doesn't return until it has read the close tag of the element.
If you want to create a partial tree, you can instantiate an XmlReader, position the reader on the node that you want to convert to an XElement tree, and then create the XElement object.
The article How to stream XML fragments with access to header information contains information on streaming a more complex document.
The article How to perform streaming transform of large XML documents contains an example of using LINQ to XML to transform extremely large XML documents while maintaining a small memory footprint.
Example: Create a custom axis method
This example creates a custom axis method. You can query it by using a LINQ query. The custom axis method StreamRootChildDoc can read a document that has a repeating Child element.
using System.Xml;
using System.Xml.Linq;
static IEnumerable<XElement> StreamRootChildDoc(StringReader stringReader)
{
    using XmlReader reader = XmlReader.Create(stringReader);
    reader.MoveToContent();
    // Parse the file and display each of the nodes.
    while (true)
    {
        // If the current node is an element and named "Child"
        if (reader.NodeType == XmlNodeType.Element && reader.Name == "Child")
        {
            // Get the current node and advance the reader to the next
            if (XNode.ReadFrom(reader) is XElement el)
                yield return el;
        }
        else if (!reader.Read())
            break;
    }
}
string markup = """
                <Root>
                  <Child Key="01">
                    <GrandChild>aaa</GrandChild>
                  </Child>
                  <Child Key="02">
                    <GrandChild>bbb</GrandChild>
                  </Child>
                  <Child Key="03">
                    <GrandChild>ccc</GrandChild>
                  </Child>
                </Root>
                """;
IEnumerable<string> grandChildData =
    from el in StreamRootChildDoc(new StringReader(markup))
    where (int)el.Attribute("Key") > 1
    select (string)el.Element("GrandChild");
foreach (string str in grandChildData)
    Console.WriteLine(str);
Imports System.Xml
Module Module1
    Public Iterator Function StreamRootChildDoc(stringReader As IO.StringReader) As IEnumerable(Of XElement)
        Using reader As XmlReader = XmlReader.Create(stringReader)
            reader.MoveToContent()
            ' Parse the file and display each of the nodes.
            While True
                ' If the current node is an element and named "Child"
                If reader.NodeType = XmlNodeType.Element And reader.Name = "Child" Then
                    ' Get the current node and advance the reader to the next
                    Dim el As XElement = TryCast(XNode.ReadFrom(reader), XElement)
                    If (el IsNot Nothing) Then
                        Yield el
                    End If
                ElseIf Not reader.Read() Then
                    Exit While
                End If
            End While
        End Using
    End Function
    Sub Main()
        Dim markup = "<Root>
                       <Child Key=""01"">
                         <GrandChild>aaa</GrandChild>
                       </Child>
                       <Child Key=""02"">
                         <GrandChild>bbb</GrandChild>
                       </Child>
                       <Child Key=""03"">
                         <GrandChild>ccc</GrandChild>
                       </Child>
                     </Root>"
        Dim grandChildData =
             From el In StreamRootChildDoc(New IO.StringReader(markup))
             Where CInt(el.@Key) > 1
             Select el.<GrandChild>.Value
        For Each s In grandChildData
            Console.WriteLine(s)
        Next
    End Sub
End Module
This example produces the following output:
bbb
ccc
The technique used in this example maintains a small memory footprint even for millions of Child elements.