Skip to main content
A newer version of this page is available. .

Traversing the Document

  • 7 minutes to read

This article details the techniques that can be used to navigate through the Rich Edit document to check the required information or perform certain actions with the elements.

RichEditControl provides means to navigate through a Document Model and Document Layout. Depending on the structure traversed, you can retrieve different information about the encountered elements. The Document Layout allows you to get only the location of the document elements, while navigating through the Document Model provides access to the properties and content of the entities.

The traversing techniques are based on the Visitor and Iterator objects. The Iterator moves through the document and obtains every element. The Visitor is used to perform operations when visiting an element of a specific type. With this Visitor-Iterator pattern, you can access all elements and perform actions on them using a single class. For details, refer to the Visitor pattern article.

Suppose you want to process the document text and display it in another text editing control (the MemoEdit in particular) in a way that the bold text is enclosed in asterisks and the rest of the text is shown without any formatting.

To accomplish this, do the following:

  1. Declare an abstract class that implements the IDocumentVisitor interface. This interface provides the necessary Visit methods that take all possible document element types as parameters. Also, use the constructor of this class to initialize a new instance of the System.Text.StringBuilder class, which will store the text processed by the Visitor.

    Tip

    The DocumentVisitorBase class implements the IDocumentVisitor interface by default. You can use this class instead of creating your own.

  2. Create a descendant of the newly created class. To perform the required actions when visiting a specific DocumentText element, override the corresponding IDocumentVisitor.Visit method as shown on the code sample below.

    MyVisitor is a custom class which implements the IDocumentVisitor interface and provides a method that processes DocumentText elements to do the following.

    • Enclose the bold text in asterisks
    • Return all characters without formatting
    • Replace the paragraph ends with newline symbols

    Other document elements are skipped.

    public class MyVisitor : DocumentVisitorBase
    {
        readonly StringBuilder buffer;
        public MyVisitor() { this.buffer = new StringBuilder(); }
        protected StringBuilder Buffer { get { return buffer; } }
        public string Text { get { return Buffer.ToString(); } }
    
        public override void Visit(DocumentText text) {
            string prefix = (text.TextProperties.FontBold) ? "**" : "";
            Buffer.Append(prefix);
            Buffer.Append(text.Text);
            Buffer.Append(prefix);
        }
        public override void Visit(DocumentParagraphEnd paragraphEnd) {
            Buffer.AppendLine();
        }
    }
    

    Important

    You can only browse and modify the existing content of the visited elements. Adding or removing the content is not permitted.

  3. Create a new object of the Visitor class.

    
    MyVisitor visitor = new MyVisitor();
    
  4. Create an Iterator object, represented by the DocumentIterator class.

    
    DocumentIterator iterator = new DocumentIterator(richEditControl1.Document);
    
  5. Call the DocumentIterator.MoveNext method to loop through all the document elements until it returns false, i.e., until the iterator has reached the end of the document. This loop can be used to provide the Visitor with access to the document entities, so that it can perform the declared operations. To do that, call the IDocumentElement.Accept method to the current document element (the DocumentIterator.Current property). As a result, the Visitor will execute the related Visit method for every iterated element.

    
    while (iterator.MoveNext())
        iterator.Current.Accept(visitor);
    

    Tip

    It is also possible to check the elements of a specific type only. To do that, pass this element type as a parameter to DocumentIterator.MoveNext method.

  6. Write the processed text stored in the Text property to the Memo Edit control.

    
    memoEdit1.Text = visitor.Text;
    

As a result, the text will be converted as illustrated below.

DcoumentIterator_Example

When obtaining an element, the Iterator returns the information about it as read-only sub-properties of an object returned by the DocumentIterator.Current property.

By default, the Iterator obtains textual elements containing hidden characters. The ReadOnlyTextPropertiesBase.Hidden property indicates whether the iterated entity contains them. To make the iterator skip hidden text, pass true to the Iterator constructor as the visibleTextOnly parameter. As for the non-textual elements, the iterator obtains all of them, regardless of their visibility.

Some elements, such as DocumentTextBox or DocumentCommentElement, can contain other elements. To get the content of these entities, create a new visitor and iterator instance in the corresponding Visit method. Use DocumentTextBox.GetIterator or DocumentCommentElement.GetIterator method to provide a new iterator with access to the element content. To skip the hidden content contained in obtained elements, pass true as the method’s visibleTextOnly parameter.


public override void Visit(DocumentTextBox textbox)
 {
   DocumentIterator textBoxIterator = textBox.GetIterator(true);
   MyVisitor textBoxVisitor = new MyVisitor();
     while (textBoxIterator.MoveNext())
        textBoxIterator.Current.Accept(textBoxVisitor);
 }

Note

The iterator obtains the elements that are located in the main body of the document model. The header or footer entities are not accessible to it.

The document layout can be traversed with the use of the Visitor and Iterator as well, but here, these two instances operate separately.

Traversing the document layout using the Visitor ( a LayoutVisitor descendant) allows you to perform required actions while visiting a specific element. It operates with the same algorithm as described above, but with one difference: in response to the visitor’s LayoutVisitor.Visit method, the Accept method is called by the LayoutElement itself, not by the Iterator. To get detailed information about Visitor implementation, refer to the Layout API article.

Navigating the document using the Iterator (a LayoutIterator object) allows you to iterate through every LayoutElement element and obtain its properties. The Iterator can move both backwards and forwards with the use of the LayoutIterator.MoveNext and LayoutIterator.MovePrevious methods. All obtained element characteristics, such as LayoutElement.Type, LayoutElement.Bounds and LayoutElement.Parent, are accessible through the LayoutIterator.Current property. Since the Visit methods are not provided, it is impossible to define an operation when visiting the element.

Apart from the LayoutVisitor traversing the elements directly, the LayoutIterator navigates through the levels of the DocumentLayout object, from LayoutLevel.Page to LayoutLevel.Box, and gets the elements related to the current level. You can traverse only a specific level by passing the required LayoutLevel value as the LayoutIterator.MoveNext method parameter. Passing true as the method’s allowTreeTraversal parameter makes the Iterator start from a specified level and move further.

This code snippet calls different LayoutIterator.MoveNext overloads to illustrate different techniques of tree navigation. Checking the LayoutIterator.IsLayoutValid value is required because the user can modify the document layout in the meantime.

The LayoutIterator.IsStart and LayoutIterator.IsEnd properties indicate whether the iterator reached the start or end of the range for which it has been created, respectively.

bool result = false;
string s = string.Empty;

// Create a new iterator if the document has been changed and the layout is updated.
if (!layoutIterator.IsLayoutValid) CreateNewIterator();

switch (barEditItemRgLevel.EditValue.ToString())
{
    case "Any":
        result = layoutIterator.MoveNext();
        break;
    case "Level":
        result = layoutIterator.MoveNext((LayoutLevel)cmbLayoutLevel.EditValue);
        break;
    case "LevelWithinParent":
        result = layoutIterator.MoveNext((LayoutLevel)cmbLayoutLevel.EditValue, false);
        break;                
}

if (!result)
{
    s = "Cannot move.";
    if (layoutIterator.IsStart) s += "\nStart is reached";
    else if (layoutIterator.IsEnd) s += "\nEnd is reached";
    MessageBox.Show(s);
}