Text Content (Text Buffer)
- 3 minutes to read
The basis of a sub-document is its actual text content. This content is stored within a sub-document's text buffer that can be accessed on the client using the SubDocument.text property. This is a separate logical stream of text containing all textual parts in a sub-document, such as paragraphs, lists, tables, and etc. All sub-document parts are bound to certain positions within the text buffer. The text buffer uses a plain one-dimensional coordinate system to identify character locations in the buffer. A coordinate position is based on a character's position from the first character in the buffer.
All entities of sub-document parts have their own symbol marks. In particular, these are:
- Simple text - for a continuous set of characters;
- End marks - for paragraphs;
- Break marks - for lines, columns, tables, pages, sections;
- Specific marks - for spaces, tabulations, inline pictures, fields and etc.
The text buffer contains contiguous pieces of text (text characters and specific symbol marks) with no additional text markup. Some examples:
If RichEdit's main sub-document displays a sentence that has two words with different font formatting and ends with a paragraph mark:
then, within the sub-document's text buffer, this content is represented as a string of 10 characters (the text buffer length is 10). This string consists of the following four elements.
- the "bold" plain text with specific bold formatting (the first 4 characters),
- an interpunct symbol (·) that marks a word-separating whitespace (1 character),
- the "text" plain text (4 characters),
- the paragraph end mark "¶" (1 character).
Within the text buffer, the first text run starts with position 0, the whitespace mark starts with position 4, the second text run starts with position 5, and the paragraph mark starts with the last position 9.
- If a sub-document contains only an inline picture and a paragraph mark, the text buffer does not contain any real text. It contains just two specific mark symbols - one to mark the inline picture and another - to mark the paragraph end. The text buffer's length is 2.
Show/Hide Mark Symbols
The RichEdit has two modes of displaying formatting marks on the screen.
Hide formatting marks
By default, formatting marks (including paragraph marks, dots for spaces, arrows for tabulations, text that is formatted as hidden, etc.) are not displayed within document text layout. In this case, hidden text does not exist in the document layout at all, however spaces and tabulations are in the document layout - they are displayed not as marks but as blank spaces/offsets of the required width.
Show formatting marks
The marks can be toggled on to display every non-printing character and to proofread the document layout. In this case, the document text layout contains text (together with text that is formatted as hidden) and specific marks for all spaces, tabulations, element breaks (such as section break or page brakes) and the like.
The display mode of formatting marks can be changed by pressing the "Crtl+Shift+8" key combination.
Client commands provide a set of methods that can be used to insert and delete text and special symbols in the following notation:
Use the RichEditCommands.showHiddenSymbols command in the following notation, to programmatically toggle the visibility of formatting marks:
A sub-document object (SubDocument) exposes the following text buffer related properties, which can be used in the notation given below:
|text||SubDocument.text||Gets the document's textual representation.|
|length||SubDocument.length||Gets the character length of the document.|