All docs
V21.1
21.2 (EAP/Beta)
21.1
20.2
20.1
19.2
19.1
The page you are viewing does not exist in version 19.1. This link will take you to the root page.
18.2
The page you are viewing does not exist in version 18.2. This link will take you to the root page.
18.1
The page you are viewing does not exist in version 18.1. This link will take you to the root page.
17.2
The page you are viewing does not exist in version 17.2. This link will take you to the root page.
.NET Framework 4.5.2+
.NET Framework 4.5.2+
.NET Standard 2.0+

PdfDocumentProcessor.GetText(PdfDocumentPosition, PdfDocumentPosition, PdfTextExtractionOptions) Method

Retrieves document content located between the specified document positions with specified extraction options.

Namespace: DevExpress.Pdf

Assembly: DevExpress.Docs.v21.1.dll

Declaration

public string GetText(
    PdfDocumentPosition startPosition,
    PdfDocumentPosition endPosition,
    PdfTextExtractionOptions options
)

Parameters

Name Type Description
startPosition PdfDocumentPosition

The area’s start position.

endPosition PdfDocumentPosition

The area’s end position.

options PdfTextExtractionOptions

A PdfTextExtractionOptions object that contains extraction options.

Returns

Type Description
String

The text obtained from the specified area.

Remarks

The GetText method uses the page coordinate system. Refer to the following help topic for more details: Coordinate Systems.

If there is no text between the specified positions, this method returns text that is nearest to these positions.

The code sample below retrieves the content located between two positions on the first page:

using (DevExpress.Pdf.PdfDocumentProcessor processor = new DevExpress.Pdf.PdfDocumentProcessor())
{
    processor.LoadDocument("TextExtraction.pdf");
    PdfDocumentPosition startPosition = new PdfDocumentPosition(1, new PdfPoint(0, 0));
    PdfDocumentPosition endPosition = new PdfDocumentPosition(1, new PdfPoint(500, 500));

    string pageText = 
    processor.GetText(startPosition, endPosition, new PdfTextExtractionOptions { ClipToCropBox = false });
    Console.WriteLine(pageText);
}
See Also