All docs
V21.1
21.2 (EAP/Beta)
21.1
20.2
20.1
19.2
19.1
The page you are viewing does not exist in version 19.1. This link will take you to the root page.
18.2
The page you are viewing does not exist in version 18.2. This link will take you to the root page.
18.1
The page you are viewing does not exist in version 18.1. This link will take you to the root page.
17.2
The page you are viewing does not exist in version 17.2. This link will take you to the root page.
.NET Framework 4.5.2+
.NET Framework 4.5.2+
.NET Standard 2.0+

PdfDocumentProcessor.GetText(PdfDocumentArea, PdfTextExtractionOptions) Method

Retrieves document content from the specified area with specified extraction options.

Namespace: DevExpress.Pdf

Assembly: DevExpress.Docs.v21.1.dll

Declaration

public string GetText(
    PdfDocumentArea area,
    PdfTextExtractionOptions options
)

Parameters

Name Type Description
area PdfDocumentArea

The document area from which the content should be extracted.

options PdfTextExtractionOptions

A PdfTextExtractionOptions object that contains extraction options.

Returns

Type Description
String

The text obtained from the specified area.

Remarks

The GetText method uses the page coordinate system. Refer to the following help topic for more details: Coordinate Systems.

Use the PdfTextExtractionOptions.ClipToCropBox property to extract content without clipping to the crop box.

The code sample below retrieves document content from the specified area:

using (DevExpress.Pdf.PdfDocumentProcessor processor = new DevExpress.Pdf.PdfDocumentProcessor())
{
    processor.LoadDocument("TextExtraction.pdf");
    PdfPage page = processor.Document.Pages[0];

    PdfRectangle pdfRectangle = new PdfRectangle(page.CropBox.Left / 3, page.CropBox.Bottom, page.CropBox.Right / 3, page.CropBox.Top);
    PdfDocumentArea pageArea = new PdfDocumentArea(1, pdfRectangle);

    string pageText = 
    processor.GetText(pageArea, new PdfTextExtractionOptions { ClipToCropBox = false });
    Console.WriteLine(pageText);
}
See Also