.NET Framework 4.5.2+
.NET Framework 4.5.2+
.NET Standard 2.0+

PdfDocumentProcessor.GetText(PdfDocumentArea) Method

Retrieves the text found in the specified document area.

Namespace: DevExpress.Pdf

Assembly: DevExpress.Docs.v21.1.dll

Declaration

public string GetText(
    PdfDocumentArea area
)

Parameters

Name Type Description
area PdfDocumentArea

A PdfDocumentArea object.

Returns

Type Description
String

The content retrieved from the specified area.

Remarks

The overloaded GetText method uses the page coordinate system. Refer to the following help topic for more details: Coordinate Systems.

Pass the PdfTextExtractionOptions object as the method parameter to extract text without clipping the content to the crop box.

The code sample below retrieves text from a specific part of the document.

using (PdfDocumentProcessor processor = new PdfDocumentProcessor())
{
    processor.LoadDocument("TextExtraction.pdf");
    PdfPage page = processor.Document.Pages[0];

    PdfRectangle pdfRectangle = new PdfRectangle(page.CropBox.Left / 3, page.CropBox.Bottom, page.CropBox.Right / 3, page.CropBox.Top);
    PdfDocumentArea pageArea = new PdfDocumentArea(1, pdfRectangle);

    string pageText = processor.GetText(pageArea);
    Console.WriteLine(pageText);
}
See Also