Search Text In PDF Using HiQPdf Library

With HiQPdf Library for .NET you can search a text in a PDF document using the SearchText() method of the PdfTextExtract class. You can choose to match the case or to match the whole word only when searching using this method parameters.

In the C# code sample below you can see how to search for a text in an existing PDF document. The found text is then highlighted in the original PDF.

C# Code Sample to Search and Highlight Text in PDF

// get the PDF file
string pdfFile = Server.MapPath("~") + @"\DemoFiles\Pdf\InputPdf.pdf";

// get the text to search
string textToSearch = textBoxTextToSearch.Text;

// create the PDF text extractor
PdfTextExtract pdfTextExtract = new PdfTextExtract();

int fromPdfPageNumber = int.Parse(textBoxFromPage.Text);
int toPdfPageNumber = textBoxToPage.Text.Length > 0 ? int.Parse(textBoxToPage.Text) : 0;

// search the text in PDF document
PdfTextSearchItem[] searchTextInstances = pdfTextExtract.SearchText(pdfFile, textToSearch,
            fromPdfPageNumber, toPdfPageNumber, checkBoxMatchCase.Checked, checkBoxMatchWholeWord.Checked);

// load the PDF file to highlight the searched text
PdfDocument pdfDocument = PdfDocument.FromFile(pdfFile);

// highlight the searched text in PDF document
foreach (PdfTextSearchItem searchTextInstance in searchTextInstances)
{
    PdfRectangle pdfRectangle = new PdfRectangle(searchTextInstance.BoundingRectangle);

    // set rectangle color and opacity
    pdfRectangle.BackColor = Color.Yellow;
    pdfRectangle.Opacity = 30;

    // highlight the text
    pdfDocument.Pages[searchTextInstance.PdfPageNumber - 1].Layout(pdfRectangle);
}

// write the modified PDF document
try
{
    // write the PDF document to a memory buffer
    byte[] pdfBuffer = pdfDocument.WriteToMemory();

    // inform the browser about the binary data format
    HttpContext.Current.Response.AddHeader("Content-Type", "application/pdf");

    // let the browser know how to open the PDF document and the file name
    HttpContext.Current.Response.AddHeader("Content-Disposition", String.Format("attachment; filename=SearchText.pdf; size={0}",
                pdfBuffer.Length.ToString()));

    // write the PDF buffer to HTTP response
    HttpContext.Current.Response.BinaryWrite(pdfBuffer);

    // call End() method of HTTP response to stop ASP.NET page processing
    HttpContext.Current.Response.End();
}
finally
{
    pdfDocument.Close();
}

You can find a live demo for searching and highlighting the text in PDF on product website.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s