.NET API to Extract Document Data

Extract images, raw or formatted text and metadata from documents, spreadsheets, presentations, emails & archives from within .NET apps.


Download Free Trial

GroupDocs.Parser for .NET is a text, metadata and image extractor API for business applications developed using C#, ASP.NET, and other .NET technologies. It supports extraction of raw, formatted & structured text as well as metadata from the files of supported formats. Through GroupDocs.Parser for .NET, your applications can also perform parsing of password protected documents for popular formats, such as Word processing documents, Excel spreadsheets, PowerPoint presentations, OneNote, PDF files and ZIP archives.

GroupDocs.Parser for .NET Features

Extracting Text from a Document

Using GroupDocs.Parser for .NET API to extract text from a document is simple and achieved with just a few lines of code:

// Create an instance of Parser class
using(Parser parser = new Parser("sample.docx"))
{
  // Extract text into the reader
  using(TextReader reader = parser.GetText())
  {
    // Print text from the document
    // If text extraction isn't supported, reader is null
    Console.WriteLine(reader == null ? "Text extraction isn't supported." : reader.ReadToEnd());
  }
}

Support and Learning Resources

GroupDocs.Parser offers document viewing APIs for other popular development environments

Back to top
 English