.NET APIs to extract text from documents

Extract raw and formatted text from different formats including emails, zip files and legal documents containing metadata information within any .NET Application

Download Free Trial
.NET Text extraction API

GroupDocs.Text for .NET


GroupDocs.Text for .NET is a fascinating document text extraction API. It extracts text and metadata from Microsoft Word, Excel, PowerPoint, email messages, container files that contain other files like ZIP archives, plain text files and HTML without any of these document reader installed. Text extractor API performs operations with unprecedented accuracy and speed. API also provides convenient tools to detect encoding such as UTF32 LE, UTF32 BE, UTF16 LE , UTF16 BE and more

Previous Next

Advanced Document Text Extraction API Features



Extracts raw and formatted text


Extracts metadata


Extract structured text


Extract highlighted text


Search text in documents


Fetches text from containers containing other files such as zip archives


Gets formatted text from TXT, Markdown and HTML files


Support for encoding detection


Support for media type detectors

Text and Metadata Extractors

GroupDocs.Text for .NET provides various relevant text extractor classes. Moreover API also has convenient tools classes like encoding and media type detectors for different files e.g

  • EmailTextExtractor and EmailFormattedTextExtractor classes to extract text from email messages
  • ExtractorFactory class for creating Text, Formatted Extractor and Container.
  • EncodingDetector class for decoding different encoding.
  • MediaTypeDetector abstract class for each custom media type detector class to detect media type of the corresponding file.

In the same way API has various classes for metadata extraction from various documents

Container Text Extractor

Container has the ability to work with files containing other documents just like zip archives. API can be consumed for extracting messages from these containers such as ost-container.

Installation and Usage

API can be consumed on .NET Framework starting from V2.0 and Mono Framework starting from V1.2. Viewer API files can be installed/downloaded using following smooth ways.

Support and Learning Resources