Developer‑friendly Document Parser SDK for extracting text, images, barcodes, metadata and tables from 50+ document and image formats.
Integrate high‑performance document parsing into your .NET, Java and Python applications with minimal coding effort.
Use flexible templates and advanced APIs to customize parsing rules and deliver clean, structured data outputs.
Powerful Document Parser SDK for extracting structured and unstructured data from PDFs, Office documents, images, emails and archives.
Extract textual information from various file formats
Retrieve visual content from diverse sources
Create custom templates and utilize them to parse specific information
PDF Forms are digital documents featuring fillable fields for user interaction
Some use cases of typical GroupDocs.Parser operations in C#, Java and Python
// Create an instance of Parser class passing desired file
using (var parser = new Parser("source.pdf"))
{
// Extract a text
using (var textReader = parser.GetText())
{
// Process the extracted text
Console.WriteLine(textReader?.ReadToEnd());
}
}
// Create an instance of Parser class passing desired file
try (Parser parser = new Parser("source.pdf"))
{
// Extract a text
try (TextReader reader = parser.getText())
{
// Process the extracted text
System.out.println(reader == null
? ""
: reader.readToEnd());
}
}
from groupdocs.parser import Parser
# Create an instance of Parser class passing desired file
with Parser("source.pdf") as parser:
# Extract a text
text = parser.get_text()
# Process the extracted text
print(text)
GroupDocs.Parser Document Parser SDK enables parsing operations across Office documents, PDFs, images, emails, archives and more.
Discover the Key Metrics of Our Library’s Accomplishments
GroupDocs.Parser supports operations with more than 50 popular file formats.
GroupDocs.Parser for .NET NuGet package was downloaded more than 1,600,000 times.
GroupDocs.Parser has 18,000 downloads on Maven. Powerful Java Parsing Features.
As famous companies as individual developers prefer GroupDocs products to build innovative solutions.
GroupDocs libraries are employed by globally renowned and distinguished brands across the world.
GroupDocs.Parser library supports the following operating systems and frameworks:
Explore documentation, code samples, and community support to enhance your experience.
Answers to most commonly asked questions.
Incorporate document parsing capabilities into any application using our cloud-based REST API and SDKs.
cURL commands for RESTful document parser Cloud API to parse documents across wide range of supported popular file formats.
Extract images, text, document information or even parse any document by user-defined template in your Microsoft .NET applications.
Cloud SDK for Java developers to parse documents, extract document information and data within Java based applications.
Web-based document parser apps that let you extract data from more than 50 popular file formats directly in your browser.
Free online app to parse Word, Excel, PowerPoint, PDF & 50+ more document types.
Parse Word documents directly from your web browser to extract images, text or metadata.
Free PDF parsing app that works on any platform or device without any limitations.