Document Parsing Solution

Robust API for data extraction from various file formats.

Parse documents with minimal coding effort.

Customize parsing results.

GroupDocs.Parser at a glance

API for data parsing across PDF, Word, Excel and more

  • Extract text

    Extract textual information from various file formats

  • Extract images

    Retrieve visual content from diverse sources

  • Parse data by templates

    Create custom templates and utilize them to parse specific information

  • Parse PDF Forms

    PDF Forms are digital documents featuring fillable fields for user interaction

GroupDocs.Parser code samples

Some use cases of typical GroupDocs.Parser operations in C# and Java

How to extract text from PDF documents

GroupDocs.Parser API makes it easy to extract text from documents by implementing a few steps.
// Create an instance of Parser class passing desired file
using (var parser = new Parser("source.pdf"))
{
    // Extract a text
    using (var textReader = parser.GetText())
    {
        // Process the extracted text
        Console.WriteLine(textReader?.ReadToEnd());
    }
}     
// Create an instance of Parser class passing desired file
try (Parser parser = new Parser("source.pdf"))
{
    // Extract a text
    try (TextReader reader = parser.getText())
    {
        // Process the extracted text
        System.out.println(reader == null 
                ? "" 
                : reader.readToEnd());
    }
}  

50+ file formats supported

GroupDocs.Parser enables parser operations within various format families

GroupDocs.Parser achievements

Discover the Key Metrics of Our Library’s Accomplishments

  • 50+

    Supported formats

    GroupDocs.Parser supports operations with more than 50 popular file formats.

  • 1600k

    NuGet downloads

    GroupDocs.Parser for .NET NuGet package was downloaded more than 1,600,000 times.

  • 18k

    Maven downloads

    GroupDocs.Parser has 18,000 downloads on Maven. Powerful Java Parsing Features.

  • 140+

    Happy customers

    As famous companies as individual developers prefer GroupDocs products to build innovative solutions.

Our happy customers

GroupDocs libraries are employed by globally renowned and distinguished brands across the world.

Platform Independence

GroupDocs.Parser library supports the following operating systems and frameworks:

.NET

.NET Framework 4.6.2 or higher
.NET Core 2.0 or higher
.NET 6.0 or higher
Windows
Linux
Mac OS
Microsoft Visual Studio
JetBrains Rider
Microsoft Visual Code
50+ file formats

Java

Java 8 or higher
Kotlin
Windows
Linux
Mac OS
IntelliJ IDEA
Eclipse
NetBeans
50+ file formats

Ready to get started?

Try GroupDocs.Parser features for free on your platform

Useful resources

Explore documentation, code samples, and community support to enhance your experience.

Frequently asked questions

Answers to most commonly asked questions.

  • Does GroupDocs.Parser library need any other third-party software to manipulate documents?
    GroupDocs.Parser does not require any external software to be installed such as Adobe Acrobat, Microsoft Office, or any other.
  • Can I try the GroupDocs.Parser library before purchasing it?
    Yes, you can try GroupDocs.Parser without buying a license. Once installed without a license, the library works in trial mode. In this mode, trial badges are added to the resultant document, and it is trimmed to the first 3 pages. If you wish to test GroupDocs.Parser without the limitations of the trial version, you can also request a 30-day temporary license. For more details, see.
  • What licenses do you have?
    We offer several license types to fit the needs of particular developers or companies. License types depend on the number of developers, the number of developer site locations, and whether you need to deliver our SDK/API to your end customers. Alternatively, you can choose Metered licenses based on monthly usage of the product. Learn more here.
 English