Java Parser API to Extract Data

Java API to parse & extract images and text with metadata from documents, presentations, archives & emails.


Download Free Trial

GroupDocs.Parser for Java is a text, image and metadata extractor API, supporting more than 50 popular document types to help building business applications with features of parsing raw, structured & formatted text. It also supports parsing documents using predefined templates and allows extracting complex data from invoices and other typical documents with speed and accuracy. GroupDocs.Parser for Java enables you to extract text and metadata from password protected files of all popular formats including Word processing documents, Excel spreadsheets, PowerPoint presentations, OneNote, PDF files and ZIP archives.

GroupDocs.Parser for Java Features

Get Text with Plain Text or HTML Formatters

With GroupDocs.Parser for Java, you can apply various formatters to the Text and HTML. You can pull text with Plain Text Formatter for both Simple and ASCII. You can also get Text with HTML Formatter and apply formatting to paragraph, hyperlink, font, headings, lists and tables.

Support and Learning Resources

GroupDocs.Parser offers document viewing APIs for other popular development environments

Back to top
 English