Search & Index Documents via Java API
Build Java Applications to perform Text Search Manipulation in All Popular Document Formats.Download Free Trial
GroupDocs.Search for Java
GroupDocs.Search for Java allows you to produce business applications that allow your end-users to perform search operations like never before. Our Java API enables users to operate basic to advanced level text search functions. Create and merge multiple indexes. Use Simple, Boolean, Regular Expression (Regex), Fuzzy and other types of queries to rapidly and smartly search through indexes. You can fetch your required information, from files, documents, emails, and archives, as GroupDocs.Search for Java supports all popular file formats.
At A Glance
An overview of Java Search API for documents text retrieval.
- Create & Manage
- Merge Multiple Indexes
- Multi-Threading Async Indexing
- Compact Indexing
- Archived Files Indexing
- Fuzzy Search
- Synonym Search
- Email Search
- Handling of Homophonic Terms
- Searching Protected Files
- Wild Card
- Regular Expression (Regex)
- Faceted & Boolean
- Case Sensitive
Supported Operating Systems and Frameworks
- Version 6 (1.6) and above
- Windows, Desktops and Servers
- Mac OS
Search API Supported File Formats
GroupDocs.Search for Java supports following document file formats:
- Word: DOC, DOCX, DOCM, DOT, DOTX, DOTM, RTF, TXT
- Excel: XLS, XLSX, XLSM, XLT, XLTX, XLTM, XLSB, XLA, XLAM, CSV, TSV
- PowerPoint: PPT, PPTX, POT, POTX, PPS, PPSX, PPTM, PPSM, POTM
- Project: MPP
- Diagram: VSD, VSS
- Microsoft Compiled HTML: CHM
- OneNote: ONE
- Portable Document Format: PDF
- OpenDocument: ODT, OTT, OTS, ODS, ODP
- Email: PST, OST, MSG, EML, EMLX
- Web File Formats: XML, HTM, HTML, XHTML, MHT, MHTML, MD
- Audio: MP3, WAV
- Video: AVI, MOV, QT, FLV, ASF
- Images: BMP, GIF, JP2, PNG, WEBP, TIFF, EMF, WMF, JPG, PSD
- Other document formats: TORRENT, ZIP, DCM, DJVU, EPUB, FB2, DICOM
GroupDocs.Search for Java Features
Build Index on Disk or in Memory with Async Multithreading
View Index Creation & Updation Progress
Selectively Skip Indexing for Specific Files & Skip Specific Words to Index Faster
Perform Import or Use List to Modify Characters during Indexing & Export to a File
Reload Index in case of Error Indexing & Alert User for Contradictory Settings
Index Status Notification regarding Latest Processed Files
Index Zipped Archives inside other ZIP Archives & Get List of Indexed Files in an Archive
Save up Space by Compact Indexing & Password Secured Documents Indexing
Document Text Extraction from Index or Source File
HTML Formatted Text Extraction to a File & Produce URL to Navigate Search Results in HTML
Add Arbitrary Additional Fields to each Document during Indexing
Configure Similarity Level for Fuzzy Search & Show Best Results
Smart Management of Typos through Fuzzy Search
Use Faceted & Boolean Search Simultaneously
Configure & Perform Synonyms Search & Smartly Deal with Homophonic Terms
Use Date Range & Case Sensitivity as Search Parameters
Make Index to Search & Browse Email Messages via Aspose.Email API
Use Search Phrases with Spell Check and Wild Cards & Skip Special Characters in Queries
Make Single Object Tree by Combining Multiple Queries
Divide Search in Smaller Chunks to Rapidly Search Huge Indexes
Index Documents from Streams and Data Structures
Set up Document Filtering in Search Results
Add English Synonyms to Default Synonym Dictionary
Enable Exact Number of Occurrences for each Found Word to Offer Alternative Word Suggestions in case of Misspelling
Add Text Attributes to Indexed Documents without Re-indexing
Perform Indexing and Searching Operations Based on Characters
Index Metadata of Non-Textual Document Formats
Indexing and Search Operation
Indexing is used by GroupDocs.Search for Java to collect data, as well as store and parse it for accurate and efficient search operations. GroupDocs.Search for Java uses such Indexes frequently for performing search.
- Create Index: Create Index folder and add/index documents to that folder.
- Load Index: Load an existing Index.
- Add Documents to Index: Add documents to existing Index, asynchronously.
- Update Index: Update existing Index, whenever a document is modified, added or deleted. This keeps search results up to date.
Perform Wild Card Search - Java
// Creating index Index index = new Index("c:\\MyIndex"); // Adding documents to index index.addToIndex("c:\\MyDocuments"); // Searching for words 'affect' or 'effect' in a document with 'principal', 'principle', 'principles', or 'principally' SearchResults results = index.search("?ffect & princip?(2~4)");
Merge Multiple Indices to Improve Search Efficiency
GroupDocs.Search for Java API provides the feature to merge multiple indexes into a common index. For an index which is modified frequently, several delta indexes are created. This approach however, makes the search performance slow. GroupDocs.Search for Java overcomes this bottle-neck by creating one common index through merging various delta indexes. This common merged index contains all the information of the merged delta indexes. This approach keeps the delta indexes unchanged while remarkably improving the search efficiency. You can configure various functionalities to further tweak this process.
Recognize Search Queries of Different Keyboard Layout
GroupDocs.Search for Java recognizes search queries that do not match your keyboard layout. At the moment, 88 languages and 164 different keyboard layouts can successfully be recognized by GroupDocs.Search for Java.
Search Using Morphological Word Form
Using GroupDocs.Search for Java, you have freedom of searching for various word forms. You may search for singular and plural form of specific noun. Or you can choose to search all forms of a verb. Root, third-person singular and simple past along with various other forms can also be searched. For non English languages, you can configure customized word forms.