Class AbstractSearchParser

java.lang.Object
io.goobi.viewer.model.iiif.search.parser.AbstractSearchParser
Direct Known Subclasses:
AltoSearchParser, SolrSearchParser

public abstract class AbstractSearchParser extends Object
Abstract base class for IIIF Search API parsers that extract text matches from different content sources.
Author:
Florian Alpers
  • Constructor Details

    • AbstractSearchParser

      public AbstractSearchParser()
  • Method Details

    • getPrecedingText

      public static String getPrecedingText(String text, int hitStartIndex, int maxLength)
      getPrecedingText.
      Parameters:
      text - full text to extract context from
      hitStartIndex - character index where the hit begins
      maxLength - maximum number of characters to return
      Returns:
      the text excerpt preceding the hit position, up to maxLength characters
    • getSucceedingText

      public static String getSucceedingText(String text, int hitEndIndex, int maxLength)
      getSucceedingText.
      Parameters:
      text - full text to extract context from
      hitEndIndex - character index immediately after the hit ends
      maxLength - maximum number of characters to return
      Returns:
      the text excerpt following the hit position, up to maxLength characters
    • getSingleWordRegex

      public static String getSingleWordRegex(String query)
      getSingleWordRegex.
      Parameters:
      query - search term or regex to match as a whole word
      Returns:
      a regex matching a single word matching the given query regex (ignoring case)
    • getContainedWordRegex

      public static String getContainedWordRegex(String query)
      getContainedWordRegex.
      Parameters:
      query - search term or regex to find as a whole word
      Returns:
      a regex matching any text containing the given query regex as single word
    • getQueryRegex

      public static String getQueryRegex(String query)
      getQueryRegex.
      Parameters:
      query - search query with optional '*' wildcard tokens
      Returns:
      a regex matching any word or sequence of words of the given query with '*' matching any number of word characters and ignoring case
    • getAutoSuggestRegex

      public static String getAutoSuggestRegex(String query)
      Create a regular expression matching all anything starting with the given query followed by an arbitrary number of word characters and ignoring case.
      Parameters:
      query - prefix string to match at the start of words
      Returns:
      the regular expression (?i){query}[\w\d-]*