Package io.goobi.viewer.controller
Class StringTools
java.lang.Object
io.goobi.viewer.controller.StringTools
StringTools class.
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
ConstantBACKSLASH_REPLACEMENT="U005C"
.static final String
ConstantDEFAULT_ENCODING="UTF-8"
.static final String
ConstantPERCENT_REPLACEMENT="U0025"
.static final String
ConstantPIPE_REPLACEMENT="U007C"
.static final String
ConstantPLUS_REPLACEMENT="U0025"
.static final String
ConstantQUESTION_MARK_REPLACEMENT="U003F"
.static final String
ConstantREGEX_BRACES="\\{(\\w+)\\}"
.static final String
ConstantREGEX_PARENTESES_DATES="\\([\\w|\\s|\\-|\\.|\\?]
static final String
ConstantREGEX_PARENTHESES="\\([^()]*\\)"
.static final String
ConstantREGEX_QUOTATION_MARKS="\"[^()]*?
static final String
ConstantREGEX_WORDS="[a-zäáàâöóòôüúùûëéèêßñ0123456789]+"
.static final String
ConstantSLASH_REPLACEMENT="U002F"
. -
Method Summary
Modifier and TypeMethodDescriptionstatic String
appendTrailingSlash
(String path) static boolean
checkValueEmptyOrInverted
(String value) static String
cleanUserGeneratedData
(String data) Clean a String from any malicious content like script tags, line breaks and backtracking filepaths.static String
convertStringEncoding
(String string, String from, String to) Converts aString
from one given encoding to the other.static String
convertToSingleWord
(String text, int maxLength, String whitespaceReplacement) static String
decodeUrl.static String
Escape url for submitted form data.static String
encodeUrl.static String
escapeCriticalUrlChracters
(String value, boolean escapePercentCharacters) escapeCriticalUrlChracters.static String
escapeHtml
(String text) Escapes the given string.static String
escapeHtmlChars
(String str) Escapes special HTML characters in the given string.static String
escapeHtmlLtGt
(String str) Escapes <> in the given string.static String
escapeQuotes.filterStringsViaRegex
(List<String> values, String regex) static String
findBestMatch
(String s, List<String> candidates, String language) findFirstMatch
(String text, String regex, int group) Finds the first String matching a regex within another string and return it as anOptional
.static String
generateHash
(String myString) Creates an hash of the given String using SHA-256.static String
getCharset
(String input) getHierarchyForCollection
(String collection, String split) getHierarchyForCollection.static int[]
getIntegerRange
(String inRange) static int
Return the length of the given string, or 0 if the string is null.static String
getMatch.static String
intern.static boolean
isImageUrl
(String url) isImageUrl.static boolean
isStringUrlEncoded
(String s, String charset) Checks whether given string already contains URL-encoded characters.static String
Normalizes WebAnnotation coordinates for rectangle rendering (x,y,w,h -> minX,minY,maxX,maxY).Try to parse the given string as integer.static String
Removed diacritical marks from each letter in the given String.static String
removeLineBreaks
(String s, String replaceWith) Removes regular and HTML line breaks from the given String.static String
removeTrailingSlashes
(String path) static String
Renames CSS classes that start with digits in the given html code due to Chrome ignoring such classes.static String
static String
static int
sortByList
(String v1, String v2, List<String> sorting) static String
stripJS.static String
Use this method to log user-controller variables that may contain pattern-breaking characters such as line breaks and tabs.static String
unescapeCriticalUrlChracters.
-
Field Details
-
REGEX_QUOTATION_MARKS
ConstantREGEX_QUOTATION_MARKS="\"[^()]*?\""
.- See Also:
-
REGEX_PARENTHESES
ConstantREGEX_PARENTHESES="\\([^()]*\\)"
.- See Also:
-
REGEX_PARENTESES_DATES
ConstantREGEX_PARENTESES_DATES="\\([\\w|\\s|\\-|\\.|\\?]+\\)"
.- See Also:
-
REGEX_BRACES
ConstantREGEX_BRACES="\\{(\\w+)\\}"
.- See Also:
-
REGEX_WORDS
ConstantREGEX_WORDS="[a-zäáàâöóòôüúùûëéèêßñ0123456789]+"
.- See Also:
-
DEFAULT_ENCODING
ConstantDEFAULT_ENCODING="UTF-8"
. -
SLASH_REPLACEMENT
ConstantSLASH_REPLACEMENT="U002F"
.- See Also:
-
BACKSLASH_REPLACEMENT
ConstantBACKSLASH_REPLACEMENT="U005C"
.- See Also:
-
PIPE_REPLACEMENT
ConstantPIPE_REPLACEMENT="U007C"
.- See Also:
-
QUESTION_MARK_REPLACEMENT
ConstantQUESTION_MARK_REPLACEMENT="U003F"
.- See Also:
-
PERCENT_REPLACEMENT
ConstantPERCENT_REPLACEMENT="U0025"
.- See Also:
-
PLUS_REPLACEMENT
ConstantPLUS_REPLACEMENT="U0025"
.- See Also:
-
-
Method Details
-
encodeUrl
Escape url for submitted form data. A space is encoded as '+'.- Parameters:
string
- String to encode- Returns:
- URL-encoded string
-
encodeUrl
encodeUrl.
- Parameters:
string
- String to encodeescapeCriticalUrlCharacters
- If true, slashes etc. will be manually escaped prior to URL encoding- Returns:
- URL-encoded string
-
decodeUrl
decodeUrl.
-
findFirstMatch
Finds the first String matching a regex within another string and return it as anOptional
.- Parameters:
text
- The String in which to searchregex
- The regex to search forgroup
- a int.- Returns:
- An optional containing the first String within the
text
matched byregex
, or an empty optional if no match was found
-
findBestMatch
-
escapeHtmlChars
Escapes special HTML characters in the given string. -
escapeHtmlLtGt
Escapes <> in the given string. -
removeDiacriticalMarks
Removed diacritical marks from each letter in the given String.- Parameters:
s
- aString
object.- Returns:
- String without diacritical marks
-
replaceCharacterVariants
-
removeLineBreaks
Removes regular and HTML line breaks from the given String. -
stripJS
stripJS.
- Parameters:
s
-- Returns:
- String sans any script-tag blocks
-
stripPatternBreakingChars
Use this method to log user-controller variables that may contain pattern-breaking characters such as line breaks and tabs.- Parameters:
s
- String to clean- Returns:
- String sans any logger pattern-breaking characters
-
getLength
Return the length of the given string, or 0 if the string is null.- Parameters:
s
- aString
object.- Returns:
- the length of the string if it exists, 0 otherwise
-
escapeHtml
Escapes the given string. UsesStringEscapeUtils.escapeHtml4(String)
and additionally converts all line breaks (\r\n, \r, \n) to html line breaks (<br/>
)- Parameters:
text
- the text to escape- Returns:
- the escaped string
-
escapeQuotes
escapeQuotes.
-
getCharset
- Parameters:
input
-- Returns:
- Charset of the given input
-
convertStringEncoding
Converts aString
from one given encoding to the other.- Parameters:
string
- The string to convert.from
- Source encoding.to
- Destination encoding.- Returns:
- The converted string.
-
isStringUrlEncoded
public static boolean isStringUrlEncoded(String s, String charset) throws UnsupportedEncodingException Checks whether given string already contains URL-encoded characters.- Parameters:
s
- String to checkcharset
- Charset for URL decoding- Returns:
- true if decoded string differs from original; false otherwise
- Throws:
UnsupportedEncodingException
-
escapeCriticalUrlChracters
escapeCriticalUrlChracters.
-
unescapeCriticalUrlChracters
unescapeCriticalUrlChracters.
-
isImageUrl
isImageUrl.
- Parameters:
url
- aString
object.- Returns:
- true if this is an image URL; false otherwise
-
renameIncompatibleCSSClasses
Renames CSS classes that start with digits in the given html code due to Chrome ignoring such classes.- Parameters:
html
- The HTML to fix- Returns:
- Same HTML document but with Chrome-compatible CSS class names
-
getHierarchyForCollection
getHierarchyForCollection.
-
normalizeWebAnnotationCoordinates
Normalizes WebAnnotation coordinates for rectangle rendering (x,y,w,h -> minX,minY,maxX,maxY).- Parameters:
coords
- aString
object.- Returns:
- Legacy format coordinates
-
getMatch
getMatch.
-
intern
intern.
-
generateHash
Creates an hash of the given String using SHA-256.- Parameters:
myString
- aString
object.- Returns:
- generated hash
-
appendTrailingSlash
- Parameters:
path
-- Returns:
- Given path with a trailing slash, if not yet present
-
removeTrailingSlashes
-
checkValueEmptyOrInverted
- Parameters:
value
-- Returns:
- true if value null, empty or starts with 0x1; false otherwise
-
filterStringsViaRegex
- Parameters:
values
- All values to checkregex
-- Returns:
- List of values that match
regex
-
parseInt
Try to parse the given string as integer.- Parameters:
s
- the string to parse- Returns:
- An Optional containing the parsed int. If the string is blank or cannot be parsed to an integer, an empty Optional is returned
-
getIntegerRange
-
cleanUserGeneratedData
Clean a String from any malicious content like script tags, line breaks and backtracking filepaths.- Parameters:
data
-- Returns:
- a cleaned up string which can be savely used
-
sortByList
-
convertToSingleWord
-
replaceAllMatches
-