Package io.goobi.viewer.controller
Class StringTools
java.lang.Object
io.goobi.viewer.controller.StringTools
StringTools class.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringConstantBACKSLASH_REPLACEMENT="U005C".static final StringConstantDEFAULT_ENCODING="UTF-8".static final StringConstantPERCENT_REPLACEMENT="U0025".static final StringConstantPIPE_REPLACEMENT="U007C".static final StringConstantPLUS_REPLACEMENT="U0025".static final StringConstantQUESTION_MARK_REPLACEMENT="U003F".static final StringConstantREGEX_BRACES="\\{(\\w+)\\}".static final StringConstantREGEX_PARENTESES_DATES="\\([\\w|\\s|\\-|\\.|\\?]static final StringConstantREGEX_PARENTHESES="\\([^()]*\\)".static final StringConstantREGEX_QUOTATION_MARKS="\"[^()]*?static final StringConstantREGEX_WORDS="[a-zäáàâöóòôüúùûëéèêßñ0123456789]+".static final StringConstantSLASH_REPLACEMENT="U002F". -
Method Summary
Modifier and TypeMethodDescriptionstatic StringappendTrailingSlash(String path) static booleancheckValueEmptyOrInverted(String value) static StringcleanUserGeneratedData(String data) Clean a String from any malicious content like script tags, line breaks and backtracking filepaths.static StringconvertStringEncoding(String string, String from, String to) Converts aStringfrom one given encoding to the other.static StringconvertToSingleWord(String text, int maxLength, String whitespaceReplacement) static StringdecodeUrl.static StringEscape url for submitted form data.static StringencodeUrl.static StringescapeCriticalUrlChracters(String value, boolean escapePercentCharacters) escapeCriticalUrlChracters.static StringescapeHtml(String text) Escapes the given string.static StringescapeHtmlChars(String str) Escapes special HTML characters in the given string.static StringescapeHtmlLtGt(String str) Escapes <> in the given string.static StringescapeQuotes.filterStringsViaRegex(List<String> values, String regex) static StringfindBestMatch(String s, List<String> candidates, String language) findFirstMatch(String text, String regex, int group) Finds the first String matching a regex within another string and return it as anOptional.static StringgenerateHash(String myString) Creates an hash of the given String using SHA-256.static StringgetCharset(String input) getHierarchyForCollection(String collection, String split) getHierarchyForCollection.static int[]getIntegerRange(String inRange) static intReturn the length of the given string, or 0 if the string is null.static StringgetMatch.static Stringintern.static booleanisImageUrl(String url) isImageUrl.static booleanstatic booleanisStringUrlEncoded(String s, String charset) Checks whether given string already contains URL-encoded characters.static StringNormalizes WebAnnotation coordinates for rectangle rendering (x,y,w,h -> minX,minY,maxX,maxY).Try to parse the given string as integer.static StringRemoved diacritical marks from each letter in the given String.static StringremoveLineBreaks(String s, String replaceWith) Removes regular and HTML line breaks from the given String.static Stringstatic StringremoveTrailingSlashes(String path) static StringRenames CSS classes that start with digits in the given html code due to Chrome ignoring such classes.static Stringstatic Stringstatic intsortByList(String v1, String v2, List<String> sorting) static StringstripJS.static StringUse this method to log user-controller variables that may contain pattern-breaking characters such as line breaks and tabs.static StringtruncateText(String text, int maxLength) Returns a truncated version of the given text which is no longer that maxLength.static StringunescapeCriticalUrlChracters.
-
Field Details
-
REGEX_QUOTATION_MARKS
ConstantREGEX_QUOTATION_MARKS="\"[^()]*?\"".- See Also:
-
REGEX_PARENTHESES
ConstantREGEX_PARENTHESES="\\([^()]*\\)".- See Also:
-
REGEX_PARENTESES_DATES
ConstantREGEX_PARENTESES_DATES="\\([\\w|\\s|\\-|\\.|\\?]+\\)".- See Also:
-
REGEX_BRACES
ConstantREGEX_BRACES="\\{(\\w+)\\}".- See Also:
-
REGEX_WORDS
ConstantREGEX_WORDS="[a-zäáàâöóòôüúùûëéèêßñ0123456789]+".- See Also:
-
DEFAULT_ENCODING
ConstantDEFAULT_ENCODING="UTF-8". -
SLASH_REPLACEMENT
ConstantSLASH_REPLACEMENT="U002F".- See Also:
-
BACKSLASH_REPLACEMENT
ConstantBACKSLASH_REPLACEMENT="U005C".- See Also:
-
PIPE_REPLACEMENT
ConstantPIPE_REPLACEMENT="U007C".- See Also:
-
QUESTION_MARK_REPLACEMENT
ConstantQUESTION_MARK_REPLACEMENT="U003F".- See Also:
-
PERCENT_REPLACEMENT
ConstantPERCENT_REPLACEMENT="U0025".- See Also:
-
PLUS_REPLACEMENT
ConstantPLUS_REPLACEMENT="U0025".- See Also:
-
-
Method Details
-
encodeUrl
Escape url for submitted form data. A space is encoded as '+'.- Parameters:
string- String to encode- Returns:
- URL-encoded string
-
encodeUrl
encodeUrl.
- Parameters:
string- String to encodeescapeCriticalUrlCharacters- If true, slashes etc. will be manually escaped prior to URL encoding- Returns:
- URL-encoded string
-
decodeUrl
decodeUrl.
-
findFirstMatch
Finds the first String matching a regex within another string and return it as anOptional.- Parameters:
text- The String in which to searchregex- The regex to search forgroup- a int.- Returns:
- An optional containing the first String within the
textmatched byregex, or an empty optional if no match was found
-
findBestMatch
-
escapeHtmlChars
Escapes special HTML characters in the given string. -
escapeHtmlLtGt
Escapes <> in the given string. -
removeDiacriticalMarks
Removed diacritical marks from each letter in the given String.- Parameters:
s- aStringobject.- Returns:
- String without diacritical marks
-
replaceCharacterVariants
-
removeLineBreaks
Removes regular and HTML line breaks from the given String. -
stripJS
stripJS.
- Parameters:
s-- Returns:
- String sans any script-tag blocks
-
stripPatternBreakingChars
Use this method to log user-controller variables that may contain pattern-breaking characters such as line breaks and tabs.- Parameters:
s- String to clean- Returns:
- String sans any logger pattern-breaking characters
-
getLength
Return the length of the given string, or 0 if the string is null.- Parameters:
s- aStringobject.- Returns:
- the length of the string if it exists, 0 otherwise
-
escapeHtml
Escapes the given string. UsesStringEscapeUtils.escapeHtml4(String)and additionally converts all line breaks (\r\n, \r, \n) to html line breaks (<br/>)- Parameters:
text- the text to escape- Returns:
- the escaped string
-
escapeQuotes
escapeQuotes.
-
getCharset
- Parameters:
input-- Returns:
- Charset of the given input
-
convertStringEncoding
Converts aStringfrom one given encoding to the other.- Parameters:
string- The string to convert.from- Source encoding.to- Destination encoding.- Returns:
- The converted string.
-
isStringUrlEncoded
public static boolean isStringUrlEncoded(String s, String charset) throws UnsupportedEncodingException Checks whether given string already contains URL-encoded characters.- Parameters:
s- String to checkcharset- Charset for URL decoding- Returns:
- true if decoded string differs from original; false otherwise
- Throws:
UnsupportedEncodingException
-
escapeCriticalUrlChracters
escapeCriticalUrlChracters.
-
unescapeCriticalUrlChracters
unescapeCriticalUrlChracters.
-
isImageUrl
isImageUrl.
- Parameters:
url- aStringobject.- Returns:
- true if this is an image URL; false otherwise
-
renameIncompatibleCSSClasses
Renames CSS classes that start with digits in the given html code due to Chrome ignoring such classes.- Parameters:
html- The HTML to fix- Returns:
- Same HTML document but with Chrome-compatible CSS class names
-
getHierarchyForCollection
getHierarchyForCollection.
-
normalizeWebAnnotationCoordinates
Normalizes WebAnnotation coordinates for rectangle rendering (x,y,w,h -> minX,minY,maxX,maxY).- Parameters:
coords- aStringobject.- Returns:
- Legacy format coordinates
-
getMatch
getMatch.
-
intern
intern.
-
generateHash
Creates an hash of the given String using SHA-256.- Parameters:
myString- aStringobject.- Returns:
- generated hash
-
appendTrailingSlash
- Parameters:
path-- Returns:
- Given path with a trailing slash, if not yet present
-
removeTrailingSlashes
-
removeQuotations
- Parameters:
s-- Returns:
- Given string without quotation marks, or same string if not in quotation marks
-
checkValueEmptyOrInverted
- Parameters:
value-- Returns:
- true if value null, empty or starts with 0x1; false otherwise
-
filterStringsViaRegex
- Parameters:
values- All values to checkregex-- Returns:
- List of values that match
regex
-
parseInt
Try to parse the given string as integer.- Parameters:
s- the string to parse- Returns:
- An Optional containing the parsed int. If the string is blank or cannot be parsed to an integer, an empty Optional is returned
-
getIntegerRange
-
cleanUserGeneratedData
Clean a String from any malicious content like script tags, line breaks and backtracking filepaths. TODO InvalidPathException in Windows- Parameters:
data-- Returns:
- a cleaned up string which can be savely used
-
sortByList
-
convertToSingleWord
-
replaceAllMatches
-
truncateText
Returns a truncated version of the given text which is no longer that maxLength. If possible the truncated text ends between words. If it is shorter than the original text, '...' is appended at the end (these count towards the maxLength, i.e. the actual text has at most maxLength-3 characters)- Parameters:
text- the text to truncatemaxLength- maximal length of the returned text- Returns:
- the truncated text or null if the input was null
-
isInteger
-