Class TextResourceBuilder
java.lang.Object
io.goobi.viewer.api.rest.resourcebuilders.TextResourceBuilder
- Author:
- Florian Alpers
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiongetAltoAsZip(String pi, HttpServletRequest request) getAltoDocument(String pi, HttpServletRequest request) Aggregate ALTO endpoint: returns concatenated ALTO XML for all pages of a record.getAltoDocument(String pi, String fileName) getCmdiDocument(String pi, String langCode) getCMDIFile(String pi, String langCode) getCMDIFile.getCMDIFiles(String pi) getCMDIFiles.getContentAsText(String contentFolder, String pi, String fileName) getFulltext(String pi, HttpServletRequest request) Aggregate plain-text endpoint: concatenates the plain-text representation of every page of a record.getFulltext(String pi, String fileName) getFulltext.getFulltextAsTEI(String pi, String filename) getFulltextAsZip(String pi, HttpServletRequest request) getFulltextMap(String pi, HttpServletRequest request) Collects full-text file paths and content in a map.getTeiAsZip(String pi, String langCode, HttpServletRequest request) getTeiDocument(String pi, String langCode, HttpServletRequest request) getTEIFiles(String pi) getTEIFiles.getTEIFiles(String pi, String langCode) getTEIFiles.
-
Constructor Details
-
TextResourceBuilder
public TextResourceBuilder()Zero-arg constructor.
-
-
Method Details
-
getFulltext
public String getFulltext(String pi, HttpServletRequest request) throws IOException, PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.IllegalRequestException Aggregate plain-text endpoint: concatenates the plain-text representation of every page of a record. Guarded against OOM viaConfiguration.getMaxAggregateFulltextSize().- Throws:
IOExceptionPresentationExceptionIndexUnreachableExceptionde.unigoettingen.sub.commons.contentlib.exceptions.IllegalRequestException
-
getFulltextAsZip
public StreamingOutput getFulltextAsZip(String pi, HttpServletRequest request) throws IOException, PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException - Parameters:
pi- persistent identifier of the workrequest-HttpServletRequest- Returns:
StreamingOutput- Throws:
IOExceptionPresentationExceptionIndexUnreachableExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException
-
getAltoAsZip
public StreamingOutput getAltoAsZip(String pi, HttpServletRequest request) throws IOException, PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException - Throws:
IOExceptionPresentationExceptionIndexUnreachableExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException
-
getAltoDocument
public String getAltoDocument(String pi, HttpServletRequest request) throws IOException, PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.IllegalRequestException Aggregate ALTO endpoint: returns concatenated ALTO XML for all pages of a record. Note: the result is NOT well-formed XML (multiple XML declarations and root<alto>elements). UsegetAltoAsZip(String, HttpServletRequest)for parsable output. Guarded against OOM viaConfiguration.getMaxAggregateAltoSize(): rejects requests whose summed page-file size exceeds the configured cap with HTTP 400 (viaIllegalRequestException).- Throws:
IOExceptionPresentationExceptionIndexUnreachableExceptionde.unigoettingen.sub.commons.contentlib.exceptions.IllegalRequestException
-
getAltoDocument
public StringPair getAltoDocument(String pi, String fileName) throws PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundException - Parameters:
pi- persistent identifier of the workfileName- file name of the ALTO document to retrieve- Returns:
- StringPair(ALTO,charset)
- Throws:
PresentationExceptionIndexUnreachableExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundException
-
getFulltextAsTEI
public String getFulltextAsTEI(String pi, String filename) throws PresentationException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException, IndexUnreachableException - Parameters:
pi- persistent identifier of the workfilename- file name of the plain-text page to convert- Returns:
- Plain text extracted from TEI
- Throws:
PresentationExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentLibExceptionIndexUnreachableException
-
getTeiDocument
public String getTeiDocument(String pi, String langCode, HttpServletRequest request) throws PresentationException, IndexUnreachableException, IOException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException - Parameters:
pi- persistent identifier of the worklangCode- ISO language code for the requested TEI versionrequest-HttpServletRequest- Returns:
- TEI document as
String - Throws:
PresentationExceptionIndexUnreachableExceptionIOExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException
-
getTeiAsZip
public StreamingOutput getTeiAsZip(String pi, String langCode, HttpServletRequest request) throws PresentationException, IndexUnreachableException, IOException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException - Parameters:
pi- persistent identifier of the worklangCode- ISO language code for the requested TEI versionrequest-HttpServletRequest- Returns:
StreamingOutput- Throws:
PresentationExceptionIndexUnreachableExceptionIOExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentLibException
-
getCmdiDocument
public String getCmdiDocument(String pi, String langCode) throws PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundException, IOException - Parameters:
pi- persistent identifier of the worklangCode- ISO language code for the requested CMDI version- Returns:
- CMDI document as
String - Throws:
PresentationExceptionIndexUnreachableExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundExceptionIOException
-
getContentAsText
public String getContentAsText(String contentFolder, String pi, String fileName) throws PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundException - Throws:
PresentationExceptionIndexUnreachableExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundException
-
getFulltext
public String getFulltext(String pi, String fileName) throws PresentationException, IndexUnreachableException, de.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundException getFulltext.- Parameters:
pi- persistent identifier of the workfileName- file name of the plain-text or ALTO page to retrieve- Returns:
- plain text content of the requested page
- Throws:
PresentationException- if any.IndexUnreachableException- if any.de.unigoettingen.sub.commons.contentlib.exceptions.ContentNotFoundException- if any.DAOExceptionde.unigoettingen.sub.commons.contentlib.exceptions.ServiceNotAllowedException
-
getFulltextMap
public Map<Path,String> getFulltextMap(String pi, HttpServletRequest request) throws IOException, PresentationException, IndexUnreachableException Collects full-text file paths and content in a map. Priority is given to files from plaintext resources, with missing files being stuffed with converted ALTO.- Parameters:
pi- persistent identifier of the workrequest- current HTTP servlet request for access checking- Returns:
- map of page file paths to their full-text content
- Throws:
IOException- if any.PresentationException- if any.IndexUnreachableException- if any.
-
getTEIFiles
getTEIFiles.- Parameters:
pi- persistent identifier used to locate the TEI folder- Returns:
- list of all language-specific TEI file paths for the record
-
getTEIFiles
getTEIFiles.- Parameters:
pi- persistent identifier used to locate the TEI folderlangCode- ISO language code to filter TEI files by- Returns:
- list of TEI file paths matching the requested language
-
getCMDIFile
public Path getCMDIFile(String pi, String langCode) throws IOException, PresentationException, IndexUnreachableException getCMDIFile.- Parameters:
pi- persistent identifier used to locate the CMDI folderlangCode- ISO language code to select the matching CMDI file- Returns:
- path to the language-specific CMDI file, or null if not found
- Throws:
IOException- if any.PresentationException- if any.IndexUnreachableException- if any.
-
getCMDIFiles
getCMDIFiles.- Parameters:
pi- persistent identifier used to locate the CMDI folder- Returns:
- list of all language-specific CMDI file paths for the record
-