A collection of typical questions and problems that occur during digitsation projects including possible answers.
If you choose to outsource the job of scanning to an external provider, you should ideally request a series of interim deliveries rather than one large final delivery. This will help you to ensure that the outsourced work is consistently of the required quality and to keep check on your overall project schedule. Ideally the contractor should upload the data directly to your system rather than supplying a hard disk.
It is worth checking carefully which digitization workflow tasks really need to be performed manually. Many recurring jobs can now be automated or at least partly automated using appropriate software. There are numerous tools from different providers covering image conversion, validation, image processing, metadata enrichment, OCR and many other tasks. Integrating these tools into your workflow as automated tasks not only saves time but also ensures that your results are consistent in terms of quality, data formats and naming systems.
Workflow tasks that need to be performed for every object (e.g. for every book) should be broken down into smaller steps. Each step should be planned in such a way that it can ideally be completed relatively quickly by appropriately trained staff in the designated user groups. Rather than:
it would be better to break down the workflow as follows:
Using this approach, the status of each object is clear and verifiable throughout the workflow. It also greatly increases productivity, as project staff repeatedly perform the same task for multiple objects without having to familiarize themselves with a different role.
Many other institutions will already have come across the same or similar problems. Establishing links with such institutions can help you to make the right decisions on procurement and working methods, exchange experiences, and learn from each other.
While of course it makes sense to plan your project thoroughly and cover all eventualities, there comes a point when it makes equal sense to just get started. When taking that decision, you should consider the following points.
Although staff who have spent a great deal of time working on other projects may be reluctant to adopt new methods, their knowledge and experience is tremendously valuable when it comes to redesigning workflows or planning new ones. You will need to harness that experience so that you can design an effective new workflow that covers every single task from start to finish. It follows that the workflow should never be imposed “from above” by management without input from project staff. This participatory approach to workflow planning also promotes greater acceptance of new methods and tools.
Project managers should work through each step of the workflow from start to finish at least once a year to ensure that they fully understand it, and if necessary ask the corresponding member of the project team to explain what is involved. This helps to disseminate and document knowledge across the entire project. It is also a good way to identify potential improvements and address typical pitfalls (e.g. “We’ve always done it that way”).
It is a good idea to review your working methods from time to time and ask yourself if it could be more effective. Even small misunderstandings and failures to learn from previous mistakes can have a major impact on hundreds of objects (e.g. incorrect resolution, wrong data formats, incorrect storage method, inefficient use of hardware and software).
There are some very good software tools that can only run on certain operating systems (e.g. Windows or Mac). Ideally, you should plan your workflows from the start in such a way that individual steps can also be performed on other operating systems. This gives you maximum flexibility over your future working methods and avoids the risk of “vendor lock-in”.
Digitization projects involve numerous technical components. For this reason, at least one technician or a technically competent member of staff should be available to actively support the project, not only resolving problems as they arise but also maintaining computers and servers and acting as a point of contact for communication with hardware manufacturers, software producers, scanning service providers, and computer centers.
In most cases, the output of a digitization project should remain usable for a long time. Furthermore, since digitization projects often extend over a long period, all workflows should be organized in such a way that the plans can still be followed much later. Directory structures, filenames and formats should be as clear and simple as possible, ideally self-explanatory. Future project staff should always be able to locate documentation describing and explaining each step of the workflow to ensure a smooth transition in the event of personnel changes.
Reliable methods of long-term archiving can be expensive. In the ideal scenario, it may be possible to cooperate with other institutions and share the cost. If this is not an option, you will need to find effective alternatives. Archiving systems should be easy to maintain, and it must be possible to retrieve the archived data at any time. When choosing the most appropriate system, you should consider whether to store the data in two geographically separate locations.
You should consider as early as possible what is to be done with your digital output. Even during the actual workflow, you can make preparations for subsequent online publication, transfer to another technical system, or ease of access for researchers. Validation, data conversion and metadata enrichment can all make a crucial difference in terms of the way the material is used.
Ideally, the tasks performed for each digitized object should be precisely verifiable so that you can maintain a clear overview of progress and any hold-ups and monitor any recurring errors and the circumstances in which they arise. Microsoft Excel and a central network drive with write access for each member of the project team are not usually adequate for this purpose. For this reason, you should consider whether to adopt a professional work-flow management tool. These are available under an open-source license and with a large community of users.
Workflows should always be designed to run sequentially. Wherever possible, you should avoid repetition, branching and other complex procedures entirely. Especially if you are dealing with a large volume of data, simple and sequential workflows ensure that each step is easily traceable, making the task of identifying errors and maintaining your systems and workflows much easier.
For every new digitization project, you should consider whether you really need to carry out the project in-house. Depending on the type of materials involved and the difficulties of working with them, it may make more sense to outsource the digitization process to an external scanning service and have the results delivered to you. This also avoids the need to buy expensive new hardware and additional personnel costs. As well as the scanning process you can of course outsource other parts of the digitization workflow such
Barcodes, dockets, notes and other information should only be stuck onto old source material in exceptional cases. The chemicals used in the glue can seriously damage valuable objects. Ideally, any additional information should be provided on inserts or cover sheets made of acid-free paper.