Dos and Don'ts for digitisation workflows

Logo

A collection of typical questions and problems that occur during digitsation projects including possible answers.

Navigation

Workflow and planning

Do: ask your scanning service provider to make interim deliveries.

If you choose to outsource the job of scanning to an external provider, you should ideally request a series of interim deliveries rather than one large final delivery. This will help you to ensure that the outsourced work is consistently of the required quality and to keep check on your overall project schedule. Ideally the contractor should upload the data directly to your system rather than supplying a hard disk.

Do: avoid as much manual processing as possible.

It is worth checking carefully which digitization workflow tasks really need to be performed manually. Many recurring jobs can now be automated or at least partly automated using appropriate software. There are numerous tools from different providers covering image conversion, validation, image processing, metadata enrichment, OCR and many other tasks. Integrating these tools into your workflow as automated tasks not only saves time but also ensures that your results are consistent in terms of quality, data formats and naming systems.

Do: break down the digitization workflow into small and logical steps.

Workflow tasks that need to be performed for every object (e.g. for every book) should be broken down into smaller steps. Each step should be planned in such a way that it can ideally be completed relatively quickly by appropriately trained staff in the designated user groups. Rather than:

it would be better to break down the workflow as follows:

Using this approach, the status of each object is clear and verifiable throughout the workflow. It also greatly increases productivity, as project staff repeatedly perform the same task for multiple objects without having to familiarize themselves with a different role.

Many other institutions will already have come across the same or similar problems. Establishing links with such institutions can help you to make the right decisions on procurement and working methods, exchange experiences, and learn from each other.

Do: get started!

While of course it makes sense to plan your project thoroughly and cover all eventualities, there comes a point when it makes equal sense to just get started. When taking that decision, you should consider the following points.

Do: harness the experience of project staff in your planning.

Although staff who have spent a great deal of time working on other projects may be reluctant to adopt new methods, their knowledge and experience is tremendously valuable when it comes to redesigning workflows or planning new ones. You will need to harness that experience so that you can design an effective new workflow that covers every single task from start to finish. It follows that the workflow should never be imposed “from above” by management without input from project staff. This participatory approach to workflow planning also promotes greater acceptance of new methods and tools.

Do: keep a hands-on approach to managing your project.

Project managers should work through each step of the workflow from start to finish at least once a year to ensure that they fully understand it, and if necessary ask the corresponding member of the project team to explain what is involved. This helps to disseminate and document knowledge across the entire project. It is also a good way to identify potential improvements and address typical pitfalls (e.g. “We’ve always done it that way”).

Do: keep your workflow system under regular observation.

It is a good idea to review your working methods from time to time and ask yourself if it could be more effective. Even small misunderstandings and failures to learn from previous mistakes can have a major impact on hundreds of objects (e.g. incorrect resolution, wrong data formats, incorrect storage method, inefficient use of hardware and software).

Do: plan in advance for platform-independent digitization workflows.

There are some very good software tools that can only run on certain operating systems (e.g. Windows or Mac). Ideally, you should plan your workflows from the start in such a way that individual steps can also be performed on other operating systems. This gives you maximum flexibility over your future working methods and avoids the risk of “vendor lock-in”.

Do: prepare yourself for the technical challenges involved in all digitization projects.

Digitization projects involve numerous technical components. For this reason, at least one technician or a technically competent member of staff should be available to actively support the project, not only resolving problems as they arise but also maintaining computers and servers and acting as a point of contact for communication with hardware manufacturers, software producers, scanning service providers, and computer centers.

Do: remember that your output should be designed to last.

In most cases, the output of a digitization project should remain usable for a long time. Furthermore, since digitization projects often extend over a long period, all workflows should be organized in such a way that the plans can still be followed much later. Directory structures, filenames and formats should be as clear and simple as possible, ideally self-explanatory. Future project staff should always be able to locate documentation describing and explaining each step of the workflow to ensure a smooth transition in the event of personnel changes.

Do: think about effective (long-term) archiving.

Reliable methods of long-term archiving can be expensive. In the ideal scenario, it may be possible to cooperate with other institutions and share the cost. If this is not an option, you will need to find effective alternatives. Archiving systems should be easy to maintain, and it must be possible to retrieve the archived data at any time. When choosing the most appropriate system, you should consider whether to store the data in two geographically separate locations.

Do: think about the intended results well in advance.

You should consider as early as possible what is to be done with your digital output. Even during the actual workflow, you can make preparations for subsequent online publication, transfer to another technical system, or ease of access for researchers. Validation, data conversion and metadata enrichment can all make a crucial difference in terms of the way the material is used.

Do: use a central workflow tool to coordinate your project.

Ideally, the tasks performed for each digitized object should be precisely verifiable so that you can maintain a clear overview of progress and any hold-ups and monitor any recurring errors and the circumstances in which they arise. Microsoft Excel and a central network drive with write access for each member of the project team are not usually adequate for this purpose. For this reason, you should consider whether to adopt a professional work-flow management tool. These are available under an open-source license and with a large community of users.

Don’t: create complex workflows with many side branches and loops.

Workflows should always be designed to run sequentially. Wherever possible, you should avoid repetition, branching and other complex procedures entirely. Especially if you are dealing with a large volume of data, simple and sequential workflows ensure that each step is easily traceable, making the task of identifying errors and maintaining your systems and workflows much easier.

Don’t: feel you have to do all the work in-house.

For every new digitization project, you should consider whether you really need to carry out the project in-house. Depending on the type of materials involved and the difficulties of working with them, it may make more sense to outsource the digitization process to an external scanning service and have the results delivered to you. This also avoids the need to buy expensive new hardware and additional personnel costs. As well as the scanning process you can of course outsource other parts of the digitization workflow such

Don’t: stick notes or labels onto old source material.

Barcodes, dockets, notes and other information should only be stuck onto old source material in exceptional cases. The chemicals used in the glue can seriously damage valuable objects. Ideally, any additional information should be provided on inserts or cover sheets made of acid-free paper.