Format: online event in 3 sessions
In recent years, there have been multiple successful attempts tackling document processing problems separately by designing task specific hand-tuned strategies. Yet the diversity of historical document processing tasks prohibits solving them one at a time and shows a need for designing generic approaches in order to handle the variability of historical series. In this Academy, we introduce and show the functioning of an open-source implementation of a CNN-based pixel-wise predictor, coupled with task dependent post-processing blocks. We show that a single CNN-architecture can be used across tasks with competitive results. Moreover most of the task-specific post-processing steps can be decomposed in a small number of simple and standard reusable operations, adding to the flexibility of our approach.
Thursday, 20 May, 2021 – 14:00 to 16:00 CEST | Introduction to semantic segmentation and annotation
Theoretical part: What is semantic segmentation? How do I make a document digitally intelligible? What steps in my project are covered by dhSegment? How do I define my problem and annotate effectively?
Practical part: Presentation of the functioning of the CVAT software.
Thursday, 27 May, 2021 – 14:00 to 16:00 CEST | Training an artificial neural network with dhSegment
Theoretical part: What is an artificial neural network and how do they work? What are the applications of deep learning? How can I train a neural network with dhSegment to speed up my research and process larger corpora?
Practical part: Introduction to notebooks and Google Colab. Installation, setup and parameterization of dhSegment for training.
Thursday, 3 June, 2021 – 14:00 to 16:00 CEST | Validation and inference for concrete implementation
Theoretical part: How to verify the machine learning process on dhSegment? How to quantify its performance? How to infer the results and extract them for the next steps of my project?
Practical part: Presentation of tensorboard. Inference and extraction of results from dhSegment.
Please visit this page to register for the event. Participation is free and open to TMO members and professionals from the broader GLAM and Digital Humanities sector.