ELAN builder

Introduction

The ELAN .eaf builder exports eaf files ready to be annotated with the ELAN software, based on either the ACLEW DAS templates or custom templates.

The original code was written by Sarah Mac Ewan.

This pipeline allows to pre-fill the .eaf annotations by importing speech (i.e. diarization annotations) from previous annotations, e.g. those generated by the LENA or the VTC, or even manual annotations.

Usage

$ child-project eaf-builder --help
usage: child-project eaf-builder [-h] [--destination DESTINATION] --segments
                                 SEGMENTS --eaf-type
                                 {random,periodic,high-volubility,energy-detection}
                                 --template TEMPLATE
                                 [--context-onset CONTEXT_ONSET]
                                 [--context-offset CONTEXT_OFFSET]
                                 [--path PATH]
                                 [--import-speech-from IMPORT_SPEECH_FROM]

generate .eaf templates based on intervals to code.

optional arguments:
  -h, --help            show this help message and exit
  --destination DESTINATION
                        eaf destination
  --segments SEGMENTS   path to the input segments dataframe
  --eaf-type {random,periodic,high-volubility,energy-detection}
                        eaf-type
  --template TEMPLATE   Which ACLEW templates (basic, native or non-native);
                        otherwise, the path to the etf et pfsx templates,
                        without the extension.
  --context-onset CONTEXT_ONSET
                        context onset and segment offset difference in
                        milliseconds, 0 for no introductory context
  --context-offset CONTEXT_OFFSET
                        context offset and segment offset difference in
                        milliseconds, 0 for no outro context
  --path PATH           path to the input dataset. Required together with
                        --import-speech-from for pre-filling the .eaf
  --import-speech-from IMPORT_SPEECH_FROM
                        set of annotations from which speech segments should
                        be imported in order to pre-fill the annotations.

Pre-filled annotations

The .eaf files can be pre-filled according to previous annotations by using the --path and --import-speech-from switches. The --path option specifies the path to the dataset (i.e. the source project). The --import-speech-from specifies the set of annotations to import speech from. The pipeline will create and populate tiers named according to either the value of speaker_id or, alternatively, speaker_type for each annotation of that set.

More resources