ELAN builder
Introduction
The ELAN .eaf builder exports eaf files ready to be annotated with the ELAN software, based on either the ACLEW DAS templates or custom templates.
The original code was written by Sarah Mac Ewan.
This pipeline allows to pre-fill the .eaf annotations by importing speech (i.e. diarization annotations) from previous annotations, e.g. those generated by the LENA or the VTC, or even manual annotations.
Usage
$ child-project eaf-builder --help
usage: child-project eaf-builder [-h] [--destination DESTINATION] --segments
SEGMENTS --eaf-type
{random,periodic,high-volubility,energy-detection}
--template TEMPLATE
[--context-onset CONTEXT_ONSET]
[--context-offset CONTEXT_OFFSET]
[--path PATH]
[--import-speech-from IMPORT_SPEECH_FROM]
generate .eaf templates based on intervals to code.
optional arguments:
-h, --help show this help message and exit
--destination DESTINATION
eaf destination
--segments SEGMENTS path to the input segments dataframe
--eaf-type {random,periodic,high-volubility,energy-detection}
eaf-type
--template TEMPLATE Which ACLEW templates (basic, native or non-native);
otherwise, the path to the etf et pfsx templates,
without the extension.
--context-onset CONTEXT_ONSET
context onset and segment offset difference in
milliseconds, 0 for no introductory context
--context-offset CONTEXT_OFFSET
context offset and segment offset difference in
milliseconds, 0 for no outro context
--path PATH path to the input dataset. Required together with
--import-speech-from for pre-filling the .eaf
--import-speech-from IMPORT_SPEECH_FROM
set of annotations from which speech segments should
be imported in order to pre-fill the annotations.
Pre-filled annotations
The .eaf files can be pre-filled according to previous annotations by using the --path
and --import-speech-from
switches.
The --path
option specifies the path to the dataset (i.e. the source project).
The --import-speech-from
specifies the set of annotations to import speech from.
The pipeline will create and populate tiers named according to either the value of speaker_id
or, alternatively, speaker_type
for each annotation of that set.
More resources
Bergelson, E., Warlaumont, A., Cristia, A., Soderstrom, M. & Vandam, M. (2017). A New Workflow for Semi-automatized Annotations: Tests with Long-Form Naturalistic Recordings of Children’s Language Environments. In Proceedings of Interspeech. doi:DOI: 10.21437/Interspeech.2017-1418