Audio processors
Overview
The package provides several tools for processing the recordings.
$ child-project process --help
usage: child-project process [-h] [--threads THREADS]
[--input-profile INPUT_PROFILE]
path {basic,vetting,channel-mapping,standard} ...
positional arguments:
path path to the dataset
{basic,vetting,channel-mapping,standard}
processor
basic basic audio conversion
vetting vetting
channel-mapping channel mapping
standard standard audio conversion
optional arguments:
-h, --help show this help message and exit
--threads THREADS amount of threads running conversions in parallel (0 =
uses all available cores)
--input-profile INPUT_PROFILE
profile of input recordings (process raw recordings by
default)
Basic audio conversion
Converts all recordings in a dataset to a given encoding. Converted
audios are stored into recordings/converted/<profile-name>
.
$ child-project process /path/to/dataset basic test --help
usage: child-project process path basic [-h] --format FORMAT --codec CODEC
--sampling SAMPLING [--skip-existing]
[--recordings RECORDINGS [RECORDINGS ...]]
name
positional arguments:
name name of the export profile
optional arguments:
-h, --help show this help message and exit
--format FORMAT audio format (e.g. wav)
--codec CODEC audio codec (e.g. pcm_s16le)
--sampling SAMPLING sampling frequency (e.g. 16000)
--skip-existing
--recordings RECORDINGS [RECORDINGS ...]
list of recordings to process, separated by
whitespaces; only values of 'recording_filename'
present in the metadata are supported.
Example:
child-project process /path/to/dataset basic standard --format=wav --sampling=16000 --codec=pcm_s16le
Processing can be restricted to a white-list of recordings only using the --recordings
option:
child-project process /path/to/dataset basic standard --format=wav --sampling=16000 --codec=pcm_s16le --recordings audio1.wav audio2.wav
Values provided to this option should be existing recording_filename
values in metadata/recordings.csv
.
The --skip-existing
switch can be used to skip previously processed files.
Standard audio conversion
Same as the basic processor but using standard parameters for the conversion:
- single-channel (first channel is kept)
- 16KHz sampling rate
- codec pcm_s16le
- wav format
Audios are exported to recordings/converted/standard
.
$ child-project process /path/to/dataset standard --help
usage: child-project process path standard [-h] [--skip-existing]
[--recordings RECORDINGS [RECORDINGS ...]]
optional arguments:
-h, --help show this help message and exit
--skip-existing
--recordings RECORDINGS [RECORDINGS ...]
list of recordings to process, separated by
whitespaces; only values of 'recording_filename'
present in the metadata are supported.
Example:
child-project process . standard
Values provided to the option --recordings
should be existing recording_filename
values in metadata/recordings.csv
.
The --skip-existing
switch can be used to skip previously processed files.
Multi-core audio conversion with slurm on a cluster
If you have access to a cluster with slurm, you can use a command like the one below to batch-convert your recordings. Please note that you may need to change some details depending on your cluster (eg cpus per task). If needed, refer to the slurm user guide
sbatch --mem=64G --time=5:00:00 --cpus-per-task=4 --ntasks=1 -o namibia.txt child-project process --threads 4 /path/to/dataset basic standard --format=wav --sampling=16000 --codec=pcm_s16le
Vetting
The vetting pipeline mutes segments of the recordings provided by the user while preserving the duration of the audio files. This technique can be used to remove speech that might contain confidential information before releasing the audio.
The input needs to be a CSV dataframe with the following columns: recording_filename
, segment_onset
, segment_onset
.
The timestamps need to be expressed in milliseconds.
$ child-project process /path/to/dataset vetting test --help
usage: child-project process path vetting [-h] --segments-path SEGMENTS_PATH
[--recordings RECORDINGS [RECORDINGS ...]]
name
positional arguments:
name name of the export profile
optional arguments:
-h, --help show this help message and exit
--segments-path SEGMENTS_PATH
path to the CSV dataframe containing the segments to
be vetted
--recordings RECORDINGS [RECORDINGS ...]
list of recordings to process, separated by commas;
only values of 'recording_filename' present in the
metadata are supported.
Channel mapping
The channel mapping pipeline is meant to be used with multi-channel audio recordings, such as those produced by the BabyLogger. It allows to filter or to combine channels from the original recordings at your convenience.
$ child-project process /path/to/dataset channel-mapping test --help
usage: child-project process path channel-mapping [-h] --channels CHANNELS
[CHANNELS ...]
[--recordings RECORDINGS [RECORDINGS ...]]
name
positional arguments:
name name of the export profile
optional arguments:
-h, --help show this help message and exit
--channels CHANNELS [CHANNELS ...]
lists of weigths for each channel
--recordings RECORDINGS [RECORDINGS ...]
list of recordings to process, separated by commas;
only values of 'recording_filename' present in the
metadata are supported.
In mathematical terms, assuming the input recordings have \(n\) channels with signals \(s_{j}(t)\); If the output recordings should have \(m\) channels, the user defines a matrix of weights \(w_{ij}\) with \(m\) rows and \(n\) columns, such as the signal of each output channel \(s'_{i}(t)\) is:
The weights matrix is defined through the --channels
parameters.
The weights for each output channel are separated by blanks. For a given output channel, the weights of each input channels should be separated by commas.
For instance, if one would like to use the following weight matrix (which transforms 4-channel recordings into 2-channel audio):
Then the correct values for the –channels parameters should be:
--channels 0,0,1,1 0.5,0.5,0,0
To make things clear, we provide a couple of examples below.
Muting all channels except for the first
Let’s assume that the original recordings have 4 channels. The following command will extract the first channel from the recordings:
child-project process /path/to/dataset channel-mapping channel1 --channels 1,0,0,0
Invert a stereo signal
Let’s assume that the original recordings are stereo signals, i.e. they have two channels. The command below will flip the two channels:
child-project process /path/to/dataset channel-mapping channel1 --channels 0,1 --channels 1,0