Getting some data
You can either have some data of your own that you would like to use the package on, or you may know of some datasets that are already in this format that you’d like to reuse.
It may be easier to start with an extant dataset. Here is the list that we know exists. Please note that the large majority of these data are NOT public, and thus if you cannot retrieve them, this means you need to get in touch with the data managers.
Public data sets
We have prepared a public dataset for testing purposes which is based on the VanDam Public Daylong HomeBank Corpus; VanDam, Mark (2018). VanDam Public Daylong HomeBank Corpus. doi:10.21415/T5388S.
From the LAAC team
Name |
Authors |
Location |
Recordings |
Duration (h) |
---|---|---|---|---|
Namibia |
Gandhi |
113 |
1449 |
|
Solomon |
Sarah |
388 |
5954 |
|
Tsimane 2017 |
41 |
556 |
||
png 2019 |
51 |
760 |
||
Vanuatu |
unavailable |
53 |
289 |
EL1000
The EL1000 dataset contains several corpora accessible upon request.
Other private datasets
We know of no other private datasets at present, but we hope one day to be able to use datalad’s search feature