Skip to content

A template repository for publishing an ecoacoustics or bioacoustics recognizer

License

Notifications You must be signed in to change notification settings

ecoacoustics/recognizer-template

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

recognizer-template

A template repository for publishing an ecoacoustics or bioacoustics recognizer

This template is an attempt to set up a standard layout for publishing recognizers.

You should fork (make a copy) of this repository. When it is forked, you'll get your own copy, owned by you, that you can change.

You can also start a new repository using this template by clicking the button that says Use this template and then selecting Create a new repository.

Getting started

Q: I'm not ready to publish my recognizer

After forking this repository you can make your copy private. See setting repository visibility.

Q: What rights do I have after I publish my recognizer?

That's up to you. If make your repository private, only you have access.

When it is time to publish you recognizer, you'll need to choose an appropriate license. This repository by default uses the Apache 2.0 license but there are a number of suitable licenses available. See choosing a license.

Other good choices are:

Q: What is this CITATION.cff file?

The citation file is a new standard recognized by GitHub and other tools as the place to put citation information. You should modify the citation file in your forked repository so that the information is correct.

See help about CITATION files.

Q: How do I work with my repository on my computer?

You can clone your repository. See cloning a repository.

We recommend using GitHub Desktop.

Q: Will this FAQ be in my published recognizer's readme?

Yeah, but you can delete it! Open the README.md file now can delete this sentence.

Q: This template seems incomplete...

It is! This template is a work in progress. You can help by adding more documentation or by suggestion improvements.

Q: Where can I get help?

Head on over to the discussions tab and ask us a question!

Directory structure

.
├── LICENSE.md      - the license for this recognizer
├── README.md       - the first page people see when they visit the repository
├── CITATION.cff    - citation information for this repository
├── src             - [optional] if you want to publish code with your recognizer,
│                     put it in this folder
├── artifacts       - [optional] if you have a trained model or other artifacts
│                     produced while developing your recognizer, put them in this folder
├── data            - contains or describes your data set
│   ├── training
│   │   ├── xxx     - the name of the species or target you are training on
│   │   ├── yyy     - [optional] further folders containing training samples
│   │   └── zzz     
│   ├── test
│   │   ├── xxx     - the name of the species or target you are evaluating your recognizer against
│   │   ├── yyy     - [optional] further folders containing testing samples
│   │   └── zzz     
|   └── README.md   - information on the included datasets or on how to obtain them

The data folder

Storing data in a repository is not always the right choice. See the Tips for audio data section below.

In each folder where it is relevant you should include:

  1. Small sets of audio samples
  2. A README.md containing
  • provenance of any data included
  • instructions on how to obtain more data
  1. Any scripts needed to download data from remote repositories

Tips for audio data

You don't have to store your audio data in your repository. If you don't, you need to ensure that your data is accessible and stored in an appropriate place.

You can store small datasets in this repository.

Don't add audio files directly to Git, rather, use Git-LFS which makes it look like you are adding files directly to Git. GitHub offers 1GB of bandwidth per month per user for free. I'd suggest keeping no more than 100MiB of data directly in the repository.

For larger datasets you can use:

  • An ecoacoustics repository
    • like Ecosounds, the A2O, or ...others...
  • A bioacoustics repository
    • like Xeno Canto
  • Cloud storage options like DropBox, OneDrive, etc.
  • Commercial services like Amazon S3, Google Cloud Storage, etc.

Egret, included in this template, can download samples from the internet for you.

About

A template repository for publishing an ecoacoustics or bioacoustics recognizer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published