This guide provides steps on how to contribute new videos to the platform. If you wish to make a contribution, please submit a Pull Request (PR) with the necessary information detailed below.
There are a few scripts available to help you build those data files by scraping the Youtube API. To use them, you must first create a Youtube API Key and add it to your .env file. Here are the guidlines to get a key https://developers.google.com/youtube/registering_an_application
YOUTUBE_API_KEY=some_key
The proposed workflow is to create the data files in the /data_preparation
folder using the scripts. Once you have validated those files and eventually cleaned a few errors, you can copy them to /data
and open a PR with that content.
Everything starts with an organization. An organization is the entity organizing the events.
Add the following information to the data_preparation/organisations.yml
file:
- name: Railsconf
website: https://railsconf.org/
twitter: railsconf
youtube_channel_name: confreaks
kind: conference # Choose either 'conference' or 'meetup'
frequency: yearly # Specify if it's 'yearly' or 'monthly'; if you need something else, open a PR with this new frequency
language: english # Default language of the talks from this conference
default_country_code: AU # default country to be assigned to the associated events
Then run this script:
rails runner scripts/prepare_organisations.rb
This will update your data_preparation/organisations.yml
file with the youtube_channel_id information.
This workflow assumes the Youtube channel is organized by playlist with 1 event equating to 1 playlist. Run the following script to create the playlist file:
rails runner scripts/create_playlists.rb
You will end up with a data structure like this:
data/
├── organisations.yml
├── railsconf
└── playlists.yml
At this point, go through the playlists.yml
and perform a bit of verification and editing:
- Add missing descriptions.
- Ensure all playlists are relevant.
- Ensure the year is correct.
Multi-Events Channels
Some YouTube channels will host multiple conferences. For example, RubyCentral hosts Rubyconf and RailsConf. To cope with that, you can specify in the organization a regex to filter the playlists of this channel. The regex is case insensitive.
Here is an example for RailsConf/RubyConf:
- name: RailsConf
youtube_channel_name: confreaks
playlist_matcher: rails # will only select the playlist where there title match rails
youtube_channel_id: UCWnPjmqvljcafA0z2U1fwKQ
...
- name: RubyConf
youtube_channel_name: confreaks
playlist_matcher: ruby # will only select the playlist where there title match ruby
youtube_channel_id: UCWnPjmqvljcafA0z2U1fwKQ
...
Once your playlists are currated, you can run the next script to extract the video information. It will iterate the playlist and extract all videos.
rails runner scripts/extract_videos.rb
At this point you have this structure
data/
├── organisations.yml
├── railsconf
│ ├── railsconf-2021
│ │ └── videos.yml
│ ├── railsconf-2022
│ │ └── videos.yml
│ ├── playlists.yml
├── speakers.yml
To extract a maximum of information from the Youbute metadata, the raw video information is parsed by a class Youtube::VideoMetadata
. This class will try to extract speakers from the title. This is the default parser but sometime the speakers ar you can create a new class and specify it in the playlists.yml
file.
- id: PL9_jjLrTYxc2uUcqG2wjZ1ppt-TkFG-gm
title: RubyConf AU 2015
description: ""
published_at: "2017-05-20"
channel_id: UCr38SHAvOKMDyX3-8lhvJHA
year: "2015"
videos_count: 21
slug: rubyconf-au-2015
metadata_parser: "Youtube::VideoMetadata::RubyConfAu" # custom parser
Once the data is prepared you can move it to the main /data
folder