-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
GSoC 2020
sktime will be applying as a mentoring organization for Google Summer of Code 2020.
This is our Ideas Page. Join the sktime team for a summer full of coding, learning and fun. Be part of our diverse community and join our efforts to advance machine learning and time series analysis capabilities!
We explicitly encourage female students to apply.
Time series data is ubiquitous in many applications. Examples include sensor readings from industrial processes, spectroscopy wave length data from chemical samples, or bed-side monitor medical data from patients. Developing advanced time series analysis capabilities for researchers and practitioners is one of the major challenges of contemporary machine learning.
sktime is a new Python toolbox for machine learning with time series and, to the best of our knowledge, the first unified toolbox for time series. Our ambition is to provide for time series what scikit-learn provides for tabular data. This involves extending scikit-learn to the different time series learning tasks, such as time series classification, clustering, forecasting and anomaly detection. To find out more, check out our paper published at the Workshop on Systems for ML at NeurIPS 2019.
- Read our how to get started guide,
- Try to solve one of the entrance tasks via a PR on GitHub. We will give preference to students who have at least tried to solve one of these tasks,
- Contact us informally to discuss applying or apply by sending us your CV and covering letter to [email protected].
We're actively looking for contributors and your help is extremely welcome. Therefore, if
- you are interested in time series, machine learning (ML), statistics, API design and software architecture,
- you like coding in Python,
- you are familiar with the basic data science ecosystem in Python, including numpy, pandas and scikit-learn,
- you enjoy working with a vibrant team of experienced ML scientists and software engineers,
- you always wanted to join an open-source community,
then GSoC with sktime is for you! You'll spend the summer working with our enthusiastic and open-minded team of developers who are creating one of the first comprehensive time series ML toolboxes out there.
GSoC is a marathon, not a sprint, and we expect good performance over the whole project. This means that you are in daily contact with your mentors and wider community and that you work full time on the project.
In addition to the individual project work, all students will be required to:
- peer-review a fellow student's work in the middle and at the end of GSoC,
- write weekly blog posts about your contribution and a final summary post at the end of the project,
- have a good time web-socializing with the other students.
Finally, our goal, apart from improving sktime, is to onboard new long-term developers and we would really like you to stay around after GSoC.
Please find below a list topics to help you get started. But please don't hesitate to propose your own topic to work on.
Title | Mentors | Short Description | Difficulty | What you need to know |
---|---|---|---|---|
Time series classification | @TonyBagnall | beginner | classification with scikit-learn | |
Time series regression | (refactoring classification) | beginner | ||
Time series clustering | (time series distances, kernels, 2nd degree transformers) | beginner | ||
Forecasting | @mloning | model selection, composition, reduction | medium | |
Develop a framework | @fkirlay | hard |
More projects details will follow soon. In the meantime, check out our development roadmap and good first issues!
Name | GitHub | Website |
---|---|---|
Markus Löning | @mloning | |
Tony Bagnall | @TonyBagnall | website |
Jason Lines | @jasonlines | |
Aaron Bostrom | ||
Franz Király | @fkiraly | website |
George Oastler | @goastler |
More details on mentors will follow shortly.