Skip to content

Google Summer of Code 2023 Ideas

yfprojects edited this page Apr 7, 2023 · 86 revisions

❗ The application deadline has passed. The accepted projects will be announced by Google on May 4th. ❗

Introduction

This idea page is for multiple Borg-related projects listed below. In short what each project does:

  • Borg is a Python-based file backup tool. It can also compress, encrypt/authenticate and deduplicate the data. Repository for Borg
  • Borgmatic is a CLI wrapper for it that keeps settings and runs pre- and post-backup tasks. It basically helps manage your Borg repos, settings and tasks that go with it. It also deals with monitoring. Repository for Borgmatic
  • Vorta is a desktop GUI for Borg. It sits in your task bar and runs backups in the background. You can also use it to view and restore different versions of your files. Vorta officially supports Linux and MacOS while the community is currently testing running vorta on Windows. Our latest release was downloaded over 9000 times via flathub or the MacOS installer. However many users install Vorta from the package repository of their Linux distribution. Repository for Vorta

In addition we have an Ansible role and a Docker image to make installing Borg and Borgmatic easier on servers.

Technologies used: All projects use Python as main language. Vorta also relies on Qt for the GUI parts. The devops tasks use Ansible and Docker.

Join the Borg Collective for Google Summer of Code 2023

GSoC is a program that allows students and open source beginners to learn contributing to an open-source project while receiving a stipend from Google, and mentorship from open-source software developers. For details about this year's GSoC, please refer to this page.

The Borg Collective is part of the Python Software Foundation. Please read about our expectations.

Getting Started

❗ The application deadline has passed. The accepted projects will be announced by Google on May 4th. ❗

  1. Get in touch on Github or IRC and tell us that you are interested in one of our projects. This doesn't represent any commitment to apply as a contributor. You can choose to apply for a different project at any time.
  2. Select an easy issue which are tagged/labelled with good first issue and try solving it. (view such issues in each project: Vorta, Borg, Borgmatic) This allows you to get to know the project and its source code. This helps you determining whether it suits you and whether you want to apply as a contributor. Solving an issue and opening a pull request (PR) for it will also be required for applying.
  3. Before working on an issue, please comment and make sure it's assigned to you, so no duplicate work is done. Also outline how you plan on solving the issue, so you are on the right track from the start. We can give you some hints on how to go about, but we also want to see how you manage on your own.
  4. Open a PR fixing the issue. We will then review it as usual. Don't feel discouraged if the reviewer requests changes to your PR. This doesn't question your skills. We just want the code to meet our (formal) standards.
  5. When you are sure that you want to apply as a contributor to one of our projects, it is time to discuss the project ideas that you are interested in with your prospected mentors on IRC/Matrix. We will set up a private room with the relevant members. Make sure to communicate your current knowledge and expertise so that your prospected mentors can guide you adequately. After this step you should have a clear understanding of task you will apply for including expected results, tasks and steps towards that goal, time effort.
  6. Your prospective mentors will then help you with writing your application. It is very important to check in with them on your application before submitting. Else your likelihood of being accepted shrinks drastically. To get quick feedback it's best to share your application using something like Google Docs with comments enabled.

During all steps communication is key. Talk to the mentors as well as other community members, take their feedback and advise into account. You can find more information about the process alongside tips on the website of the PSF.

To start coding, see the existing contributor guides for each project and setup your development environment:

Feel free to ask questions. On your journey you might encounter unclear, insufficient or missing documentation. Don't hesitate to point out the problems, so we can help you out as well as improve the documentation of our projects.

Project Ideas

These are tasks you can work on. You can combine any number of tasks, so they add up to a full- or part time project. You can also suggest your own tasks to supplement the ones below. Especially if they are related. Keep in mind that this is not about picking the easiest tasks, but learning something that will be useful later. It's best to pick tasks from a single project, so you can work with the same mentor the whole time.

Vorta

Remove Paramiko dependency (Done ✅)

Difficulty: Easy
Length: 25 hours
Skills required: Python, Unix, OpenSSH
Description: It's easy to add a dependency but hard to remove it. We found that it's not really essential for our application to parse each SSH key a user has. So this task would remove Paramiko and just do a rudimentary check to see if a file is a private SSH key.
Task outline: Research and confirm the format of OpenSSH keys. Build a function to identify one, given a file path. Replace usage of Paramiko with this function. Also includes test cases for all steps.
Additional details: See this issue
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Enhance archive table and archive actions

Difficulty: Easy
Length: 50 hours
Skills required: Python, Qt
Description: Vorta comprises a table listing the archives in a borg repository. There also is a button for renaming the selected archive in another dialog. This task should implement the capability to edit the archive name inline (in the table cell) without having to open another dialog. This task should also add a column showing whether an archive was created by the user manually or by the scheduler. Beneath the archive table there is a compact button running borg compact. However this borg feature is only available since borg v1.2. The button must therefore be hidden from the GUI when using earlier borg versions. There is also a button for refreshing the selected archive data from the repository. This button should work when selecting multiple archives too. Vorta allows mounting a selected archive. Implement the option to copy the mount location to the clipboard, open the file manager at the mount location and a setting for doing that automatically after mounting. Currently the user has to select a folder to mount to. This can be very time consuming. This task should also add the possibility to 'quick mount' a repository into a temporary directory created by vorta.
Task outline: Add additional column to the archive table. Implement edit functionality. Hide compact button for borg versions <1.2. Implement refreshing multiple archives. Implement copying mount path to the clipboard. Implement opening the mount location in default file explorer. Implement automatically opening mount location (+ corresponding setting). Implement quick mount action.
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Search a file in diff and extract view

Difficulty: Easy
Length: 100 hours
Skills required: Python, Qt
Description: Vorta has a feature for comparing two archives (backup snapshots) and a feature for extracting specific files from an archive. The dialogs of these are very similiar. They are comprised of a list of files and some view modes options.
This task adds the option to filter the files shown through a search bar. At first one should be able to search for files containing the string entered but advanced search options like filtering by file size or other attributes could be implemented as well.
Task outline: Plan out how the search bar should work and how it would fit into the existing GUI. Implement the search functionality in the backend (FileItemModel). Implement the GUI addition. Write unittests for the added code.
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Error handling and report dialog/wizard

Difficulty: Medium
Length: 100 hours
Skills required: Python, Qt, Unix
Description: There is already some code to catch errors and display a simple dialog. However it isn't used consistently. This task would add a dialog displaying the errors user friendly and a wizard for reporting an issue on Github. This task would also ensure we catch long Borg errors (maybe via return code).
Task outline: Get familiar with the kind of errors encountered in vorta and how they are handled. Plan out the use cases and the functionalities of the error dialog and report wizard. Draw a mockup of the new GUI. Implement the GUI as a Qt UI file. Do the coding needed to make the GUI functional. Adjust the existing vorta code to use/work with the new dialog.
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Improve Test Coverage

Difficulty: Medium
Length: 100 hours
Skills required: Python, Pytest
Description: It's fun to add new features, but the actual work is maintaining them over time, as the code around it changes. This task would aim to increase the coverage output by the coverage tool from ~65% to ~80% by cleaning up existing tests, using parameterization and adding more unit tests (as opposed to higher-level integration tests we use now).
Task outline: Look at each Vorta package and module to find corresponding existing tests. Analyse the test coverage and determine which additional tests are needed. Then use consistent file naming for existing tests and add missing tests (especially unit tests).
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Implement summary line for table in source view

Difficulty: Medium-hard
Length: 100 hours
Skills required: Python, Qt, (basic reading C++)
Description: The user can configure backup sources in vorta like files or directories. These sources are displayed in a table including information about their size and number of files. We would like the table to have a summary line showing the sum of all sources for these fields. This task will require you to dig deep into Qt documentation and possibly Qt source code to find a way of implementing a summary line since such a thing isn't supported by Qt out of the box.
Task outline: Research writing custom Qt widgets and extending (subclassing) existing ones. Research a way to extend QTableView for showing a summary line. Implement summary line. Use custom TableView in the GUI.
Additional details: See discussion #1231
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Run backups as root

Difficulty: Medium-hard
Length: 100 hours
Skills required: Python, Unix
Description: Vorta was first built for user backups. Some users also want to run it as root and backup the whole system. There are minor blocking issues to achieve this. This task would resolve those and write documentation for it. This is a large task that needs additional Linux knowledge.
Task outline: Research ways of running vorta as root on Linux and possible ways to only run borg subcommands as root (polkit, sudo, root install, ...). Weigh up the advantages and disadvantages of the options and decide on a way to implement. Implement support for root backups.
Additional details: See issues #801 and #1482
Possible mentors: @real-yfprojects, @m3nu

Test on live Borg binary

Difficulty: Medium
Length: 100-175 hours
Skills required: Python, Unix, Shell
Description: Currently we test on static mock files of Borg JSON output. That means our tests don't actually run Borg, but take some existing output. This is not optimal because we already support 3 major Borg versions and can only add mock files for one. This task would improve testing to run on multiple actual Borg versions/binaries.
Task outline: Research how to create multiple environments with different borg versions to run tests in. This might work differently in the CI (on the server) and local developer machines. Build a testing utility that can run the existing Vorta tests on multiple versions of Borg. (You can use a tool like Tox or write your own script.)
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Add a log viewer / analyser

Difficulty: Medium
Length: 125-150 hours
Skills required: Python, Qt
Description: Currently Vorta just shows 0, 1 or 2 to inform about borg's return code and users have to wonder about what that could mean. It needs to offer a borg log file view, so that users can find more information about problems and also about successful runs. The log file view could colour the log lines according to their log level (e.g. display ERROR in red). A more advanced log line highlighter would be even better. The log viewer should also allowing filtering log lines by application session, borg command and other attributes.
Task outline: Plan out how the log file viewer would integrate into the existing GUI. Plan out how the log file viewer should work from a user perspective. Draw a GUI mockup. Implement the GUI as a Qt UI file. Implement the logic of the new dialog. Implement filtering logs. Implement log syntax highlighting. Write unittests throughtout the process.
Additional details: See issue #1483
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Implement profile sidebar

Difficulty: Medium
Length: 125-175 hours
Skills required: Python, Qt
Description: Vorta allows configuring profiles. A profile defines a number of sources, a backup destination, a schedule and some more settings. Currently there is a ComboBox for selecting a profile at the top of the window alongside buttons for adding or removing profiles. All profiles are active at the same time. However you can only edit the profile selected. This happens in the tabs below. The Misc tab is an exception. It contains the global settings that aren't associated to a profile.
In this task you will implement a sidebar next to the tab widget that replaces the current way of selecting a profile to edit. The sidebar should be comprised of a list with the configured profiles that can be selected and a visually distinct Misc item that gives access to the global settings. The old Misc tab will be removed in favour of this new user interface. The new Misc item now provides space for an about view and a way to manage profiles configured in Vorta.
Task outline: Implement the profile list that will be part of the sidebar. Add buttons for editing the profile list. Remove the old profile management GUI elements. Implement Misc item and the corresponding view. Move global settings to the new Misc view. Create an About tab and add it to the new Misc view. Implement a repository management tab and add it to the new Misc view. Write unittests for the new GUI parts.
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Advanced borg command hooks

Difficulty: Medium
Length: 150-175 hours
Skills required: Python, Unix, basic Qt
Description: Vorta already allows specifying a custom command before and after borg create is run. We would like to have a more versatile way to run custom commands for different hooks. For example some users want to mount a drive before every borg command and unmount it afterwards. There should be hooks for these use cases. The custom commands could also be able to modify the behaviour of Vorta, e.g. through their output or by settings environment variables. The Vorta community did already thought about different ways of implementing such a feature.
Our favourite one includes a script the user can configure that is called before and after every command. Vorta provides the script with information about the profile, repository and the hook the script is called for. The user can freely configure which commands the script executes for a given hook or set of arguments.
Task outline: Read this thread. Inform yourself about the use cases this feature might be able to help with. Plan out the details of how this feature should work. Implement GUI widget to configure the script. Write code that makes vorta run the script for each hook and provides it with the correct arguments. Write unittests for the implemented feature(s). Write script template that allows users to get started quickly with configuring custom commands for common use cases. Write documentation for this feature describing how to use it and how it works from a user perspective.
Additional details: See issue #379
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Implement exclude GUI

Difficulty: Medium-hard
Length: 175 hours
Skills required: Python, Qt, Unix desktop
Description: Currently users exclude files by adding text rules, like /tmp/cache/*.tmp. It is often confusing and we'd like to add a GUI, as well as pre-defined rules for it. E.g. one could choose to exclude common macOS cache files.
Task outline: Create a list of sensible default files to exclude. Possibly grouped, so users can choose to enable parts of them. Decide on way to store exclusions. Do mockup of GUI (partly done). Implement as Qt UI file. Implement parsing exclusion rules to Borg input.
Additional details: See this issue for discussions about the GUI part and here for suggested exclusion rules.
Possible mentors: @real-yfprojects, @m3nu, @Hofer-Julian

Other issues

You can also come up with own ideas to implement or choose to solve any other existing issue. Discuss your ideas with you prospective mentors.

Borg

Add --format option to borg diff command

Difficulty: Easy
Length: 25 hours
Skills required: Python, Unix
Description: The similar borg list command already has a format option. This small task would add the same to borg diff.
Task outline: This is a great starter task to find your way around the Borg codebase and make small edits.
Additional details: See this issue
Possible mentors: @m3nu, backup mentor: @ThomasWaldmann

Update shell completions for Borg v2

Difficulty: Easy
Length: 25 hours
Skills required: Python, Shell
Description: For the next release of Borg, a few commands have changed subtly. This small task would update shell completions for popular shells, like Zsh.
Task outline: Review existing completions and adjust and test them for Borg v2
Additional details: See this issue
Possible mentors: @m3nu, @Hofer-Julian, backup mentor: @ThomasWaldmann

Update intro video

Difficulty: Easy
Length: 25 hours
Skills required: Shell, reStructuredText
Description: Borg's website shows a quick intro video that demonstrates how Borg works at a high level. There was already some work to automate the creation of this video, but still needs an update to cover the upcoming v2 release.
Task outline: Review the existing script to create videos and adjust them for Borg v2
Additional details: See this issue
Possible mentors: @m3nu, backup mentor: @ThomasWaldmann

Comparison with other backup tools

Difficulty: Easy
Length: 175 hours
Skills required: Unix, Shell, Google research
Description: This would help users decide on the backup tool to use. Naturally there is no single best option for all use cases. This large task includes less programming and more research and qualitative work. You may need to write some scripts for benchmarking.
Task outline: First step would be to determine the features to use for comparing. Then do qualitative research to find them for each backup tool. Finally set up a benchmarking process for quantitative comparison.
Additional details: Some work on this was already done for our docs and a recent benchmark.
Possible mentors: @m3nu, backup mentor: @ThomasWaldmann

Difficulty: Medium
Length: 350 hours
Skills required: Python, Unix
Description: This is a large task, but doesn't require in-depth Borg knowledge. It would involve adding support for additional hardlink-based backup tools. So those users can migrate their existing backups into Borg.
Task outline: Together with Borg maintainers you will decide on which backup tools to support for importing. Then research their format and add the import code and test cases.

Possible mentors: @m3nu, backup mentor: @ThomasWaldmann

Other issues tagged help wanted and good first issue

Borgmatic

Restore a database backup to a different database or server

Difficulty: Easy
Length: 30 hours
Skills required: Python, Linux, relational database administration basics (PostgreSQL and/or MySQL/MariaDB)
Description: Today, borgmatic supports creating database backups and restoring those backups to the same database. However, this doesn't support use cases likes: 1. Running a test restore to another database or server without impacting the production database, or 2. In the event of a database loss, restoring a database to a newly created replacement server with a different hostname and/or credentials.
Task outline: Add command-line support to borgmatic's restore action to override the restored database name, hostname, port username, password, etc. and plumb those values through to the restore commands. Additionally, manually test on all database types borgmatic supports, update the existing documentation, and add/update tests for these features.
Additional details: See the ticket
Possible mentors: @witten, @real-yfprojects (backup mentor)

Wrap all Borg sub-commands with borgmatic actions

Difficulty: Easy
Length: 90 hours
Skills required: Python, Linux
Description: borgmatic is effectively a wrapper around Borg backup, providing additional features like a configuration file, database integration, etc. But borgmatic only wraps a fraction of the sub-commands that Borg provides. And for those that it does wrap, it doesn't necessarily support all command-line flags as borgmatic options. Users can always drop back down to running Borg directly for those missing sub-commands (or use the borgmatic borg action), but that doesn't provide all the conveniences of borgmatic and its configuration file.
Task outline: Implement borgmatic actions for all Borg sub-commands that are not yet implemented. For each Borg flag within those sub-commands, decide whether it makes sense to add a new borgmatic configuration option for it—or whether it would be more appropriate as a borgmatic action command-line flag. Also as part of this work, consider implementing missing flags/options on existing borgmatic actions.
Additional details: Not all Borg sub-commands make sense to wrap. For instance, Borg invokes borg serve internally, and there's likely not a good use case for running it via borgmatic. Similarly, some Borg flags like --info and --debug shouldn't be exposed directly via borgmatic configuration options or command-line flags, because borgmatic uses them implicitly (e.g. via --verbosity) without exposing them to the end-user.
Possible mentors: @witten, @real-yfprojects (backup mentor)

Bootstrap a borgmatic restore from nothing

Difficulty: Medium
Length: 80 hours
Skills required: Python, Linux, relational database administration basics (PostgreSQL and/or MySQL/MariaDB)
Description: borgmatic is great for restoring the occasional accidentally deleted file or database, but it's not as streamlined for the use case of spinning up an entire replacement system from nothing in the case of catastrophic server loss. That's because in order to function, borgmatic needs a configuration file with a list of Borg repositories to restore from, databases to restore, credentials, etc.
Task outline: Design and implement a better, more streamlined workflow for bootstrapping the restore of backed up files and databases to a totally blank system. This could include approaches like: 1. Automatically storing borgmatic configuration in a canonical location in any backup archive such that it can be restored or used upon restore, 2. Automatically storing borgmatic database configuration along with each backed up database dump so as to facilitate restore, and/or 3. Implementing a new borgmatic action like borgmatic bootstrap to provide the user with an entry point to some of these features. Additionally, update the documentation and tests accordingly.
Additional details: See a related ticket
Possible mentors: @witten, @real-yfprojects (backup mentor)

Backup and restore SQLite databases (Done ✅)

Difficulty: Medium
Length: 125 hours
Skills required: Python, Linux, relational database administration basics (especially SQLite)
Description: borgmatic supports creating database backups and restoring those backups for a handful of database types: PostgreSQL, MySQL/MariaDB, and MongoDB. But many users run applications against SQLite databases which, while lighter weight and lacking a daemon process, still are subject to many of the same consistency concerns that a more full-fledged database system has.
Task outline: Implement a borgmatic SQLite database hook (modeled after one of the existing database hooks) so that users can dump and restore consistent SQLite database snapshots with borgmatic. Additionally, update the existing documentation and tests accordingly.
Additional details: See the ticket.
Possible mentors: @witten, @real-yfprojects (backup mentor)

Perform ZFS filesystem snapshots

Difficulty: Hard
Length: 150 hours
Skills required: Python, Linux, relational database administration basics (PostgreSQL and/or MySQL/MariaDB)
Description: borgmatic creates backup archives by reading a collection of files on a filesystem. However, this can result in inconsistent backups if the files change as borgmatic/Borg are reading them, which may pose problems for some users / use cases. A common solution today is to take advantage of the snapshotting capabilities of various filesystems like ZFS to produce a consistent view of the filesystem from a particular point in time, but those snapshots are not natively integrated with borgmatic.
Task outline: Implement native ZFS support in borgmatic such that a snapshot is created before backups occur and then the snapshot is removed / cleaned up afterwards. This should ideally be integrated with borgmatic via a new general-purpose filesystem snapshotting hook internal API (analogous to the existing database hook internal API already in place) so that subsequent projects can support other filesystems. Additionally, update the documentation and tests accordingly.
Additional details: See the ticket
Possible mentors: @witten, @real-yfprojects (backup mentor)

Also see good first issues.

DevOps and Packaging

For the Borg+Borgmatic Docker image and Ansible role

Remove antiquated msmtp and ntfy images

Difficulty: Easy
Length: 25 hours
Skills required: Docker, Linux, Shell
Description: This has been a point of confusion for users for many years and recently we started including apprise which fulfills the duty of both msmtp and ntfy in a simpler package
Task outline: Remove those applications, update documentation and prepare a release to notify users about it.
Additional details: -
Possible mentor: @grantbevis, @m3nu (backup mentor)

Add Systemd timer in addition to Cron timer (DONE)

Difficulty: Easy
Length: 45 hours
Skills required: Ansible, Linux, Systemd
Description: Our role currently installs a cron file to run regular backup tasks. As more distributions move to use Systemd, it makes sense to move more flexible Systemd timers for scheduling backups and checking tasks.
Task outline: Write a Systemd timer- and service file. Test it on a live system. Then add Ansible tasks to install it and ensure it passes tests.
Additional details: See this issue
Possible mentors: @m3nu, @Hofer-Julian, @grantbevis (backup mentor)

Add support for reduced NICE priority (DONE)

Difficulty: Medium
Length: 45 hours
Skills required: Ansible, Linux, Systemd
Description: This is a bit related to the "Add Systemd timer" task. As you may know, it's possible to set a NICE level for Linux processes to reduce their priority. In addition Systemd supports setting CPU and IO limits. This small to medium task would add support for running backups with a reduced priority to the Ansible role. The benefit would be that backups can run in the background without interfering with normal server operation.
Task outline: Research and test options to set NICE priority with Cron and Systemd. Then adjust Ansible playbooks and templates and add documentation for it.
Additional details: See this issue
Possible mentors: @m3nu, @grantbevis (backup mentor)

Add more validations to Molecule tests

Difficulty: Medium
Length: 75 hours
Skills required: Ansible, Molecule, Docker, Linux
Description: We already use Molecule to test this role, but only do a few basic validations. This task would add more validations and maybe run a local backup to make sure the role works well.
Task outline: Research how verification is done at similar projects. Then make a list of things we like to test. Then see which ones we can test in the context of a Docker test.
Additional details: See this issue for more.
Possible mentors: @m3nu, @grantbevis (backup mentor)

Create a minimal/extended docker image setup

Difficulty: Medium
Length: 90 hours
Skills required: Docker, Linux, Shell
Description: A recentish request to have a minimal no-frills base image and then an extended docker image with more tools included. I think we would also include the ability to provide a variable containing a string of additional packages users would like installed. This would then be processed during entry.sh and install any additional tools users wish.
Task outline: Develop new multi-stage Docker file and add build workflow based on Github Actions
Additional details: See this issue
Possible mentor: @grantbevis, @m3nu (backup mentor)

Add CI for the docker images

Difficulty: Medium
Length: 90 hours
Skills required: Docker, Linux, Shell
Description: Add linting (hadolint) and test builds as we currently only utilise CD to deploy the image to GitHub/Docker Hub registries: Might be worth investigating if we could run the borgmatic unit/integration tests using the docker-borgmatic image as they don't run there currently.
Task outline: Research Docker linting and testing best-practices and see how similar projects are dealing with it. Then add the same to our release process, likely using Github Actions, but staying platform agnostic where possible.
Additional details: -
Possible mentor: @grantbevis, @m3nu (backup mentor)