Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build: allow to install packages with apt #8065

Merged
merged 12 commits into from
May 12, 2021
31 changes: 30 additions & 1 deletion docs/config-file/v2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ This is to avoid typos and provide feedback on invalid configurations.

.. contents::
:local:
:depth: 1
:depth: 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this will spam the TOC too much -- seems fine tho?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was between 2 or 3, but wanted to have build.apt_images and sphing.builder visible instead of just having sphinx and build, but no strong opinion.


version
~~~~~~~
Expand Down Expand Up @@ -303,6 +303,9 @@ Configuration for the documentation build process.

build:
image: latest
apt_packages:
- libclang
- cmake

python:
version: 3.7
Expand All @@ -323,6 +326,32 @@ as defined here:
* `stable <https://github.com/readthedocs/readthedocs-docker-images/tree/releases/5.x>`_: :buildpyversions:`stable`
* `latest <https://github.com/readthedocs/readthedocs-docker-images/tree/releases/6.x>`_: :buildpyversions:`latest`

build.apt_packages
``````````````````

List of `APT packages`_ to install.
stsewd marked this conversation as resolved.
Show resolved Hide resolved
Our build servers run Ubuntu 18.04, with the default set of package repositories installed.
stsewd marked this conversation as resolved.
Show resolved Hide resolved
We don't currently support PPA's or other custom repositories.

.. _APT packages: https://packages.ubuntu.com/

:Type: ``list``
:Default: ``[]``

.. code-block:: yaml

version: 2

build:
apt_packages:
- libclang
- cmake

.. note::

When possible avoid installing Python packages using apt (``python3-numpy`` for example),
:ref:`use pip or Conda instead <guides/reproducible-builds:pinning dependencies>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯


sphinx
~~~~~~

Expand Down
2 changes: 1 addition & 1 deletion docs/guides/reproducible-builds.rst
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,6 @@ or our Conda docs about :ref:`environment files <guides/conda:creating the \`\`e
.. tip::

Remember to update your docs' dependencies from time to time to get new improvements and fixes.
It also makes it easy to manage in case a version reaches it's end of support date.
It also makes it easy to manage in case a version reaches its end of support date.

.. TODO: link to the supported versions policy.
80 changes: 73 additions & 7 deletions readthedocs/config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@
SUBMODULES_INVALID = 'submodules-invalid'
INVALID_KEYS_COMBINATION = 'invalid-keys-combination'
INVALID_KEY = 'invalid-key'
INVALID_NAME = 'invalid-name'

LATEST_CONFIGURATION_VERSION = 2

Expand Down Expand Up @@ -121,10 +122,18 @@ def __init__(self, key, code, error_message, source_file=None):
super().__init__(message, code=code)

def _get_display_key(self):
# Checks for patterns similar to `python.install.0.requirements`
# if matched change to `python.install[0].requirements` using backreference.
"""
Display keys in a more friendly format.

Indexes are displayed like ``n``,
but users may be more familiar with the ``[n]`` syntax.
For example ``python.install.0.requirements``
is changed to `python.install[0].requirements`.
"""
return re.sub(
r'^(python\.install)(\.)(\d+)(\.\w+)$', r'\1[\3]\4', self.key
r'^([a-zA-Z_.-]+)\.(\d+)([a-zA-Z_.-]*)$',
r'\1[\2]\3',
self.key
stsewd marked this conversation as resolved.
Show resolved Hide resolved
)


Expand Down Expand Up @@ -745,12 +754,69 @@ def validate_build(self):
),
)

# Allow to override specific project
config_image = self.defaults.get('build_image')
if config_image:
build['image'] = config_image
# Allow to override specific project
config_image = self.defaults.get('build_image')
if config_image:
build['image'] = config_image

with self.catch_validation_error('build.apt_packages'):
raw_packages = self._raw_config.get('build', {}).get('apt_packages', [])
validate_list(raw_packages)
# Transform to a dict, so is easy to validate individual entries.
self._raw_config.setdefault('build', {})['apt_packages'] = (
list_to_dict(raw_packages)
)

build['apt_packages'] = [
self.validate_apt_package(index)
for index in range(len(raw_packages))
]
if not raw_packages:
self.pop_config('build.apt_packages')

return build

def validate_apt_package(self, index):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we probably can be smarter about this. Can we not call apt after a --, which is the shell way of handling the end of options?

Unless otherwise noted, each builtin command documented as accepting options preceded by ‘-’ accepts ‘--’ to signify the end of the options. The :, true, false, and test/[ builtins do not accept options and do not treat ‘--’ specially. The exit, logout, return, break, continue, let, and shift builtins accept and process arguments beginning with ‘-’ without requiring ‘--’. Other builtins that accept arguments but are not specified as accepting options interpret arguments beginning with ‘-’ as invalid options and require ‘--’ to prevent this interpretation.

We should still raise a validation error here, but I'd like to see additional restrictions for this built in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point Eric.

Also, we need to take care of executing commands in the same line as well. Like,

apt-get install `touch /tmp/hello.txt`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe python's subprocess command will handle this, but we should definitely be sure and have tests for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, subprocess should handle this as long as you don't pass shell=True. However, it's worth a test to make sure we don't ever mess that up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we use subprocess, but the Docker client API (exec_create), tho.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use subprocess, we escape special chars in

def get_wrapped_command(self):
"""
Wrap command in a shell and optionally escape special bash characters.
In order to set the current working path inside a docker container, we
need to wrap the command in a shell call manually.
Some characters will be interpreted as shell characters without
escaping, such as: ``pip install requests<0.8``. When passing
``escape_command=True`` in the init method this escapes a good majority
of those characters.
"""
bash_escape_re = re.compile(
r"([\t\ \!\"\#\$\&\'\(\)\*\:\;\<\>\?\@"
r'\[\\\]\^\`\{\|\}\~])',
)
prefix = ''
if self.bin_path:
prefix += 'PATH={}:$PATH '.format(self.bin_path)
command = (
' '.join([
bash_escape_re.sub(r'\\\1', part) if self.escape_command else part
for part in self.command
])
)
return (
"/bin/sh -c 'cd {cwd} && {prefix}{cmd}'".format(
cwd=self.cwd,
prefix=prefix,
cmd=command,
)
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested defining packages in a malicious way? (e.g. trying to executed forbidden things). From this docstring it only happen under certain circumstances. Are we sure we are always calling this in that way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

escape_command defaults to true if that's what you mean.

"""
Validate the package name to avoid injections of extra options.

We validate that they aren't interpreted as an option or file.
See https://manpages.ubuntu.com/manpages/xenial/man8/apt-get.8.html
and https://www.debian.org/doc/manuals/debian-reference/ch02.en.html#_debian_package_file_names # noqa
for allowed chars in packages names.
"""
key = f'build.apt_packages.{index}'
package = self.pop_config(key)
with self.catch_validation_error(key):
validate_string(package)
package = package.strip()
invalid_starts = [
# Don't allow extra options.
'-',
# Don't allow to install from a path.
stsewd marked this conversation as resolved.
Show resolved Hide resolved
'/',
'.',
]
for start in invalid_starts:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this bc feels like it gives a nice error for usual mistakes, but isn't needed as the regex below already validates this.

if package.startswith(start):
self.error(
key=key,
message=(
'Invalid package name. '
f'Package can\'t start with {start}.',
),
code=INVALID_NAME,
)
# List of valid chars in packages names.
pattern = re.compile(r'^[a-zA-Z0-9][a-zA-Z0-9.+-]*$')
if not pattern.match(package):
self.error(
key=key,
message='Invalid package name.',
code=INVALID_NAME,
)
return package

def validate_python(self):
"""
Validates the python key.
Expand Down
6 changes: 5 additions & 1 deletion readthedocs/config/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,11 @@ def as_dict(self):

class Build(Base):

__slots__ = ('image',)
__slots__ = ('image', 'apt_packages')

def __init__(self, **kwargs):
kwargs.setdefault('apt_packages', [])
super().__init__(**kwargs)


class Python(Base):
Expand Down
59 changes: 59 additions & 0 deletions readthedocs/config/tests/test_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
CONFIG_REQUIRED,
CONFIG_SYNTAX_INVALID,
INVALID_KEY,
INVALID_NAME,
PYTHON_INVALID,
VERSION_INVALID,
)
Expand Down Expand Up @@ -748,6 +749,7 @@ def test_as_dict(tmpdir):
},
'build': {
'image': 'readthedocs/build:latest',
'apt_packages': [],
},
'conda': None,
'sphinx': {
Expand Down Expand Up @@ -935,6 +937,62 @@ def test_build_image_check_invalid_type(self, value):
build.validate()
assert excinfo.value.key == 'build.image'

@pytest.mark.parametrize(
'value',
[
[],
['cmatrix'],
['Mysql', 'cmatrix', 'postgresql-dev'],
],
)
def test_build_apt_packages_check_valid(self, value):
build = self.get_build_config({'build': {'apt_packages': value}})
build.validate()
assert build.build.apt_packages == value

@pytest.mark.parametrize(
'value',
[3, 'string', {}],
)
def test_build_apt_packages_invalid_type(self, value):
build = self.get_build_config({'build': {'apt_packages': value}})
with raises(InvalidConfig) as excinfo:
build.validate()
assert excinfo.value.key == 'build.apt_packages'

@pytest.mark.parametrize(
'error_index, value',
[
(0, ['/', 'cmatrix']),
(1, ['cmatrix', '-q']),
(1, ['cmatrix', ' -q']),
(1, ['cmatrix', '\\-q']),
(1, ['cmatrix', '--quiet']),
(1, ['cmatrix', ' --quiet']),
(2, ['cmatrix', 'quiet', './package.deb']),
(2, ['cmatrix', 'quiet', ' ./package.deb ']),
(2, ['cmatrix', 'quiet', '/home/user/package.deb']),
(2, ['cmatrix', 'quiet', ' /home/user/package.deb']),
(2, ['cmatrix', 'quiet', '../package.deb']),
(2, ['cmatrix', 'quiet', ' ../package.deb']),
(1, ['one', '$two']),
(1, ['one', 'non-ascíí']),
# We don't allow regex for now.
(1, ['mysql', 'cmatrix$']),
(0, ['^mysql-*', 'cmatrix$']),
# We don't allow specifying versions for now.
(0, ['postgresql=1.2.3']),
# We don't allow specifying distributions for now.
(0, ['cmatrix/bionic']),
],
)
def test_build_apt_packages_invalid_value(self, error_index, value):
build = self.get_build_config({'build': {'apt_packages': value}})
with raises(InvalidConfig) as excinfo:
build.validate()
assert excinfo.value.key == f'build.apt_packages.{error_index}'
assert excinfo.value.code == INVALID_NAME

@pytest.mark.parametrize('value', [3, [], 'invalid'])
def test_python_check_invalid_types(self, value):
build = self.get_build_config({'python': value})
Expand Down Expand Up @@ -2072,6 +2130,7 @@ def test_as_dict(self, tmpdir):
},
'build': {
'image': 'readthedocs/build:latest',
'apt_packages': [],
},
'conda': None,
'sphinx': {
Expand Down
30 changes: 29 additions & 1 deletion readthedocs/projects/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -794,7 +794,7 @@ def run_build(self, record):
environment=self.build_env,
)
with self.project.repo_nonblockinglock(version=self.version):
self.setup_python_environment()
self.setup_build()

# TODO the build object should have an idea of these states,
# extend the model to include an idea of these outcomes
Expand Down Expand Up @@ -1152,6 +1152,10 @@ def update_app_instances(
search_ignore=self.config.search.ignore,
)

def setup_build(self):
self.install_system_dependencies()
self.setup_python_environment()

def setup_python_environment(self):
"""
Build the virtualenv and install the project into it.
Expand All @@ -1177,6 +1181,30 @@ def setup_python_environment(self):
if self.project.has_feature(Feature.LIST_PACKAGES_INSTALLED_ENV):
self.python_env.list_packages_installed()

def install_system_dependencies(self):
stsewd marked this conversation as resolved.
Show resolved Hide resolved
"""
Install apt packages from the config file.

We don't allow to pass custom options or install from a path.
The packages names are already validated when reading the config file.

.. note::

``--quiet`` won't suppress the output,
it would just remove the progress bar.
"""
packages = self.config.build.apt_packages
if packages:
self.build_env.run(
'apt-get', 'update', '--assume-yes', '--quiet',
user=settings.RTD_DOCKER_SUPER_USER,
)
# put ``--`` to end all command arguments.
self.build_env.run(
'apt-get', 'install', '--assume-yes', '--quiet', '--', *packages,
user=settings.RTD_DOCKER_SUPER_USER,
)

def build_docs(self):
"""
Wrapper to all build functions.
Expand Down
4 changes: 4 additions & 0 deletions readthedocs/rtd_tests/fixtures/spec/v2/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,10 @@ build:
# Note: it can be overridden by a project
image: enum('stable', 'latest', required=False)

# List of packages to be installed with apt-get
# Default: []
apt_packages: list(str(), required=False)

python:
# The Python version (this depends on the build image)
# Default: '3'
Expand Down
Loading