Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#30: Replace shell script with Python #31

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
**.swp
**.coverage
**.env
**.venv
**__pycache__
9 changes: 9 additions & 0 deletions kubernetes/.pytest.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[pytest]
# https://docs.python.org/3/library/logging.html#levels
log_cli = true
log_cli_level = 20
filterwarnings =
# DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled
# for removal in a future version. Use timezone-aware objects to represent
# datetimes in UTC: datetime.datetime.now(datetime.UTC).
ignore:.*datetime.datetime.utcnow().*
15 changes: 6 additions & 9 deletions kubernetes/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
FROM alpine
FROM docker.io/python:3.12

ARG VAULT_VERSION=1.16.3
COPY vault_snapshot /vault_snapshot
COPY requirements.txt /vault_snapshot/

COPY vault-snapshot.sh /
WORKDIR /vault_snapshot
RUN pip install -r requirements.txt

RUN wget https://releases.hashicorp.com/vault/${VAULT_VERSION}/vault_${VAULT_VERSION}_linux_amd64.zip && \
unzip vault_${VAULT_VERSION}_linux_amd64.zip && \
mv vault /usr/local/bin && rm vault*zip && \
apk add s3cmd && chmod +x vault-snapshot.sh

CMD ["/vault-snapshot.sh"]
CMD ["./vault_snapshot.py"]
32 changes: 29 additions & 3 deletions kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,17 @@ After the snapshot is created in a temporary directory, `s3cmd` is used to sync
## Configuration over environment variables

* `VAULT_ADDR` - Vault address to access
* `VAULT_ROLE` - Vault role to use to create the snapshot
* `S3_URI` - S3 URI to use to upload (s3://xxx)
* `VAULT_TOKEN` - Vault token (if provided, overrules `VAULT_ROLE`)
* `VAULT_SKIP_VERIFY` - optional, set to any value to skip TLS verification
* `VAULT_ROLE` - Vault role to create the snapshot. Required when no `VAULT_TOKEN`.
* `S3_BUCKET` - S3 bucket to point to
* `S3_HOST` - S3 endpoint
* `S3_EXPIRE_DAYS` - Delete files older than this threshold (expired)
* `AWS_ACCESS_KEY_ID` - Access key to use to access S3
* `AWS_SECRET_ACCESS_KEY` - Secret access key to use to access S3
* `JWT_SECRET_PATH` - Path to JWT token for authentication against
`VAULT_ROLE`. Defaults to
`/var/run/secrets/kubernetes.io/serviceaccount/token`

## Configuration of file retention (pruning)

Expand All @@ -33,7 +37,7 @@ to avoid any modification before `$S3_EXPIRE_DAYS`:
mc retention set --default GOVERNANCE "${S3_EXPIRE_DAYS}d" my-s3-remote/my-bucket
```

On removal by the `vault-snapshot.sh` script, [`DEL` deletion marker
On removal by the script, [`DEL` deletion marker
(tombstone)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock-managing.html#object-lock-managing-delete-markers)
is set:

Expand All @@ -51,3 +55,25 @@ mc undo my-snapshots/vault-snapshots-2f848f/vault_2024-09-06-1739.snapshot
mc ls --versions my-snapshots/vault-snapshots-2f848f
[2024-09-06 19:39:49 CEST] 28KiB Standard 1031052557042383613 v1 PUT vault_2024-09-06-1739.snapshot
```

## Development and tests

Vault API requests are mocked with
[requests-mock](https://requests-mock.readthedocs.io).

To prepare the environment:
```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

Run the tests w/o coverage:
```bash
pytest
```

Run the tests with coverage:
```bash
coverage run .venv/bin/pytest
```
8 changes: 3 additions & 5 deletions kubernetes/cronjob.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ spec:
app.kubernetes.io/name: vault-snapshot
app.kubernetes.io/version: v0.1.0
spec:
restartPolicy: never
restartPolicy: Never
automountServiceAccountToken: true
serviceAccountName: vault-raft-snapshot
containers:
Expand All @@ -30,11 +30,9 @@ spec:
value: s3.example.com
- name: S3_BUCKET
value: bucketname
- name: S3_URI
value: s3://bucketname
# leave empty to retain snapshot files (default)
- name: S3_EXPIRE_DAYS
value:
# remove this variable to retain all snapshots (default)
value: "90"
- name: VAULT_ROLE
value: vault-snapshot
- name: VAULT_ADDR
Expand Down
7 changes: 7 additions & 0 deletions kubernetes/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
hvac
boto3
moto[s3]
requests_mock
pytest
coverage
freezegun
28 changes: 0 additions & 28 deletions kubernetes/vault-snapshot.sh

This file was deleted.

1 change: 1 addition & 0 deletions kubernetes/vault_snapshot/fixtures/jwt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
mock_jwt
173 changes: 173 additions & 0 deletions kubernetes/vault_snapshot/vault_snapshot.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
#!/usr/bin/env python

import logging
import boto3
from botocore.exceptions import ClientError
import hvac
import os
from pathlib import Path
from datetime import UTC, datetime, timedelta

class VaultSnapshot:
"""
Create Vault snapshots on S3.
"""

def __init__(self, **kwargs):
"""
Init S3 and hvac clients
"""

# setup logger
self.logger = logging.getLogger(__name__)

# read input keyword arguments
if "vault_addr" in kwargs:
self.vault_addr = kwargs["vault_addr"]
elif "VAULT_ADDR" in os.environ:
self.vault_addr = os.environ["VAULT_ADDR"]
else:
raise NameError("VAULT_ADDR undefined")

if "vault_skip_verify" in kwargs:
self.verify = False
elif "VAULT_SKIP_VERIFY" in os.environ:
self.verify = False
else:
self.verify = True

if "vault_token" in kwargs:
self.vault_token = kwargs["vault_token"]
elif "VAULT_TOKEN" in os.environ:
self.vault_token = os.environ["VAULT_TOKEN"]
elif "vault_role" in kwargs:
self.vault_role = kwargs["vault_role"]
elif "VAULT_ROLE" in os.environ:
self.vault_role = os.environ["VAULT_ROLE"]
else:
raise NameError("No VAULT_TOKEN or VAULT_ROLE")

if "s3_access_key_id" in kwargs:
self.s3_access_key_id = kwargs["s3_access_key_id"]
elif "AWS_ACCESS_KEY_ID" in os.environ:
self.s3_access_key_id = os.environ["AWS_ACCESS_KEY_ID"]
else:
raise NameError("AWS_ACCESS_KEY_ID undefined")

if "s3_secret_access_key" in kwargs:
self.s3_secret_access_key = kwargs["s3_secret_access_key"]
elif "AWS_SECRET_ACCESS_KEY" in os.environ:
self.s3_secret_access_key = os.environ["AWS_SECRET_ACCESS_KEY"]
else:
raise NameError("AWS_SECRET_ACCESS_KEY undefined")

if "s3_host" in kwargs:
self.s3_host = kwargs["s3_host"]
elif "S3_HOST" in os.environ:
self.s3_host = os.environ["S3_HOST"]
else:
raise NameError("S3_HOST undefined")

if "s3_bucket" in kwargs:
self.s3_bucket = kwargs["s3_bucket"]
elif "S3_BUCKET" in os.environ:
self.s3_bucket = os.environ["S3_BUCKET"]
else:
raise NameError("S3_BUCKET undefined")

if "s3_expire_days" in kwargs:
self.s3_expire_days = kwargs["s3_expire_days"]
elif "S3_EXPIRE_DAYS" in os.environ:
self.s3_expire_days = os.environ["S3_EXPIRE_DAYS"]
else:
self.s3_expire_days = -1

if "jwt_secret_path" in kwargs:
self.jwt_secret_path = kwargs["jwt_secret_path"]
elif "JWT_SECRET_PATH" in os.environ:
self.s3_bucket = os.environ["JWT_SECRET_PATH"]
else:
# default Kubernetes serviceaccount JWT secret path
self.jwt_secret_path = "/var/run/secrets/kubernetes.io/serviceaccount/token"

# Boto S3 client
# * https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html
self.s3_client = boto3.client(service_name="s3",
endpoint_url=self.s3_host,
aws_access_key_id=self.s3_access_key_id,
aws_secret_access_key=self.s3_secret_access_key)

self.logger.info(f"Connecting to Vault API {self.vault_addr}")
self.hvac_client = hvac.Client(url=self.vault_addr,
verify=self.verify)

# try setting VAULT_TOKEN if exists
if hasattr(self, "vault_token") and len(self.vault_token) > 0:
self.hvac_client.token = self.vault_token
elif Path(self.jwt_secret_path).exists():
f = open(self.jwt_secret_path)

# Authenticate with Kubernetes ServiceAccount if vault_token is empty
# https://hvac.readthedocs.io/en/stable/usage/auth_methods/kubernetes.html
login_resp = hvac.api.auth_methods.Kubernetes(self.hvac_client.adapter).login(
role=self.vault_role,
jwt=f.read()
)
self.hvac_client.token = login_resp["auth"]["client_token"]
else:
raise Exception("Unable to authenticate with VAULT_TOKEN or JWT")

assert self.hvac_client.is_authenticated()

def snapshot(self):
"""Create Vault integrated storage (Raft) snapshot.

The snapshot is returned as binary data and should be redirected to
a file:
* https://developer.hashicorp.com/vault/api-docs/system/storage/raft
* https://hvac.readthedocs.io/en/stable/source/hvac_api_system_backend.html
"""

with self.hvac_client.sys.take_raft_snapshot() as resp:
assert resp.ok

self.logger.info("Raft snapshot status code: %d" % resp.status_code)

date_str = datetime.now(UTC).strftime("%F-%H%M")
file_name = "vault_%s.snapshot" % (date_str)
self.logger.info(f"File name: {file_name}")

# Upload the file
# * https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/put_object.html
try:
response = self.s3_client.put_object(
Body=resp.content,
Bucket=self.s3_bucket,
Key=file_name,
)
self.logger.info("s3 put_object response: %s", response)
except ClientError as e:
logging.error(e)

# Iterate and remove expired snapshots:
# https://boto3.amazonaws.com/v1/documentation/api/latest/guide/migrations3.html
s3 = boto3.resource(service_name="s3",
endpoint_url=self.s3_host,
aws_access_key_id=self.s3_access_key_id,
aws_secret_access_key=self.s3_secret_access_key)
objs = self.s3_client.list_objects(Bucket=self.s3_bucket)["Contents"]
#self.logger.info(objs)

for o in objs:
self.logger.info(f"LastModified: {o['LastModified']}")
# expire keys when older than S3_EXPIRE_DAYS
if int(self.s3_expire_days) >= 0:
if o["LastModified"] <= datetime.now(UTC) - timedelta(days=int(self.s3_expire_days)):
self.logger.info(f"Deleting expired snapshot {o['Key']}")
s3.Object(self.s3_bucket, o["Key"]).delete()

return file_name

if __name__ == "__main__":
vault_snapshot = VaultSnapshot()
vault_snapshot.snapshot()
Loading