Add Backblaze backup integration by hugo-vrijswijk · Pull Request #145508 · home-assistant/core

hugo-vrijswijk · 2025-05-23T12:18:31Z

Proposed change

Adds a new core integration for Backblaze B2. Can be used as a backup source for HA Backups.

Type of change

Dependency upgrade
Bugfix (non-breaking change which fixes an issue)
New integration (thank you!)
New feature (which adds functionality to an existing integration)
Deprecation (breaking change to happen in the future)
Breaking change (fix/feature causing existing functionality to break)
Code quality improvements to existing code or addition of tests

Additional information

Integration is mostly based on the aws_s3 source, with some inspiration from azure_storage. It has some prerequisites (existing bucket and key). Also allows using a prefix to store backups in a certain folder in Backblaze. The integration will also make some checks before setup to ensure the configuration works.

The library used (b2sdk) is completely sync, so many calls are wrapped in async_add_executor_job. b2sdk also uses the requests library, instead of the async http libraries used in HA. There is an open issue for async support, but it is noted this would be essentially a new library. Tests use the RawSimulator from b2sdk to test interactions with the SDK, rather than patching.

This is my first time contributing to HA, so please verify I am following all conventions. I intend to make more improvements to raise the integration quality scale.

Also I named the integration backblaze_b2, as that seemed in line with aws_s3. Not sure if there ever would be another backblaze integration for a different product. If needed i can change the name to just backblaze

~~This PR fixes or closes issue: fixes~~
This PR is related to issue Cannot use S3 API compatible endpoints with 'AWS S3' integration #144497, S3 on Backblaze: Unsupported header 'x-amz-sdk-checksum-algorithm' received for this API call #143995, Add support for s3 compatible storage providers #144474
Link to documentation pull request: Add docs for Backblaze B2 integration home-assistant.io#39169
Link to brands pull request: Add logos and icons for core integration backblaze_b2 brands#7125
~~Link to developer documentation pull request:~~
~~Link to frontend pull request:~~

Checklist

The code change is tested and works locally.
Local tests pass. Your PR cannot be merged unless tests pass
There is no commented out code in this PR.
I have followed the development checklist
I have followed the perfect PR recommendations
The code has been formatted using Ruff (ruff format homeassistant tests)
Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

Documentation added/updated for www.home-assistant.io

If the code communicates with devices, web services, or third-party tools:

The manifest file has all fields filled out correctly.
Updated and included derived files by running: python3 -m script.hassfest.
New or updated dependencies have been added to requirements_all.txt.
Updated by running python3 -m script.gen_requirements_all.
For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.

To help with the load of incoming pull requests:

I have reviewed two other open pull requests in this repository.

hugo-vrijswijk · 2025-05-24T16:52:15Z

Didn't realize there was already another older pr with similar features. But I'll have a look at it tomorrow to merge the best of both worlds (at least doing it from scratch taught me how HA integrations work)

zweckj · 2025-05-24T19:54:18Z

setting this back to draft while you figure that out

hugo-vrijswijk · 2025-05-26T11:33:02Z

setting this back to draft while you figure that out

I made some minor changes, but it should all be ready for review now. I tested with a slightly bigger backup (500MB) and I didn't notice any slowdowns or hanging in the rest of Home Assistant

jeroenleenarts · 2025-05-30T14:56:28Z

Just making sure, this does not load the entire backup file into memory before uploading?

That was one of the issues @frenck had to fix in his implementations. But he does use a different way of uploading the data, he uses upload_bytes I think.

And is there a convenient way to test this on a production setting? I could try and extract this implementation and run it on my production home assistant install as a test. But not sure what the steps required for that are.

(I did it once for the tado integration when the quite suddenly changed their login mechanism, breaking the production tado integrations. Ran the version in a pull request for a week or two that way.)

home-assistant · 2025-05-30T18:08:14Z

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

hugo-vrijswijk · 2025-06-02T11:18:37Z

Just making sure, this does not load the entire backup file into memory before uploading?

That was one of the issues @frenck had to fix in his implementations. But he does use a different way of uploading the data, he uses upload_bytes I think.

Yes, the upload_bytes I think requires you to load the whole thing into memory. That was my initial implementation. But it should now stream without loading the whole backup into memory. The downside is there's no way of indicating how big the stream will be, and thus have some checks on the upload size from backblaze.

And is there a convenient way to test this on a production setting? I could try and extract this implementation and run it on my production home assistant install as a test. But not sure what the steps required for that are.

(I did it once for the tado integration when the quite suddenly changed their login mechanism, breaking the production tado integrations. Ran the version in a pull request for a week or two that way.)

No idea. I'd be happy to test this on my own actual HA instance if you find a way

hugo-vrijswijk · 2025-06-05T10:01:32Z

I'm not exactly sure why there are tests failing (on only python 3.13). The assertion diff is not an object i'm creating directly

joostlek · 2025-06-12T14:28:35Z

+        patch("b2sdk.v2.B2Api", return_value=sim) as mock_client,
+        patch("homeassistant.components.backblaze.B2Api", return_value=sim),


we should only patch libraries where we use them, so the first patch is incorrect, while the second patch is a correct one. I think you want to patch the first one to homeassistant.components.backblaze.config_flow.B2Api

joostlek · 2025-06-12T14:28:55Z

+@pytest.fixture(autouse=True)
+def b2_fixture():
+    """Create account and application keys."""
+    sim = RawSimulator()


What's this exactly?

joostlek · 2025-06-12T14:31:12Z

+        application_key: str = key["applicationKey"]
+
+        bucket = sim.create_bucket(
+            api_url=api_url,
+            account_id=account_id,
+            account_auth_token=auth_token,
+            bucket_name=USER_INPUT[CONF_BUCKET],
+            bucket_type="allPrivate",
+        )
+
+        # Create a test backup
+        test_backup_data = b"backup data"
+        upload_url = sim.get_upload_url(api_url, auth_token, bucket["bucketId"])
+        stream = io.BytesIO(test_backup_data)
+        stream.seek(0)
+
+        filename = TEST_BACKUP.name
+        sha1 = hashlib.sha1(test_backup_data).hexdigest()
+        file = sim.upload_file(
+            upload_url["uploadUrl"],
+            upload_url["authorizationToken"],
+            filename,
+            len(test_backup_data),
+            "application/octet-stream",
+            sha1,
+            BACKUP_METADATA,
+            stream,
+        )
+
+        def ls(
+            self,
+            prefix: str = "",
+        ) -> list[tuple[FileVersion, str]]:
+            """List files in the bucket."""
+            return [
+                (
+                    FileVersion(
+                        sim,
+                        file["fileId"],
+                        file["fileName"],
+                        file["contentLength"],
+                        "application/octet-stream",
+                        sha1,
+                        BACKUP_METADATA,
+                        file["uploadTimestamp"],
+                        file["accountId"],
+                        file["bucketId"],
+                        "action",
+                        None,
+                        None,
+                    ),
+                    file["fileName"],
+                )
+            ]
+
+        BucketSimulator.ls = ls
+
+        yield BackblazeFixture(application_key_id, application_key, bucket, sim, auth)


I have a little clue what happens here, but I am wondering why you keep using certain parts of the library. I personally always like to completely patch out the library and this way our code does not use library code (except for some dataclasses and mapping logic to create many objects).

I think by using more mocks you can remove more inline patches, which is a big improvement imo

joostlek · 2025-06-12T14:33:11Z

+async def test_form_invalid_auth(hass: HomeAssistant) -> None:
+    """Test config flow."""
+    result = await _async_start_flow(hass, "invalid", "invalid")
+    assert result.get("type") is FlowResultType.FORM
+    assert result.get("errors") == {"base": "invalid_credentials"}
+
+
+async def test_form_invalid_bucket_name(
+    hass: HomeAssistant,
+    b2_fixture: BackblazeFixture,
+) -> None:
+    """Test config flow."""
+    result = await _async_start_flow(
+        hass,
+        b2_fixture.key_id,
+        b2_fixture.application_key,
+        {
+            **USER_INPUT,
+            "bucket": "invalid-bucket-name",
+        },
+    )
+    assert result.get("type") is FlowResultType.FORM
+    assert result.get("errors") == {"bucket": "invalid_bucket_name"}
+
+
+async def test_form_cannot_connect(
+    hass: HomeAssistant,
+    b2_fixture: BackblazeFixture,
+) -> None:
+    """Test config flow."""
+    with patch(
+        "b2sdk.v2.RawSimulator.authorize_account",
+        side_effect=exception.ConnectionReset("test"),
+    ):
+        result = await _async_start_flow(
+            hass,
+            b2_fixture.key_id,
+            b2_fixture.application_key,
+            USER_INPUT,
+        )
+
+    assert result.get("type") is FlowResultType.FORM
+    assert result.get("errors") == {"base": "cannot_connect"}


Let's finish config flow tests so we know they are able to recover once they experienced an error

joostlek · 2025-06-12T14:33:24Z

+) -> None:
+    """Test config flow."""
+    with patch(
+        "b2sdk.v2.RawSimulator.get_bucket_by_name",


you could use the mock for this I think

zweckj · 2025-06-13T06:34:06Z

+        downloaded_file = await self._hass.async_add_executor_job(file.download)
+        response = downloaded_file.response


correct me if I'm wrong, but this sounds like you now have the entire file in memory?

No this is the http response object, not the body of the response, which we start streaming after this

zweckj · 2025-06-13T06:39:19Z

+        r_fd, w_fd = os.pipe()
+
+        async def writer() -> None:
+            """Write async stream to the pipe."""
+            with os.fdopen(w_fd, "wb") as w:
+                async for chunk in stream:
+                    w.write(chunk)
+                w.close()
+
+        # Schedule the writer coroutine
+        writer_task = self.async_create_task(self._hass, writer())


this also sounds dangerous to me? Who guarantees we're not writing the entire file to the pipe, before the other task is able to upload (enough) chunks?

Yeah I think you might be right. I'm not sure how the streams work internally, but if they are push-based that is possible (but also an issue with any other implementation?)

iirc none of the other implementations are using a separate task for this, but do read and write for a single chunk together

zweckj · 2025-06-13T06:41:04Z

+                self._bucket.upload_unbound_stream(
+                    r,
+                    filename,
+                    file_info=file_info,


what's the maximum metadata size that backblaze allows?

From the docs:

Each key is a UTF-8 string up to 50 bytes. There is an overall 7000-byte limit on the headers that are needed for file name and file information, unless the file is uploaded with server-side encryption in which case the limit is 2048 bytes.

But that'll be a problem then. Not 100% sure what a key is in that context, but aren't you dropping the entire serialized json into one key? In any case even 2kB are not safe enough, which means we'd need to use metadata files in those cases.

I think the docs might be wrong (or I misunderstood). The backup I have been uploading has 370 bytes of metadata and that's been working fine. I don't know how big the metadata can get

Potentially bigger than 2kB. I think with anything bigger than 4 we should be safe, but 2 is not enough.

7kb is the limit for files when server-side encryption is off (including the filename). I can add a note to the docs that server-side encryption is not recommended for larger instances, because of this limit

I'd rather switch to file based metadata (like AWS S3 does it as well), instead of discouraging users to opt out of a security feature.

ElCruncharino · 2025-07-29T15:03:06Z

FYI. I've been following this conversation, and submitted #149627 based on the feedback here. Thank you @hugo-vrijswijk. Happy to merge into you branch if that's the preference here, since I didn't want to duplicate your brand and documentation pull requests.

emontnemery · 2025-10-30T07:43:12Z

I'm closing this since #149627 which completed the work started in this PR has been merged.
Thanks a lot @hugo-vrijswijk for initiating it 👍

Add Backblaze B2 integration

c7e30bf

home-assistant Bot added cla-signed has-tests new-integration integration: backblaze_b2 quality-scale labels May 23, 2025

This was referenced May 23, 2025

Add logos and icons for core integration backblaze_b2 home-assistant/brands#7125

Merged

Add docs for Backblaze B2 integration home-assistant/home-assistant.io#39169

Merged

hugo-vrijswijk added 3 commits May 23, 2025 16:32

Add tests for backup and init

045369d

Improve test coverage

7a8dfb2

Update log-when-unavailable quality_scale

ae127e3

zweckj marked this pull request as draft May 24, 2025 19:54

hugo-vrijswijk added 4 commits May 26, 2025 07:26

Rename backblaze_b2 to backblaze

1fbb2d8

Wrap in single executor job

b6aef5a

Extract AgentBackup creation to method

91601f9

Merge branch 'dev' into integration-backblaze-b2

d5cbc2a

hugo-vrijswijk marked this pull request as ready for review May 26, 2025 11:32

hugo-vrijswijk changed the title ~~Add Backblaze B2 integration~~ Add Backblaze integration May 26, 2025

joostlek requested changes May 30, 2025

View reviewed changes

home-assistant Bot marked this pull request as draft May 30, 2025 18:08

hugo-vrijswijk added 2 commits June 2, 2025 11:08

Remove info log

31f0377

Automatically add prefix slash

622747a

hugo-vrijswijk marked this pull request as ready for review June 2, 2025 11:18

home-assistant Bot requested a review from joostlek June 2, 2025 11:18

Merge branch 'dev' into integration-backblaze-b2

b1dd3e0

hugo-vrijswijk added 2 commits June 5, 2025 18:10

Merge branch 'dev' into integration-backblaze-b2

5514ee9

Remove cast

9acf1bd

joostlek requested changes Jun 12, 2025

View reviewed changes

home-assistant Bot marked this pull request as draft June 12, 2025 14:34

NoRi2909 reviewed Jun 12, 2025

View reviewed changes

Comment thread homeassistant/components/backblaze/strings.json Outdated

zweckj reviewed Jun 13, 2025

View reviewed changes

hugo-vrijswijk added 6 commits June 13, 2025 11:52

Merge branch 'home-assistant:dev' into integration-backblaze-b2

f3ab84c

Move some code outside try-catch

ebb9246

Update translation string for prefix directory path

caf3d01

Use unique entry_id

9e919d8

Update imports and assert step_id and errors

ef4a560

Fix test

bcf47f1

ElCruncharino mentioned this pull request Jul 29, 2025

Add backblaze b2 backup integration #149627

Merged

19 tasks

emontnemery changed the title ~~Add Backblaze integration~~ Add Backblaze backup integration Sep 17, 2025

emontnemery closed this Oct 30, 2025

github-actions Bot locked and limited conversation to collaborators Oct 31, 2025

		patch("b2sdk.v2.B2Api", return_value=sim) as mock_client,
		patch("homeassistant.components.backblaze.B2Api", return_value=sim),

		downloaded_file = await self._hass.async_add_executor_job(file.download)
		response = downloaded_file.response

Uh oh!

Conversation

hugo-vrijswijk commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed change

Type of change

Additional information

Checklist

Uh oh!

hugo-vrijswijk commented May 24, 2025

Uh oh!

zweckj commented May 24, 2025

Uh oh!

hugo-vrijswijk commented May 26, 2025

Uh oh!

jeroenleenarts commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

home-assistant Bot commented May 30, 2025

Uh oh!

hugo-vrijswijk commented Jun 2, 2025

Uh oh!

hugo-vrijswijk commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ElCruncharino commented Jul 29, 2025

Uh oh!

emontnemery commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

hugo-vrijswijk commented May 23, 2025 •

edited

Loading

jeroenleenarts commented May 30, 2025 •

edited

Loading