Skip to content

Commit e2f9821

Browse files
feat: Allow base64 encoded credentials in URI (#410)
Fixes #409 🦕 To enable credential information to be included in the connection URL, for cases where you don't have a credentials file locally on the client, I propose the `credentials_base64` parameter. It requires that the user have encoded their credentials JSON file using a number of techniques like `base64`, or `openssl base64`, or `python -m base64`, or www.base64encode.org. I have used nox to run unit and system tests for Python 3.6 - 3.9. I'm tracking down a separate issue with my computer for why 3.10 tests did not run. - [x] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-sqlalchemy/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [x] Ensure the tests and linter pass - [x] Code coverage does not decrease (if any source code was changed) - [x] Appropriate docs were updated (if necessary)
1 parent 7f36628 commit e2f9821

File tree

7 files changed

+133
-4
lines changed

7 files changed

+133
-4
lines changed

README.rst

+27-1
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ Connection String Parameters
180180

181181
There are many situations where you can't call ``create_engine`` directly, such as when using tools like `Flask SQLAlchemy <http://flask-sqlalchemy.pocoo.org/2.3/>`_. For situations like these, or for situations where you want the ``Client`` to have a `default_query_job_config <https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client>`_, you can pass many arguments in the query of the connection string.
182182

183-
The ``credentials_path``, ``credentials_info``, ``location``, ``arraysize`` and ``list_tables_page_size`` parameters are used by this library, and the rest are used to create a `QueryJobConfig <https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.QueryJobConfig.html#google.cloud.bigquery.job.QueryJobConfig>`_
183+
The ``credentials_path``, ``credentials_info``, ``credentials_base64``, ``location``, ``arraysize`` and ``list_tables_page_size`` parameters are used by this library, and the rest are used to create a `QueryJobConfig <https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.QueryJobConfig.html#google.cloud.bigquery.job.QueryJobConfig>`_
184184

185185
Note that if you want to use query strings, it will be more reliable if you use three slashes, so ``'bigquery:///?a=b'`` will work reliably, but ``'bigquery://?a=b'`` might be interpreted as having a "database" of ``?a=b``, depending on the system being used to parse the connection string.
186186

@@ -207,6 +207,32 @@ Here are examples of all the supported arguments. Any not present are either for
207207
'write_disposition=WRITE_APPEND'
208208
)
209209
210+
In cases where you wish to include the full credentials in the connection URI you can base64 the credentials JSON file and supply the encoded string to the ``credentials_base64`` parameter.
211+
212+
.. code-block:: python
213+
214+
engine = create_engine(
215+
'bigquery://some-project/some-dataset' '?'
216+
'credentials_base64=eyJrZXkiOiJ2YWx1ZSJ9Cg==' '&'
217+
'location=some-location' '&'
218+
'arraysize=1000' '&'
219+
'list_tables_page_size=100' '&'
220+
'clustering_fields=a,b,c' '&'
221+
'create_disposition=CREATE_IF_NEEDED' '&'
222+
'destination=different-project.different-dataset.table' '&'
223+
'destination_encryption_configuration=some-configuration' '&'
224+
'dry_run=true' '&'
225+
'labels=a:b,c:d' '&'
226+
'maximum_bytes_billed=1000' '&'
227+
'priority=INTERACTIVE' '&'
228+
'schema_update_options=ALLOW_FIELD_ADDITION,ALLOW_FIELD_RELAXATION' '&'
229+
'use_query_cache=true' '&'
230+
'write_disposition=WRITE_APPEND'
231+
)
232+
233+
To create the base64 encoded string you can use the command line tool ``base64``, or ``openssl base64``, or ``python -m base64``.
234+
235+
Alternatively, you can use an online generator like `www.base64encode.org <https://www.base64encode.org>_` to paste your credentials JSON file to be encoded.
210236

211237
Creating tables
212238
^^^^^^^^^^^^^^^

sqlalchemy_bigquery/_helpers.py

+6
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@
1212
from google.cloud import bigquery
1313
from google.oauth2 import service_account
1414
import sqlalchemy
15+
import base64
16+
import json
1517

1618

1719
USER_AGENT_TEMPLATE = "sqlalchemy/{}"
@@ -30,12 +32,16 @@ def google_client_info():
3032
def create_bigquery_client(
3133
credentials_info=None,
3234
credentials_path=None,
35+
credentials_base64=None,
3336
default_query_job_config=None,
3437
location=None,
3538
project_id=None,
3639
):
3740
default_project = None
3841

42+
if credentials_base64:
43+
credentials_info = json.loads(base64.b64decode(credentials_base64))
44+
3945
if credentials_path:
4046
credentials = service_account.Credentials.from_service_account_file(
4147
credentials_path

sqlalchemy_bigquery/base.py

+5
Original file line numberDiff line numberDiff line change
@@ -753,6 +753,7 @@ def __init__(
753753
credentials_path=None,
754754
location=None,
755755
credentials_info=None,
756+
credentials_base64=None,
756757
list_tables_page_size=1000,
757758
*args,
758759
**kwargs,
@@ -761,6 +762,7 @@ def __init__(
761762
self.arraysize = arraysize
762763
self.credentials_path = credentials_path
763764
self.credentials_info = credentials_info
765+
self.credentials_base64 = credentials_base64
764766
self.location = location
765767
self.dataset_id = None
766768
self.list_tables_page_size = list_tables_page_size
@@ -791,6 +793,7 @@ def create_connect_args(self, url):
791793
dataset_id,
792794
arraysize,
793795
credentials_path,
796+
credentials_base64,
794797
default_query_job_config,
795798
list_tables_page_size,
796799
) = parse_url(url)
@@ -799,13 +802,15 @@ def create_connect_args(self, url):
799802
self.list_tables_page_size = list_tables_page_size or self.list_tables_page_size
800803
self.location = location or self.location
801804
self.credentials_path = credentials_path or self.credentials_path
805+
self.credentials_base64 = credentials_base64 or self.credentials_base64
802806
self.dataset_id = dataset_id
803807
self._add_default_dataset_to_job_config(
804808
default_query_job_config, project_id, dataset_id
805809
)
806810
client = _helpers.create_bigquery_client(
807811
credentials_path=self.credentials_path,
808812
credentials_info=self.credentials_info,
813+
credentials_base64=self.credentials_base64,
809814
project_id=project_id,
810815
location=self.location,
811816
default_query_job_config=default_query_job_config,

sqlalchemy_bigquery/parse_url.py

+8
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ def parse_url(url): # noqa: C901
6868
dataset_id = url.database or None
6969
arraysize = None
7070
credentials_path = None
71+
credentials_base64 = None
7172
list_tables_page_size = None
7273

7374
# location
@@ -78,6 +79,10 @@ def parse_url(url): # noqa: C901
7879
if "credentials_path" in query:
7980
credentials_path = query.pop("credentials_path")
8081

82+
# credentials_base64
83+
if "credentials_base64" in query:
84+
credentials_base64 = query.pop("credentials_base64")
85+
8186
# arraysize
8287
if "arraysize" in query:
8388
str_arraysize = query.pop("arraysize")
@@ -107,6 +112,7 @@ def parse_url(url): # noqa: C901
107112
dataset_id,
108113
arraysize,
109114
credentials_path,
115+
credentials_base64,
110116
QueryJobConfig(),
111117
list_tables_page_size,
112118
)
@@ -117,6 +123,7 @@ def parse_url(url): # noqa: C901
117123
dataset_id,
118124
arraysize,
119125
credentials_path,
126+
credentials_base64,
120127
None,
121128
list_tables_page_size,
122129
)
@@ -265,6 +272,7 @@ def parse_url(url): # noqa: C901
265272
dataset_id,
266273
arraysize,
267274
credentials_path,
275+
credentials_base64,
268276
job_config,
269277
list_tables_page_size,
270278
)

tests/system/test_helpers.py

+29
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
# license that can be found in the LICENSE file or at
55
# https://opensource.org/licenses/MIT.
66

7+
import base64
78
import os
89
import json
910

@@ -30,6 +31,12 @@ def credentials_info(credentials_path):
3031
return json.load(credentials_file)
3132

3233

34+
@pytest.fixture
35+
def credentials_base64(credentials_path):
36+
with open(credentials_path) as credentials_file:
37+
return base64.b64encode(credentials_file.read().encode()).decode()
38+
39+
3340
def test_create_bigquery_client_with_credentials_path(
3441
module_under_test, credentials_path, credentials_info
3542
):
@@ -72,3 +79,25 @@ def test_create_bigquery_client_with_credentials_info_respects_project(
7279
credentials_info=credentials_info, project_id="connection-url-project",
7380
)
7481
assert bqclient.project == "connection-url-project"
82+
83+
84+
def test_create_bigquery_client_with_credentials_base64(
85+
module_under_test, credentials_base64, credentials_info
86+
):
87+
bqclient = module_under_test.create_bigquery_client(
88+
credentials_base64=credentials_base64
89+
)
90+
assert bqclient.project == credentials_info["project_id"]
91+
92+
93+
def test_create_bigquery_client_with_credentials_base64_respects_project(
94+
module_under_test, credentials_base64
95+
):
96+
"""Test that project_id is used, even when there is a default project.
97+
98+
https://github.com/googleapis/python-bigquery-sqlalchemy/issues/48
99+
"""
100+
bqclient = module_under_test.create_bigquery_client(
101+
credentials_base64=credentials_base64, project_id="connection-url-project",
102+
)
103+
assert bqclient.project == "connection-url-project"

tests/unit/test_helpers.py

+49-1
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,14 @@
44
# license that can be found in the LICENSE file or at
55
# https://opensource.org/licenses/MIT.
66

7+
import base64
8+
import json
79
from unittest import mock
810

911
import google.auth
1012
import google.auth.credentials
11-
from google.oauth2 import service_account
1213
import pytest
14+
from google.oauth2 import service_account
1315

1416

1517
class AnonymousCredentialsWithProject(google.auth.credentials.AnonymousCredentials):
@@ -105,6 +107,52 @@ def test_create_bigquery_client_with_credentials_info_respects_project(
105107
assert bqclient.project == "connection-url-project"
106108

107109

110+
def test_create_bigquery_client_with_credentials_base64(monkeypatch, module_under_test):
111+
mock_service_account = mock.create_autospec(service_account.Credentials)
112+
mock_service_account.from_service_account_info.return_value = AnonymousCredentialsWithProject(
113+
"service-account-project"
114+
)
115+
monkeypatch.setattr(service_account, "Credentials", mock_service_account)
116+
117+
credentials_info = (
118+
{"type": "service_account", "project_id": "service-account-project"},
119+
)
120+
121+
credentials_base64 = base64.b64encode(json.dumps(credentials_info).encode())
122+
123+
bqclient = module_under_test.create_bigquery_client(
124+
credentials_base64=credentials_base64
125+
)
126+
127+
assert bqclient.project == "service-account-project"
128+
129+
130+
def test_create_bigquery_client_with_credentials_base64_respects_project(
131+
monkeypatch, module_under_test
132+
):
133+
"""Test that project_id is used, even when there is a default project.
134+
135+
https://github.com/googleapis/python-bigquery-sqlalchemy/issues/48
136+
"""
137+
mock_service_account = mock.create_autospec(service_account.Credentials)
138+
mock_service_account.from_service_account_info.return_value = AnonymousCredentialsWithProject(
139+
"service-account-project"
140+
)
141+
monkeypatch.setattr(service_account, "Credentials", mock_service_account)
142+
143+
credentials_info = (
144+
{"type": "service_account", "project_id": "service-account-project"},
145+
)
146+
147+
credentials_base64 = base64.b64encode(json.dumps(credentials_info).encode())
148+
149+
bqclient = module_under_test.create_bigquery_client(
150+
credentials_base64=credentials_base64, project_id="connection-url-project",
151+
)
152+
153+
assert bqclient.project == "connection-url-project"
154+
155+
108156
def test_create_bigquery_client_with_default_credentials(
109157
monkeypatch, module_under_test
110158
):

tests/unit/test_parse_url.py

+9-2
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ def url_with_everything():
4848
return make_url(
4949
"bigquery://some-project/some-dataset"
5050
"?credentials_path=/some/path/to.json"
51+
"&credentials_base64=eyJrZXkiOiJ2YWx1ZSJ9Cg=="
5152
"&location=some-location"
5253
"&arraysize=1000"
5354
"&list_tables_page_size=5000"
@@ -72,6 +73,7 @@ def test_basic(url_with_everything):
7273
dataset_id,
7374
arraysize,
7475
credentials_path,
76+
credentials_base64,
7577
job_config,
7678
list_tables_page_size,
7779
) = parse_url(url_with_everything)
@@ -82,6 +84,7 @@ def test_basic(url_with_everything):
8284
assert arraysize == 1000
8385
assert list_tables_page_size == 5000
8486
assert credentials_path == "/some/path/to.json"
87+
assert credentials_base64 == "eyJrZXkiOiJ2YWx1ZSJ9Cg=="
8588
assert isinstance(job_config, QueryJobConfig)
8689

8790

@@ -123,15 +126,15 @@ def test_all_values(url_with_everything, param, value, default):
123126
)
124127

125128
for url in url_with_everything, url_with_this_one:
126-
job_config = parse_url(url)[5]
129+
job_config = parse_url(url)[6]
127130
config_value = getattr(job_config, param)
128131
if callable(value):
129132
assert value(config_value)
130133
else:
131134
assert config_value == value
132135

133136
url_with_nothing = make_url("bigquery://some-project/some-dataset")
134-
job_config = parse_url(url_with_nothing)[5]
137+
job_config = parse_url(url_with_nothing)[6]
135138
assert getattr(job_config, param) == default
136139

137140

@@ -177,6 +180,7 @@ def test_empty_with_non_config():
177180
dataset_id,
178181
arraysize,
179182
credentials_path,
183+
credentials_base64,
180184
job_config,
181185
list_tables_page_size,
182186
) = url
@@ -186,6 +190,7 @@ def test_empty_with_non_config():
186190
assert dataset_id is None
187191
assert arraysize == 1000
188192
assert credentials_path == "/some/path/to.json"
193+
assert credentials_base64 is None
189194
assert job_config is None
190195
assert list_tables_page_size is None
191196

@@ -198,6 +203,7 @@ def test_only_dataset():
198203
dataset_id,
199204
arraysize,
200205
credentials_path,
206+
credentials_base64,
201207
job_config,
202208
list_tables_page_size,
203209
) = url
@@ -207,6 +213,7 @@ def test_only_dataset():
207213
assert dataset_id == "some-dataset"
208214
assert arraysize is None
209215
assert credentials_path is None
216+
assert credentials_base64 is None
210217
assert list_tables_page_size is None
211218
assert isinstance(job_config, QueryJobConfig)
212219
# we can't actually test that the dataset is on the job_config,

0 commit comments

Comments
 (0)