Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasette is not compatible with SQLite's strict quoting compilation option #2001

Open
gwk opened this issue Jan 23, 2023 · 4 comments · May be fixed by #2004
Open

Datasette is not compatible with SQLite's strict quoting compilation option #2001

gwk opened this issue Jan 23, 2023 · 4 comments · May be fixed by #2004

Comments

@gwk
Copy link

gwk commented Jan 23, 2023

I have linked Python3.11 on macOS against recent SQLite that was compiled using -DSQLITE_DQS=0. This option disables interpretation of double-quoted identifiers as string literals, described in the SQLite docs as a "MySQL 3.x misfeature". See https://www.sqlite.org/quirks.html#dblquote for background.

Datasette uses the double-quote syntax in a number of key places, and is thus completely broken in this environment.

My experience was to pip install datasette, then run datasette serve -I my-data.db. When I visit http://127.0.0.1:8001 I get a 500 response.

The error: sqlite3.OperationalError: no such column: geometry_columns

The responsible SQL: 'select 1 from sqlite_master where tbl_name = "geometry_columns"'

I then installed datasette from GitHub master in development mode and changed the offending SQL to use correct quotes: "select 1 from sqlite_master where tbl_name = 'geometry_columns'".

With this change, I get a little further, but have the same problem with the first table name in my database (in my case, "Meta"):

OperationalError: no such column: Meta
Traceback (most recent call last):
  File "/Users/gwk/external/datasette/datasette/app.py", line 1522, in route_path
    response = await view(request, send)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/views/base.py", line 151, in view
    return await self.dispatch_request(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/views/base.py", line 105, in dispatch_request
    response = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/views/index.py", line 70, in get
    "fts_table": await db.fts_table(table),
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/database.py", line 363, in fts_table
    return await self.execute_fn(lambda conn: detect_fts(conn, table))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/database.py", line 213, in execute_fn
    return await asyncio.get_event_loop().run_in_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/py/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/database.py", line 211, in in_thread
    return fn(conn)
           ^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/database.py", line 363, in <lambda>
    return await self.execute_fn(lambda conn: detect_fts(conn, table))
                                              ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/gwk/external/datasette/datasette/utils/__init__.py", line 588, in detect_fts
    rows = conn.execute(detect_fts_sql(table)).fetchall()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: no such column: Meta
INFO:     127.0.0.1:50258 - "GET / HTTP/1.1" 500 Internal Server Error

I will try to continue playing with this, but I also hope that the datasette developers will enable this mode in a test environment as I am unlikely to be able to exercise all of the SQL in the codebase, or make a pull request very soon.

Note that the DQS setting compile-time option can be overridden at runtime with calls to the C API:

sqlite3_db_config(db, SQLITE_DBCONFIG_DQS_DDL, 0, (void*)0);
sqlite3_db_config(db, SQLITE_DBCONFIG_DQS_DML, 0, (void*)0);

As far as I can tell, sqlite3_db_config is not exposed in Python, but perhaps we could figure out how to invoke it using ctypes.

@cldellow
Copy link
Contributor

Your comment introduced me to this issue in sqlite and to the ctypes module - thanks!

I also hope that the datasette developers will enable this mode in a test environment [...]
perhaps we could figure out how to invoke it using ctypes

I'm not a Datasette developer, but I am curious to learn more about getting unholy access to the sqlite C APIs inside of Datasette. (Such access could also help #1293, and if done without grovelling inside of pysqlite's Connection object for the db handle, could even be relatively safe.)

I experimented a bit. I came up with https://gist.github.com/cldellow/85bba507c314b127f85563869cd94820

If you run python3 enable-strict-quoting-sqlite3.py, it seems to set those flags correctly -- SELECT "foo" fails where it would normally succeed.

But if you put it in a plugins/ dir and run datasette --plugins-dir plugins/, it segfaults when it tries to call sqlite3_db_config on the connections created by Datasette.

I am... confused. I'm pretty sure I'm using the same python and the same libsqlite3 in both scenarios, so I would expect it to work.

@gwk do you know anything that might help me debug the segfault? I gather that my approach of going grovelling inside of a PyObject is particularly dangerous, but I was thinking (a) it's necessary in order to test Datasette's use of the sqlite3 library and (b) even if it's not portable, it'd be good enough for running the tests on a single machine.

@gwk
Copy link
Author

gwk commented Jan 25, 2023

@cldellow glad to hear you tried it, as I got grossed out by my own suggestion ;) If you are on macOS I do have one trick for debugging segfaults using lldb.

@cldellow
Copy link
Contributor

cldellow commented Jan 25, 2023

I'm on Ubuntu, unfortunately. :( Would it still be relevant?

I think I've narrowed things down a bit more.

Even sqlite3_free(sqlite3_malloc(128)) segfaults -- this suggests to me that it's something about the sqlite3 library that was loaded, vs, say, getting the wrong db handle when I go spelunking in the Connection object.

@cldellow
Copy link
Contributor

Aha, it's user error on my part.

Adding

sqlite3_db_config.argtypes = [ctypes.c_void_p, ctypes.c_int, ctypes.c_int, ctypes.c_int]

makes it work reliably both on the CLI and from datasette, and now I can reproduce the errors you mentioned in the issue description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants