Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
c70e1bc
explicit duckdb 1.29.0; self-host core extensions; document
Fil Oct 8, 2024
0029c8c
configure which extensions are self-hosted
Fil Oct 10, 2024
feeaad8
Merge branch 'main' into fil/duckdb-wasm-1.29
Fil Oct 10, 2024
33aa5cb
hash extensions
Fil Oct 10, 2024
543f823
better docs
Fil Oct 10, 2024
7475589
cleaner duckdb manifest — now works in scripts and embeds
Fil Oct 11, 2024
47b6bd0
restructure code, extensible manifest
Fil Oct 11, 2024
abd0380
test, documentation
Fil Oct 11, 2024
7ac5d1d
much nicer config
Fil Oct 11, 2024
0adcb36
document config
Fil Oct 11, 2024
5365371
add support for mvp, clean config & documentation
Fil Oct 11, 2024
1fdf717
parametrized the initial LOAD in DuckDBClient
Fil Oct 11, 2024
bc712c3
tests
Fil Oct 11, 2024
2fb2878
bake-in the extensions manifest
Fil Oct 11, 2024
bc49674
fix test
Fil Oct 11, 2024
9a13f2a
don't activate spatial on the documentation
Fil Oct 11, 2024
e2c8b6c
Merge branch 'main' into fil/duckdb-wasm-1.29
Fil Oct 14, 2024
4a5128d
refactor: hash individual extensions, include the list of platforms i…
Fil Oct 14, 2024
13f892c
don't copy extensions twice
Fil Oct 14, 2024
8bb2866
Merge branch 'main' into fil/duckdb-wasm-1.29
Fil Oct 18, 2024
43ef6eb
Merge branch 'main' into fil/duckdb-wasm-1.29
Fil Oct 19, 2024
6764969
Merge branch 'main' into fil/duckdb-wasm-1.29
mbostock Oct 20, 2024
d72f0c3
Update src/duckdb.ts
Fil Oct 20, 2024
d6fc020
remove DuckDBClientReport utility
Fil Oct 21, 2024
69f25a2
renames
Fil Oct 21, 2024
30788e3
p for platform
Fil Oct 21, 2024
710f36a
centralize DUCKDBWASMVERSION and DUCKDBVERSION
Fil Oct 21, 2024
4f58100
clearer
Fil Oct 21, 2024
a8cfdcd
better config; manifest.extensions now lists individual extensions on…
Fil Oct 21, 2024
490d969
validate extension names; centralize DUCKDBBUNDLES
Fil Oct 21, 2024
aaff8f8
fix tests
Fil Oct 21, 2024
bc39bbe
Merge branch 'main' into fil/duckdb-wasm-1.29
Fil Oct 30, 2024
8bd0972
copy edit
Fil Oct 30, 2024
b90c22a
support loading non-self-hosted extensions
Fil Oct 30, 2024
b37be07
test duckdb config normalization & defaults
Fil Oct 30, 2024
9abaf57
documentation
Fil Oct 30, 2024
ccc0073
typography
Fil Oct 30, 2024
26c7a6f
doc
Fil Oct 31, 2024
4416dd3
Merge branch 'main' into fil/duckdb-wasm-1.29
mbostock Nov 1, 2024
7704416
use view for <50MB
mbostock Nov 1, 2024
1dde616
docs, shorthand, etc.
mbostock Nov 1, 2024
0491966
annotate fixes
mbostock Nov 1, 2024
be26385
disable telemetry on annotate tests, too
mbostock Nov 1, 2024
a23d3e4
tidier duckdb manifest
mbostock Nov 1, 2024
c753728
Merge branch 'main' into fil/duckdb-wasm-1.29
mbostock Nov 1, 2024
6e828c9
remove todo
mbostock Nov 1, 2024
365dbe3
more robust duckdb: scheme
mbostock Nov 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions docs/lib/duckdb.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,3 +105,60 @@ const sql = DuckDBClient.sql({quakes: `https://earthquake.usgs.gov/earthquakes/f
```sql echo
SELECT * FROM quakes ORDER BY updated DESC;
```

## Extensions

DuckDB has a flexible extension mechanism that allows for dynamically loading extensions. These may extend DuckDB's functionality by providing support for additional file formats, introducing new types, and domain-specific functionality.

### Built-in extensions

The built-in extensions are statically linked to the default bundle. In other words, they are immediately available to use. This case includes, for example, the "httpfs" extension.

### Installing extensions

Installing an extension, in DuckDB-wasm, references the source file or extensions repository that holds it. Thus, you can specify:

```sql echo run=false
INSTALL h3 FROM community;
LOAD h3;
SELECT format('{:x}', h3_latlng_to_cell(37.77, -122.43, 9)) AS cell_id;
```

Beyond the official extensions repositories (with `core` extensions at `https://extensions.duckdb.org` and `community` extensions at `https://community.duckdb.org`), you can install an extension from an explicit URL:

```sql echo run=false
INSTALL custom FROM 'https://example.com/';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should discourage people from installing extensions from within SQL blocks: doing so globally changes the behavior of the DuckDBClient instance and can lead to race conditions/nondeterministic behavior across blocks, and also because we want to favor self-hosting of extensions rather than hotlinking to an external website.

The recommended way to install extensions should be via the front matter or the project config (or to do it in JavaScript by redefining the sql literal and awaiting the loading of the extensions).

```

### Loading extensions

To activate an extension in a DuckDB instance, we have to “load” it, for example with an explicit `LOAD` statement:

```sql echo run=false
LOAD spatial;
SELECT ST_Area('POLYGON((0 0, 0 1, 1 1, 1 0, 0 0))'::GEOMETRY) as area;
```

Many of the core extensions however do not need an explicit `LOAD` statement, as they get autoloaded when DuckDB detects that they are needed. For example, the query below autoloads the "json" extension:

```sql echo run=false
SELECT bbox FROM read_json('https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.geojson');
```

Similarly, this query autoloads the "inet" extension:

```sql echo
SELECT '127.0.0.1'::INET AS ipv4, '2001:db8:3c4d::/48'::INET AS ipv6;
```

### Self-hosted extensions

Framework will download and host the extensions of your choice locally. By default, only "json" and "parquet" are self-hosted, but you can add more by specifying the list in the [config](../config). The self-hosted extensions are served (currently) from the `/_npm/` directory, ensuring that you can continue to work offline and from a server you control.

<div class="tip">

Note that if you `INSTALL` or `LOAD` an extension that is not self-hosted, DuckDB will load it from the core or community servers. At present Framework does not know which extensions your code is using—but you can inspect the network activity in your browser to see if that is the case, and decide to add them to the list of self-hosted extensions. In the future, the preview server might be able to raise a warning if the list is incomplete. If you are interested in this feature, please upvote #issueTK.

</div>

These features are tied to DuckDB wasm’s 1.29 version, and strongly dependent on its development cycle.
2 changes: 2 additions & 0 deletions docs/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,3 +205,5 @@ Inputs.table(await sql([`SELECT * FROM gaia WHERE source_id IN (${[source_ids]})
When interpolating values into SQL queries, be careful to avoid [SQL injection](https://en.wikipedia.org/wiki/SQL_injection) by properly escaping or sanitizing user input. The example above is safe only because `source_ids` are known to be numeric.

</div>

For more information, see [DuckDB: extensions](./lib/duckdb#extensions).
16 changes: 15 additions & 1 deletion src/client/stdlib/recommendedLibraries.js
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,21 @@ export const mermaid = () => import("observablehq:stdlib/mermaid").then((mermaid
export const Plot = () => import("npm:@observablehq/plot");
export const React = () => import("npm:react");
export const ReactDOM = () => import("npm:react-dom");
export const sql = () => import("observablehq:stdlib/duckdb").then((duckdb) => duckdb.sql);
export const sql = () =>
import("observablehq:stdlib/duckdb").then(async (duckdb) => {
const {sql} = duckdb;
const extensions = JSON.parse(document.querySelector("#observablehq-duckdb-hosted-extensions").textContent);
for (const [name, ref] of extensions) {
const x = `INSTALL ${name} FROM '${new URL(`..${ref}`, import.meta.url).href}';`;
console.warn(import.meta.url, x);
await sql([x]);
const y = `LOAD ${name};`;
console.warn(import.meta.url, y);
await sql([y]);
}
console.warn(extensions);
return sql;
});
export const SQLite = () => import("observablehq:stdlib/sqlite").then((sqlite) => sqlite.default);
export const SQLiteDatabaseClient = () => import("observablehq:stdlib/sqlite").then((sqlite) => sqlite.SQLiteDatabaseClient); // prettier-ignore
export const tex = () => import("observablehq:stdlib/tex").then((tex) => tex.default);
Expand Down
26 changes: 25 additions & 1 deletion src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,10 @@ export interface SearchConfigSpec {
index?: unknown;
}

export interface DuckDBConfig {
extensions: {[key: string]: string};
}

export interface Config {
root: string; // defaults to src
output: string; // defaults to dist
Expand All @@ -98,6 +102,7 @@ export interface Config {
normalizePath: (path: string) => string;
loaders: LoaderResolver;
watchPath?: string;
duckdb: DuckDBConfig;
}

export interface ConfigSpec {
Expand Down Expand Up @@ -125,6 +130,7 @@ export interface ConfigSpec {
quotes?: unknown;
cleanUrls?: unknown;
markdownIt?: unknown;
duckdb?: unknown;
}

interface ScriptSpec {
Expand Down Expand Up @@ -260,6 +266,7 @@ export function normalizeConfig(spec: ConfigSpec = {}, defaultRoot?: string, wat
const search = spec.search == null || spec.search === false ? null : normalizeSearch(spec.search as any);
const interpreters = normalizeInterpreters(spec.interpreters as any);
const normalizePath = getPathNormalizer(spec.cleanUrls);
const duckdb = normalizeDuckDB(spec.duckdb as any);

// If this path ends with a slash, then add an implicit /index to the
// end of the path. Otherwise, remove the .html extension (we use clean
Expand Down Expand Up @@ -310,7 +317,8 @@ export function normalizeConfig(spec: ConfigSpec = {}, defaultRoot?: string, wat
md,
normalizePath,
loaders: new LoaderResolver({root, interpreters}),
watchPath
watchPath,
duckdb
};
if (pages === undefined) Object.defineProperty(config, "pages", {get: () => readPages(root, md)});
if (sidebar === undefined) Object.defineProperty(config, "sidebar", {get: () => config.pages.length > 0});
Expand Down Expand Up @@ -488,3 +496,19 @@ export function mergeStyle(
export function stringOrNull(spec: unknown): string | null {
return spec == null || spec === false ? null : String(spec);
}

function normalizeDuckDB(spec: unknown): DuckDBConfig {
const extensions = spec?.["extensions"] ?? ["json", "parquet"];
return {
extensions: Object.fromEntries(
Object.entries(
Array.isArray(extensions)
? Object.fromEntries(extensions.map((name) => [name, true]))
: (spec as {[key: string]: string})
).map(([name, value]) => [
name,
value === true ? `https://extensions.duckdb.org/v1.1.1/wasm_eh/${name}.duckdb_extension.wasm` : `${value}`
])
)
};
}
5 changes: 4 additions & 1 deletion src/libraries.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import type {DuckDBConfig} from "./config.js";

export function getImplicitFileImports(methods: Iterable<string>): Set<string> {
const set = setof(methods);
const implicits = new Set<string>();
Expand Down Expand Up @@ -72,14 +74,15 @@ export function getImplicitStylesheets(imports: Iterable<string>): Set<string> {
* library used by FileAttachment) we manually enumerate the needed additional
* downloads here. TODO Support versioned imports, too, such as "npm:leaflet@1".
*/
export function getImplicitDownloads(imports: Iterable<string>): Set<string> {
export function getImplicitDownloads(imports: Iterable<string>, duckdb: DuckDBConfig): Set<string> {
const set = setof(imports);
const implicits = new Set<string>();
if (set.has("npm:@observablehq/duckdb")) {
implicits.add("npm:@duckdb/duckdb-wasm/dist/duckdb-mvp.wasm");
implicits.add("npm:@duckdb/duckdb-wasm/dist/duckdb-browser-mvp.worker.js");
implicits.add("npm:@duckdb/duckdb-wasm/dist/duckdb-eh.wasm");
implicits.add("npm:@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js");
for (const [, url] of Object.entries(duckdb.extensions)) implicits.add(url);
}
if (set.has("npm:@observablehq/sqlite")) {
implicits.add("npm:sql.js/dist/sql-wasm.js");
Expand Down
34 changes: 30 additions & 4 deletions src/npm.ts
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ export async function getDependencyResolver(
(name === "arquero" || name === "@uwdata/mosaic-core" || name === "@duckdb/duckdb-wasm") && depName === "apache-arrow" // prettier-ignore
? "latest" // force Arquero, Mosaic & DuckDB-Wasm to use the (same) latest version of Arrow
: name === "@uwdata/mosaic-core" && depName === "@duckdb/duckdb-wasm"
? "1.28.0" // force Mosaic to use the latest (stable) version of DuckDB-Wasm
? "1.29.0" // force Mosaic to use the latest (stable) version of DuckDB-Wasm
: pkg.dependencies?.[depName] ??
pkg.devDependencies?.[depName] ??
pkg.peerDependencies?.[depName] ??
Expand Down Expand Up @@ -248,9 +248,7 @@ async function resolveNpmVersion(root: string, {name, range}: NpmSpecifier): Pro
export async function resolveNpmImport(root: string, specifier: string): Promise<string> {
const {
name,
range = name === "@duckdb/duckdb-wasm"
? "1.28.0" // https://github.com/duckdb/duckdb-wasm/issues/1561
: undefined,
range = name === "@duckdb/duckdb-wasm" ? "1.29.0" : undefined,
path = name === "mermaid"
? "dist/mermaid.esm.min.mjs/+esm"
: name === "echarts"
Expand Down Expand Up @@ -316,3 +314,31 @@ export function fromJsDelivrPath(path: string): string {
const subpath = parts.slice(i).join("/"); // "+esm" or "lite/+esm" or "lite.js/+esm"
return `/_npm/${namever}/${subpath === "+esm" ? "_esm.js" : subpath.replace(/\/\+esm$/, "._esm.js")}`;
}

const downloadRequests = new Map<string, Promise<string>>();

/**
* Given a URL such as
* https://extensions.duckdb.org/v1.1.1/wasm_eh/parquet.duckdb_extension.wasm,
* returns the corresponding local path such as
* _npm/extensions.duckdb.org/v1.1.1/wasm_eh/parquet.duckdb_extension.wasm
*/
export async function resolveDownload(root: string, href: string): Promise<string> {
if (!href.startsWith("https://")) throw new Error(`invalid download path: ${href}`);
const path = "/_npm/" + href.slice("https://".length);
const outputPath = join(root, ".observablehq", "cache", "_npm", href.slice("https://".length));
if (existsSync(outputPath)) return path;
let promise = downloadRequests.get(outputPath);
if (promise) return promise; // coalesce concurrent requests
promise = (async () => {
console.log(`download: ${href} ${faint("→")} ${outputPath}`);
const response = await fetch(href);
if (!response.ok) throw new Error(`unable to fetch: ${href}`);
await mkdir(dirname(outputPath), {recursive: true});
await writeFile(outputPath, Buffer.from(await response.arrayBuffer()));
return path;
})();
promise.catch(console.error).then(() => downloadRequests.delete(outputPath));
downloadRequests.set(outputPath, promise);
return promise;
}
4 changes: 2 additions & 2 deletions src/preview.ts
Original file line number Diff line number Diff line change
Expand Up @@ -390,9 +390,9 @@ function handleWatch(socket: WebSocket, req: IncomingMessage, configPromise: Pro
if (path.endsWith("/")) path += "index";
path = join(dirname(path), basename(path, ".html"));
config = await configPromise;
const {root, loaders, normalizePath} = config;
const {root, loaders, normalizePath, duckdb} = config;
const page = await loaders.loadPage(path, {path, ...config});
const resolvers = await getResolvers(page, {root, path, loaders, normalizePath});
const resolvers = await getResolvers(page, {root, path, loaders, normalizePath, duckdb});
if (resolvers.hash === initialHash) send({type: "welcome"});
else return void send({type: "reload"});
hash = resolvers.hash;
Expand Down
6 changes: 5 additions & 1 deletion src/render.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import {findModule} from "./javascript/module.js";
import type {TranspileModuleOptions} from "./javascript/transpile.js";
import {transpileJavaScript, transpileModule} from "./javascript/transpile.js";
import type {MarkdownPage} from "./markdown.js";
import {resolveDownload} from "./npm.js";
import type {PageLink} from "./pager.js";
import {findLink, normalizePath} from "./pager.js";
import {isAssetPath, resolvePath, resolveRelativePath} from "./path.js";
Expand All @@ -30,7 +31,7 @@ type RenderInternalOptions =

export async function renderPage(page: MarkdownPage, options: RenderOptions & RenderInternalOptions): Promise<string> {
const {data, params} = page;
const {base, path, title, preview} = options;
const {base, path, title, preview, duckdb} = options;
const {loaders, resolvers = await getResolvers(page, options)} = options;
const {draft = false, sidebar = options.sidebar} = data;
const toc = mergeToc(data.toc, options.toc);
Expand All @@ -57,6 +58,9 @@ if (location.pathname.endsWith("/")) {
</script>`)
: ""
}
<script type="application/json" id="observablehq-duckdb-hosted-extensions">${html.unsafe(
JSON.stringify(Object.entries(duckdb.extensions).map(([name]) => [name, "/_npm/extensions.duckdb.org"]))
)}</script>
<script type="module">${html.unsafe(`

import ${preview || page.code.length ? `{${preview ? "open, " : ""}define} from ` : ""}${JSON.stringify(
Expand Down
10 changes: 8 additions & 2 deletions src/resolvers.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import {createHash} from "node:crypto";
import {extname, join} from "node:path/posix";
import type {DuckDBConfig} from "./config.js";
import {findAssets} from "./html.js";
import {defaultGlobals} from "./javascript/globals.js";
import {isJavaScript} from "./javascript/imports.js";
Expand All @@ -12,6 +13,7 @@ import type {LoaderResolver} from "./loader.js";
import type {MarkdownPage} from "./markdown.js";
import {extractNodeSpecifier, resolveNodeImport, resolveNodeImports} from "./node.js";
import {extractNpmSpecifier, populateNpmCache, resolveNpmImport, resolveNpmImports} from "./npm.js";
import {resolveDownload} from "./npm.js";
import {isAssetPath, isPathImport, parseRelativeUrl, relativePath, resolveLocalPath, resolvePath} from "./path.js";

export interface Resolvers {
Expand All @@ -38,6 +40,7 @@ export interface ResolversConfig {
normalizePath: (path: string) => string;
globalStylesheets?: string[];
loaders: LoaderResolver;
duckdb: DuckDBConfig;
}

const defaultImports = [
Expand Down Expand Up @@ -202,7 +205,7 @@ async function resolveResolvers(
staticImports?: Iterable<string> | null;
stylesheets?: Iterable<string> | null;
},
{root, path, normalizePath, loaders}: ResolversConfig
{root, path, normalizePath, loaders, duckdb}: ResolversConfig
): Promise<Omit<Resolvers, "path" | "hash" | "assets" | "anchors" | "localLinks">> {
const files = new Set<string>(initialFiles);
const fileMethods = new Set<string>(initialFileMethods);
Expand Down Expand Up @@ -361,12 +364,15 @@ async function resolveResolvers(

// Add implicit downloads. (This should be maybe be stored separately rather
// than being tossed into global imports, but it works for now.)
for (const specifier of getImplicitDownloads(globalImports)) {
for (const specifier of getImplicitDownloads(globalImports, duckdb)) {
globalImports.add(specifier);
if (specifier.startsWith("npm:")) {
const path = await resolveNpmImport(root, specifier.slice("npm:".length));
resolutions.set(specifier, path);
await populateNpmCache(root, path);
} else if (specifier.startsWith("https://")) {
const path = await resolveDownload(root, specifier);
resolutions.set(specifier, path);
} else if (!specifier.startsWith("observablehq:")) {
throw new Error(`unhandled implicit download: ${specifier}`);
}
Expand Down
26 changes: 25 additions & 1 deletion test/libraries-test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,31 @@ describe("getImplicitDownloads(imports)", () => {
"npm:@duckdb/duckdb-wasm/dist/duckdb-mvp.wasm",
"npm:@duckdb/duckdb-wasm/dist/duckdb-browser-mvp.worker.js",
"npm:@duckdb/duckdb-wasm/dist/duckdb-eh.wasm",
"npm:@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js"
"npm:@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/autocomplete.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/fts.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/icu.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/inet.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/json.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/parquet.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/spatial.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/sqlite_scanner.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/substrait.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/tpcds.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/tpch.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_eh/vss.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/autocomplete.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/fts.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/icu.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/inet.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/json.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/parquet.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/spatial.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/sqlite_scanner.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/substrait.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/tpcds.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/tpch.duckdb_extension.wasm",
"https://extensions.duckdb.org/v1.1.1/wasm_mvp/vss.duckdb_extension.wasm"
])
);
assert.deepStrictEqual(
Expand Down
Loading