diff --git a/.gitignore b/.gitignore index dbd275aa..b119c17a 100644 --- a/.gitignore +++ b/.gitignore @@ -49,8 +49,12 @@ StrykerOutput/ # Upstream mirror state is regeneratable from # `references/reference-sources.json` via the sync script. -# Do not commit it. -references/upstreams/ +# Do not commit it. Sentinel pair (`.gitignore` + `README.md`) +# is tracked so the directory exists on clone and contributors +# see what it's for, parallel to `drop/` and `roms/`. +references/upstreams/* +!references/upstreams/.gitignore +!references/upstreams/README.md # Lean 4 + Mathlib build artifacts. Generated by `lake build`. # Mathlib alone is ~6.8 GB of .olean; never commit. diff --git a/references/upstreams/.gitignore b/references/upstreams/.gitignore new file mode 100644 index 00000000..7c9d611b --- /dev/null +++ b/references/upstreams/.gitignore @@ -0,0 +1,3 @@ +* +!.gitignore +!README.md diff --git a/references/upstreams/README.md b/references/upstreams/README.md new file mode 100644 index 00000000..8993d43d --- /dev/null +++ b/references/upstreams/README.md @@ -0,0 +1,78 @@ +# `references/upstreams/` — gitignored upstream-source mirror + +This directory is the local checkout of every upstream source listed in +[`references/reference-sources.json`](../reference-sources.json). It is +**gitignored except for this README and `.gitignore`** — the contents +are regenerated by the upstream-sync script and never committed. + +## Why nothing here is committed + +Upstream mirrors are bulky (multiple gigabytes of source trees from +projects like Feldera, Arrow, Bond, Bonsai-Rx, BookKeeper, Capnproto, +and dozens of others). Committing them would: + +- bloat the repo by orders of magnitude, +- pin Zeta to a specific upstream snapshot (we want to track current + upstream main, not freeze it), +- pollute `git log` with content the project doesn't author, +- force every clone to download upstream history we already have via + the upstream's own remote. + +The git-ignored mirror lets contributors work locally against the +upstream tree (read code, run benchmarks, copy patches) while keeping +the repo itself lean. + +## How the mirror is regenerated + +`references/reference-sources.json` is the canonical list. +[`tools/setup/common/sync-upstreams.sh`](../../tools/setup/common/sync-upstreams.sh) +reads it and clones (or pulls) each entry under +`references/upstreams//`. The sync script is invoked +by `tools/setup/install.sh` and can also be run standalone. See +[`references/README.md`](../README.md) for the broader references +layout. + +## Why the sentinel pair + +This `.gitignore` plus `README.md` follow the same pattern as `drop/` +(per-user staging area for incoming content) and `roms/` (gitignored +emulator-test corpus): the sentinel preserves the directory in version +control so contributors see it on clone, but the bulky contents stay +local and regeneratable. Without the sentinel, an empty +`references/upstreams/` directory either disappears at clone time or +risks accidental commits of upstream source. + +Pattern documented at: + +- `drop/.gitignore` + `drop/README.md` (Otto-staging-zone) +- `roms/.gitignore` + `roms/README.md` (Otto safe-ROM testbed) +- this directory (Otto upstream-source mirror) + +## What does NOT live here + +- **Vendored upstream snapshots** that ARE committed (because the + project depends on them at a pinned version) live elsewhere — see + `references/tla-book/` for an example. Those are intentionally + tracked. +- **Notes about upstream code** live under `references/notes/`, not + here. Notes are factory-authored prose; this directory is upstream- + authored source. +- **Zeta's own artifacts** never land here. This is read-only mirror + territory. + +## How to add a new upstream + +1. Add an entry to `references/reference-sources.json` (license, + canonical URL, intended use). +2. Run the sync script — your new upstream lands at + `references/upstreams//`, gitignored automatically by + the `*` rule above. +3. Land the JSON change as a normal PR. The mirror clone happens on + each contributor's machine on first sync. + +## Why this README is committed + +Without committed prose explaining the directory's purpose, a +new contributor seeing an empty `references/upstreams/` (after a fresh +clone, before running the sync script) would have no signal that this +is a real working directory. The README is the signal.