Skip to content

ci(cypress): default stateful ES to snapshot on CI, docker locally#264218

Merged
patrykkopycinski merged 7 commits intoelastic:mainfrom
patrykkopycinski:ci/auto-detect-cypress-es-from
May 5, 2026
Merged

ci(cypress): default stateful ES to snapshot on CI, docker locally#264218
patrykkopycinski merged 7 commits intoelastic:mainfrom
patrykkopycinski:ci/auto-detect-cypress-es-from

Conversation

@patrykkopycinski
Copy link
Copy Markdown
Contributor

@patrykkopycinski patrykkopycinski commented Apr 17, 2026

Summary

Default Cypress stateful Elasticsearch provisioning to snapshot on CI and keep docker for local development.

The earlier switch to Docker as the universal default (#254306) was motivated by:

  • making local dev match shipped artifacts,
  • multi-arch support for Apple Silicon,
  • avoiding per-spec snapshot extraction,
  • faster warm starts on developer machines.

All four are genuine wins for local dev. On CI they either don't apply, are neutral, or are actively counter-productive. After gathering empirical data from Buildkite, the right default on CI is snapshot; on workstations the right default stays docker.

Why snapshot on CI

  1. No version-skew race. Kibana CI already resolves an ES snapshot manifest once per build in .buildkite/scripts/lifecycle/pre_build.sh against kibana-ci-es-snapshots-daily — Kibana's own daily-verified bucket, version-locked to Kibana by construction. The post-version-bump window (9.5.0, 9.6.0, …) that my earlier auto-detect probe tried to guard against doesn't actually exist for stateful Cypress on CI: the tar.gz is already there, or pre_build.sh has already failed the build before any Cypress agent starts. A Docker image for that same version is not guaranteed to exist at the same moment — which is the exact failure mode we kept running into.

  2. Docker-on-CI is not meaningfully faster on the same hardware. I pulled job durations from Buildkite for kibana-on-merge Security Solution Cypress jobs before and after [kbn-es] Add --docker flag to yarn es snapshot #254306 and reconciled them against the Buildkite agent machine-type change (n2-standard-4n2-highmem-4) that landed in the same window. Controlling for that hardware change, ES start-up on a warm CI agent is ~5s different between snapshot tar.gz and Docker — within noise for a 20–40 minute Cypress group. The speedups originally attributed to Docker were largely a hardware upgrade.

  3. ES starts once per FTR config group, not per spec. parallel.ts provisions ES once for each group in specGroups, runs all specs in that group against the same cluster, then shuts down (see runSpecGroup). Only retry runs go per-spec. So the "Docker avoids per-spec extraction on CI" argument is mostly about retries, which are a tiny fraction of total runtime.

  4. Fewer moving parts on CI. No Docker registry auth, no Docker pull on every agent, no fallback logic between Docker and snapshot, no GCS probe script. Snapshot tar.gz is already pre-fetched/cached by the standard Kibana CI lifecycle.

Why keep Docker for local dev

  1. Matches shipped artifacts byte-for-byte.
  2. Native multi-arch (Apple Silicon) without a separate tar.gz pipeline.
  3. Warm starts are fast once the image is cached on the workstation.
  4. CYPRESS_ES_FROM=snapshot (or docker) still works as an explicit override for both environments.

Change

const defaultEsFrom = process.env.CI ? 'snapshot' : 'docker';
const esFrom =
  configEsFrom === 'serverless' ? 'serverless' : esFromEnv || defaultEsFrom;

Also drops the earlier detect_cypress_es_from.sh probe and its hook in setup_job_env.shpre_build.sh already covers the version-skew concern at a better layer.

The serverless routing fix (configEsFrom === 'serverless' wins over CYPRESS_ES_FROM) is retained from the first commit and is independent of the default flip — it prevents stateful CYPRESS_ES_FROM=snapshot from accidentally booting serverless suites against a stateful snapshot tar.gz and blowing up with unknown setting [xpack.security.authc.native_roles.enabled].

Test plan

  • Green kibana-on-merge Security Solution Cypress jobs (stateful + serverless).
  • Green kibana-pull-request Security Solution Cypress jobs with no CYPRESS_ES_FROM set.
  • Local: yarn cypress:run ... still uses Docker by default.
  • Local: CYPRESS_ES_FROM=snapshot yarn cypress:run ... uses snapshot.
  • Serverless suites remain on serverless regardless of CYPRESS_ES_FROM.

…ESS_ES_FROM

`CYPRESS_ES_FROM` should only affect the *stateful* ES provisioning path.
Before this change, setting `CYPRESS_ES_FROM=snapshot` unconditionally
overrode the serverless path, causing serverless Cypress suites to boot
a stateful ES tar.gz that rejected serverless-only settings, e.g.:

  unknown setting [xpack.security.authc.native_roles.enabled]
  unknown setting [serverless.search.enable_replicas_for_instant_failover]

Fix the precedence so `configEsFrom === 'serverless'` always wins, and
`CYPRESS_ES_FROM` only chooses between `snapshot` / `docker` for stateful.
Cypress suites provision ES via Docker by default, which breaks for a
short window after every Kibana version bump (e.g. 9.5.0, 9.6.0) because
the unified release hasn't yet published the matching
`docker.elastic.co/elasticsearch/elasticsearch:${version}-SNAPSHOT` image.
During that window Cypress either fails to pull the image or falls through
to a mismatched older tag, producing:

  [elasticsearch-service] This version of Kibana (vX.Y.0-SNAPSHOT) is
  incompatible with the following Elasticsearch nodes in your cluster:
  vX.(Y-1).0 ...

This change adds `detect_cypress_es_from.sh`, sourced from
`setup_job_env.sh`, which probes the Elastic Docker registry for the ES
image manifest matching the current Kibana version. If the image is not
published yet, it sets `CYPRESS_ES_FROM=snapshot` so Cypress falls back
to the ES snapshot tar.gz (which is published much earlier in the release
cycle). If the image is available, `CYPRESS_ES_FROM` is left unset and
Cypress continues to use Docker as before.

The check is best-effort: any probe failure (network, auth, unexpected
response, missing `jq` / `curl`) falls back to `snapshot` so tests keep
running rather than silently using a mismatched Docker image. Explicit
`CYPRESS_ES_FROM` values set by the user are always respected.

The serverless Cypress path is unaffected because `parallel.ts` now
always prefers `configEsFrom === 'serverless'` over the env var (fixed
in the prior commit).
@patrykkopycinski
Copy link
Copy Markdown
Contributor Author

/ci

Switch `detect_cypress_es_from.sh` from probing the Docker registry (with
anonymous token exchange) to probing the per-version snapshot manifest that
release-eng publishes to GCS:

  https://storage.googleapis.com/elastic-artifacts-snapshot/elasticsearch/latest/${VERSION}-SNAPSHOT.json

This is the same source of truth that `elastic/kibana` and
`elastic/elasticsearch` version-bump pipelines already gate on (see
`.buildkite/pipelines/version_bump.yml` in both repos).

Why this is better than the registry HEAD probe
-----------------------------------------------
- No registry auth dance — public GCS bucket, plain curl. Fewer failure
  modes in the detection path.
- Promotion-level signal — an ES image may be pushed to the registry before
  release-eng promotes the snapshot as the build's `latest`. The JSON
  manifest only appears after promotion, so we avoid a race where Cypress
  pulls a half-ready snapshot.
- Alignment — if ES release-eng considers a version "not ready", Kibana
  Cypress treats it the same, by construction.

Behavior is unchanged at the control-flow level: probe succeeds with
matching `.version` → leave `CYPRESS_ES_FROM` unset (Docker default); any
other outcome → safe fallback to `CYPRESS_ES_FROM=snapshot`.
@patrykkopycinski patrykkopycinski changed the title [CI] Auto-detect CYPRESS_ES_FROM based on ES Docker image availability [CI] Auto-detect CYPRESS_ES_FROM based on ES snapshot promotion Apr 20, 2026
Reverts the Docker-registry / ES snapshot-manifest probe approach and
instead picks the `esFrom` default based on the execution environment:

  - CI            -> `snapshot`
  - Local dev     -> `docker`
  - Serverless    -> `serverless` (unchanged, forced by FTR config)
  - Explicit env  -> `CYPRESS_ES_FROM` still wins for stateful suites

Why this is better than probing a manifest from the Cypress runner:

1. Kibana CI already resolves the ES snapshot manifest once per build
   in `.buildkite/scripts/lifecycle/pre_build.sh` against
   `kibana-ci-es-snapshots-daily`, which is Kibana's own daily-verified
   bucket — it is version-locked to Kibana by construction, so the
   post-version-bump race the probe tried to guard against never
   manifests for stateful Cypress on CI.

2. Snapshot tar.gz extraction on a warm agent is on par with — or
   faster than — a Docker image pull + container boot on the same
   hardware. The speedups observed after the previous switch to Docker
   defaults were largely attributable to a concurrent Buildkite agent
   machine-type upgrade (n2-standard-4 -> n2-highmem-4), not to the ES
   provisioner itself.

3. Local developers still benefit from Docker: identical to shipped
   artifacts, multi-arch (Apple Silicon), and warm starts are cheap
   once the image is cached on the workstation.

4. Removes the `detect_cypress_es_from.sh` probe and its hook in
   `setup_job_env.sh` — no extra GCS round-trip per agent, no bash
   fallback logic to reason about, fewer moving parts in CI.
@patrykkopycinski patrykkopycinski changed the title [CI] Auto-detect CYPRESS_ES_FROM based on ES snapshot promotion ci(cypress): default stateful ES to snapshot on CI, docker locally Apr 20, 2026
@patrykkopycinski patrykkopycinski marked this pull request as ready for review April 27, 2026 15:24
@patrykkopycinski patrykkopycinski requested a review from a team as a code owner April 27, 2026 15:24
@patrykkopycinski patrykkopycinski self-assigned this Apr 27, 2026
@patrykkopycinski patrykkopycinski requested a review from mistic April 27, 2026 15:24
@patrykkopycinski patrykkopycinski added release_note:skip Skip the PR/issue when compiling release notes backport:version Backport to applied version labels v9.4.0 v9.5.0 labels Apr 27, 2026
@delanni
Copy link
Copy Markdown
Member

delanni commented May 5, 2026

@patrykkopycinski - let's get this merged?

@patrykkopycinski patrykkopycinski enabled auto-merge (squash) May 5, 2026 09:50
@kibanamachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

cc @patrykkopycinski

@patrykkopycinski patrykkopycinski merged commit 66c8e08 into elastic:main May 5, 2026
24 checks passed
@kibanamachine
Copy link
Copy Markdown
Contributor

Starting backport for target branches: 9.4

https://github.com/elastic/kibana/actions/runs/25376912174

@kibanamachine
Copy link
Copy Markdown
Contributor

💚 All backports created successfully

Status Branch Result
9.4

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request May 5, 2026
…lly (#264218) (#267726)

# Backport

This will backport the following commits from `main` to `9.4`:
- [ci(cypress): default stateful ES to snapshot on CI, docker locally
(#264218)](#264218)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Patryk
Kopyciński","email":"contact@patrykkopycinski.com"},"sourceCommit":{"committedDate":"2026-05-05T12:40:20Z","message":"ci(cypress):
default stateful ES to snapshot on CI, docker locally (#264218)\n\n##
Summary\n\nDefault Cypress stateful Elasticsearch provisioning to
`snapshot` on CI\nand keep `docker` for local development.\n\nThe
earlier switch to Docker as the universal default (#254306)
was\nmotivated by:\n\n- making local dev match shipped artifacts,\n-
multi-arch support for Apple Silicon,\n- avoiding per-spec snapshot
extraction,\n- faster warm starts on developer machines.\n\nAll four are
genuine wins **for local dev**. On CI they either don't\napply, are
neutral, or are actively counter-productive. After gathering\nempirical
data from Buildkite, the right default on CI is `snapshot`;
on\nworkstations the right default stays `docker`.\n\n## Why snapshot on
CI\n\n1. **No version-skew race.** Kibana CI already resolves an ES
snapshot\nmanifest once per build
in\n[`.buildkite/scripts/lifecycle/pre_build.sh`](https://github.com/elastic/kibana/blob/main/.buildkite/scripts/lifecycle/pre_build.sh)\nagainst
`kibana-ci-es-snapshots-daily` — Kibana's own daily-verified\nbucket,
version-locked to Kibana by construction. The post-version-bump\nwindow
(`9.5.0`, `9.6.0`, …) that my earlier auto-detect probe tried to\nguard
against doesn't actually exist for stateful Cypress on CI: the\ntar.gz
is already there, or `pre_build.sh` has already failed the build\nbefore
any Cypress agent starts. A Docker image for that same version is\n_not_
guaranteed to exist at the same moment — which is the exact\nfailure
mode we kept running into.\n\n2. **Docker-on-CI is not meaningfully
faster on the same hardware.** I\npulled job durations from Buildkite
for `kibana-on-merge` Security\nSolution Cypress jobs before and after
#254306 and reconciled them\nagainst the Buildkite agent machine-type
change (`n2-standard-4` →\n`n2-highmem-4`) that landed in the same
window. Controlling for that\nhardware change, ES start-up on a warm CI
agent is ~5s different between\nsnapshot tar.gz and Docker — within
noise for a 20–40 minute Cypress\ngroup. The speedups originally
attributed to Docker were largely a\nhardware upgrade.\n\n3. **ES starts
once per FTR config group, not per spec.** `parallel.ts`\nprovisions ES
once for each group in `specGroups`, runs all specs in\nthat group
against the same cluster, then shuts down
(see\n[`runSpecGroup`](https://github.com/elastic/kibana/blob/main/x-pack/solutions/security/plugins/security_solution/scripts/run_cypress/parallel.ts)).\nOnly
retry runs go per-spec. So the \"Docker avoids per-spec extraction\non
CI\" argument is mostly about retries, which are a tiny fraction
of\ntotal runtime.\n\n4. **Fewer moving parts on CI.** No Docker
registry auth, no Docker pull\non every agent, no fallback logic between
Docker and snapshot, no GCS\nprobe script. Snapshot tar.gz is already
pre-fetched/cached by the\nstandard Kibana CI lifecycle.\n\n## Why keep
Docker for local dev\n\n1. Matches shipped artifacts byte-for-byte.\n2.
Native multi-arch (Apple Silicon) without a separate tar.gz
pipeline.\n3. Warm starts are fast once the image is cached on the
workstation.\n4. `CYPRESS_ES_FROM=snapshot` (or `docker`) still works as
an explicit\noverride for both environments.\n\n##
Change\n\n```ts\nconst defaultEsFrom = process.env.CI ? 'snapshot' :
'docker';\nconst esFrom =\n configEsFrom === 'serverless' ? 'serverless'
: esFromEnv || defaultEsFrom;\n```\n\nAlso drops the earlier
`detect_cypress_es_from.sh` probe and its hook in\n`setup_job_env.sh` —
`pre_build.sh` already covers the version-skew\nconcern at a better
layer.\n\nThe serverless routing fix (`configEsFrom === 'serverless'`
wins over\n`CYPRESS_ES_FROM`) is retained from the first commit and is
independent\nof the default flip — it prevents stateful
`CYPRESS_ES_FROM=snapshot`\nfrom accidentally booting serverless suites
against a stateful snapshot\ntar.gz and blowing up with `unknown
setting\n[xpack.security.authc.native_roles.enabled]`.\n\n## Test
plan\n\n- [ ] Green `kibana-on-merge` Security Solution Cypress jobs
(stateful +\nserverless).\n- [ ] Green `kibana-pull-request` Security
Solution Cypress jobs with no\n`CYPRESS_ES_FROM` set.\n- [ ] Local:
`yarn cypress:run ...` still uses Docker by default.\n- [ ] Local:
`CYPRESS_ES_FROM=snapshot yarn cypress:run ...` uses\nsnapshot.\n- [ ]
Serverless suites remain on `serverless` regardless
of\n`CYPRESS_ES_FROM`.\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"66c8e08c9b8dec386784e7af9f2a981464ae43f1","branchLabelMapping":{"^v9.5.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","v9.4.0","v9.5.0"],"title":"ci(cypress):
default stateful ES to snapshot on CI, docker
locally","number":264218,"url":"https://github.com/elastic/kibana/pull/264218","mergeCommit":{"message":"ci(cypress):
default stateful ES to snapshot on CI, docker locally (#264218)\n\n##
Summary\n\nDefault Cypress stateful Elasticsearch provisioning to
`snapshot` on CI\nand keep `docker` for local development.\n\nThe
earlier switch to Docker as the universal default (#254306)
was\nmotivated by:\n\n- making local dev match shipped artifacts,\n-
multi-arch support for Apple Silicon,\n- avoiding per-spec snapshot
extraction,\n- faster warm starts on developer machines.\n\nAll four are
genuine wins **for local dev**. On CI they either don't\napply, are
neutral, or are actively counter-productive. After gathering\nempirical
data from Buildkite, the right default on CI is `snapshot`;
on\nworkstations the right default stays `docker`.\n\n## Why snapshot on
CI\n\n1. **No version-skew race.** Kibana CI already resolves an ES
snapshot\nmanifest once per build
in\n[`.buildkite/scripts/lifecycle/pre_build.sh`](https://github.com/elastic/kibana/blob/main/.buildkite/scripts/lifecycle/pre_build.sh)\nagainst
`kibana-ci-es-snapshots-daily` — Kibana's own daily-verified\nbucket,
version-locked to Kibana by construction. The post-version-bump\nwindow
(`9.5.0`, `9.6.0`, …) that my earlier auto-detect probe tried to\nguard
against doesn't actually exist for stateful Cypress on CI: the\ntar.gz
is already there, or `pre_build.sh` has already failed the build\nbefore
any Cypress agent starts. A Docker image for that same version is\n_not_
guaranteed to exist at the same moment — which is the exact\nfailure
mode we kept running into.\n\n2. **Docker-on-CI is not meaningfully
faster on the same hardware.** I\npulled job durations from Buildkite
for `kibana-on-merge` Security\nSolution Cypress jobs before and after
#254306 and reconciled them\nagainst the Buildkite agent machine-type
change (`n2-standard-4` →\n`n2-highmem-4`) that landed in the same
window. Controlling for that\nhardware change, ES start-up on a warm CI
agent is ~5s different between\nsnapshot tar.gz and Docker — within
noise for a 20–40 minute Cypress\ngroup. The speedups originally
attributed to Docker were largely a\nhardware upgrade.\n\n3. **ES starts
once per FTR config group, not per spec.** `parallel.ts`\nprovisions ES
once for each group in `specGroups`, runs all specs in\nthat group
against the same cluster, then shuts down
(see\n[`runSpecGroup`](https://github.com/elastic/kibana/blob/main/x-pack/solutions/security/plugins/security_solution/scripts/run_cypress/parallel.ts)).\nOnly
retry runs go per-spec. So the \"Docker avoids per-spec extraction\non
CI\" argument is mostly about retries, which are a tiny fraction
of\ntotal runtime.\n\n4. **Fewer moving parts on CI.** No Docker
registry auth, no Docker pull\non every agent, no fallback logic between
Docker and snapshot, no GCS\nprobe script. Snapshot tar.gz is already
pre-fetched/cached by the\nstandard Kibana CI lifecycle.\n\n## Why keep
Docker for local dev\n\n1. Matches shipped artifacts byte-for-byte.\n2.
Native multi-arch (Apple Silicon) without a separate tar.gz
pipeline.\n3. Warm starts are fast once the image is cached on the
workstation.\n4. `CYPRESS_ES_FROM=snapshot` (or `docker`) still works as
an explicit\noverride for both environments.\n\n##
Change\n\n```ts\nconst defaultEsFrom = process.env.CI ? 'snapshot' :
'docker';\nconst esFrom =\n configEsFrom === 'serverless' ? 'serverless'
: esFromEnv || defaultEsFrom;\n```\n\nAlso drops the earlier
`detect_cypress_es_from.sh` probe and its hook in\n`setup_job_env.sh` —
`pre_build.sh` already covers the version-skew\nconcern at a better
layer.\n\nThe serverless routing fix (`configEsFrom === 'serverless'`
wins over\n`CYPRESS_ES_FROM`) is retained from the first commit and is
independent\nof the default flip — it prevents stateful
`CYPRESS_ES_FROM=snapshot`\nfrom accidentally booting serverless suites
against a stateful snapshot\ntar.gz and blowing up with `unknown
setting\n[xpack.security.authc.native_roles.enabled]`.\n\n## Test
plan\n\n- [ ] Green `kibana-on-merge` Security Solution Cypress jobs
(stateful +\nserverless).\n- [ ] Green `kibana-pull-request` Security
Solution Cypress jobs with no\n`CYPRESS_ES_FROM` set.\n- [ ] Local:
`yarn cypress:run ...` still uses Docker by default.\n- [ ] Local:
`CYPRESS_ES_FROM=snapshot yarn cypress:run ...` uses\nsnapshot.\n- [ ]
Serverless suites remain on `serverless` regardless
of\n`CYPRESS_ES_FROM`.\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"66c8e08c9b8dec386784e7af9f2a981464ae43f1"}},"sourceBranch":"main","suggestedTargetBranches":["9.4"],"targetPullRequestStates":[{"branch":"9.4","label":"v9.4.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.5.0","branchLabelMappingKey":"^v9.5.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/264218","number":264218,"mergeCommit":{"message":"ci(cypress):
default stateful ES to snapshot on CI, docker locally (#264218)\n\n##
Summary\n\nDefault Cypress stateful Elasticsearch provisioning to
`snapshot` on CI\nand keep `docker` for local development.\n\nThe
earlier switch to Docker as the universal default (#254306)
was\nmotivated by:\n\n- making local dev match shipped artifacts,\n-
multi-arch support for Apple Silicon,\n- avoiding per-spec snapshot
extraction,\n- faster warm starts on developer machines.\n\nAll four are
genuine wins **for local dev**. On CI they either don't\napply, are
neutral, or are actively counter-productive. After gathering\nempirical
data from Buildkite, the right default on CI is `snapshot`;
on\nworkstations the right default stays `docker`.\n\n## Why snapshot on
CI\n\n1. **No version-skew race.** Kibana CI already resolves an ES
snapshot\nmanifest once per build
in\n[`.buildkite/scripts/lifecycle/pre_build.sh`](https://github.com/elastic/kibana/blob/main/.buildkite/scripts/lifecycle/pre_build.sh)\nagainst
`kibana-ci-es-snapshots-daily` — Kibana's own daily-verified\nbucket,
version-locked to Kibana by construction. The post-version-bump\nwindow
(`9.5.0`, `9.6.0`, …) that my earlier auto-detect probe tried to\nguard
against doesn't actually exist for stateful Cypress on CI: the\ntar.gz
is already there, or `pre_build.sh` has already failed the build\nbefore
any Cypress agent starts. A Docker image for that same version is\n_not_
guaranteed to exist at the same moment — which is the exact\nfailure
mode we kept running into.\n\n2. **Docker-on-CI is not meaningfully
faster on the same hardware.** I\npulled job durations from Buildkite
for `kibana-on-merge` Security\nSolution Cypress jobs before and after
#254306 and reconciled them\nagainst the Buildkite agent machine-type
change (`n2-standard-4` →\n`n2-highmem-4`) that landed in the same
window. Controlling for that\nhardware change, ES start-up on a warm CI
agent is ~5s different between\nsnapshot tar.gz and Docker — within
noise for a 20–40 minute Cypress\ngroup. The speedups originally
attributed to Docker were largely a\nhardware upgrade.\n\n3. **ES starts
once per FTR config group, not per spec.** `parallel.ts`\nprovisions ES
once for each group in `specGroups`, runs all specs in\nthat group
against the same cluster, then shuts down
(see\n[`runSpecGroup`](https://github.com/elastic/kibana/blob/main/x-pack/solutions/security/plugins/security_solution/scripts/run_cypress/parallel.ts)).\nOnly
retry runs go per-spec. So the \"Docker avoids per-spec extraction\non
CI\" argument is mostly about retries, which are a tiny fraction
of\ntotal runtime.\n\n4. **Fewer moving parts on CI.** No Docker
registry auth, no Docker pull\non every agent, no fallback logic between
Docker and snapshot, no GCS\nprobe script. Snapshot tar.gz is already
pre-fetched/cached by the\nstandard Kibana CI lifecycle.\n\n## Why keep
Docker for local dev\n\n1. Matches shipped artifacts byte-for-byte.\n2.
Native multi-arch (Apple Silicon) without a separate tar.gz
pipeline.\n3. Warm starts are fast once the image is cached on the
workstation.\n4. `CYPRESS_ES_FROM=snapshot` (or `docker`) still works as
an explicit\noverride for both environments.\n\n##
Change\n\n```ts\nconst defaultEsFrom = process.env.CI ? 'snapshot' :
'docker';\nconst esFrom =\n configEsFrom === 'serverless' ? 'serverless'
: esFromEnv || defaultEsFrom;\n```\n\nAlso drops the earlier
`detect_cypress_es_from.sh` probe and its hook in\n`setup_job_env.sh` —
`pre_build.sh` already covers the version-skew\nconcern at a better
layer.\n\nThe serverless routing fix (`configEsFrom === 'serverless'`
wins over\n`CYPRESS_ES_FROM`) is retained from the first commit and is
independent\nof the default flip — it prevents stateful
`CYPRESS_ES_FROM=snapshot`\nfrom accidentally booting serverless suites
against a stateful snapshot\ntar.gz and blowing up with `unknown
setting\n[xpack.security.authc.native_roles.enabled]`.\n\n## Test
plan\n\n- [ ] Green `kibana-on-merge` Security Solution Cypress jobs
(stateful +\nserverless).\n- [ ] Green `kibana-pull-request` Security
Solution Cypress jobs with no\n`CYPRESS_ES_FROM` set.\n- [ ] Local:
`yarn cypress:run ...` still uses Docker by default.\n- [ ] Local:
`CYPRESS_ES_FROM=snapshot yarn cypress:run ...` uses\nsnapshot.\n- [ ]
Serverless suites remain on `serverless` regardless
of\n`CYPRESS_ES_FROM`.\n\n---------\n\nCo-authored-by: kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"66c8e08c9b8dec386784e7af9f2a981464ae43f1"}}]}]
BACKPORT-->

Co-authored-by: Patryk Kopyciński <contact@patrykkopycinski.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes v9.4.0 v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants