Presto: More robust handling of existing config by simoneves · Pull Request #109 · rapidsai/velox-testing

simoneves · 2025-10-27T15:54:17Z

This PR adds logic to the Presto start scripts to re-use any existing config files for the given variant in case they have been modified by the user.

Config files for the given variant will be automatically generated if absent.

A config tree is generated for each variant independently, so that the user is free to make manual edits in the generated version for one variant, but switch to another variant temporarily (e.g. to do an ANALYZE in CPU mode) without losing the edits.

Also added an --overwrite-config option to the start scripts to force regeneration of the config when deliberately switching variants or to revert an edited config back to default.

Configs generated or edited with previous versions of velox-testing will be ignored, and should ideally be removed.

simoneves · 2025-10-27T16:33:49Z

An alternative behavior (suggested by @patdevinwilson unless I'm misunderstanding him!) may be to allow the presence of independent generated (and editable) configs for all three variants, so that local edited dev versions for all three can exist in parallel, perhaps for comparison purposes.

This may be overkill. Discuss.

patdevinwilson · 2025-10-28T15:23:50Z

presto/scripts/generate_presto_config.sh

lsmem may not be available on all environments. Consider adding a fallback (e.g., /proc/meminfo or free -g) so the script works across more hosts.
Example fallback: if command -v lsmem is missing, parse /proc/meminfo or use free -g

@patdevinwilson do you have an example of a system where lsmem -b does NOT work?

patdevinwilson

LGTM with non-blocking suggestions. The change improves safety and UX. Addressing the fallback for memory detection and adding a simple config version would make this more robust, but they are not blockers for merging

patdevinwilson · 2025-10-28T15:36:09Z

An alternative behavior (suggested by @patdevinwilson unless I'm misunderstanding him!) may be to allow the presence of independent generated (and editable) configs for all three variants, so that local edited dev versions for all three can exist in parallel, perhaps for comparison purposes.

This may be overkill. Discuss.

Per-variant directories (recommended medium complexity)
Store generated configs under per-variant directories: generated/cpu/, generated/gpu/, generated/java/.
Each variant dir contains:
the generated configs
variant metadata file: variant_type.txt (or implicit from dir)
config_version or schema_version (e.g., config_version.txt)
Generation logic:
When VARIANT_TYPE is set, write new configs into generated/${VARIANT_TYPE}/.
If generated/${VARIANT_TYPE}/ already exists and --overwrite-config not set, reuse it (possibly warn if config_version mismatches current tool).
If generated/${VARIANT_TYPE}/ is missing, generate it.
Backwards compatibility:
On first run after deploy: if generated/ exists and is not per-variant (old layout), auto-migrate it to generated/<detected_variant>/ (move or copy) OR treat it as a “legacy” default but print a migration hint.
CLI UX:
Keep --overwrite-config (applies to current VARIANT_TYPE).
Add optional flags: --overwrite-config-all, --list-configs, --use-config= (explicit override).
Advantages: clean separation, simpler validation (dir name = variant), easy to list/compare.
Disadvantages: requires a migration step and a slightly larger script change.
Per-variant file suffixes or names (alternative)
Instead of subdirs, name files like config.cpu.json, config.gpu.json, config.java.json.
Simpler to implement in some shells, but when there are many files it’s cleaner to put them under a subdir.
Still need metadata/version files.

paul-aiyedun · 2025-10-29T20:27:27Z

presto/scripts/generate_presto_config.sh

-rm -rf generated
-mkdir -p generated
-cat > generated/config.json << EOF
+# generate only if no existing config or overwrite flag is set


What configurations are being modified locally? Can you please expand on where the request for the updates in the PR is coming from?

@paul-aiyedun Todd and Patrick were both iterating configs by editing their local files. Once the original auto-config PR (#48) landed, this was dangerous, because the config would be automatically re-generated every time you ran the start script, so local speculative edits would be overwritten if you hadn't copied them aside. Hence, they asked for this change.

tmostak · 2025-10-30T00:19:11Z

Apologies I'm chiming in late, but whether in this PR or a follow-up, I'd strongly advocate for separate configs for GPU, CPU Native, and Java, as I've hit some nasty surprised switching between them (after I removed the generate command from the start presto scripts, which will become the default with this PR), particularly as the number of drivers doesn't change when you move from GPU to CPU and vice-versa, with all the ensuing impacts to runtime and memory usage.

simoneves · 2025-10-30T00:28:20Z

Apologies I'm chiming in late, but whether in this PR or a follow-up, I'd strongly advocate for separate configs for GPU, CPU Native, and Java, as I've hit some nasty surprised switching between them (after I removed the generate command from the start presto scripts, which will become the default with this PR), particularly as the number of drivers doesn't change when you move from GPU to CPU and vice-versa, with all the ensuing impacts to runtime and memory usage.

@tmostak The generate script is still called in this PR. It's just smart enough not to regenerate for the same variant unless you provide the overwrite option. It also won't let you re-use a config that was made for a different variant. It will FORCE you to use the overwrite option in that case. Unless I've messed something up, but that's how it works for me.

simoneves · 2025-10-30T03:24:57Z

@tmostak et al, per our discussion, I have now made it generate and maintain separate configs per variant, so you are free to edit one variant's generated config without risk of losing that work if you have to switch to another variant.

simoneves · 2025-10-30T03:28:53Z

presto/docker/docker-compose.common.yml

 services:
  presto-base-volumes:
    volumes:
-      - ./config/generated/etc_common:/opt/presto-server/etc


Each of the three variants now has a separate generated config tree, so ALL of the mappings have to move to the leaf container recipes.

simoneves · 2025-10-30T03:34:44Z

presto/docker/docker-compose.java.yml

      service: presto-base-coordinator
    volumes:
-      - ./config/generated/etc_coordinator/config_java.properties:/opt/presto-server/etc/config.properties
+      - ./config/generated/java/etc_common:/opt/presto-server/etc


I am assuming that order is important here, to ensure that the behavior is still to map the directory and THEN map some of the individual files (to somewhere else), same as when the directory mapping was in the parent recipe.

simoneves · 2025-10-30T03:39:33Z

Something's still not right with Java mode memory config. Test job with this branch failed, although last night's CI was fine. I don't think this branch could have broken it, but I'll check it again in the morning...

mattgara

LGTM, just had some questions.

mattgara · 2025-10-30T16:58:38Z

presto/scripts/generate_presto_config.sh

+  echo -e "${GREEN}$1${NC}"
 }

 if [ ! -x ../pbench/pbench ]; then


Great little check, I'm wondering if we need to broadly apply such checks to all our scripts! (I realize this was not added in this PR but it just gets highlighted, and I think I've previously missed this.)

@misiugodfrey perhaps you can implement something equivalent in your script utilities stuff, that can be called from any script to verify that it's being run from the right location.

mattgara · 2025-10-30T17:00:49Z

presto/scripts/generate_presto_config.sh

-fi
+  # now perform other variant-specific modifications to the generated configs
+  if [[ "${VARIANT_TYPE}" == "gpu" ]]; then
+    # for GPU variant, uncomment these optimizer settings


(Also not added in this PR but just being brought to my attention in this one)

"uncomment these ... when" seems like a perfect fit for an CLI option that enables this (optionally of course!)

Yeah, this was #111. I don't think this should be a command-line option, as it MUST be enabled for GPU, and MUST NOT be enabled for CPU. This was my latest way of avoiding forking the template files, which @paul-aiyedun and others are opposed to.

mattgara · 2025-10-30T17:55:43Z

presto/scripts/generate_presto_config.sh

+  echo "Generating Presto Config files for '${VARIANT_TYPE}' for host with ${NPROC} CPU cores and ${RAM_GB}GB RAM"
+
+  # (re-)generate the config.json file
+  rm -rf ${CONFIG_DIR}


Does this need special handling in case the script fails to complete (i.e. do we need a trap to clean this up?)

Do you mean it should remove any config files it manages to generate if it doesn't generate it PROPERLY? At this point, the only thing that can fail is pbench itself, and if that fails, you get nothing anyway. The server won't start in that situation, so even if something WAS generated, the server won't try to use it, and it'll get deleted again when it next succeeds.

paul-aiyedun

Looks good to me.

@paul-aiyedun

Previous attempts at allowing automatic memory config for Java mode failed, with the Coordinator failing to start, with errors of the form: Invalid memory configuration. The sum of max total query memory per node (96636764160) and heap headroom (36399847833) cannot be larger than the available heap memory (121332826112) This turns out to be because there are two additional memory-related configs that were in the Native Coordinator config but not in the Java one which turn out to be essential in order for the memory assignment math to be correct. Thanks to @paul-aiyedun for finding the fix. Also fixes the Java config.properties file container mapping broken accidentally in #109.

More robust handling of any existing config

9776c83

simoneves requested review from Avinash-Raj, karthikeyann, mattgara, misiugodfrey, patdevinwilson, paul-aiyedun and tmostak October 27, 2025 15:54

simoneves added the improvement Improves an existing functionality label Oct 27, 2025

simoneves changed the title ~~Presto: Move robust handling of existing config~~ Presto: More robust handling of existing config Oct 27, 2025

patdevinwilson reviewed Oct 28, 2025

View reviewed changes

patdevinwilson approved these changes Oct 28, 2025

View reviewed changes

paul-aiyedun reviewed Oct 29, 2025

View reviewed changes

simoneves requested a review from paul-aiyedun October 29, 2025 20:46

Merge branch 'main' into seves/avoid_overwriting_presto_memory_config

79f76c7

simoneves mentioned this pull request Oct 29, 2025

Add Presto GPU landing page README #114

Merged

simoneves added 2 commits October 29, 2025 20:10

Separate configs per variant

15ab87d

Map separate configs per variant

c0cca8e

simoneves requested a review from patdevinwilson October 30, 2025 03:24

simoneves commented Oct 30, 2025

View reviewed changes

mattgara reviewed Oct 30, 2025

View reviewed changes

simoneves requested a review from mattgara October 30, 2025 18:04

paul-aiyedun approved these changes Oct 30, 2025

View reviewed changes

simoneves merged commit 16cdd54 into main Oct 30, 2025
1 of 3 checks passed

simoneves deleted the seves/avoid_overwriting_presto_memory_config branch October 30, 2025 19:59

This was referenced Oct 30, 2025

Presto: Fix Java memory config #115

Closed

Presto: Fix for Java Automatic Memory Config (REPLACES #115) #117

Merged

Conversation

simoneves commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simoneves commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patdevinwilson left a comment

Choose a reason for hiding this comment

Uh oh!

patdevinwilson commented Oct 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmostak commented Oct 30, 2025

Uh oh!

simoneves commented Oct 30, 2025

Uh oh!

simoneves commented Oct 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simoneves commented Oct 30, 2025

Uh oh!

mattgara left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattgara Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paul-aiyedun left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

simoneves commented Oct 27, 2025 •

edited

Loading

simoneves commented Oct 27, 2025 •

edited

Loading

mattgara Oct 30, 2025 •

edited

Loading