Skip to content

Conversation

@GregOwen
Copy link
Contributor

@GregOwen GregOwen commented Aug 21, 2018

What changes were proposed in this pull request?

The value returned by a call to spark.conf.get(key) may be specified in one of 3 different ways:

  1. By explicitly setting the value of key via e.g. spark.conf.set(key, value)
  2. By the client providing a default value at call-time via e.g. spark.conf.get(key, clientDefault)
  3. By the ConfigBuilder providing a default value when it is declared via e.g. SQLConf.buildConf(key).createWithDefault(builderDefault)

Currently, the order of precedence among these different sources of value is:

  1. Explicit value if one was set
  2. Otherwise, client default if one was provided
  3. Otherwise, builder default

There is currently one exception to this rule: if the builder happened to be declared with a default that is another ConfigBuilder (via e.g. SQLConf.buildConf(key).fallbackConf(parentConfigBuilder)), then we check the builder default first (in this case the parent conf and potentially the parent's parent and so on) and then only use the client default if none of the fallback confs had explicit values defined.

From a user perspective, this means that the client default may or may not be used depending on whether the config in question was declared with a scalar default or a fallback conf default, which is usually not obvious. In the case of a fallback conf default, the value returned by spark.conf.get may depend on an arbitrary number of parent confs, which seems like it will lead to particularly difficult to diagnose errors.

This PR removes that exception and adds a test to guarantee that the precedence order listed above is applied to all conf types. Note that this is an explicit change from the previous implementation in #19974

How was this patch tested?

Unit test to verify the precedence order for confs with both scalar and fallback conf defaults.

@gatorsmile
Copy link
Member

ok to test

@GregOwen GregOwen changed the title [SPARK-22779] Fallback config defaults should behave like scalar defaults [SPARK-22779][SQL] Fallback config defaults should behave like scalar defaults Aug 21, 2018
case _ => defaultValue
}
}
Option(settings.get(key)).getOrElse(defaultValue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a behavior change. We already have a conf that uses this fallback config. See

  val SQL_STRING_REDACTION_PATTERN =
    buildConf("spark.sql.redaction.string.regex")
      .doc("Regex to decide which parts of strings produced by Spark contain sensitive " +
        "information. When this regex matches a string part, that string part is replaced by a " +
        "dummy value. This is currently used to redact the output of SQL explain commands. " +
        "When this conf is not set, the value from `spark.redaction.string.regex` is used.")
      .fallbackConf(org.apache.spark.internal.config.STRING_REDACTION_PATTERN)

@SparkQA
Copy link

SparkQA commented Aug 21, 2018

Test build #4284 has finished for PR 22174 at commit 28ea230.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@GregOwen
Copy link
Contributor Author

Closing this for now since it seems like this might break existing workflows

@GregOwen GregOwen closed this Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants