Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions Documentation/scalar.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,164 @@ delete <enlistment>::
This subcommand lets you delete an existing Scalar enlistment from your
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> +commitGraph.generationVersion=1::
> +	While the preferred version is 2 for performance reasons, existing users
> +	that had version 1 by default will need special care in upgrading to
> +	version 2. This is likely to change in the future as the upgrade story
> +	is solidifies.

"as the upgrade story solidifies"?

> +fetch.writeCommitGraph=false::
> +	This config setting was created to help users automatically udpate their
> +	commit-graph files as they perform fetches. However, this takes time
> +	from foreground fetches and pulls and Scalar uses background maintenance
> +	for this function instead.

"update their files".

> +index.threads=true::
> +	This tells Git to automatically detect how many threads it should use
> +	when reading the index in parallel due to the `core.preloadIndex=true`
> +	setting.

Is "due to the `core.preloadIndex=true` setting" part of this
sentence still relevant?


Other than that, superbly written.  Thanks, will queue.

local file system, unregistering the repository.

REQUIRED AND RECOMMENDED CONFIG
-------------------------------

As part of both `scalar clone` and `scalar register`, certain Git config
values are set to optimize for large repositories or cross-platform support.
These options are updated in new Git versions according to the best known
advice for large repositories, and users can get the latest recommendations
by running `scalar reconfigure [--all]`.

This section lists justifications for the config values that are set in the
latest version.

am.keepCR=true::
This setting is important for cross-platform development across Windows
and non-Windows platforms and keeping carriage return (`\r`) characters
in certain workflows.

commitGraph.changedPaths=true::
This setting helps the background maintenance steps that compute the
serialized commit-graph to also store changed-path Bloom filters. This
accelerates file history commands and allows users to automatically
benefit without running a foreground command.

commitGraph.generationVersion=1::
While the preferred version is 2 for performance reasons, existing users
that had version 1 by default will need special care in upgrading to
version 2. This is likely to change in the future as the upgrade story
is solidifies.

core.autoCRLF=false::
This removes the transformation of worktree files to add CRLF line
endings when only LF line endings exist. This is removed for performance
reasons. Repositories that use tools that care about CRLF line endings
should commit the necessary files with those line endings instead.

core.logAllRefUpdates=true::
This enables the reflog on all branches. While this is a performance
cost for large repositories, it is frequently an important data source
for users to get out of bad situations or to seek support from experts.

core.safeCRLF=false::
Similar to `core.autoCRLF=false`, this disables checks around whether
the CRLF conversion is reversible. This is a performance improvement,
but can be dangerous if `core.autoCRLF` is reenabled by the user.

credential.https://dev.azure.com.useHttpPath=true::
This setting enables the `credential.useHttpPath` feature only for web
URLs for Azure DevOps. This is important for users interacting with that
service using multiple organizations and thus multiple credential
tokens.

feature.experimental=false::
This disables the "experimental" optimizations grouped under this
feature config. The expectation is that all valuable optimizations are
also set explicitly by Scalar config, and any differences are
intentional. Notable differences include several bitmap-related config
options which are disabled for client-focused Scalar repos.

feature.manyFiles=false::
This disables the "many files" optimizations grouped under this feature
config. The expectation is that all valuable optimizations are also set
explicitly by Scalar config, and any differences are intentional.

fetch.showForcedUpdates=false::
This disables the check at the end of `git fetch` that notifies the user
if the ref update was a forced update (one where the previous position
is not reachable from the latest position). This check can be very
expensive in large repositories, so is disabled and replaced with an
advice message. Set `advice.fetchShowForcedUpdates=false` to disable
this advice message.

fetch.unpackLimit=1::
This setting prevents Git from unpacking packfiles into loose objects
as they are downloaded from the server. This feature was intended as a
way to prevent performance issues from too many packfiles, but Scalar
uses background maintenance to group packfiles and cover them with a
multi-pack-index, removing this issue.

fetch.writeCommitGraph=false::
This config setting was created to help users automatically udpate their
commit-graph files as they perform fetches. However, this takes time
from foreground fetches and pulls and Scalar uses background maintenance
for this function instead.

gc.auto=0::
This disables automatic garbage collection, since Scalar uses background
maintenance to keep the repository data in good shape.

gui.GCWarning=false::
Since Scalar disables garbage collection by setting `gc.auto=0`, the
`git-gui` tool may start to warn about this setting. Disable this
warning as Scalar's background maintenance configuration makes the
warning irrelevant.

index.skipHash=true::
Disable computing the hash of the index contents as it is being written.
This assists with performance, especially for large index files.

index.threads=true::
This tells Git to automatically detect how many threads it should use
when reading the index in parallel due to the `core.preloadIndex=true`
setting.

index.version=4::
This index version adds compression to the path names, reducing the size
of the index in a significant way for large repos. This is an important
performance boost.

merge.renames=true::
When computing merges in large repos, it is particularly important to
detect renames to maximize the potential for a result that will validate
correctly. Users performing merges locally are more likely to be doing
so because a server-side merge (via pull request or similar) resulted in
conflicts. While this is the default setting, it is set specifically to
override a potential change to `diff.renames` which a user may set for
performance reasons.

merge.stat=false::
This disables a diff output after computing a merge. This improves
performance of `git merge` for large repos while reducing noisy output.

pack.useBitmaps=false::
This disables the use of `.bitmap` files attached to packfiles. Bitmap
files are optimized for server-side use, not client-side use. Scalar
disables this to avoid some performance issues that can occur if a user
accidentally creates `.bitmap` files.

pack.usePathWalk=true::
This enables the `--path-walk` option to `git pack-objects` by default.
This can accelerate the computation and compression of packfiles created
by `git push` and other repack operations.

receive.autoGC=false::
Similar to `gc.auto`, this setting is disabled in preference of
background maintenance.

status.aheadBehind=false::
This disables the ahead/behind calculation that would normally happen
during a `git status` command. This information is frequently ignored by
users but can be expensive to calculate in large repos that receive
thousands of commits per day. The calculation is replaced with an advice
message that can be disabled by disabling the `advice.statusAheadBehind`
config.

The following settings are different based on which platform is in use:

core.untrackedCache=(true|false)::
The untracked cache feature is important for performance benefits on
large repositories, but has demonstrated some bugs on Windows
filesystems. Thus, this is set for other platforms but disabled on
Windows.

http.sslBackend=schannel::
On Windows, the `openssl` backend has some issues with certain types of
remote providers and certificate types. Override the default setting to
avoid these common problems.


SEE ALSO
--------
linkgit:git-clone[1], linkgit:git-maintenance[1].
Expand Down
81 changes: 43 additions & 38 deletions scalar.c
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "help.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> Add "# set by scalar" to the end of each config option to assist users
> in identifying why these config options were set in their repo.

The implementation is quite straight-forward, inlining expansion of
repo_config_set_gently() in the places that we want to add comment to.

If we had (a lot) more than two callsites, I would have suggested to
add a simple helper function, something like

    static int scalar_config_set(struct repository *r, const char *key, const char *value)
    {
	char *file = repo_git_path(r, "config");
        int res = repo_config_set_multivar_in_file_gently(r, file,
		key, value, NULL, " # set by scalar", 0);
	free(file);
	return res;
    }

and then the updates to the callers would have been absolute minimum.

Well, even with only two callsites, perhaps such a refactoring may
still have value in reducing the risk of typo in the comment.

> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index bd6f0c40d2..43c210a23d 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
>  	GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
>  	test_path_is_file one/src/cron.txt &&
>  	test true = "$(git -C one/src config core.preloadIndex)" &&
> +	test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
> +	test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
> +
>  	test_subcommand git maintenance start <reconfigure &&
>  	test_subcommand ! git maintenance unregister --force <reconfigure &&

Looks good.

#include "setup.h"
#include "trace2.h"
#include "path.h"

static void setup_enlistment_directory(int argc, const char **argv,
const char * const *usagestr,
Expand Down Expand Up @@ -99,16 +100,20 @@ static int set_scalar_config(const struct scalar_config *config, int reconfigure
{
char *value = NULL;
int res;
char *file = repo_git_path(the_repository, "config");

if ((reconfigure && config->overwrite_on_reconfigure) ||
repo_config_get_string(the_repository, config->key, &value)) {
trace2_data_string("scalar", the_repository, config->key, "created");
res = repo_config_set_gently(the_repository, config->key, config->value);
res = repo_config_set_multivar_in_file_gently(the_repository, file, config->key,
config->value, NULL,
" # set by scalar", 0);
} else {
trace2_data_string("scalar", the_repository, config->key, "exists");
res = 0;
}

free(file);
free(value);
return res;
}
Expand All @@ -122,13 +127,33 @@ static int have_fsmonitor_support(void)
static int set_recommended_config(int reconfigure)
{
struct scalar_config config[] = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> From: Derrick Stolee <[email protected]>
>
> These config values were added in the original Scalar contribution,
> d0feac4e8c (scalar: 'register' sets recommended config and starts
> maintenance, 2021-12-03), but were never fully checked for validity in
> the upstream Git project. At the time, Scalar was only intended for the
> contrib/ directory so did not have as rigorous of an investigation.
>
> Each config option has its own justification for removal:
>
> * core.preloadIndex: This value is true by default, now. Removing this
>   causes some changes required to the tests that checked this config
>   value. Use gui.gcwarning=false instead.
>
> * core.fscache: This config does not exist in the core Git project, but
>   is instead a config option for a Git for Windows feature.
>
> * core.multiPackIndex: This config value is now enabled by default, so
>   does not need to be called out specifically. It was originally
>   included to make sure the background maintenance that created
>   multi-pack-indexes would result in the expected performance
>   improvements.
>
> * credential.validate: This option is not something specific to Git but
>   instead an older version of Git Credential Manager for Windows. That
>   software was replaced several years ago by the cross-platform Git
>   Credential Manger so this option is no longer needed to help users who
>   were on that older software.
>
> * pack.useSparse=true: This value is now Git's default as of de3a864114
>   (config: set pack.useSparse=true by default, 2020-03-20) so we don't
>   need it set by Scalar.

Thanks for a conprehensive list.  Very well described.

/* Required */
{ "am.keepCR", "true", 1 },
{ "core.FSCache", "true", 1 },
{ "core.multiPackIndex", "true", 1 },
{ "core.preloadIndex", "true", 1 },
{ "am.keepCR", "true" },
{ "commitGraph.changedPaths", "true" },
{ "commitGraph.generationVersion", "1" },
{ "core.autoCRLF", "false" },
{ "core.logAllRefUpdates", "true" },
{ "core.safeCRLF", "false" },
{ "credential.https://dev.azure.com.useHttpPath", "true" },
{ "feature.experimental", "false" },
{ "feature.manyFiles", "false" },
{ "fetch.showForcedUpdates", "false" },
{ "fetch.unpackLimit", "1" },
{ "fetch.writeCommitGraph", "false" },
{ "gc.auto", "0" },
{ "gui.GCWarning", "false" },
{ "index.skipHash", "true", 1 /* Fix previous setting. */ },
{ "index.threads", "true"},
{ "index.version", "4" },
{ "merge.renames", "true" },
{ "merge.stat", "false" },
{ "pack.useBitmaps", "false" },
{ "pack.usePathWalk", "true" },
{ "receive.autoGC", "false" },
{ "status.aheadBehind", "false" },

/* platform-specific */
#ifndef WIN32
{ "core.untrackedCache", "true", 1 },
{ "core.untrackedCache", "true" },
#else
/*
* Unfortunately, Scalar's Functional Tests demonstrated
Expand All @@ -142,36 +167,11 @@ static int set_recommended_config(int reconfigure)
* Therefore, with a sad heart, we disable this very useful
* feature on Windows.
*/
{ "core.untrackedCache", "false", 1 },
#endif
{ "core.logAllRefUpdates", "true", 1 },
{ "credential.https://dev.azure.com.useHttpPath", "true", 1 },
{ "credential.validate", "false", 1 }, /* GCM4W-only */
{ "gc.auto", "0", 1 },
{ "gui.GCWarning", "false", 1 },
{ "index.skipHash", "false", 1 },
{ "index.threads", "true", 1 },
{ "index.version", "4", 1 },
{ "merge.stat", "false", 1 },
{ "merge.renames", "true", 1 },
{ "pack.useBitmaps", "false", 1 },
{ "pack.useSparse", "true", 1 },
{ "receive.autoGC", "false", 1 },
{ "feature.manyFiles", "false", 1 },
{ "feature.experimental", "false", 1 },
{ "fetch.unpackLimit", "1", 1 },
{ "fetch.writeCommitGraph", "false", 1 },
#ifdef WIN32
{ "http.sslBackend", "schannel", 1 },
{ "core.untrackedCache", "false" },

/* Other Windows-specific required settings: */
{ "http.sslBackend", "schannel" },
#endif
/* Optional */
{ "status.aheadBehind", "false" },
{ "commitGraph.changedPaths", "true" },
{ "commitGraph.generationVersion", "1" },
{ "core.autoCRLF", "false" },
{ "core.safeCRLF", "false" },
{ "fetch.showForcedUpdates", "false" },
{ "pack.usePathWalk", "true" },
{ NULL, NULL },
};
int i;
Expand All @@ -195,13 +195,18 @@ static int set_recommended_config(int reconfigure)
* for multiple values.
*/
if (repo_config_get_string(the_repository, "log.excludeDecoration", &value)) {
char *file = repo_git_path(the_repository, "config");
trace2_data_string("scalar", the_repository,
"log.excludeDecoration", "created");
if (repo_config_set_multivar_gently(the_repository, "log.excludeDecoration",
if (repo_config_set_multivar_in_file_gently(the_repository, file,
"log.excludeDecoration",
"refs/prefetch/*",
CONFIG_REGEX_NONE, 0))
CONFIG_REGEX_NONE,
" # set by scalar",
0))
return error(_("could not configure "
"log.excludeDecoration"));
free(file);
} else {
trace2_data_string("scalar", the_repository,
"log.excludeDecoration", "exists");
Expand Down
26 changes: 17 additions & 9 deletions t/t9210-scalar.sh
Original file line number Diff line number Diff line change
Expand Up @@ -202,14 +202,17 @@ test_expect_success 'scalar clone --no-... opts' '
test_expect_success 'scalar reconfigure' '
git init one/src &&
scalar register one &&
git -C one/src config core.preloadIndex false &&
git -C one/src config unset gui.gcwarning &&
scalar reconfigure one &&
test true = "$(git -C one/src config core.preloadIndex)" &&
git -C one/src config core.preloadIndex false &&
test false = "$(git -C one/src config gui.gcwarning)" &&
git -C one/src config unset gui.gcwarning &&
rm one/src/cron.txt &&
GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
test_path_is_file one/src/cron.txt &&
test true = "$(git -C one/src config core.preloadIndex)" &&
test false = "$(git -C one/src config gui.gcwarning)" &&
test_grep "GCWarning = false # set by scalar" one/src/.git/config &&
test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&

test_subcommand git maintenance start <reconfigure &&
test_subcommand ! git maintenance unregister --force <reconfigure &&

Expand All @@ -231,25 +234,30 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
git init $num/src &&
scalar register $num/src &&
git -C $num/src config includeif."onbranch:foo".path something &&
git -C $num/src config core.preloadIndex false || return 1
git -C $num/src config unset gui.gcwarning || return 1
done &&

scalar reconfigure --all &&

for num in $repos
do
test true = "$(git -C $num/src config core.preloadIndex)" || return 1
test false = "$(git -C $num/src config gui.gcwarning)" || return 1
done
'

test_expect_success 'scalar reconfigure --all with detached HEADs' '
# This test demonstrates an issue with index.skipHash=true and
# this test variable for the split index. Disable the test variable.
GIT_TEST_SPLIT_INDEX= &&
export GIT_TEST_SPLIT_INDEX &&

repos="two three four" &&
for num in $repos
do
rm -rf $num/src &&
git init $num/src &&
scalar register $num/src &&
git -C $num/src config core.preloadIndex false &&
git -C $num/src config unset gui.gcwarning &&
test_commit -C $num/src initial &&
git -C $num/src switch --detach HEAD || return 1
done &&
Expand All @@ -258,7 +266,7 @@ test_expect_success 'scalar reconfigure --all with detached HEADs' '

for num in $repos
do
test true = "$(git -C $num/src config core.preloadIndex)" || return 1
test false = "$(git -C $num/src config gui.gcwarning)" || return 1
done
'

Expand Down Expand Up @@ -290,7 +298,7 @@ test_expect_success 'scalar supports -c/-C' '
git init sub &&
scalar -C sub -c status.aheadBehind=bogus register &&
test -z "$(git -C sub config --local status.aheadBehind)" &&
test true = "$(git -C sub config core.preloadIndex)"
test false = "$(git -C sub config gui.gcwarning)"
'

test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
Expand Down
Loading