Support for adding unique ID into volume mount names #62

Chuxel · 2022-07-15T19:29:36Z

This proposal is an enhancement idea to the current dev container features proposal.

This spec proposal is a port of the proposed solution to microsoft/vscode-remote-release#5679 and relates to #2 (comment).

For both Dev Container Features (#61) and normal devcontainer.json scenarios, it is sometimes necessary to have a unique identifier generated that can be referenced in different places.

The most critical example of this is named volumes. When you have multiple dev containers running in the same environment (e.g. locally), the docker-in-docker feature needs to be able to create unique volume names for each to store docker data. Any feature that references a mount point will face the same challenge.

The proposed solution is to introduce a variable like ${devcontainerId} that is a unique identifier for the resulting container. The format of the identifier does not matter as much as the fact it can be broadly used. So generally it should be alpha numeric. Features can then reference this identifier where appropriate (like in mounts).

@alexdima @chrmarti @jkeech @edgonmsft - This has been the most consistent ask from early adopters of dev container features to date. So I think we'll want to cover it in the spec.

The text was updated successfully, but these errors were encountered:

stuartleeks · 2022-07-29T14:25:47Z

Is there any consensus on how the devcontainerId value would be generated?

My initial thought when thinking about this a while back was to use the name from devcontainer.json. The downside is that changing the name would break the volume links, which isn't ideal. Especially as I often start working with a dev container for a while before noticing that I haven't put something meaningful in the name value so end up changing it later to something to help identify the project.

Another idea is to add an id property to devcontainer.json and have the CLI/extension populate it with a new randomly generated value if it doesn't exist. This could then be the value used for devcontainerId and users could still rename the dev container without breaking the volume links in features.

stuartleeks · 2022-07-29T14:54:02Z

An alternative to generating the id and modifying a user's devcontainer.json would be to use id if it exists and name (or a derivative of it) as a fallback. In this case, it might help steer into the pit of success if the extension auto-generated a random id value when a user adds a devcontainer.json via the "Remote-Containers: Add Development Container Configuration Files..." command?

chrmarti · 2022-08-11T10:11:54Z

I would use a similar approach to how we name images which uses vsc-${workspaceFolderBasename}-${workspaceFolderHash}.

To also support the repository in volume case in Remote-Containers (and possibly other cases) we need to base the hash on the id labels we use to identify the dev container (the id label includes the workspace folder path in the local case). It would be nice to have a human-readable part, but that might not be easy to derive in all cases.

So my proposal is to use a sha-256 hash of the id labels (there can be multiple such labels and they all need to match to identify the dev container).

stuartleeks · 2022-08-11T11:58:48Z

When I look at a dev container running locally, I have the following labels:

  "com.visualstudio.code.devcontainers.id": "python-3",
  "com.visualstudio.code.devcontainers.release": "v0.231.6",
  "com.visualstudio.code.devcontainers.source": "https://github.com/microsoft/vscode-dev-containers/",
  "com.visualstudio.code.devcontainers.timestamp": "Wed, 13 Apr 2022 00:28:49 GMT",
  "com.visualstudio.code.devcontainers.variant": "3.10-bullseye",
  "desktop.docker.io/wsl-distro": "Ubuntu",
  "devcontainer.local_folder": "\\\\wsl.localhost\\Ubuntu\\home\\stuart\\source\\my-project",
  "version": "0.203.5"

Which of these do you suggest are included in the hash?

My goal is to have an ID that is scoped to an individual dev container and that doesn't change across dev container renames (i.e. changing name in devcontainer.json) or VS Code release versions etc

chrmarti · 2022-08-11T14:04:48Z

In this case only "devcontainer.local_folder": "\\\\wsl.localhost\\Ubuntu\\home\\stuart\\source\\my-project" would go into the hash.

A container created from a repository cloned to a volume has different id labels, e.g.:

"vsch.local.repository": "https://github.com/chrmarti/vscode-regex.git/tree/main",
"vsch.local.repository.folder": "vscode-regex",
"vsch.local.repository.volume": "vscode-regex-main-29a8aa263ac65c28ae2f5568b6fd157f"

In the code these labels are passed around as idLabels IIRC.

Chuxel · 2022-08-11T14:22:59Z

@chrmarti I assume this would also work if a volume was used for the source code in the Remote - Containers case, correct?

chrmarti · 2022-08-11T19:09:32Z

@Chuxel Yes, the variations I'm aware of are:

Local folder:

"devcontainer.local_folder": "\\\\wsl.localhost\\Ubuntu\\home\\stuart\\source\\my-project"

"Clone Repository in Container Volume":

"vsch.local.repository": "https://github.com/chrmarti/vscode-regex.git/tree/main",
"vsch.local.repository.folder": "vscode-regex",
"vsch.local.repository.volume": "vscode-regex-main-29a8aa263ac65c28ae2f5568b6fd157f"

When inspecting a volume:

"vsch.local.volume": "myvolume"

Codespaces (here the unique id might not be relevant at the moment):

"Type": "codespaces"

(Using a fixed id label is fine when that uniquely identifies the dev container among other containers on the same Docker host.)

The Dev Container CLI only has the local folder case built-in, in all other cases we pass id labels to the CLI as command line options (--id-label).

chrmarti · 2022-08-12T15:18:52Z

So we could use a hash of the id labels and then base64 encode that like:

const crypto = require('crypto');

// Array of label=value.
const idLabels = ['"devcontainer.local_folder=C:\\Users\\chrmarti\\repos\\hello'];

const hash = crypto.createHash('sha256')
	.update(JSON.stringify(idLabels.sort())) // sort to avoid order dependency
	.digest('base64url'); // omits padding =

// For a volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed,
// base64 includes "[a-zA-Z0-9+/]".
const uniqueId = 'a' + hash.replace(/\+/g, '_')
	.replace(/\//g, '-');

console.log(uniqueId); // Example prints: adg5DShAK-VwSz7TkEPJqOyzEmKEAJc3EKAsKsiCfsRE

Notes:

Is sha256 too long?
- Do we need a cryptographic hash?
JSON.stringify needs to be specified in abstract terms as not including any additional whitespace (to avoid JavaScript specifics in the spec).
The result works for volumes, other names might have other limitations on allowed characters.
- Should compare with, e.g., container and image name limitations.
- Includes upper- and lowercase characters.

tschaffter · 2022-08-13T04:39:42Z

@chrmarti

What do you think about removing the dash from the set of available chars so that the unique ID could be easily selected whenever needed by double clicking on it?

chrmarti · 2022-08-13T12:58:34Z

@tschaffter We could use dot instead of dash (to still match base64).

tschaffter · 2022-08-15T17:34:08Z

Thanks for the prompt response! Dot would lead to the same issue, but it's maybe better than using the dash since VS Code uses the dash as separator in the name of Docker images. For example: vsc-my-project-e7f0e6fdd3e51765526df2844ad10816-features-uid. I realize that the double-click feature I mentioned depends on where the string is double clicked. But if VS Code use dash as separator in resource names, maybe it's best to exclude it from the generation of the hash.

alexdima · 2022-08-17T12:23:55Z

Some time ago I also needed to encode a sha256 in a short string (a subdomain name which are under 63 chars). I went with encoding the number in base32 which requires at most 52 chars:

function sha256AsBase32(bytes: ArrayBuffer): string {
	const array = Array.from(new Uint8Array(bytes));
	const hexArray = array.map(b => b.toString(16).padStart(2, '0')).join('');
	// sha256 has 256 bits, so we need at most ceil(lg(2^256-1)/lg(32)) = 52 chars to represent it in base 32
	return BigInt(`0x${hexArray}`).toString(32).padStart(52, '0');
}

Here's a code pointer (uses crypto browser API, but the principle is the same for nodejs): https://github.com/microsoft/vscode/blob/6fa43c8e9dd85f83ec9f3bbd48f0eeb24db3c76b/src/vs/workbench/browser/webview.ts#L5-L26

chrmarti · 2022-09-01T13:19:12Z

@alexdima I like that idea. It uses [0-9a-v] to represent numbers 0-31 and avoids punctuation characters and case-sensitivity.

chrmarti · 2022-09-05T13:02:00Z

Updated code using base32:

const crypto = require('crypto');

function uniqueIdForLabels(idLabels) {
	const stringInput = JSON.stringify(idLabels, Object.keys(idLabels).sort()); // sort properties
	const bufferInput = Buffer.from(stringInput, 'utf-8');

	const hash = crypto.createHash('sha256')
		.update(bufferInput)
		.digest();

	const uniqueId = BigInt(`0x${hash.toString('hex')}`)
		.toString(32)
		.padStart(52, '0');
	return uniqueId;
}

const examples = [
	{
		'devcontainer.local_folder': '\\\\wsl.localhost\\Ubuntu\\home\\stuart\\source\\my-project'
	},
	{
		'vsch.local.repository.volume': 'vscode-regex-main-29a8aa263ac65c28ae2f5568b6fd157f',
		'vsch.local.repository': 'https://github.com/chrmarti/vscode-regex.git/tree/main',
		'vsch.local.repository.folder': 'vscode-regex'
	},
	{
		'vsch.local.volume': 'myvolume'
	},
	{
		'Type': 'codespaces'
	}
];

const uniqueIds = examples.map(uniqueIdForLabels);
console.log(uniqueIds);
/* Prints:
[
  '1rr0ifggvslov34du7rckm9v3v1vq76fgstqpg49k5ettifijujv',
  '0ogk7ui0niiilqvq4n3cstdcmg5fa4fc2ldsetncf60o3jvu33j3',
  '09h2k0dvao1m9l93kqpcs7kenvhoeik9jlgu3q6k7l4k3nold7ce',
  '0hcbhh2c7vldoj773drm1bjldnb89u0rt96sl9nju22d9ou8d14n'
]
*/

chrmarti · 2022-09-07T08:04:26Z

Goal

Allow features to refer to an identifier that is unique to the dev container they are installed into.

E.g., the docker-in-docker feature needs a way to mount a volume per dev container.

Proposal

The identifier will be refered to as ${devcontainerId} in the feature metadata and that will be replaced with the dev container's id. It should only be used in parts of the metadata that is not used for building the image because that would otherwise prevent prebuilding the image at a time when the dev container's id is not known yet.

The identifier is derived from a set of container labels that uniquely identify the dev container on a Docker host. It is up to the implementation to choose these labels. (E.g., a single label with the workspace folder as its value, or a set of labels identifying a Git repository and a volume to clone the Git repository into.)

E.g., the docker-in-docker feature could use the dev container id with:

{
    "id": "docker-in-docker",
    "version": "1.0.4",
    // ...
    "mounts": [
        {
            "source": "dind-var-lib-docker-${devcontainerId}",
            "target": "/var/lib/docker",
            "type": "volume"
        }
    ]
}

Computing the Identfier

Input the labels as a JSON object with the object's keys being the label names and the object's values being the labels' values.
- To ensure all implemenations get to the same result, the object keys must be sorted and any optional whitespace outside of the keys and values must be removed.
Compute a SHA-256 hash from the UTF-8 encoded input string.
Use a base-32 encoded representation left-padded with '0' to 52 characters as the result.

JavaScript implemenation taking an object with the labels as argument and returning a string as the result:

const crypto = require('crypto');

function uniqueIdForLabels(idLabels) {
	const stringInput = JSON.stringify(idLabels, Object.keys(idLabels).sort()); // sort properties
	const bufferInput = Buffer.from(stringInput, 'utf-8');

	const hash = crypto.createHash('sha256')
		.update(bufferInput)
		.digest();

	const uniqueId = BigInt(`0x${hash.toString('hex')}`)
		.toString(32)
		.padStart(52, '0');
	return uniqueId;
}

jkeech · 2022-09-07T16:20:24Z

Overall looks good. This will be a useful feature.

It should only be used in parts of the metadata that is not used for building the image because that would otherwise prevent prebuilding the image at a time when the dev container's id is not known yet.

Can we formalize this to make it a requirement? For instance, we could only support that variable expansion in devcontainer properties that apply to container runtime and not image build time (mounts, runArgs, etc).

The identifier is derived from a set of container labels that uniquely identify the dev container on a Docker host. It is up to the implementation to choose these labels.

To ensure all implemenations get to the same result, the object keys must be sorted and any optional whitespace outside of the keys and values must be removed.

These two things seem to be in opposition to each other. Unless we define the labels that must be used, implementations will use different labels as the input and get different results, even if they follow the same hashing logic. I would suggest that

If we care about implementation interoperability, then we must define the set of labels that are used for the ID so that implementations all agree and use the same inputs
If we don't care about interoperability, then we also don't need to define the hashing logic and can instead just document that the labels should be hashed to produce a unique identifier for that instance of the devcontainer. We can provide an example hashing algorithm, but it's ultimately an implementation detail of the implementor at that point.

edgonmsft · 2022-09-07T16:43:33Z

+1 on the idea of defining the set of labels explicitly. Besides that this would be a great addition.

Chuxel · 2022-09-08T13:41:18Z

It should only be used in parts of the metadata that is not used for building the image because that would otherwise prevent prebuilding the image at a time when the dev container's id is not known yet.

Can we formalize this to make it a requirement? For instance, we could only support that variable expansion in devcontainer properties that apply to container runtime and not image build time (mounts, runArgs, etc).

@jkeech @edgonmsft @chrmarti Why put a limit on it? Mounts is the primary use case and we do need this to work from image labels as well so that would apply to the image. Also, if this affects the contents of the image, then the ID would be isolated to the image, which would work great. The main place there's a challenge is runtime params.

chrmarti · 2022-09-08T13:53:35Z

It should only be used in parts of the metadata that is not used for building the image because that would otherwise prevent prebuilding the image at a time when the dev container's id is not known yet.

Can we formalize this to make it a requirement? For instance, we could only support that variable expansion in devcontainer properties that apply to container runtime and not image build time (mounts, runArgs, etc).

Sounds good.

The identifier is derived from a set of container labels that uniquely identify the dev container on a Docker host. It is up to the implementation to choose these labels.

To ensure all implemenations get to the same result, the object keys must be sorted and any optional whitespace outside of the keys and values must be removed.

These two things seem to be in opposition to each other. Unless we define the labels that must be used, implementations will use different labels as the input and get different results, even if they follow the same hashing logic. I would suggest that

If we care about implementation interoperability, then we must define the set of labels that are used for the ID so that implementations all agree and use the same inputs

If we don't care about interoperability, then we also don't need to define the hashing logic and can instead just document that the labels should be hashed to produce a unique identifier for that instance of the devcontainer. We can provide an example hashing algorithm, but it's ultimately an implementation detail of the implementor at that point.

The labels to identify a dev container seem to depend on how the dev container is created (in its context, e.g., a Codespace implies a single dev container, so the label can be constant, see #62 (comment)). Other implementations might come up with additional ways to create dev containers, so we probably don't want to limit the allowed labels.

We could specify that implementations can use the devcontainer.local_folder label for interoperability when a dev container is based on a local folder and that they can choose other labels as desired.

jkeech · 2022-09-08T16:09:29Z

@jkeech @edgonmsft @chrmarti Why put a limit on it? Mounts is the primary use case and we do need this to work from image labels as well so that would apply to the image.

@Chuxel, what @chrmarti and I are suggesting is that the devcontainer ID is tied to a single instance of the devcontainer. You can have multiple instances of the same image running on the same machine, so if the ID is evaluated in an image, it's no longer unique. Therefore, the ID can only be evaluated at the time when a devcontainer is spun up, and it only applies to properties at that point which are tied to the container rather than the image. That's why we should restrict the property in the spec to only be allowed in runtime properties like mounts.

Even though the ID is evaluated at runtime does not mean that the ID variable cannot be present in image metadata. It would just remain as ${devcontainerId} in the image label (for the allowed properties like mounts) and then be replaced/evaluated by the CLI at runtime when spinning up the containers.

chrmarti · 2022-09-14T08:59:56Z

Posted the proposal with the discussed clarifications in PR #96 for review.

Chuxel · 2022-10-24T20:38:03Z

//cc @bamurtaugh FYI - Another for the list of items to add to the reference over the next week or so.

Chuxel added the proposal Still under discussion, collecting feedback label Jul 15, 2022

Chuxel mentioned this issue Jul 15, 2022

Port features.json properties for most common runtime arguments #2

Open

bamurtaugh added this to the July 2022 milestone Jul 21, 2022

bamurtaugh modified the milestones: July 2022, August 2022 Jul 28, 2022

tschaffter mentioned this issue Jul 31, 2022

Enable more than one devcontainer to use Docker at a time Sage-Bionetworks/sage-monorepo#477

Closed

Chuxel mentioned this issue Aug 10, 2022

Running multiple devcontainers with docker-in-docker in parallel fails, need unique volume name microsoft/vscode-remote-release#5679

Closed

Chuxel added the active label Aug 10, 2022

alexdima assigned chrmarti Aug 24, 2022

bamurtaugh modified the milestones: August 2022, September 2022 Aug 25, 2022

Chuxel mentioned this issue Aug 31, 2022

Docker in Docker feature only accessible by single devcontainer at a time microsoft/vscode-dev-containers#1596

Closed

kieferrm mentioned this issue Sep 4, 2022

Remote: Iteration Plan for September 2022 microsoft/vscode-remote-release#7160

Closed

7 tasks

chrmarti mentioned this issue Sep 14, 2022

Dev Container Id Proposal #96

Merged

chrmarti modified the milestones: September 2022, October 2022 Sep 30, 2022

kieferrm mentioned this issue Oct 8, 2022

Remote: Iteration Plan for October 2022 microsoft/vscode-remote-release#7315

Closed

8 tasks

chrmarti added a commit to devcontainers/cli that referenced this issue Oct 24, 2022

${devcontainerId} (devcontainers/spec#62)

edf2f1f

chrmarti added a commit to devcontainers/cli that referenced this issue Oct 24, 2022

${devcontainerId} (devcontainers/spec#62)

e651585

chrmarti closed this as completed Oct 24, 2022

chrmarti mentioned this issue Oct 24, 2022

Test: ${devcontainerId} variable microsoft/vscode-remote-release#7405

Closed

2 tasks

This was referenced Oct 24, 2022

Latest spec changes #128

Merged

Future / proposed spec changes devcontainers/devcontainers.github.io#56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for adding unique ID into volume mount names #62

Support for adding unique ID into volume mount names #62

Chuxel commented Jul 15, 2022 •

edited

Loading

stuartleeks commented Jul 29, 2022

stuartleeks commented Jul 29, 2022

chrmarti commented Aug 11, 2022

stuartleeks commented Aug 11, 2022

chrmarti commented Aug 11, 2022

Chuxel commented Aug 11, 2022

chrmarti commented Aug 11, 2022

chrmarti commented Aug 12, 2022

tschaffter commented Aug 13, 2022

chrmarti commented Aug 13, 2022

tschaffter commented Aug 15, 2022 •

edited

Loading

alexdima commented Aug 17, 2022 •

edited

Loading

chrmarti commented Sep 1, 2022 •

edited

Loading

chrmarti commented Sep 5, 2022

chrmarti commented Sep 7, 2022

jkeech commented Sep 7, 2022 •

edited

Loading

edgonmsft commented Sep 7, 2022

Chuxel commented Sep 8, 2022 •

edited

Loading

chrmarti commented Sep 8, 2022

jkeech commented Sep 8, 2022

chrmarti commented Sep 14, 2022

Chuxel commented Oct 24, 2022

Support for adding unique ID into volume mount names #62

Support for adding unique ID into volume mount names #62

Comments

Chuxel commented Jul 15, 2022 • edited Loading

stuartleeks commented Jul 29, 2022

stuartleeks commented Jul 29, 2022

chrmarti commented Aug 11, 2022

stuartleeks commented Aug 11, 2022

chrmarti commented Aug 11, 2022

Chuxel commented Aug 11, 2022

chrmarti commented Aug 11, 2022

chrmarti commented Aug 12, 2022

tschaffter commented Aug 13, 2022

chrmarti commented Aug 13, 2022

tschaffter commented Aug 15, 2022 • edited Loading

alexdima commented Aug 17, 2022 • edited Loading

chrmarti commented Sep 1, 2022 • edited Loading

chrmarti commented Sep 5, 2022

chrmarti commented Sep 7, 2022

Goal

Proposal

Computing the Identfier

jkeech commented Sep 7, 2022 • edited Loading

edgonmsft commented Sep 7, 2022

Chuxel commented Sep 8, 2022 • edited Loading

chrmarti commented Sep 8, 2022

jkeech commented Sep 8, 2022

chrmarti commented Sep 14, 2022

Chuxel commented Oct 24, 2022

Chuxel commented Jul 15, 2022 •

edited

Loading

tschaffter commented Aug 15, 2022 •

edited

Loading

alexdima commented Aug 17, 2022 •

edited

Loading

chrmarti commented Sep 1, 2022 •

edited

Loading

jkeech commented Sep 7, 2022 •

edited

Loading

Chuxel commented Sep 8, 2022 •

edited

Loading