- Install and run
- Troubleshooting
- Example usage
- Commands description
- Workspace context for applications
- Exit codes
To install the latest version:
curl -L https://github.com/DataBiosphere/terra-cli/releases/latest/download/download-install.sh | bash && export SUPPRESS_GCLOUD_CREDS_WARNING=true
# Optional: Move to somewhere in PATH
sudo mv terra /usr/local/bin
# Talk to Verily production environment (as opposed to a test environment):
terra server set --name=verily --quiet
To install a specific version, set the version as an environment variable
export TERRA_CLI_VERSION=0.106.0
By default, the CLI will be installed without support for Docker (i.e. it won't pull the Docker image). The TERRA_CLI_DOCKER_MODE environment variable controls Docker support. Set it to
- DOCKER_NOT_AVAILABLE (default) to skip pulling the Docker image
- DOCKER_AVAILABLE to pull the image (requires Docker to be installed and running).
Re-installing will overwrite any existing installation (i.e. all JARs and
scripts will be overwritten), but will not modify the $PATH
. If you have added
the location of the Terra CLI to your $PATH
, you will therefore need to add
its location to your path again after each install.
- Java 17
- Docker 20.10.2 (Must be running if installing in DOCKER_AVAILABLE mode)
curl
,tar
,gcloud
(For install only)
Note: The CLI doesn't use gcloud
directly either during installation or normal
operation.
However, docker pull
may use gcloud
under the covers to pull the default Docker image from GCR; therefore, gcloud
is required for installation.
Note: If you are using the CLI on a Terra cloud environment, you do not need to run the commands below. You are already logged in. You can verify this by running
terra auth status
-
To Launch an OAuth flow that creates a new tab in your browser window where you will complete the login
terra auth login
-
If the machine where you're running the CLI does not have a browser available to it, then use the manual login flow by setting the browser flag using. See the Authentication section below for more details.
terra config set browser MANUAL
In order to spend money in Terra (e.g. by creating a workspace and resources within it), you need access to a billing account via a spend profile. Currently, there is a single spend profile used by each team. An admin user can grant you access. Admins, see ADMIN.md for more details.
A user can be given access to multiple spend profiles, and an admin can set the
default spend profile to use. Users can override the default spend profile used
to create a workspace by using the --spend-profile
option during workspace
create. See usage under example usage.
In order to read data from or write data to a private external resource in
Terra, you must grant the appropriate data access permissions to your proxy
group. To view the email address of your proxy group, run terra auth status
When running terra app
commands in LOCAL_PROCESS
app-launch
mode (the
default), it's necessary to install various tools locally. The following
instructions are for both macOS and Linux.
-
gcloud
- Make sure you have Python installed, then download the .tar.gz archive file from the [installation page](https://cloud.google. com/sdk/docs/install). Rungcloud version
to verify the installation.gcloud builds submit
has--gcs-bucket-resource
option to specify the--gcs-source-staging-dir
and--gcs-log-dir
options as default.
-
gsutil
- This command is included in thegcloud
CLI, or available separately here. Verify its installation withgsutil version
(also printed as part ofgcloud version
) -
bq
- This command is included withgcloud
. More details are available here. Similarly, verify its installation withbq version
. -
nextflow
- Install by downloading abash
script and running it locally. Create anextflow
directory somewhere convenient (e.g.$HOME/nextflow
) and switch to it. Then runcurl -s https://get.nextflow.io | bash
. Finally, move thenextflow
executable script to a location on the$PATH
: the$PATH
by runningsudo mv nextflow /usr/local/bin/
. Verify the installation withnextflow -version
. -
git
- Follow these instructions for installing Git on your platform.
Now, these applications are available in terra
by running, for
example, terra gsutil ls
. When run with terra
, environment variables are set
based on resources in the active workspace, and context such as the active GCP
project is set up automatically.
Clear the context file and all credentials. This will require you to login and select a workspace again.
cd $HOME/.terra
rm context.json
rm StoredCredential
A Terra CLI release includes a GitHub release of the terra-cli
repository and
a corresponding Docker image in GCR. download-install.sh
is a convenience
script that downloads the latest (or specific) version of the installation
package, unarchives it, runs the install.sh
script included inside, and then
deletes the installation package.
You can also skip the download-install.sh
script and do the installation
manually.
- Download the
terra-cli.tar
install package directly from the GitHub releases page. - Unarchive the
tar
file. - Run the installation script from the unarchived directory:
./install.sh
There is not yet an uninstaller. You can clear the entire context directory, which includes the context file, all credentials, and all JARs. This will then require a re-installation (see above).
rm -R $HOME/.terra
rm /usr/local/bin/terra
The commands below walk through a brief demo of the existing commands.
-
Fetch the user's credentials. Check the authentication status to confirm the login was successful.
terra auth login terra auth status
-
Ping the Terra server.
terra server status
-
Create a new Terra workspace and backing Google project, using the user's default spend profile. Check the current context to confirm it was created successfully.
terra workspace create --id=<my-workspace-id> terra status
-
Create a new Terra workspace and backing Google project, overriding the user's default spend profile. User must have access to the alternate spend profile.
terra workspace create --id=<my-workspace-id> --spend-profile=<alt-spend-profile> terra status
-
List all workspaces the user has read or write access to.
terra workspace list
-
If you want to use an existing Terra workspace, use the
set
command instead ofcreate
.terra workspace set --id=eb0753f9-5c45-46b3-b3b4-80b4c7bea248
-
Set the Gcloud user and application default credentials.
gcloud auth login gcloud auth application-default login
Run a Nextflow hello world example (requires Docker image set and container
running, or Nextflow to be installed locally. For Docker
support, export TERRA_CLI_DOCKER_MODE=DOCKER_AVAILABLE
before
installing terra
):
terra nextflow run hello
Run an example Nextflow workflow in the context of the Terra workspace (i.e. in the workspace's backing Google project). This is the same example workflow used in the GCLS tutorial.
-
Download the workflow code from GitHub.
git clone https://github.com/nextflow-io/rnaseq-nf.git cd rnaseq-nf git checkout v2.0 cd ..
-
Create a bucket in the workspace for Nextflow to use.
terra resource create gcs-bucket --name=mybucket --bucket-name=mybucket
-
Update the
gls
section of thernaseq-nf/nextflow.config
file to point to the workspace project and bucket we just created.gls { params.transcriptome = 'gs://rnaseq-nf/data/ggal/transcript.fa' params.reads = 'gs://rnaseq-nf/data/ggal/gut_{1,2}.fq' params.multiqc = 'gs://rnaseq-nf/multiqc' process.executor = 'google-lifesciences' process.container = 'nextflow/rnaseq-nf:latest' workDir = "$TERRA_mybucket/scratch" google.region = 'us-east1' google.project = "$GOOGLE_CLOUD_PROJECT" google.lifeSciences.serviceAccountEmail = "$GOOGLE_SERVICE_ACCOUNT_EMAIL" google.lifeSciences.network = 'network' google.lifeSciences.subnetwork = 'subnetwork' }
-
Perform a dry-run to confirm the config is set correctly.
terra nextflow config rnaseq-nf/main.nf -profile gls
-
Kick off the workflow. (This takes about 10 minutes to complete.)
terra nextflow run rnaseq-nf/main.nf -profile gls
-
To send metrics about the workflow run to a Nextflow Tower server, first define an environment variable with the Tower access token. Then specify the
-with-tower
flag when kicking off the workflow.export TOWER_ACCESS_TOKEN=***** terra nextflow run hello -with-tower terra nextflow run rnaseq-nf/main.nf -profile gls -with-tower
-
Call the Gcloud CLI tools in the current workspace context. This means that Gcloud is configured with the backing Google project and environment variables are defined that contain workspace and resource properties (e.g. bucket names, pet service account email).
terra gcloud config get-value project terra gsutil ls terra bq version
-
See the list of supported third-party tools. The CLI runs these tools in a Docker image, if
app-launch
mode isDOCKER_CONTAINER
. If theapp-launch
mode isLOCAL_PROCESS
, the CLI will assume the tools are available in the current shell environment and launch them there.terra app list
-
Print the image tag that the CLI is currently using.
terra config get image
Usage: terra [COMMAND]
Terra CLI
Commands:
app Run applications in the workspace.
auth Retrieve and manage user credentials.
bq Call bq in the Terra workspace.
config Configure the CLI.
cromwell cromwell Generate a Cromwell configuration file.
gcloud Call gcloud in the Terra workspace.
git Call git in the Terra workspace.
group Manage groups of users.
gsutil Call gsutil in the Terra workspace.
nextflow Call nextflow in the Terra workspace.
notebook Use GCP Notebooks in the workspace.
resolve Resolve a resource to its cloud id or path.
resource Manage resources in the workspace.
server Connect to a Terra server.
spend Manage spend profiles.
status Print details about the current workspace and server.
user Manage users.
version Get the installed version.
workspace Setup a Terra workspace.
- The
resolve
command is an alias for theterra resource resolve
command. - The
status
command prints details about the current workspace and server. - The
version
command prints the installed version string. - The
bq
,gcloud
,git
,gsutil
andnextflow
commands call third-party applications in the context of a Terra workspace and are aliases for theterra app [application]
command
The other commands are groupings of sub-commands, described in the sections below.
app
Applicationsauth
Authenticationconfig
Configcromwell
Cromwellgit
Gitgroup
Groupsgsutil
gsutilnotebook
Notebooksresource
Resourcesserver
Serverspend
Spenduser
Userworkspace
Workspacefolder
Folder
Usage: terra app [COMMAND]
Run applications in the workspace.
Commands:
execute [FOR DEBUG] Execute a command in the application container for the
Terra workspace, with no setup.
list List the supported applications.
The Terra CLI allows running supported third-party tools within the context of a workspace. To see supported tools, run
terra app list
The app-launch
configuration property controls how tools are run: in a Docker
container, or a local child process.
If you pass --workspace
flag, it must come immediately after the tool:
# Works
terra bq --workspace=<workpspace-id> ls
# Doesn't work, --workspace is passed to bq instead of terra
terra bq ls --workspace=<workpspace-id>
For creating resources such as BigQuery dataset or GCS bucket, you must create through terra rather than through tool. This is because terra configures permissions for you.
# Works
terra resource create gcs-bucket --name=<resource-name>
# Doesn't work
terra gsutil mb gs://<bucket-name>
Usage: terra auth [COMMAND]
Retrieve and manage user credentials.
Commands:
login Authorize the CLI to access Terra APIs and data with user credentials.
revoke Revoke credentials from an account.
status Print details about the currently authorized account.
Only one user can be logged in at a time. To login as a different user, run
terra auth login
Login uses the Google OAuth 2.0 installed application flow.
You don't need to login again after switching workspaces. You will need to login again after switching servers, because different Terra deployments may have different OAuth flows.
By default, the CLI opens a browser window for the user to click through the OAuth flow. For some use cases (e.g. CloudShell, notebook VM), this is not practical because there is no default (or any) browser on the machine. The CLI has a browser option that controls this behavior. The below command displays a URL, which the user can copy to a browser on a different machine (e.g. their laptop), complete the login prompt, and then copy/paste the response token back into a shell on the machine where they want to use the Terra CLI.
# set the browse option
> terra config set browser MANUAL
Browser launch mode for login is MANUAL (CHANGED).
# auth flow
> terra auth login
Please open the following address in a browser on any machine:
https://accounts.google.com/o/oauth2/auth?access_type=offline&approval_prompt=force&client_id=[...]
Please enter code: *****
Login successful: [email protected]
Usage: terra cluster [COMMAND]
Use spark clusters in the workspace.
Commands:
start Start a stopped Dataproc cluster within your workspace.
stop Stop a started Dataproc cluster within your workspace.
launch Launch a running Dataproc cluster within your workspace.
You can create a Dataproc (controlled resource) with
terra resource create gcp-dataproc-cluster --name=<resourcename> --bucket=<configBucket> --temp-bucket=<tempBucket> [--workspace=<id>]
Note: You must have controlled config and temp gcs buckets created within your workspace before creating a cluster.
These stop
, start
and launch
commands are provided for convenience.
- gcp-dataproc-clusters are supported on workspaces
created on cloud platform GCP. You can also stop and start the notebook using
the
gcloud dataproc clusters [start|stop]
commands.
To launch a component web interface, ensure that the component has been enabled during cluster creation via the --components
flag, then run:
terra cluster launch --name=<cluster-name> --proxy-view=<component>
The supported components are: YARN_RESOURCE_MANAGER, MAPREDUCE_JOB_HISTORY, SPARK_HISTORY_SERVER, HDFS_NAMENODE, YARN_APPLICATION_TIMELINE, HIVESERVER2, TEZ, JUPYTER, JUPYTERLAB, ZEPPELIN, SOLR, FLINK_HISTORY_SERVER.
See Dataproc optional components for more details.
Usage: terra config [COMMAND]
Configure the CLI.
Commands:
get Get a configuration property value.
list List all configuration properties and their values.
set Set a configuration property value.
These commands are property getters and setters for configuring the Terra CLI. Currently, the available configuration properties are:
OPTION VALUE DESCRIPTION
app-launch DOCKER_CONTAINER app launch mode
browser AUTO browser launch for login
image gcr.io/terra-cli-dev/terra-cli/0.246.0:stable docker image id
resource-limit 1000 max number of resources to allow per workspace
console-logging OFF logging level for printing directly to the terminal
file-logging INFO logging level for writing to files/Users/ginay/.terra/logs
server broad-dev-cli-testing (unset)
workspace (unset) (unset)
format TEXT output format
Utility commands for using the Cromwell workflow engine with Terra.
Usage: terra cromwell [COMMAND]
Commands related to Cromwell workflows.
Commands:
generate-config Generate a Cromwell configuration file (cromwell.conf) for use on a Terra workspace cloud environment.
To run Cromwell in a notebook instance:
-
Generate the config
terra cromwell generate-config \ (--workspace-bucket-name=bucket_name | --google-bucket-name=gs://my-bucket) \ [--dir=my/path]
-
One of
workspace-bucket-name
orgoogle-bucket-name
is required to specify the bucket used by Cromwell for workflow orchestration.workspace-bucket-name
is a Terra resource name.google-bucket-name
is a Google Cloud Storage bucket. Ifgoogle-bucket-name
does not begin with thegs://
prefix, it will be automatically added.
-
Start the Cromwell server on
localhost:8000
, runjava -Dconfig.file=path/to/cromwell.conf -jar cromwell/cromwell-81.jar server
-
In another terminal window, run
cromshell
. Enterlocalhost:8000
for cromwell server. -
Start workflow through cromshell: e.g.
cromshell submit workflow.wdl inputs.json [options.json] [dependencies.zip]
For more information, see https://github.com/broadinstitute/cromshell.
Usage: terra git [COMMAND]
Call git command in the terra workspace. Besides calling normal Git operation, this command allow cloning git-repo resources in the workspace.
Commands:
all Clone all the git-repo resources in the workspace. Usage: terra git clone --all
resource Clone specified git-repo resources in the workspace. Usage: terra git clone --resource=<repoResource1Name> --resource=<repoResource2Name>
To add a git repo:
terra resource add-ref git-repo --name=<resource_name> --repo-url=<repo_url>
Usage: terra group [COMMAND]
Manage groups of users.
Commands:
add-user Add a user to a group with a given policy.
create Create a new Terra group.
delete Delete an existing Terra group.
describe Describe the group.
list List the groups to which the current user belongs.
list-users List the users in a group.
remove-user Remove a user from a group with a given policy.
Terra groups are managed by SAM. These commands are utility wrappers around the group endpoints.
Say a Terra group's email is [email protected]
. name
is mygroup
,
not [email protected]
:
terra group list-users --name=mygroup
Adding a member to a Terra group implicitly adds their pet service accounts. For
example, say terra-user
is added to [email protected]
. When mygroup
is
granted access to a resource, terra-user
is able to access that resource from
any of their Terra workspaces.
You can run terra gsutil
or terra gcloud alpha storage
. gcloud alpha storage
is a newer version of gsutil
. It doesn't support everything, but what it does
support may be significantly faster.
Usage: terra notebook [COMMAND]
Use GCP Notebooks in the workspace.
Commands:
start Start a stopped Notebook instance within your workspace.
stop Stop a running Notebook instance within your workspace.
launch Launch a running Notebook instance within your workspace.
You can create a notebook (controlled resource) with
terra resource create [notebook-type] --name=<resourcename> [--workspace=<id>]
These stop
, start
and launch
commands are provided for convenience.
- [gcp-notebooks](https://cloud.google.
com/vertex-ai/docs/workbench/notebook-solution) are supported on workspaces
created on cloud platform GCP. You can also stop and start the notebook using
the
gcloud notebooks instances [start|stop]
commands. - sagemaker-notebooks are supported on workspaces created on cloud platform
AWS. You can also stop and start the notebook using the
aws --profile=profile-name sagemaker [start-notebook-instance|stop-notebook-instance]
commands.
Usage: terra resource [COMMAND]
Manage resources in the workspace.
Commands:
add-ref, add-referenced Add a new referenced resource.
check-access Check if you have access to a referenced resource.
credentials Retrieve temporary credentials to access a cloud resource.
create, create-controlled Add a new controlled resource.
delete Delete a resource from the workspace.
describe Describe a resource.
list List all resources.
list-tree List all resources and folders in tree view.
mount Mounts all workspace bucket resources.
move Move resource to a folder.
open-console Retrieve console link to access a cloud resource.
resolve Resolve a resource to its cloud id or path.
unmount Unmounts all workspace bucket resources.
update Update the properties of a resource.
A controlled resource is a cloud resource that is managed by Terra. It exists
within the current workspace context. For example, a bucket within the workspace
Google project. You can create these with the create
command.
A referenced resource is a cloud resource that is NOT managed by Terra. It
exists outside the current workspace context. For example, a BigQuery dataset
hosted outside of Terra or in another workspace. You can add these with the
add-ref
command. The workspace currently supports the following referenced
resource:
gcs-bucket
gcs-object
bq-dataset
bq-table
git-repo
The check-access
command lets you see whether you have access to a particular
resource. This is useful when a different user created or added the resource and
subsequently shared the workspace with you. check-access
currently always
returns true for git-repo
reference type because workspace doesn't support
authentication to external git services yet.
The list of resources in a workspace is maintained on the Terra Workspace Manager server.
User can update the name and description of a reference resource. User can also
update a reference resource to another of the same type. For instance, if a user
creates a reference resource to Bq dataset foo
and later on wants to point to
Bq dataset bar
in the same project, one can use the below command to update
the reference. However, one is not allowed to update the reference to a
different type (e.g. update a dataset reference to a data table reference is not
allowed).
terra resource udpate --name=<fooReferenceName> --new-dataset-id=bar
GCS bucket lifecycle rules are specified by passing a JSON-formatted file path
to the terra resource create gcs-bucket
command. The expected JSON structure
matches the one used by the gsutil lifecycle
command. This
structure is a subset of the GCS
resource specification.
Below are some example file contents for specifying a lifecycle rule.
-
Change the storage class to
ARCHIVE
after 10 days.{ "rule": [ { "action": { "type": "SetStorageClass", "storageClass": "ARCHIVE" }, "condition": { "age": 10 } } ] }
-
Delete any objects with storage class
STANDARD
that were created before December 3, 2007.{ "rule": [ { "action": { "type": "Delete" }, "condition": { "createdBefore": "2007-12-03", "matchesStorageClass": [ "STANDARD" ] } } ] }
-
Delete any objects that are more than 365 days old.
{ "rule": [ { "action": { "type": "Delete" }, "condition": { "age": 365 } } ] }
There is also a command shortcut for specifying this type of lifecycle rule (3).
terra resource create gcs-bucket --name=mybucket --bucket-name=mybucket --auto-delete=365
A reference to an GCS bucket object can be created by calling
terra resource add-ref gcs-object --name=referencename --bucket-name=mybucket --object-name=myobject
A file or folder is treated as an object in GCS bucket. By either creating a
folder through the cloud console UI or copying an existing folder of files to
the GCS bucket, a user can create a folder object. So the user can create a
reference to the folder if they have at least READER
access to the bucket
and/or READER
access to the folder. Same with a file.
Different from other referenced resource type, there is also support for
creating a reference to objects in the folder. For instance, a user may create
a foo/
folder with bar.txt
and secret.txt
in it. If the user have at least
READ access to foo/ folder, they have access to anything in the foo/ folder. So
they can add a reference to foo/bar.txt
, foo/\*
or foo/\*.txt
.
NOTE Be careful to provide the correct object name when creating a reference. We only check if the user has READER access to the provided path, we do not check whether the object exists. This is helpful because when referencing to foo/*, it is actually not a real object! So a reference to
fooo/
(where objectfooo
does not exist) can be created if the user hasREADER
access to the bucket orfoo/\*.png
(where there is no png files) if they have access to thefoo/
folder.
Users can mount GCS buckets and referenced folder objects locally to the user's
home directory in $HOME/workspace/
by running
terra resource mount
Users can specify the --name
flag with the name of a GCS bucket or GCS object
resource to only mount that individual resource. This flag is useful for
remounting a resource that had failed to mount or has been moved to a different
folder in the workspace.
By default, controlled GCS buckets and referenced folder objects created by the
user will be mounted with read-write permissions while controlled buckets
created by other users and referenced bucket folders will be mounted with
read-only permissions. Users can override this default behavior by specifying
the --read-only
flag.
# all mounts to be read-only
terra resource mount --read-only
# all mounts to be read-write
terra resource mount --name=mybucket --read-only=false
Users can specify the --disable-cache
flag. This will disable file metadata
caching and file type caching for objects in the mounted buckets. List
operations such as ls
will be slower, but will reflect the most up-to-date
state of the bucket. This is useful when working with collaborators in a shared
workspace. See more details in
the gcsfuse
repository.
If a mount has failed, an empty directory will be left at mount point with the resource name and a suffix error string indicating the failure. Users can remount the bucket after resolving bucket access or bucket reference issues.
Unmounting a single resource can fail if the resource has been renamed or moved
to a different workspace folder. In this case, users can either
run terra resource unmount
to unmount all mounted resources
in $HOME/workspace/
. Or, users can directly list out all mounted filesystems
with mount
and then unmount the resource using its mount path
with fusermount -u
(for linux) or umount
for (MacOS).
Usage: terra server [COMMAND]
Connect to a Terra server.
Commands:
list List all available Terra servers.
set Set the Terra server to connect to.
status Print status and details of the Terra server context.
A Terra server or environment is a set of connected Terra services (e.g. Workspace Manager, Data Repo, SAM).
Workspaces exist on a single server, so switching servers will change the list of workspaces available to you.
These commands are intended for admin users. Admins, see ADMIN.md for more details.
These user management commands are intended for admin users. Admins, see ADMIN.md for more details.
Ensure you have the latest CLI version. To install new CLI version, first manually uninstall the existing CLI and then install the latest CLI.
terra user ssh-key
is how Terra does source control in a notebook environment.
It handles the ssh key of the current user. There is one single Terra ssh key
per user in a given server (e.g. broad-dev). With this SSH key, you can perform
source control in a terra-managed notebook instance using git.
To set up an ssh key, add the terra ssh key to your local machine using the below command
terra user ssh-key add
You should see in the output an ssh public key starting with ssh-rsa
. Then
copy the public key from the command output and add it to
GitHub. GitHub's instruction link.
If you think your key is compromised (e.g. the private key on your local machine is leaked to other user), you must delete the key from your GitHub account and generate a new Terra ssh key using the below command
terra user ssh-key generate
Once a new key is generated, you need to associate this new key with your GitHub account again. GitHub's instruction link.
Usage: terra workspace [COMMAND]
Setup a Terra workspace.
Commands:
add-user Add a user or group to the workspace.
break-glass Grant break-glass access to a workspace user.
configure-aws Generate an AWS configuration file for a workspace.
duplicate Duplicate an existing workspace.
create Create a new workspace.
delete Delete an existing workspace.
delete-property Delete the workspace properties.
describe Describe the workspace.
list List all workspaces the current user can access.
list-users List the users of the workspace.
remove-user Remove a user or group from the workspace.
set Set the workspace to an existing one.
set-property Set the workspace properties.
update Update an existing workspace.
A Terra workspace created on cloud platform GCP is backed by a Google project. Creating/deleting a workspace also creates/deletes the project.
The break-glass
command is intended for admin users. Admins,
see ADMIN.md
for more details.
Usage: terra folder [COMMAND]
Setup a folder in a Terra workspace.
Comands:
create create a folder
delete delete a folder
update update a folder's name, description, parent folder
set-property set key-value properties on a folder
tree print out the folders in the workspace in a tree hiearchy
The Terra CLI defines a workspace context for applications to run in. This context includes:
GOOGLE_CLOUD_PROJECT
environment variable set to the backing google project id.GOOGLE_SERVICE_ACCOUNT_EMAIL
environment variable set to the current user's pet SA email in the current workspace.- Environment variables that are the name of the workspace resources, prefixed
with
TERRA_
are set to the resolved cloud identifier for those resources ( e.g.mybucket
->TERRA_mybucket
set togs://mybucket
). Applies to referenced and controlled resources.
To use a workspace reference in a Terra CLI command, escape the environment variable to bypass the shell substitution on the host machine.
Example commands for creating a new controlled bucket resource and then
using gsutil
to get its IAM bindings.
> terra resource create gcs-bucket --name=mybucket --bucket_name=mybucket
Successfully added controlled GCS bucket.
> terra gsutil iam get \$TERRA_mybucket
Setting the gcloud project to the workspace project
Updated property [core/project].
{
"bindings": [
{
"members": [
"projectEditor:terra-wsm-dev-e3d8e1f5",
"projectOwner:terra-wsm-dev-e3d8e1f5"
],
"role": "roles/storage.legacyBucketOwner"
},
{
"members": [
"projectViewer:terra-wsm-dev-e3d8e1f5"
],
"role": "roles/storage.legacyBucketReader"
}
],
"etag": "CAE="
}
To use a workspace reference in a file or config that will be read by an application, do not escape the environment variable. Since this will be running inside the Docker container or local process, there is no need to bypass shell substitution.
Example nextflow.config
file that includes a reference to a bucket resource in
the workspace, the backing Google project, and the workspace pet SA email.
profiles {
gls {
params.transcriptome = 'gs://rnaseq-nf/data/ggal/transcript.fa'
params.reads = 'gs://rnaseq-nf/data/ggal/gut_{1,2}.fq'
params.multiqc = 'gs://rnaseq-nf/multiqc'
process.executor = 'google-lifesciences'
process.container = 'nextflow/rnaseq-nf:latest'
workDir = "$TERRA_mybucket/scratch"
google.region = 'us-east1'
google.project = "$GOOGLE_CLOUD_PROJECT"
google.lifeSciences.serviceAccountEmail = "$GOOGLE_SERVICE_ACCOUNT_EMAIL"
google.lifeSciences.network = 'network'
google.lifeSciences.subnetwork = 'subnetwork'
}
}
To see all environment variables defined in the Docker container or local process when applications are launched
terra app execute env
The terra app execute ...
command is intended for debugging. It lets you
execute any command in the Docker container or local process, not just the ones
we've officially supported (i.e. gsutil
, bq
, gcloud
, nextflow
).
To run tools that are not yet supported by the Terra CLI, or to use local
versions of tools, set the app-launch
configuration property to launch a child
process on the local machine instead of inside a Docker container.
terra config set app-launch LOCAL_PROCESS
Then call the tool with terra app execute
. Before running the tool command,
the CLI defines environment variables for each workspace resource and
configures gcloud
with the workspace project. After running the tool command,
the CLI restores the original gcloud
project configuration.
terra app execute dsub \
--provider google-v2 \
--project \$GOOGLE_CLOUD_PROJECT \
--regions us-central1 \
--logging \$TERRA_MY_BUCKET/logging/ \
--output OUT=\$TERRA_MY_BUCKET/output/out.txt \
--command 'echo "Hello World" > "${OUT}"' \
--wait
(Note: The command above came from
the dsub
README
.)
Accessing AWS Workspace resources via the AWS CLI or SDK can be configured using
the terra workspace configure-aws
command. This command writes an
AWS configuration file
with two profiles for each AWS resource in the workspace to a file named after
the Workspace in $HOME/terra/aws/
.
The output of this command is meant to be used with the
bash eval
command to set up the current
environment to access resource in the current workspace by outputting a bash
command to set the AWS_CONFIG_FILE
environment variable to the newly created
AWS configuration file. In turn this file contains two profiles for each
resource, one named after the resource, and one suffixed with -ro
; the former
provides write/read access to the resource, the latter provides read-only.
Workspace has a single AWS S3 Storage folder (output truncated):
> terra resource describe --name aws_folder_20230422
Name: aws_folder_20230422
Description: My First Storage Folder
Type: AWS_S3_STORAGE_FOLDER
Region: us-east-1
...
AWS S3 Storage Folder: s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/
# Objects: 0
Call terra workspace configure-aws
wrapped in a bash eval
command and note
that environment variable AWS_CONFIG_FILE
points at the newly written config
file:
> eval "$(terra workspace configure-aws)"
> echo $AWS_CONFIG_FILE
/Users/jczerk/.terra/aws/verily_devel/jczerk_aws_202304131028.conf
Now we can use profile aws_folder_20230422
to copy a file into our S3 Storage
Folder:
> aws --profile=aws_folder_20230422 s3 cp /tmp/hello.txt s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/
upload: /tmp/hello.txt to s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/hello.txt
And using read-only profile aws_folder_20230422-ro
allows us to list this
file:
> aws --profile=aws_folder_20230422-ro s3 ls s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/
2023-04-22 09:53:15 0
2023-05-01 11:32:10 14 hello.txt
But not delete it:
> aws --profile=aws_folder_20230422-ro s3 rm s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/hello.txt
delete failed: s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/hello.txt An error occurred (AccessDenied) when calling the DeleteObject operation: Access Denied
Switching back to profile aws_folder_20230422
, delete succeeds:
> aws --profile=aws_folder_20230422 s3 rm s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/hello.txt
delete: s3://v0-saas-devel-us-east-1-terra/aws_folder_20230422/hello.txt
Caching credentials using
tool aws-vault
is recommended, and
can be configured using options --cache-with-aws-vault
and
--aws-vault-path
.
The CLI sets the process exit code as follows.
- 0 = Successful program execution
- 1 = User-actionable error (e.g. missing parameter, workspace not defined in the current context)
- 2 = System or internal error (e.g. error making a request to a Terra service)
- 3 = Unexpected error (e.g. null pointer exception)
App exit codes will be passed through to the caller. e.g.
If gcloud --malformedOption
returns exit code 2
, then
terra gcloud --malformedOption
will also return exit code 2
.