Skip to content

Load data into Blob Container in the CLFS format

Ze Qian Zhang edited this page Mar 8, 2022 · 6 revisions

DEPRECATED

Please note that this command has been deprecated. The recommended approach is to mount an HPC Cache over NFS and use it as a regular POSIX filesystem.

Getting Started

The load command (only available on Linux) can copy data into an empty Azure Blob Container and store it in Microsoft's Avere Cloud FileSystem (CLFS) format. The proprietary CLFS format is used by the Azure HPC Cache and Avere vFXT for Azure products.

This command is a simple option for moving existing data to cloud storage for use with specific Microsoft high-performance computing cache products. Because these products use a proprietary cloud filesystem format to manage data, that data cannot be loaded through the native copy command. Instead, the data must be loaded through the cache product itself OR via this load command, which uses the correct proprietary format. This command lets you transfer data without using the cache - for example, to pre-populate storage or to add files to a working set without increasing cache load.

The command relies on a Python-based extension called CLFSLoad. Before running the load command, install the extension with:

pip3 install clfsload~=1.0.23 # For AzCopy 10.5

To validate that the extension is installed correctly and accessible in your PATH, run:

CLFSLoad.py

If you encounter any issue, please try the steps listed here.

Usage:

azcopy load clfs [local dir] [container URL] [flags]

Examples

Load an entire directory:

azcopy load clfs "/path/to/dir" "https://[account].blob.core.windows.net/[container]?[SAS]" --state-path="/path/to/state/path"

Sample output:

INFO: Invoking the CLFSLoad Extension located at: /home/zed/.local/bin/CLFSLoad.py
INFO: Init phase has started.
INFO: CLFSLoad Extension version: 1.0.23
INFO: CLFSLoad Extension configurations: compression type=LZ4, preserve hard links=false
INFO: Starting a new job.
INFO: Transfer phase has started.
INFO: Finalize phase has started.
97.3 %, 994 Done, 0 Failed, 30 Pending, 0 Skipped, 1024 Total, Throughput (Mb/s): 2.517131081650587

Elapsed Time (Minutes): 0.8233
Total Number Of Transfers: 1024
Number of Transfers Completed: 1024
Number of Transfers Failed: 0
Number of Transfers Skipped: 0
TotalBytesTransferred: 16971254
Final Job Status: Completed

To resume a failed job:

azcopy load clfs "/path/to/dir" "https://[account].blob.core.windows.net/[container]?[SAS]" --state-path="/path/to/state/path" --new-session=false

Please refer to the in-app help message for more info:

azcopy load clfs -h

Note

This is a preview release of the load command. Please report any issues on the AzCopy Github repo.