Usage
The Python command hpc_campaign is the entry point to all commands:
connector Launches a service that can make SSH tunnels on demand to remote hosts
genkey Generates/validates keys used for encrypting datasets in the campaign archive
list Lists the available campaign archives, all or those that matches an expression
manager The main command with many sub-commands to create/delete/update a campaign archive
taridx Creates an index from a TAR file that can be used to point to replicas on an archival storage
cache List/clear the content of the local cache
Creating a Campaign archive file
The hpc_campaign manager command is the primary tool for creating, modifying, and viewing campaign archive files (.aca). It enables full lifecycle management for datasets, replicas, and archival storage locations.
A campaign archive name without the required .aca extension will be automatically corrected. Relative paths for archive names are resolved using the campaignstorepath defined in ~/.config/hpc-campaign/config.yaml unless otherwise specified. Multiple sub-commands can be chained in a single execution.
Note
Updates to moving data to other location is only properly supported in version 0.6 if the other location is an archival location (see hpc_campaign manager <archive> archived –move command). Otherwise one has to add (to the campaign archive) the dataset in the new location manually and delete the old replica (from the campaign archive).
Global Usage and Options
The manager command is invoked using the following general format:
usage: hpc_campaign manager <archive> [sub-command] [options]
The following options are available globally for the manager command to overwrite the default options:
–campaign_store, -s <CAMPAIGN_STORE> specifies the path to the local campaign store used by the campaign manager instead of the default path set in ~/.config/hpc-campaign/config.yaml
–hostname, -n <HOSTNAME> provides the host name, which must be unique for hosts within a campaign used by the campaign manager instead of the default hostname set in ~/.config/hpc-campaign/config.yaml
–keyfile, -k <KEYFILE> specifies the key file used to encrypt metadata.
Manager sub-commands
The [sub-command] argument can take one of the following values
create First command to create a new campaign archive
delete Delete dataset/replica from a campaign archive, or the entire archive
info List the content of a campaign archive
dataset Add ADIOS2 or HDF5 files
text Add text files, embedded or just reference to remote file
image Add images, embedded or remote optionally with an embedded thumbnail image
add-archival-storage Register an archival location (tape system, https, s3)
archived Create a replica of a dataset pointing to an archival storage location
time-series Organizing a series of individual datasets as a single entry with extra dimension for time
upgrade For upgrading an older ACA format to newer format
1. create
Creates a new campaign archive file stored in the specified or default path to the local campaign store folder. Example usage:
hpc_campaign manager demoproject/test_campaign_001 create
2. delete
Delete specific items (datasets or replicas) from a campaign archive file. Example usage:
hpc_campaign manager demoproject/test_campaign_001 delete [options]
The optional options specifies what will be deleted:
–uuid <id> [<id> …] removes datasets by their universally unique identifier (UUID).
–name <str> [<str> …] removes datasets by their representation name.
–replica <id> [<id> …] removes replicas by their ID number.
–campaign deletes the entire campaign file.
3. info
Prints the content and metadata of a campaign archive file. Example usage:
hpc_campaign manager demoproject/test_campaign_001 info [options]
The optional options allow listing replicas, entries that have been deleted and checksums. A complete list of options can be found in the help menu (-h option).
4. dataset
Adds one or more datasets to the archive with datasets being valid HDF5 or ADIOS2 BP files.
Note
A temporary file is created from HDF5 files during processing, so write access to the /tmp directory is required.
Example usage:
hpc_campaign manager demoproject/test_campaign_001 dataset run_001.bp run_002.h5
Additional option (–name <NAME>) can specify the representation name for one dataset in the campaign hierarchy. The same option can be applied to the text and image sub-commands.
5. text/image
Add one or more text files or image files to the archive. Text files are always stored compressed directly within the archive. By default, only a remote reference is stored for image files.
Note
Since text is stored internally, be mindful of the resulting archive’s size when adding large text files.
Example usage:
hpc_campaign manager demoproject/test_campaign_001 text input.json --store
hpc_campaign manager demoproject/test_campaign_001 image 2dslice.jpg --thumbnail
Additional options for images include: * –name, -n <NAME> allows multiple files with different resolutions can share the same name. * –store, -s stores the image file directly in the campaign archive instead of just a reference. * –thumbnail <X> <Y> stores a resized image with an X-by-Y resolution as a thumbnail, while referring to the original.
6. add-archival-storage
Records an archival storage location (e.g., tape system) to the list of known storage locations for the campaign.
hpc_campaign manager demoproject/test_campaign_001 add-archival-storage \
--longhostname users.nccs.gov https USERS.NCCS ~pnorbert/campaign-test/gray-scott-ensemble
This adds a second host/directory location into the campaign archive:
USERS.NCCS longhostname = users.nccs.gov
2. ~pnorbert/campaign-test/gray-scott-ensemble - Archive: https
Replicas of datasets then can be created (in the campaign file) by the archived sub-command. Note that hpc-campaign does not copy/move files on disk, someone else has to do that. These commands only record the action into the campaign archive file.
If we put a TAR file there instead of individual files, we can just point to that, and use archived one by one for each dataset. However, it is easier and provides more information, if we create an index file with the taridx command, then let the manager to create (i.e. record) a new replica for every dataset/replica already in the campaign archive that is also in the tar index.
7. archived
Indicates that a (replica of a) dataset has been manually copied or moved to an archival storage location. A new replica entry is created pointing to the archival host/directory. This sub-command only works if the metadata of the dataset is still included in the ACA file, so that it can be copied for the new replica. Therefore, always execute this sub-command before deleting the original replica from the ACA file. The two operations can be combined using the –move option. This sub-command requires the use of add-archival-storage sub-command that adds the location (host/directory/tar file) to the campaign first. If many files are added to the archival location in a TAR file, it is better to use the taridx command to create an index of the tar file and then use that in the add-archival-storage operation to automatically create replicas of all datasets involved. However, this individual sub-command allows for placing the replica in a different relative path string than the original, while the tar indexing requires them to be placed exactly with the same relative paths.
8. time-series
Organizes a sequence of datasets into a single named time-series. Subsequent calls with the same name will add datasets to the list, unless –replace is used.
hpc_campaign manager test.aca dataset series/array00.bp --name array00
hpc_campaign manager test.aca dataset series/array01.bp --name array01
hpc_campaign manager test.aca dataset series/array02.bp --name array02
hpc_campaign manager test.aca dataset series/array03.bp --name array03
hpc_campaign manager test.aca time-series array array00 array01 array02
hpc_campaign manager test.aca time-series array array03
hpc_campaign manager test.aca info
...
Time-series and their datasets:
array
89635fe22f85314ebfc04c902bca42f3 ADIOS Jun 9 08:49 array00
cea0302ea4ce39ccabca6b40bbeb09d1 ADIOS Jun 9 08:49 array01
fad5daf925e13f938c2649d81a1821f2 ADIOS Jun 9 08:49 array02
180b6d3123a832d786e1d0ff99c7e303 ADIOS Jun 9 08:49 array03
Other Datasets:
...
# ADIOS tools will present them as a single dataset with multiple steps
bpls -l test.aca array/*
int64_t array/Nx 4*scalar = 10 / 10
double array/bpArray 4*{10} = 0 / 9
double array/time 4*scalar = 0 / 0
9. upgrade
An ADIOS2 release will only read the latest ACA version and throw errors if an older ACA files is opened. The upgrade sub-command will modify the old ACA to jump to the next version. It may be called multiple times to get to the current version. This is an in-place conversion. If an error occurs during conversion, all changes are cancelled, leaving the original file intact.
Example creating an archive campaign file
In this example we will create an archive campaign file with: - the text json input file for a simulation - the data generated by the simulation code - analysis data generated by a code that reads the simulation data and produces histograms - the images generated by a visualization code on the simulation data
Configuration: - the campaignpath in ~/.config/hpc-campaign/config.yaml is set to /path/to/campaign-store - the path /path/to/campaign-store/demoproject is writable directory - the runs are made on a machine named OLCF in the Campaign hostname in ~/.config/hpc-campaign/config.yaml - all the files above are generated and stored in ${pwd}/runs
$ hpc_campaign manager demoproject/test_campaign_001 delete --campaign
$ hpc_campaign manager demoproject/test_campaign_001 create
$ hpc_campaign manager demoproject/test_campaign_001 text runs/input-configuration.json
$ hpc_campaign manager demoproject/test_campaign_001 dataset runs/simulation-output.bp runs/simulation-chekpoint.bp
$ hpc_campaign manager demoproject/test_campaign_001 dataset analysis/pdf.bp
$ hpc_campaign manager demoproject/test_campaign_001 image analysis/plot-2d.json --store
$ hpc_campaign manager demoproject/test_campaign_001 info
=========================================================
ADIOS Campaign Archive, version 0.5, created on Oct 18 14:29
Hosts and directories:
OLCF longhostname = frontier05341.frontier.olcf.ornl.gov
1. /path/to/simulation
Other Datasets:
3a4bf0b14cc33424a470862bd67ed007 TEXT Oct 18 14:25 runs/input-configuration.json
0fce4b1173f432f7ae5d2282df9077a6 ADIOS Oct 18 14:25 runs/simulation-output.bp
aa5d2282df9077a60fc643f5ab53b351 ADIOS Oct 18 14:26 runs/simulation-chekpoint.bp
b42d0da4a0793adca341ace1ff6e628d ADIOS Oct 18 14:28 analysis/pdf.bp
85a0b724b22f37a4a79ad8a0cf1127d1 IMAGE Oct 18 14:24 analysis/plot-2d.json
Comparing the campaign archive size to the data it points to can be done by the default method on each operating system.
$ du -sh runs/*bp
263M simulation-chekpoint.bp
3.8G simulation-output.bp
$ du -sh /path/to/adios-campaign-store/demoproject/test_campaign_001 info.aca
127K /path/to/adios-campaign-store/demoproject/test_campaign_001 info.aca
Launch local connection server
to be continued…