ARC¶
This guide describes how to use the Nordugrid ARC client for storing and retrieving files from Swestore. The ARC tools are usually used for managing jobs and data for grid enabled computing clusters, but in the Swestore context we focus om the data management tools.
See also the ARC client tools documentation page for detailed information.
Requirements¶
To access Swestore using the ARC client you need to use certificate access and be a member of a Swestore storage project.
You also need to have the certificate installed on the resource where you want to run the ARC commands. For NAISS resources this process includes transfering it to the intended NAISS resource and prepare it for use with grid tools.
All NAISS HPC systems should have the ARC client installed. If yours doesn't, please contact support at your centre so they can fix this as soon as possible. To install the ARC client on your own computer, see the official Nordugrid Installing ARC Client Tools page for more information.
Quickstart¶
Access protocol¶
ARC supports multiple access protocols. We recommend using WebDAV, see the Enabled Access Protocols page for details that might affect transfers of larger data volumes.
Basic commands¶
-
arcproxy
- unlock your certificate so you can use it.arcproxy
-
arcls
- for listing files. Works similarly tols
.arcls https://webdav.swestore.se/snic/YOUR_PROJECT_NAME
-
arcmkdir
- for creating directories. Works similarly tomkdir
.arcmkdir https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/newdir
-
arccp
- for copying files. Works similarly tocp
.arccp myfile.txt https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/myfile.txt
-
arcrm
- for deleting files. Works similarly torm
.arcrm https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/whoops.txt
Use man
and --help
to get more info on each command.
man arcrm
or
arcls --help
Paths¶
The ARC commands supports multiple storage protocols, we recommend using WebDAV with paths on the form
https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/...
Unlock your certificate¶
Your certificate needs to be unlocked before you can do anything. Think of the process as logging in. When successful, a proxy certificate is the result. A proxy certificate inherits the properties of your normal certificate, but has a shorter lifetime.
arcproxy
To see the lifetime of your session, use:
arcproxy -I
Copying files¶
Copying files to and from resources is accomplished using the arccp command.
Copying single files¶
Copying single files is accomplished in the same way as using the normal cp command as shown in the following example:
arccp archive.tar.gz https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/
Please note the trailing / which marks the destination as a directory. Without a / the destination will be a file, which may or may not be what you wanted. All required directories are created when needed so the destination may be a nonexisting directory.
Recursive copying¶
Recursive copying is accomplished using the --recursive option to arccp.
Example:
arccp --recursive foobar/ https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/
NOTE: The above example will copy all files in the directory
foobar
into the destination directory YOUR_PROJECT_NAME
. If you
want the directory foobar
to be part of the destination path you
have to explicitly supply it as shown in the example below:
arccp --recursive foobar/ https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/foobar/
Long-running operations¶
Note that copying large directory trees can take quite some time, and might fail if you're not aware of the following:
- Your login session created with the
arcproxy
command has a limited lifetime. Usearcproxy -I
to show the remaining time. Usearcproxy -c validityPeriod=xxH
to initiate a session with longer lifetime. - The command will abort if you lose your network connection with the
computer where you are running arccp. A utility such as
screen
ortmux
can be used to create a terminal session you can reattach to. - Transfer rates are largely dependent on the average file size, if you have a lot of small files the transfer will be slower than if you have large files.
- We recommend to limit your transfer sessions (ie. the directory tree copied with each arccp command) to 1TB if you have mostly large (100+MB) files and to 100GB if you have smaller files.
Listing files¶
Listing files on a resources is done using the arcls command. In the simplest form the command just takes a URL as input and displays names and directories without any extra information as shown in the following example:
arcls https://webdav.swestore.se/snic/bils/db/uniprot/2012_05
Example output:
reldate.txt
speclist.txt
uniprot_sprot.dat.gz
uniprot_sprot.fasta.gz
uniprot_trembl.dat.gz
uniprot_trembl.fasta.gz
Additional information can be listed by adding the --long option:
arcls --long https://webdav.swestore.se/snic/bils/db/uniprot/2012_05
Example output:
<Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency>
reldate.txt file 151 2012-05-23 03:00:19 (n/a) adler32:f3f52f1d (n/a)
speclist.txt file 1715169 2012-05-23 03:00:17 (n/a) adler32:91e59dae (n/a)
uniprot_sprot.dat.gz file 462895141 2012-05-23 02:57:18 (n/a) adler32:0f131bb2 (n/a)
uniprot_sprot.fasta.gz file 79935897 2012-05-23 03:00:20 (n/a) adler32:89844c57 (n/a)
uniprot_trembl.dat.gz file 9162678278 2012-05-23 02:52:01 (n/a) adler32:b2d7cfd5 (n/a)
uniprot_trembl.fasta.gz file 4456514443 2012-05-23 02:57:34 (n/a) adler32:2b73b2a1 (n/a)
Metadata¶
Metadatainformation on a specific file can be listed by specifying the -m or --metadata option. Worth noting is that the amount of metadata available differs depending on which protocol is used.
Examples:
arcls --metadata https://webdav.swestore.se/ops/nikke/smallfile
Example output:
/ops/nikke/smallfile
checksum:adler32:762606eb
mtime:2013-04-12 11:06:56
path:/ops/nikke/smallfile
size:30
type:file
arcls --metadata srm://srm.swegrid.se/ops/nikke/smallfile
Example output:
/ops/nikke/smallfile
accessperm:rw-r-----
checksum:adler32:762606eb
ctime:2013-04-12 11:06:56
filestoragetype:PERMANENT
group:25001
latency:ONLINE
lifetimeassigned:PT1S
lifetimeleft:PT1S
mtime:2013-04-12 11:06:56
owner:25001
path:/ops/nikke/smallfile
size:30
spacetokens:
type:file
Creating directories¶
Directories are generally created on demand. If you copy a file with the destination /snic/YOUR_PROJECT_NAME/newdir/dummyfile the newdir directory will be created if missing. But you can explicitly create directories using the arcmkdir command.
arcmkdir https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/newdir
Removing files or directories¶
arcrm https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/newdir/dummyfile
arcrm https://webdav.swestore.se/snic/YOUR_PROJECT_NAME/newdir/
To remove directories they have to be empty.
FAQ¶
1) I get this message when I try to list files:
$ arcls gsiftp://gsiftp.swestore.se/snic/
ERROR: Unsupported URL given
- The nordugrid-arc-plugins-globus package is missing. Without it ARC is not able to use the gsiftp protocol. Do note that Globus and related protocols are being discontinued, most communities are migrating to use WebDAV instead.
2) arcproxy
gives WARNING or ERROR messages.
- The most common reason is a missing certificate file. See Requirements