The aim of this nationally accessible storage (Swestore) is to build a robust, flexible and expandable system that can be used in most cases where access to large scale storage is needed. To the user, the resource should appear as a single large system, while it is desirable that some parts of the system are deployed and distributed across several locations to benefit from the advantages of, among other things, locality and redundancy effects.
We continuously investigate new technologies that are suitable to implement Swestore. This storage solution is intended as a versatile short and medium term storage, for large-scale research data. It is intended and best suitable for so called “warm” data that is not analysed or processed right now but it is still relevant for the active research project.
Project allocations are usually not backed up unless agreed differently and all files, i.e. digital objects, exist in two replicas in two geographically different locations. The main purpose of this type of storage is to offload the fast storage (Center Storage) during the active research phase and to move data from and to HPC systems.
What's in the package¶
Distribution and redundancy¶
Swestore is distributed across HPC centres at serveral universites; C3SE, HPC2N, Lunarc and NSC. Data is stored in two copies with each copy at a different centre. This enables the system to cope with a issues ranging from a simple crash of a storage element, power outage of a computer room, to losing an entire centre due to fire or flooding while still providing access to the stored data.
The core services of Swestore, like the core software and meta database, is currently running in one place and is thus a single point of failure. The meta data is replicated and there are contingency plans for moving these services to another location within a few working days in case of a total loss of core services.
One of the major advantages to the distributed nature of Swestore is the excellent aggregated transfer rates. This is achieved by bypassing a central node and having data transfers going directly to/from the storage elements. Swestore can achieve aggregated transfer rates in excess of 100 Gigabit per second, but in practice this is limited by connectivity to the end user, the university network performance or a limited number of concurrent files transfers (typically limited to about 100MB/s per file/connection).
To protect against silent data corruption the dCache storage system checksums all stored data and periodically verifies the data using this checksum. Transfers to and from Swestore might be checksummed, depending on the client and transfer protocol selected.
What's not in the package¶
Mounted file systems¶
Swestore is not mounted as a local file system on the HPC resources. Mounting remote file systems is generally a complicated matter which will suffer from latency and network problems outside Swestore and the HPC centre control and is very likely to cause instabilites for the HPC services.
Apart from the redundancy and distribution of the stored files, Swestore is not backuped to tape or simliar backup solutions. Swestore does NOT yet provide protection against user errors like inadvertent file deletions.