Object storage is a type of storage to store and share data in cloud environments. An object consists of a unique name, the actual data and associated metadata (like access restrictions and user defined metadata). Contrary to filesystems objects are not organised in a hierarchy, but in a flat container. The data is accessed via http based protocols/APIs (application programming interfaces), like s3 and swift.
Object stores have a higher abstraction from the underlying hardware than filesystems, enabling for more flexibility on the administration side, e.g. distribution of data, replication of data and transparent storage capacity extension.
Users benefit from the ubiquitous availability of the storage service. The data can be accessed and written from anywhere, as long a the user has the appropriate access/write permissions. These permissions can be controlled via access control lists by the container owners. With the concept of large objects, object have no size limitations. A caveat on object stores is the possible higher network latency between the storage servers and the client machines.
The object storage in the de.NBI cloud can be accessed either via the S3-API or the Swift-API. These APIs differ in their terminology, capabilities and respective tools: Objects are stored in containers (swift) or buckets (s3). Access is controlled via access control lists (s3 and swift) or via policies (only s3).
The s3 and swift APIs in the de.NBI cloud is based on radosgw, a part of the ceph object store. For further details on this implementation see http://docs.ceph.com/docs/giant/radosgw/ .
Both APIs have command line clients as well as graphical user interfaces.
URL of s3/swift server: https://s3.computational.bio.uni-giessen.de.
The OpenStack dashboard provides an easy way to manage the object storage in the side menu:
By clicking on Containers, the object storage buckets available to the current project are displayed:
This basic interface allows the definition of containers and upload/download of files.
S3 offers a lot of functionality and a very fine grained authorization model (see Amazon's description of the S3 API). The functionality in the OpenStack dashboard offers only the bare minimum to handle buckets and files. The same is also true for the Swift API, which is not explained on this site.
In all standard use cases, a bucket should be created via the OpenStack dashboard. This assign the bucket to the OpenStack project, and will grant all project user's access to the bucket. Bucket for individual users or use cases are possible, but require a different way to create access credentials. Contact support if you need such a bucket.
Access to the S3 API requires credentials acceptable for the radosgw instance. The instance is linked to keystone and it able to authenticate users based on credentials generated by OpenStack. These credentials are bound to a project, so different projects require different credentials.
Unfortunately the OpenStack dashboard does not provide access to the necessary credentials, so the command line has to be used:
Project → Compute → Access & Security → API Access → Download OpenStack RC File v3
openstack --os-identity-api-version 3 ec2 credentials list
openstack --os-identity-api-version 3 ec2 credentials create
access
and secret
values are the credentials for accessing the S3 buckets of the project.
Using the s3cmd
command line utility, this section demonstrates how to access the BCF S3 object storage. The setup should be similar for other tools, libraries or applications.
s3cmd –configure
to start the interactive configuration dialoghost_base
value to s3.computational.bio.uni-giessen.de
host_bucket
value to %(bucket).s3.computational.bio.uni-giessen.de
s3cmd ls
to list all buckets. The output should print one line containing s3://<bucket name>
for all buckets in the projectIf you need access to different project, you can create multiple configuration files with s3cmd.
Certain aspects of S3 are not supported yet in Ceph, including:
The official ceph doku for the aws sdk java library is outdated and results in an error. Please use the builder component of AmazonS3Client class. An example can be found here:
AWSCredentials credentials = new BasicAWSCredentials(this.executorS3Key, this.executorS3Secret); ClientConfiguration clientConfiguration = new ClientConfiguration(); AwsClientBuilder.EndpointConfiguration regionOne = new AmazonS3Builder.EndpointConfiguration("s3.computational.bio.uni-giessen.de", "RegionOne"); AmazonS3 amazonS3 = AmazonS3Client.builder().withCredentials(new AWSStaticCredentialsProvider(credentials)).withRegion("RegionOne").withEndpointConfiguration(regionOne).build(); for(Bucket bucket : amazonS3.listBuckets()) { System.out.println(bucket.getName()); }
S3 supports the generation of temporary download urls, so called signed urls. You can generate signed urls with the help of s3cmd. For a download link that is available for a defined amount of time use
s3cmd signurl <path_to_object> +<availability_in_seconds>
If you want to generate a download link that is available until a certain moment you can use
s3cmd signurl <path_to_object> <seconds_since_01_01_1970>
It will generate the signed URL that you can pass to the person who will download the object/file. (Check the protocoll of the link. If it is http, you have to change it to https.)
The de.NBI object storage can be accessed through the swift API with the swift commandline tool. Therefore you need a cloud project, the swiftclient and an openstack rc file (see Setup of swift tool).
source <youropenstackrc>
swift list
swift post <containername>
# for small files swift upload <container> <file> # for large files >=5GB swift upload -S <chunksize in bytes max 5GB> <container> <file>
swift download <container> <file>
swift delete <container> <file>
swift delete <container>
Here a documentation how to use s3cmd to interact with your s3 buckets. You get your Access key and Secret Key via the Openstack platform (Project → API Access → View Access Data).
CORS can be configured via a xml-configuration-file.
<CORSConfiguration> <CORSRule> <AllowedOrigin>https://<replace-me></AllowedOrigin> <AllowedOrigin>http://<replace-me>:*</AllowedOrigin> <AllowedMethod>GET</AllowedMethod> <AllowedMethod>PUT</AllowedMethod> <AllowedHeader>*</AllowedHeader> </CORSRule> </CORSConfiguration>
Set this configuration for a bucket with
s3cmd setcors cors.xml s3://<your-bucket>
Further information can be found at https://docs.aws.amazon.com/AmazonS3/latest/userguide/ManageCorsUsing.html