While Synapse provides physical storage for files (using Amazon’s S3), not all data ‘in’ Synapse is stored on Synapse controlled locations. For example, data files can physically reside on a user-owned S3 bucket, SFTP servers, or a local file server using a proxy servers. Creating a custom storage location allows users ownership and control of their files, especially in cases where there is a large amount of data or cases where there are additional restrictions that need to be set on the data.
Note: System metadata, annotations, and provenance records are still stored in Synapse's S3 storage.
Setting Up an External AWS S3 Bucket
Please note that your S3 Bucket must be in the us-east-1 (N. Virginia) region for this to work.
Follow the documentation on Amazon Web Service (AWS) site to Create a Bucket.
Make the following adjustments to customize it to work with Synapse:
When the AWS instructions prompt you to Create a Bucket - Select a Bucket Name and Region, use a unique name. For example, thisisthenameofmybucket.
Select the newly created bucket and click the Properties button. Expand the Permissions section and:
Make sure that all the boxes (List, Upload/Delete, View Permissions, and Edit Permissions) have been checked. It should do this by default.
Select the Add bucket policy button and copy one of the below policies (read-only or read-write permissions). Change the name of Resource from “synapse-share.yourcompany.com” to the name of your new bucket (twice) and ensure that the Principal is "AWS":"325565585839". This is Synapse’s account number.
To allow authorized Synapse users to upload data to your bucket set read-write permissions need to be set on that bucket (you allow Synapse to upload and retrieve files):
For read-write permissions, you also need to create an object that proves to the Synapse service that you own this bucket. This can be done by creating an owner.txt file with your Synapse username and uploading it to your bucket. You can upload the file with the Amazon Web Console or if you have the AWS command line client, you can upload using the command line.
If you do not want to allow authorized Synapse users to upload data to your bucket but provide read access you can change the permissions to read-only:
Make sure to enable cross-origin resource sharing (CORS)
In Permissions, click CORS configuration. In the CORS configuration editor, edit the configuration so that Synapse is included in the AllowedOrigin tag. An example CORS configuration that would allow this is:
If your bucket is set for read-write access, files can be added to the bucket using the standard Synapse interface (web or programmatic).
If the bucket is read-only or you already have content in the bucket, you will have to add representations of the files in Synapse programmatically. This is done using a FileHandle, which is a Synapse representation of the file.
Please see the REST docs for more information on setting external storage location settings using our REST API.
To setup an SFTP as a storage location, the settings on the Project need to be changed, specifically the storageLocation needs to be set. This is best done using either R or Python but has alpha support in the web browser.
Customize the code below to set the storage location as your SFTP server:
Using a Proxy to Access a Local File Server or SFTP Server
For files stored outside of Amazon, an additional proxy is needed to validate the pre-signed URL and then proxy the requested file contents. View more information here about the process as well as about creating a local proxy or a SFTP proxy.
Set Project Settings for a Local Proxy
You must have a key (“your_secret_key”) to allow Synapse to interact with the filesystem.
Let us know what was unclear or what has not been covered. Reader feedback is key to making the documentation better, so please let us know or open an issue in our Github repository (Sage-Bionetworks/synapseDocs).