Getting Started with Synapse

Get Started with Synapse

This getting started is for non-technical users who are interested in learning about Synapse. By following this getting started, you’ll learn fundamental Synapse features by performing some simple tasks. You’ll learn how to:

  • Create your own project and add content to Synapse
  • Install one of the Synapse clients (R, Python or command line)
  • Incorporate Synapse in your workflows to read shared content and upload analysis results
  • Find content in Synapse
  • Understand and use provenance

What is Synapse?

Synapse is an open source software platform that data scientists can use to carry out, track, and communicate their research in real time. Synapse enables co-location of scientific content (data, code, results) and narrative descriptions of that work. Synapse has seeded a growing number of living research projects and resources including Sage/DREAM Challenges.

With Synapse, you can:

  • create your own personal project workspaces
  • populate your projects with files and tables such as data, code, and results as well as the provenance relationships that tie these resources together
  • richly annotate files and tables to increase discoverability and aid in programmatic querying of these resources
  • provide a project narrative which lives right along side the scientific artifacts of your work, via the Synapse Wiki engine
  • create a DOI for any resource for easy citation of your work
  • share your work with other Synapse users, teams of users, or make your work public
  • Discuss with researchers in a project using group email and group chat.

Synapse was created to encourage open science initiatives to advance our understanding of human health. Sage Bionetworks provides Synapse services free of charge to the scientific community through generous support from the National Cancer Institute (NCI), the Washington State Life Science Development Fund (LSDF), and the National Heart, Lung, and Blood Institute (NIH NHLBI).

Synapse operates under a complete governance process that includes well-documented Terms and Conditions of Use, guidelines and operating procedures, privacy enhancing technologies, as well as the right of audit and external reviews.

Installing Synapse Clients

Synapse is built on a number of RESTful web APIs that allow users to interact with the system via a number of clients. One of these clients is the web client, i.e. the website www.synapse.org. Synapse also provides three programmatic clients (R, Python, and Command Line). Content can be uploaded, downloaded, annotated, and queried from any of these interfaces. In the getting started guide we will run through examples using all three programmatic interfaces. At any point you can pick the language you would like to see examples in by clicking the corresponding tab at the bottom of every example. Unless otherwise noted the examples are can be typed into the respective environment. That is a shell prompt for the command line examples, a Python session such as an ipython notbook of script, and an R session for the R examples.

In a terminal window type the following command and hit enter. (For alternative methods of installation see the Python client installation instructions.)

pip install synapseclient

Logging into Synapse

Synapse credentials are required to use the programmatic clients. Register to create an account, and even if you login with a Google account, make sure you go through the extra step of creating a Synapse username and password.

At the command line you can login by specifying your Synapse username and password.

The login credentials can be specified for every Synapse client session, but this is not the recommended practice as your password will be visible. Instead, by passing the rememberMe parameter you can cache your credentials for use in future Synapse client sessions.

The full list of possible login parameters for the Python client can be found in the Python Docs and for the R client in the R Docs.

To login with your username/email and password:

# by passing --rememberMe the username/password will not need to specified on subsequent calls to Synapse.
synapse login -u me@example.com -p secret --rememberMe

Using a config file

You can store your credentials in your home directory in a file called .synapseConfig (note the period at the beginning of the file which makes this a hidden, system file on Linux-like OS’s. The format is as such:

[authentication]
username: me@example.com
password: secret

Becoming a Certified User

Anyone can browse public content in Synapse but in order to download and create content you will need to register for an account:

Register

As Synapse can store human subject data that has sharing and use restrictions, you will also need to become certified and take a quiz about what kinds of items can be shared in Synapse. To start this process:

Become a Certified User

Explore our accounts, certification and profile validation page to find out more information on the different levels of users.

Project and Data Management on Synapse

Now that you have your Synapse account you can start adding content. All Synapse content is organized according to user-created Projects. Select a unique name for your Project, such as “My uniquely named project”, and create your Project. Projects are an organizational unit in which you can collaboratively access and share Wikis (narratives), Files (a distributed file system to store data, code, and results from your work), and Tables (web-accessible, sharable, and queryable data where columns can have a user-specified structured schema). Each Project also contains a project-specific Discussion Forum. By default, your newly created Project is private; you are the only person who can access it and any content you include in it. To invite others to view or edit your Project, click on the Share icon in the upper right hand portion of the screen. For more information on Sharing, please see the Content Controls article.

As an exercise we are going to create an example Project to store some cell line analysis.

Since Project names must be unique in Synapse, let me suggest a project name for you: Foo

synapse create Project -name "My uniquely named project"
	


By default, your newly created Project is private; you are the only person who can access it and any content you include in it. Later on we will share your created Project with other users.

Objects like Files, Folders, Projects created in Synapse are assigned unique identifiers which are used for unique reference (a Synapse ID) with the format syn12345678. For example, your newly created Project will be assigned a Synapse ID.

synapse logo Note: Synapse Ids are used to uniquely identify Files, Folders, Projects and Tables in Synapse.

You can view what you have created in Synapse on the web with:

synapse onweb syn123  #where syn123 is replaced with the synapse Id of your project
	

Adding a Wiki to your Project

The Wiki tab in a Project provides a space for you to build narrative content to describe your research. These Wikis can also be nested as subpages to build up a hierarchy of content within your Project as well as be attached to specific Files and Folders in your Project. Examples of content that you may want to include are project descriptions, specific aims, progress updates of data generation or analysis, analysis results (either in prose or via markdown-based notebooks such as knitr or IPython notebook), or web-accessible publication-like summaries of your research.

Wiki pages can contain highly customized content including, but not limited to images, tables, code blocks, LaTeX formatted equations, and scholarly references. Synapse-specific widgets also allow users to embed dynamic content based on other resources stored in Synapse (e.g., Entity List, User/Team badge, Query Table, or Provenance Graph).

Here we will create a small Wiki:

The command line client does not support the creation of wiki content.
We suggest using (to get to the webpage of the project)
synapse onweb syn###
where syn### is the Synapse Id of your created project.  Then editing the wiki using the web client.
	

Organizing Data: creating Files and Folders

The Files tab houses a remote file system that you can utilize to share your project’s data, code, results, and any other information pertinent to your research. Unlike the file system on your local computer, Synapse Files and Folders are identified by a unique identifier, are versioned, and can be linked to one another using the Synapse Provenance services. These Files and Folders, like all Synapse content, can be accessed either through the web or through one of our analytical clients using their unique Synapse ID.

Synapse Folders are used just as folders are on a local file system – to organize and segment content. Folders can also contain (or be parents of) any number of other Folders and Files.

To add a Folder:

synapse create Folder name="results" parentId=syn123  #where syn123 is replaced by the project Id
	


Synapse Files are also much like files on a local file system – except they are web-accessible to anyone who has access, can be richly annotated (and queried on), can be embedded as links or images within a Synapse Wiki, and can be associated with a DOI. Files carry the Conditions for Use of the Folder they are placed into in addition to additional specific Conditions for Use they have on their own.

Lets upload a local file data/cell_lines_raw_data.csv into this newly created Folder. To follow along you can pick any file you have and replace the name with your chosen file. We will also attach some annotations to this file describing the content of the file. In the example, we will associate the key foo with the value bar along with two numerical annotations.

synapse add data/cell_lines_raw_data.csv --parentId=syn123  --annotations '{"foo": "bar", "number1":42, "number2": 3.14}'

	


Local Folder and File Sharing Settings

Access to Files, Tables, and Folders is controlled by the Sharing setting that you select for your project. You may also set individual Sharing settings for specific Files, Tables, or Folders within a Project.

Provenance and Tracking Content

Synapse provides advanced capabilities for formally tracking the relationship between digital assets (e.g. data, code, analytical results) stored within the system through the Synapse provenance system in order to aide in disseminating their work in ways that others can reproduce and reuse. The Synapse provenance system allows users to formally track their analysis history by aiding in the communication and sharing of a sequence of processing steps. Provenance relationships can, for example, be specified between raw data, analysis code and results that occur in a complex processing pipeline, regardless of where it is run. Synapse’s web services for managing provenance expose a very general data model based on the W3C Prov spec. Central to the design, users are not required to use a particular execution environment or workflow tool. Instead, provenance can be specified by inserting calls to the Synapse web service layer into their normal workflows to record activity; pipelines may be created through simple scripting or by using workflow execution engines. The provenance system allows users to branch off workflows at any point in prior analyses, while maintaining detailed records of data, code, and environment versions needed to reproduce the work.

Provenance is easiest specified when you are uploading or editing a file in Synapse. To specify the provenance you specify the files used as input and any files that were executed to generate the File. Both used and executed can be either an arbitrary URL such as a reference to a code commit on github, a file stored on an ftp site or references to items in Synapse. Here we are going to add a figure to Synapse and indicate that the code https://github.com/Sage-Bionetworks/synapseTutorials was used to generate the figure from the data in the file data/cell_lines_raw_data.csv that we uploaded previously

synapse add images/plot_2.png --parentId=syn123  \
--used raw_data_file  \
--executed https://github.com/Sage-Bionetworks/synapseTutorials


Find additional information and tutorials through our User Guide.

Need More Help!

Try posting a question to our Forum.

Let us know what was unclear or what has not been covered. Reader feedback is key to making the documentation better, so please let us know or open an issue in our Github repository (Sage-Bionetworks/synapseDocs).

2016 Sage Bionetworks Contact us Creative Commons License