Tables is a feature designed to provide users the ability to create web-accessible, sharable, and queryable
data where columns can have a user-specified structured schema. Users may define table columns to contain common primitive
data types (text, numbers, dates, etc.) or references to other Synapse objects (e.g.,
Files). Therefore, it is possible to
organize sample metadata into tables that include columns for desired sample fields, as well as columns that link the sample
information to a heterogeneous collection of files.
Tables may be queried and edited through both the Synapse web portal as well as through our analytical clients
that enable direct access to these data from analysis pipeline code. Unlike most NoSQL systems, the data in Synapse
is strongly consistent, not eventually consistent. This is an important design consideration for scientific data processing,
as analysis on eventually-consistent data sources can limit the types of analysis performed, and may require special coding
strategies to ensure reasonable accuracy.
Tables contain metadata that specifies the types of columns included in the
Table. These columns can be specified manually,
or Synapse can recommend column types when a user uploads a data file.
The data contained within a Synapse
Table can be retrieved by using a SQL-like query language either through the web portal or through
the analytical clients. See the API docs for an enumeration of the types of queries that can be performed. Here are a couple of simple examples.
To get all of the columns from a
Table with id syn3079449, the following query would be used:
SELECT * FROM syn3079449
To get only the two columns called “age” and “gender”:
SELECT age, gender FROM syn3079449
To count the number of rows:
SELECT count(*) FROM syn3079449
To get all columns, but only rows where age is greater that 50:
SELECT * FROM syn3079449 WHERE age > 50
To get all columns, but only rows where age is greater that 50 - and sort by treatmentArm:
SELECT * FROM syn3079449 WHERE age > 50 ORDER BY "treatmentArm" ASC
A Schema is defined in terms of Column objects that specify types from the following choices:
1. Start with a dataframe, .csv, or .tsv:
2. Create table schema: Each client has a utility function to create Columns from a data frame.
synapseclient.as_table_columns and R uses
The Table() function takes two arguments, a schema object and data in some form, which can be:
3. Store table in Synapse:
4. Query the table:
Python returns an iterator and R returns a data frame by default. To return a data frame in Python use Pandas
Once the schema is settled, changes can be made by adding, appending, and deleting.
When updating, begin by querying the table to ensure you have the latest schema and values.
Updating existing values
Adding new columns
Modifying existing columns
Adding new rows
Modifying existing rows
Delete the entire table
Tables support a special column type called
File which contain a file handle, an identifier of a file stored in Synapse. Here’s an example of how to upload files into Synapse, associate them with a table and read them back later.
1. Add a new column for files in the table we’re currently working with
2. Retrieve the most current table and save as a data frame
3. Upload the files: The files used in this example can be downloaded here. It is assumed that the files are in your current working directory.
4. Query the table and download the album cover files
The faceted navigation on
Tables (also known as simple search) can be used to simplify your search without having to use SQL-like queries. Simple search uses radio buttons and sliders to show all available facets in a menu to the left of the
Table whereas advanced search employs a SQL-like query to filter the
Table. To use table facets, navigate to a
Table or a
File View. For this example, we will be using the Synapse Table Demo found in the Wondrous Research Example. Simple and advanced search both allow you to query for features of interest in a
Table using different methods.
In order to use simple search, you must first set columns to be facets in the schema editor. Select Schema in the upper right of your table and click on Edit Schema. In the resulting pop-up, select Values or Range from the dropdowns under the Facet option. Values can be thought of as categories whereas Range is a date or number.
To see all the facets, click on Show simple search found above the SQL-query bar:
Select the features you are interested in to filter the table.
You can toggle from the simple search to the advanced search without losing the query results. For example, if in the simple search you had selected treatmentArm
A, age of
23:64, and gender as
female, the query will be preserved in the advanced search bar. However, this is unidirectional because the advance search allows parameters that are not available with facets. Therefore switching from advanced to simple search will result in resetting the search query.
There are additional docs available for
Tables in both Python and R that cover more advanced topics.
For Python check out our Python Docs.
For R, open up your R session and check out our vignettes by typing
vignette("tables", package="synapseClient") into your console.
Try posting a question to our Forum.
Let us know what was unclear or what has not been covered. Reader feedback is key to making the documentation better, so please let us know or open an issue in our Github repository (Sage-Bionetworks/synapseDocs).