This vignette will combine concepts from Annotations and Queries, Views, Uploading and Downloading Data in Bulk in order to create a manifest
velociraptor_manifest.txt, upload 100 files and edit annotations on these files using the Synapse clients.
The Sage Bionetworks Systems Biology and Computational Oncology teams maintain annotation dictionaries in GitHub. You can use the terms found in the synapseAnnotations GitHub repo as a starting point for you own annotations.
To batch upload files, create a tab-delimited manifest which contains, at minimum, the columns
parent. You can also add additional annotations as columns in your manifest. For example, your manifest might have the following headers:
path: is the local path to your file parent: is the Synapse ID (in the format syn123456) that the files will be uploaded to specimenID: is the unique identifier for each of your specimens assay: is the technology used to generate the data in this file (e.g. rnaSeq, ChIPSeq, wholeGenomeSeq) species: is the species of your sample (e.g. Mouse, Human, Triceratops) platform: is hardware used to generate the data (e.g. HiSeq2500, Affy6.0, HoodDNASequencer) sex: e.g. male or female fileFormat: is the type of file (e.g. fastq, R script)
Here it is in a visual example:
See Creating a Manifest in Uploading and Downloading Data in Bulk for additional details.
Save this file in a tab-delimited format called
Files can be uploaded in one go with a manifest file. If you would like to do a “dry run” validation of the file before uploading, you can add the parameter
dryRun = True to the function
syncToSynapse. Please note that the
dryRun feature checks everything but does not upload the files.
And ta-da! Your files have been uploaded!
Since the files have been uploaded with annotations, a file View allows users to query, facet, and bulk manipulate the files and metadata.
To create your File View:
parentcolumn in the manifest.
etaglisted as one of your columns.
An annotation for a single file can be modified in the Web client View in the case that
delta_1 needs to be updated to
A bulk annotation update is required in the case that
Velociraptor mongoliensis should be modified to
Utahraptor ostrommaysorum in all 100 files.
To download the annotation values from the Web client:
Include row metadata (Row Id and Row Version)selected when downloading.
Now that you have the file View downloaded, you can edit the values using your tool of preference, whether that is Python, R, Excel, LibreOffice, etc.
With the changes saved, go back to the file View in your browser.
Alternatively, download the View with the R or Python client: