Synapse Docs
Breadcrumbs

Managing Metadata with Curator

About

Curator provides a spreadsheet-style grid on the Synapse website that offers a familiar row-and-column layout used in many data management tools. This interface allows you to add metadata to describe files and enter data describing records, such as those for participants and biospecimens.

In addition to manual editing, Curator offers an AI-powered chat assistant to help accelerate metadata completion. The assistant can guide you through metadata requirements, suggest appropriate values, and help populate fields based on your project context, while keeping final decisions and edits under human control.

There are two types of metadata to keep in mind: file-based metadata and record-based metadata.

  • File-based metadata is information attached to files you upload to Synapse, stored as Synapse Annotations. Synapse annotations refer to key-value pairs that describe important details about a file. If you have used Synapse before, you may recognize annotations as the metadata associated with the tag icon shown here annotation.png . Annotations are used by downstream systems (for example, Synapse data portals) to support tasks such as search and filtering, which is why well-annotated data is important. Keep in mind that Synapse Annotations are visible to the public, so contributors should not enter any potentially private information.


    In file-based metadata spreadsheets, the leftmost columns are id and name. These are system-generated and should not be edited.

    • id corresponds to the Synapse ID, which uniquely identifies files, folders, datasets, and other entities, and follows the format syn plus an eight-digit number (e.g. syn12345678)

    • name corresponds to the file name. See best practices for filenames.

      Columns to the right of id and name capture additional metadata defined by the associated metadata template and are displayed as the header row. The table below shows an example for RNA sequencing FASTQ files, annotated with metadata fields such as assay, tissue, species, file_type, and sample_id.

id

name

assay

tissue

species

file_type

sample_id

syn12345678

sample1_read1.fastq

RNA-seq

blood

human

fastq

001

syn12345679

sample1_read2.fastq

RNA-seq

blood

human

fastq

001

syn12345680

sample2_read1.fastq

RNA-seq

blood

human

fastq

002

syn12345681

sample2_read2.fastq

RNA-seq

blood

human

fastq

002

  • Record-based metadata is used to capture information about entities such as participants, samples, or specimens.

    Definitions

    • Record set: A tabular file (for example, CSV or TSV) that contains a collection of related records describing the same type of entity (participants, samples, or specimens).

    • Record
      A single row in a record set representing one entity (for example, one participant or one sample).

    • Record ID
      A user-defined identifier (such as participant_id, sample_id, or specimen_id) that uniquely identifies a record within a record set.
      This is not a Synapse ID and is not auto-generated by Synapse.

Upload and update behavior

Each record must include a record ID column. This ID is used to determine whether a record should be created or updated when a record set is uploaded.

When a record set is uploaded:

  • Rows with a record ID that does not already exist in the record set are added as new records

  • Rows with a record ID that already exists in the record set overwrite (update) the existing record with the same ID

  • Records in the existing record set whose IDs are not present in the uploaded file are left unchanged

Synapse does not generate record IDs automatically. If a record ID is missing, the row cannot be matched to an existing record.

As shown in the table below, record-based metadata capture additional clinical context (for example, diagnosis, age bracket, and sex) for the files listed above. Record-based metadata are stored in a separate table so that multiple files can be linked to the same participant or sample without repeating the same information. This keeps metadata consistent and easier to update.

sample_id

diagnosis

age_bracket

sex

001

Glioblastoma

50–59

Male

002

Glioblastoma

40–49

Female

Whether metadata is applied as file-based or record-based metadata depends on the type of content. In general, when you upload files, you will always have at least file-based metadata applied. In some cases, you may also have record-based metadata that describes patients, biospecimens, etc.

💡 Test your knowledge!

This diagram provides an overview of the metadata contribution workflow in Curator. Additional details and instructions for each step are included in this documentation.

Curator_Workflow.png

Requirements for using Curator

To use Curator, you must:

  • Create a Synapse account

  • Complete the Synapse Certification Quiz

  • Have View access (or higher) to the Synapse project, granted by the project Administrator

  • Have Edit access (or higher) to any files you will work with, granted by the project Administrator

Setting up Curator

Data Coordination Plan (i.e. Data Coordinating Centers)

If you are part of a Data Coordination plan (i.e. data coordinating center such as ADKP, ARK, ALS, ELITE, Classic, HTAN, MC2, or NF-OSI) your project setup is handled by Sage Bionetworks. Curator is enabled automatically on your behalf, and no additional configuration is required from you.

If you have questions or need support, please contact your consortium’s service desk:

If none of the above apply, please submit questions to the Synapse Help Desk

Free Basic Plan

Curator is free to use. If you manage your own project on the free Basic plan, Curator is disabled by default and hidden in the user interface. To enable it, follow the programmatic setup steps in the instructions linked here.

To access Curator, go to your Synapse project and select the Tasks and Actions tab. This tab is the central place to see tasks associated with your project.


Screenshot 2026-01-07 100237.png
Navigate to your Synapse project by selecting it from the Projects menu (for example, under Shared With Me).
image-20260213-220316.png
The Tasks and Actions tab appears on Synapse projects and provides access to curation tasks.

What Are Tasks?

Curator introduces the concept of tasks in Synapse. Tasks guide you through the steps needed to complete required or recommended steps for your project or data.

How Curation Tasks Relate to Your Project Files

Curation tasks are directly connected to your project’s file structure.

When you navigate to the Files tab, you will see folders corresponding to each curation task. The type of folder depends on the metadata workflow used.

Conceptual Diagram of a Synapse Project

Project
│
├── Tasks and Actions (Tab)
│     ├── Task: Biospecimen Metadata
│     └── Task: File-Level Metadata
│
└── Files (Tab)
      ├── Biospecimen_Metadata/
      │     └── Biospecimen_RecordSet (table)
      │
      └── Experimental_Files/
            ├── file_001.fastq
            ├── file_002.fastq
            └── file_003.fastq

If you are unsure which task to complete, or if you do not see an expected task, please contact your data manager or project administrator for guidance.

Permissions and Collaboration

Curator is designed to promote collaboration by ensuring that everyone on a project sees a shared, consistent set of curation tasks. Curation tasks are assigned to a team so that team members can work together, review progress, and maintain alignment throughout the metadata curation process.

Access to curation tasks follows Synapse project permissions, so existing collaboration models remain intact.

Synapse permissions

Curator access

Can View

View curation tasks

Can Download

View curation tasks

Can Edit

View/Edit curation tasks

Can Edit & Delete

View/Edit/Delete curation tasks

Administrator

View/Edit/Delete curation tasks

There is currently no built-in search feature or filters on the Tasks and Actions tab, so leverage your browser’s “control / CMD F” feature to search for the curation task you need to complete.

Working in Curator

  1. From the Tasks and Actions tab, click Open Curator under the Actions menu.

image-20260213-220454.png


  1. For file-based metadata, the grid displays the list of files in the leftmost column. For record-set data, the grid displays all records in the dataset, either populated from a previous session or with empty cells if no metadata has been entered yet. Column headers indicate the fields or properties that need to be completed.

  • Blue column headers = required fields

  • White column headers = optional fields

  • Red cells = missing or invalid values

  • White cells = validated values

    Dependencies (conditional fields)
    Some fields are conditionally required based on values entered in other fields, as defined by the schema. When a dependency is triggered:

    • A column header may change from white to blue to indicate it has become required.

    • Cells in the affected column may change from white to red if values are missing or no longer valid under the new condition.

    • If the condition is no longer met, the column may revert to optional (white header) and cells return to white once validation passes.

    • Curator_Example.jpg

Grid actions

  • Add rows: Select + Add to create a new row.

  • Edit faster: Right-click a cell (or Control + Click) to open the context menu:

    • Copy

    • Cut

    • Paste

    • Insert row below

    • Duplicate row

    • Delete row

  • See more fields: Scroll horizontally to view additional columns.

  • Bulk upload: Upload a CSV to populate the grid with metadata (record-based submissions only).

  • View JSON Schema at the top of the page to review column details.

  1. Enter metadata by clicking directly into a cell and typing a value. For fields with a controlled vocabulary, click the cell and select one or more values from the drop-down list.

  2. Click Sync Changes to save your latest edits

  3. Once applied, your updates take effect immediately and become the project’s Synapse annotations.

Uploading a CSV

For record-based metadata, you can fill out your data in another tool (like Excel or Google Sheets) by downloading a CSV template and uploading it back into the grid.

  1. Go to the Tasks and Actions tab.

  2. Click Open Curator next to the relevant task to open Curator.

  3. In Curator, click Download to export the spreadsheet as a CSV.

  4. Open the CSV in your preferred tool and complete it offline.
    Important: Don’t change the column headers.

  5. Back in Curator, click Upload and select your completed CSV file.

How uploads create and update records

Every row must include a record ID column. Curator uses this ID to decide whether each row should create a new record or update an existing one.

When you upload a record set:

  • New IDs (not already in the record set) are added as new records.

  • Existing IDs overwrite (update) the record with the same ID.

  • Records not included in the uploaded file are left unchanged.

Note: Synapse does not generate record IDs automatically. If a row is missing a record ID, it can’t be matched to an existing record.

AI Grid Assistant

The Grid Assistant is an AI-powered chat assistant designed to help you curate, clean, validate, and populate tabular data directly within a grid. It understands the grid’s schema and can analyze selected rows or the entire dataset to provide guidance, corrections, and transformations.

The assistant is especially useful when working with structured data that must conform to specific validation rules or when you want to quickly populate a grid using a project description or other large blocks of text.

Accessing the Grid Assistant

To open the AI Grid Assistant:

  1. Navigate to the grid.

  2. Click the Open Chat button in the upper-right corner of the grid. The chat panel will open, allowing you to ask questions or give instructions related to the grid.

  3. To close the chat, click the × in the upper-right corner of the chat panel. Closing the chat preserves your current conversation.

  4. If you leave the page or refresh your browser, the chat session will end. A new chat session will start when you reopen the Grid Assistant.

  5. Grid Assistant can make mistakes, so please check all work prior to submitting!

What the Grid Assistant Can Help With

  • Identify errors, missing values, and duplicate records in the grid.

  • Normalize and clean inconsistent or poorly formatted data.

  • Explore, filter, and summarize data in the grid.

  • Create new columns or derive values from existing data.

Example Prompts:

  • Can you check the selected rows for schema validation errors and explain what’s wrong?

  • These rows have inconsistent date formats in the sample_date column - can you fix this?

  • I’m seeing errors in the species column for this data. What’s the issue, and can you fix the selected rows?

  • For the selected rows, can you generate a sample_id by combining project_id and specimen_number?

  • Are there any common issues across the selected rows that I should address before submission?

💡 Tips

  • Be specific about which rows or columns you want the assistant to work on.

  • Use prompt templates for repeatable workflows.

  • Review suggested changes before applying them to ensure correctness.

What can the AI see?

The Grid Assistant can see:

  • The content currently visible to you in the grid, and

  • The information you choose to provide in the chat.

It follows the same permissions and access controls as the rest of Synapse and cannot see anything you do not already have access to. The Grid Assistant is hosted in AWS Bedrock and does not use submitted content for model training.

As with any Synapse feature, please use care when working with sensitive information and follow the Synapse Terms of Service.

Metadata Requirements

Consortium or Funder Requirements

If you are part of a larger consortium or are working under a specific funder, metadata requirements are often already defined for you. These requirements ensure consistency across datasets and help make data interoperable within a research domain.

You do not need to configure or apply these requirements yourself. When you fill out the Curator grid, the appropriate metadata schema is automatically applied for you. It can be helpful to be aware of these metadata requirements ahead of time, so you know what information to collect during data generation and preparation.

Below are examples of consortium-managed metadata dictionaries and data models that Curator uses behind the scenes:

Individual Synapse Users

If you are an individual Synapse user and are not part of a larger consortium, we plan to make this easier in the future by offering a set of standard metadata schemas that you can choose from. These schemas will cover common data types and research workflows and will help guide you in providing clear, consistent metadata when uploading your data.

For now, if you’re interested in learning more or working with a more advanced setup, you can explore these resources:

Validating

The grid editor checks your entries as you type and points out missing or incorrect information. Highlighted cells and messages show you what needs attention.

You can still click Sync Changes even if there are messages. This is on purpose, so you can save your work when some information isn’t available yet (for example, if you don’t know a required value or don’t see the right option in the list).

Where to See Validation Errors

  • Row Indicator (Left Panel): A highlighted row indicator means at least one value in that row is missing or invalid.

  • Highlighted Cells: Highlighted cells show exactly which values are missing, invalid, or incorrectly formatted.

  • Validation Message Panel (Bottom of Screen): A yellow panel displays explanations of validation errors. An example is shown below:

image-20251217-161959.png

Resolving Validation Errors

Message

Explanation

How to fix

null is not a valid enum

One or more required fields in the selected row are missing (null) or left empty, and the field expects a value from a predefined list.

Populate all required fields (blue columns) with a valid value from the allowed list.

<value> is not a valid enum value

The entered value does not match one of the allowed values defined in the metadata dictionary.

Select a valid option from the predefined list. If the correct term is missing, notify your DCC contact.

Required field <field name> is missing

A required column defined in the schema is missing entirely from the submission.

Add the missing column and populate it with valid values for all rows.

Cannot convert <value> to integer

The value entered cannot be interpreted as a whole number.

Enter a whole number (e.g., 1, 10). Do not use text or decimals.

Value <value> is not a whole number

A decimal or non-integer number was entered where an integer is required.

Replace the value with a whole number.

Cannot convert <value> to float

The value entered cannot be interpreted as a numeric (decimal) value.

Enter a numeric value (e.g., 3.14, 10).

Cannot convert <value> to boolean

The value does not match accepted boolean values.

Use one of the supported values: true, false, yes, no, 1, or 0.

String length is less than minimum

The entered text is shorter than the minimum allowed length.

Enter a longer value that meets the minimum length requirement.

String length is greater than maximum

The entered text exceeds the maximum allowed length.

Shorten the value to meet the maximum length requirement.

String does not match pattern

The value does not match the required format (pattern).

Update the value to follow the required format (for example, a specific ID or naming pattern).

Value is less than minimum

The numeric value is below the allowed minimum.

Enter a number that meets or exceeds the minimum value.

Value is greater than maximum

The numeric value exceeds the allowed maximum.

Enter a number that is less than or equal to the maximum value.

Expected type: <type>, found: <type>

The value entered is the wrong data type for this column (for example, text entered where a number is required).

Replace the value with the correct type expected by the column (for example, enter a number instead of text).

[null] is not a valid date. Expected [yyyy-MM-dd]

The date field is empty or not entered in the required format.

Enter a valid date using the required format: YYYY-MM-DD (for example, 2024-03-15).

Saving

Edits are saved in real time as you enter information in the form. If you leave the page and return later, your information will still be saved.

However, these changes are not committed to Synapse as final annotations until you click Sync Changes.

If the update is successful, a green bar at the bottom of the screen will confirm that the sheet has been successfully updated.

image-20260303-191958.png

Viewing Submitted Metadata

Viewing File-Based Metadata

File-based metadata annotations also become “Synapse annotations” and are visible with each individual file.

  1. Navigate to the Files tab in the Synapse project.

  2. From the folder tree, navigate to the folder associated with the records.

  3. Hover over the tag icon :ann: next to an individual file name to view its annotations.

    • A yellow tag :y: indicates that one or more annotations have invalid metadata.

  4. To see a complete list, click the file entity and click the :annotation: icon on the top right of the entity page.

Viewing Record-Based Metadata
Annotations associated with a record set are also visible on the file-tree view.

  1. Navigate to the Files tab in the Synapse project

  2. Click the appropriate folder associated with the records.

  3. Select the record set entity :re: , which will open a preview displaying the associated metadata.

Troubleshooting

Error message

How to fix

The CSV file cannot be empty

Ensure the CSV file contains at least one row. If a header is expected, include a header row followed by data.

Expected the first line to be the header but was empty

Make sure the CSV includes a header row as the first line and that it is not blank.

The CSV header does not match the schema size

Confirm that the number of columns in the CSV header exactly matches the expected schema. Do not add or remove columns.

The CSV header column “<column>” does not match the schema column “<expected>” at index <n>

Ensure the column names and their order exactly match the schema or Curator grid. Column order matters.

Unexpected processing error

Retry the operation. If the error persists, contact support and include the CSV file and details about what you were attempting to upload.

Reporting a Bug, Requesting a Feature, or Getting Help

If you run into an issue, have a question, or would like to request a new feature for Curator, please submit a support ticket to the Synapse Help Desk

To help us resolve your issue faster, include:

  • A brief description of the problem or request

  • The synapse ID (syn########) of relevant projects, folders, files

  • Any error messages you saw

  • Screenshots or recordings, if available

For questions related to specific projects or Data Coordination Plans, please contact your consortium’s service desk instead:

Frequently Asked Questions

What features are available today?

Below is a summary of features available today and under construction.

Feature

Status

Notes

Spreadsheet-style grid editor

✅ Available

Row-and-column interface for metadata entry on Synapse

File-based metadata (Synapse annotations)

✅ Available

Metadata attached directly to files

Record-based metadata (record sets)

✅ Available

Separate tables for participants, samples, etc., with upsert keys

Project-level curation tasks

✅ Available

Shared tasks visible to all collaborators

Required vs optional field indicators

✅ Available

Blue headers = required, white = optional

Real-time validation while editing

✅ Available

Highlights missing/invalid values as you type

CSV export and upload

✅ Available

Download templates and upload completed CSVs

Autosave during grid session

✅ Available

Drafts saved automatically

Sync Changes to commit to Synapse

✅ Available

Explicit submission step

Permission-based access control

✅ Available

Follows Synapse project permissions

AI Grid Assistant (human-in-the-loop)

✅ Available

Chat-based assistance for cleaning, validating, and populating data

Schema inspection (View Validation Schema)

✅ Available

JSON schema view for advanced users

Consortium-managed metadata schemas

✅ Available

Applied by Sage Bionetworks for Data Coordination projects

Browser-based search (Ctrl/Cmd + F)

✅ Available


Simultaneous (multi-user) editing

⚠️ Limited

Must be added using Synapse Teams

Real-time user presence indicators

🚧 Under Construction / Planned

Cannot see if others are editing

Version history / rollback in UI

🚧 Under Construction / Planned

Previous grid versions not viewable

Shareable grid session URLs

⚠️ Limited

Synapse team must be configured

Built-in grid search/filter tools

🚧 Under Construction / Planned

Planned improvement over browser search

Self-serve schema selection for individuals

🚧 Under Construction / Planned

Standard schemas planned for non-consortium users

What happens if two users are on the same grid in different browser sessions?

At this time, Curator does not indicate whether another user is currently working in the same grid. The most recent saved session determines the current set of Synapse annotations, so we recommend coordinating with your team to avoid overlapping edits.

The value I need is not in the valid-value list. What should I do?

If the term you need does not appear in the valid-value list, go ahead and submit your data using the appropriate value. Then, notify your data manager or consortium contact that the term is missing from the metadata dictionary. They can review the request and update the list if the term is appropriate to add.

Is Synapse updated immediately as I edit the grid?

No. The grid functions as a draft workspace. Your edits are saved as you work, but metadata is not submitted to Synapse until you click Sync Changes.

Why do I only see the Tasks and Actions tab on certain Synapse projects?

The Tasks and Actions tab is hidden if there are no curation tasks.

If you believe you should see the Tasks and Actions tab, please contact the appropriate portal help desk:

If none of these apply, please submit questions to the Synapse Help Desk

Glossary

Curator
Synapse feature for completing metadata in a spreadsheet-style grid.

File-based metadata
Metadata attached directly to Synapse files (stored as annotations).

Metadata template
Configuration that defines which metadata fields/columns appear in a sheet.

Record set
Table containing a collection of related records (e.g., participants or samples).

Record-based metadata
Metadata stored in tables to describe study entities (e.g., participants, samples, specimens).

Schema

Definition that specifies the structure, fields, data types, relationships, and validation rules for metadata. JSON schema is a popular format.

Spreadsheet-style grid
Row-and-column interface for viewing and editing metadata.

Synapse Annotations
Public key–value metadata fields associated with Synapse entities.

Sync Changes
Button that commits grid edits to Synapse as submitted metadata.

Task
Guided action item for completing required or recommended project/data steps.

Tasks and Actions tab
Synapse project tab where tasks and related metadata sheets are accessed.