On-site Data Upload

Topics:

Uploading data (simple uploads)

A project can upload files associated with it. Upload is per project member, one file at a time. Files must be uploaded sequentially. Large files may time out while uploading. If you'd like to upload large files please use the large file upload API.

The API endpoint for simple file uploads is /api/direct-sharing/project/files/upload/.

File upload format

A POST API request would contain the following:

  • As a querystring parameter or in the "Authorization" header:
    • access_token=<MASTER_ACCESS_TOKEN>
    This identifies and authorizes your project.
  • As "multipart/form-data" fields:
    • project_member_id: Project member ID (string)
    • data_file: File
    • metadata: File metadata (JSON formatted string, see below for format)

See below for example API calls using httpie on the command line and raw request examples.

About the master access token

Each project has a "master access token" used for API calls. This token is a password for your project.

To find the token, your project management page and click on your project's name. The master access token should be listed in your project's details.

Keep this token PRIVATE.

  • Do NOT publicly share this token.
  • Do NOT share this token in an unsecured manner.
  • NEVER put this token into a git repository.

This token is used to authorize the following:

  • API access to any private data shared with your project
  • sending messages to project members via API
  • uploading data for project members via API

If you ever believe the security of this token may have been compromised, contact us at support@openhumans.org and we'll reset it to a new value.

When using this token in programs, we recommend you do NOT store it. You should enter the token each time you run your software.

If you want to have fully automated API transactions with Open Humans, you should use OAuth2 endpoints with user-specific access tokens.

Metadata format

The 'metadata' POST form field is a JSON object.

The following name/value items are always required in the metadata JSON:

tags
An array of strings. This should be an array of potentially helpful tags that describe the file (maybe useful for automated search for files on Open Humans), e.g. data type and format.
description
A string. This should be a short string describing the file, 100 characters max.

These are optionally empty, but of course we hope you fill them! For example, the following metadata JSON is valid:

{"tags": [], "description": ""}

Though we'd prefer to see some content, e.g.:

{
  "tags": ["survey", "diet", "csv"],
  "description": "Diet survey questions and responses.",
  "md5": "156da7fc980988c51682374436849943"
}

Recommended metadata items with reserved meanings -- use if appropriate.

md5
A string. This should be an md5 hash of the file, which can be used to check file integrity.
creation_date
A string. This should be an ISO 8601 formatted date or date + time, indicating when this file was created.
start_date
A string. This should be an ISO 8601 formatted date or date + time, and can be used to indicate the start of a time range (if this file represents data for a particular time range).
end_date
A string. This should be an ISO 8601 formatted date or date + time, and can be used to indicate the end of a time range (if this file represents data for a particular time range).

Example file upload API calls

Example for httpie:

#!/bin/sh

echo "test file data" > test-file.txt

TOKEN=example_token
URL=https://www.openhumans.org/api/direct-sharing/project/upload/?access_token=$TOKEN
METADATA='{"tags": ["survey", "diet", "csv"], "description": "Diet survey questions and responses", "md5": "156da7fc980988c51682374436849943"}'

http --verbose --form POST $URL \
  project_member_id='12345678' \
  metadata=$METADATA \
  data_file@./test-file.txt

Raw request example:

POST /api/direct-sharing/project/upload/?access_token=example_token HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 491
Content-Type: multipart/form-data; boundary=60b1416fed664815a28bf8be840458ae
Host: www.openhumans.org
User-Agent: HTTPie/0.9.3

--60b1416fed664815a28bf8be840458ae
Content-Disposition: form-data; name="project_member_id"

12345678
--60b1416fed664815a28bf8be840458ae
Content-Disposition: form-data; name="metadata"

{"tags": ["survey", "diet", "csv"], "description": "Diet survey questions and responses", "md5": "156da7fc980988c51682374436849943"}
--60b1416fed664815a28bf8be840458ae
Content-Disposition: form-data; name="data_file"; filename="test-file.txt"

test file data

--60b1416fed664815a28bf8be840458ae--

Uploading data (large files)

Uploading large files is similar to the simple file upload API described above but requires two additional calls. You can upload files of unlimited size and make the uploads in parallel with this method, however.

The first API endpoint for large file uploads is /api/direct-sharing/project/files/upload/direct/.

File upload format

A POST API request would contain the following:

  • As a querystring parameter or in the "Authorization" header:
    • access_token=<MASTER_ACCESS_TOKEN>
    This identifies and authorizes your project.
  • As "multipart/form-data" fields:
    • project_member_id: Project member ID (string)
    • filename: The name of the file to upload
    • metadata: File metadata (JSON formatted string, see below for format)

In the response to this request you will received a file ID and an upload URL. You'll upload your file with a PUT request to the URL and then call the completion endpoint to let Open Humans know you've finished uploading. It is important not to send a Content-Type header because the URL is pre-signed and a Content-Type header will invalidate that signature.

The completion API endpoint for large file uploads is /api/direct-sharing/project/files/upload/complete/.

Completion format

A POST API request would contain the following:

  • As a querystring parameter or in the "Authorization" header:
    • access_token=<MASTER_ACCESS_TOKEN>
    This identifies and authorizes your project.
  • As "multipart/form-data" fields:
    • project_member_id: Project member ID (string)
    • file_id: The ID of the file from the first step

An example using httpie

TOKEN=
PROJECT_MEMBER_ID=

BASE_URL=https://www.openhumans.org

UPLOAD_URL=$BASE_URL/api/direct-sharing/project/files/upload/direct/?access_token=$TOKEN
COMPLETE_URL=$BASE_URL/api/direct-sharing/project/files/upload/complete/?access_token=$TOKEN

# get a file ID and an upload URL
JSON=`http --form POST $UPLOAD_URL \
  project_member_id=$PROJECT_MEMBER_ID \
  metadata='{"tags": ["survey", "diet", "csv"], "description": "Diet survey questions and responses"}' \
  filename="test-file.json"`

# get the URL and the ID from the JSON response
URL=`echo $JSON | jq -r .url`
ID=`echo $JSON | jq -r .id`

echo '{"testing": "just testing..."}' > test-file.json

# actually upload the file, ensuring that we don't send a Content-Type
http --verbose PUT "$URL" Content-Type: @./test-file.json

# notify Open Humans that we've completed the upload
http --verbose --form POST $COMPLETE_URL \
  project_member_id=$PROJECT_MEMBER_ID \
  file_id=$ID

Deleting files

The API endpoint for deleting files is:

/api/direct-sharing/project/files/delete/

A project can delete files that it has uploaded to member profiles. Deletion requests are made per project member, for individual files or all files.

File deletion format

There are three ways to delete files:

  • One file, by file ID (unique)
    File ID is available in the project's API for data access. (Projects implicitly have access to all data for their project via the direct sharing API.)
  • One or more files, by matching filename
    Filename is the file's basename, including extensions explicitly, e.g. 'surveydata.csv'. All exact filename matches for this file basename will be deleted, for this project member.
  • Delete all files for this project member
    All files associated with your project will be deleted for this member.

A POST API request would contain the following:

  • As a querystring parameter named access_token or in the "Authorization" header:
    • access_token=<MASTER_ACCESS_TOKEN>
    This identifies and authorizes your project.
  • A JSON object request body with the project member ID (string), and one the follow items: 'file_id' (integer), 'file_basename' (string), 'all_files' (boolean). See examples below for usage.

Example JSON

Delete file ID #12345 for member 12345678 for this project:

{"project_member_id": "12345678", "file_id": 12345}

Delete any files with the name 'foobar.txt' for member 12345678 for this project. Note, file extensions are included in the basename.

{"project_member_id": "12345678", "file_basename": "foobar.txt"}

Delete all files for project member 12345678 for this project:

{"project_member_id": "12345678", "all_files": True}

Examples for httpie

#!/bin/sh

TOKEN=example_token
URL=https://www.openhumans.org/api/direct-sharing/project/files/delete/?access_token=$TOKEN

# deleting a file by its ID
http POST $URL \
  project_member_id='12345678' \
  file_id:=12345

# deleting a file by its name
http POST $URL \
  project_member_id='12345678' \
  file_basename='foobar.txt'

# deleting all files for a project member
http POST $URL \
  project_member_id='12345678' \
  all_files:=True

Raw HTTP request example

POST /api/direct-sharing/project/upload/?access_token=example_token HTTP/1.1
Accept: application/json
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 51
Content-Type: application/json
Host: www.openhumans.org
User-Agent: HTTPie/0.9.3

{
    "file_id": 12345,
    "project_member_id": "12345678"
}