Upload Binary Data

Motivation

Since every local Git repository contains a copy of the entire project history, it is important to avoid adding large binary files directly to the repository. Large binary files added and removed throughout a project’s history will cause the repository to become bloated, take up too much disk space, require excessive time and bandwidth to download, etc.

A solution to this problem which has been adopted by this project is to store binary files, such as images, in a separate location outside the Git repository, then download the files at build time with CMake.

A “content link” file contains an identifying SHA512 hash. The content link is stored in the Git repository at the path where the file would exist, but with a “.sha512” extension appended to the file name. CMake will find these content link files at build time, download them from a list of server resources, and create symlinks or copies of the original files at the corresponding location in the build tree.

Prerequisites

The data.kitware.com server is an ITK community resource where any community member can upload binary data files. There are two methods available to upload data files:

  1. The Girder web interface.

  2. The girder-cli command line executable that comes with the girder-client Python package.

Before uploading data, please visit data.kitware.com and register for an account.

Once files have been uploaded to your account, they will be publicly available and accessible since data is content addressed. At release time, the release manager will upload and archive repository data references in the ITK collection and other redundant storage locations.

Upload Via the Web Interface

Log in welcome page

After logging in, you will be presented with the welcome page. Click on the personal data space link.

Personal data space

Next, select the Public folder of your personal data space.

Public folder

Click the green upload button.

The Upload files dialog

Click the Browse or drop files to select the files to upload.

The Upload files dialog with files selected

Click Start Upload to upload the file to the server.

Next, proceed to Download the Content Link.

Upload Via Python Script

A Python script to upload files from the command line, girder-cli, is available with the girder-client python package. To install it:

python -m pip install girder-client

To upload files with the girder-cli script, we need to obtain an API key and a parent folder id from the web interface.

My account link

After logging in, select My account from the user drop down.

API key tab

Next, select the API keys tab.

Create new key

Create a new API key if one is not available.

Create new key

The show link will show the key, which can be copied into the command line.

My Folders link

Next, select My Folders from the user drop down.

Personal data space

Next, select the Public folder of your personal data space.

Public folder information

Click the i button for information about the folder.

Public folder information modal

The Unique ID can be copied into the command line.

Use both the API key and the folder ID when calling girder-cli. For example,

girder-cli \
  --api-key 12345ALongSetOfCharactersAndNumbers \
  --api-url https://data.kitware.com/api/v1 \
  upload \
  58becaee8d777f0aefede556 \
  /tmp/cthead1.png

Next, proceed to Download the Content Link.