[Insight-developers] Tarball of external data files

Brad King brad.king at kitware.com
Mon Nov 28 12:07:22 EST 2011


On 11/28/2011 11:48 AM, Cory Quammen wrote:
> While waiting for the completion of 'make ITKData' run for the first
> time on a new machine, I had time to ponder how to speed up the
> initial download of ITK's external test data. My solution was to tar
> up the external data directory that already had the complete data on
> one machine, transfer it to my new machine, untar it, set the
> ExternalData_OBJECT_STORES to this directory, and voila, no need to
> wait the 20-30 minutes that the usual one-by-one download procedure
> takes when you invoke 'make ITKData' for the first time.

Nice.  I've been meaning to work on a tool like this but never had time.

FYI, the ITK release tarballs will come with a .ExternalData directory
containing all the objects referenced by content links in the tarballs.
That way release builds won't need to download anything.  The script
that computes the tarballs is here:

   http://itk.org/gitweb?p=ITK.git;a=blob;f=Utilities/Maintenance/SourceTarball.bash;hb=v4.0rc03

Returning to your topic, what we need is a more efficient object
transport for "make ITKData" to use.  Git uses "packs" to achieve the
same thing, but it can efficiently compute what objects the client
needs and produce a custom pack on the server because it knows the
commit history missing from the client.  All of our objects are
independent.

A tarball of objects for getting started is a simple approximation of
Git's "pack" concept.  Further progress would require client and server
protocol implementations to download a bunch of objects simultaneously
from the server and then extract them on the client.

-Brad


More information about the Insight-developers mailing list