Git/WorkflowPrimer

From KitwarePublic
< Git
Jump to navigationJump to search

This primer details a possible workflow for using git and ITK. There are many ways to use git and it is a very flexible tool. This page details my particular way of working with git, and is certainly not the last word. It is also a rambling collection of tips and experiences that I've picked up in the last few years of using git.

Basics

It's worth trying to explain some high-level git concepts. A feature (or a bug) that sets git apart from Subversion is it's distributed nature. In practice that means that ITK needs to "bless" a repo for it to be the "official" source of ITK. This has already been done.

Another git concept is that of a commit. git uses a SHA1 hash to uniquely identify a change set. The hash is (almost) guaranteed to be unique across your project, and even across all projects everywhere and for all time. Quoting from the excellent Pro Git book:

<quote> A lot of people become concerned at some point that they will, by random happenstance, have two objects in their repository that hash to the same SHA-1 value. What then?

If you do happen to commit an object that hashes to the same SHA-1 value as a previous object in your repository, GIt will see the previous object already in your Git database and assume it was already written. If you try to check out that object again at some point, you’ll always get the data of the first object.

However, you should be aware of how ridiculously unlikely this scenario is. The SHA-1 digest is 20 bytes or 160 bits. The number of randomly hashed objects needed to ensure a 50% probability of a single collision is about 2^80 (the formula for determining collision probability is p = (n(n-1)/2) * (1/2^160)). 2^80 is 1.2 x 10^24 or 1 million billion billion. That’s 1,200 times the number of grains of sand on the earth.

Here’s an example to give you an idea of what it would take to get a SHA-1 collision. If all 6.5 billion humans on Earth were programming, and every second, each one was producing code that was the equivalent of the entire Linux kernel history (1 million Git objects) and pushing it into one enormous Git repository, it would take 5 years until that repository contained enough objects to have a 50% probability of a single SHA-1 object collision. A higher probability exists that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night. </quote>