directory. Here are all the objects in the example directory now,
commented with what they store:
$ find .git/objects -type f
.git/objects/01/55eb4229851634a0f03eb265b69f5a2d56f341 # tree 2
.git/objects/1a/410efbd13591db07496601ebc7a059dd55cfe9 # commit 3
.git/objects/1f/7a7a472abf3dd9643fd615f6da379c4acb3e3a # test.txt v2
.git/objects/3c/4e9cd789d88d8d89c1073707c3585e41b0e614 # tree 3
.git/objects/83/baae61804e65cc73a7201a7252750c76066a30 # test.txt v1
.git/objects/ca/c0cab538b970a37ea1e769cbbde608743bc96d # commit 2
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4 # ’test content’
.git/objects/d8/329fc1cc938780ffdd9f94e0d364e0ea74f579 # tree 1
.git/objects/fa/49b077972391ad58037050f2a75f74e3671e92 # new.txt
.git/objects/fd/f4fc3344e67ab068f836878b6c4951e3b15f3d # commit 1
If you follow all the internal pointers, you get an object graph something like Figure
9.3.
Figure 9.3: All the objects in your Git directory
9.2.3
Object Storage
I mentioned earlier that a header is stored with the content. Let’s take a minute to look
at how Git stores its objects. You’ll see how to store a blob object — in this case, the
string “what is up, doc?” — interactively in the Ruby scripting language. You can start
up interactive Ruby mode with the
irb
command:
$ irb
>> content = "what is up, doc?"
=> "what is up, doc?"
212

C
HAPTER
9
G
IT
I
NTERNALS
Git constructs a header that starts with the type of the object, in this case a blob.
Then, it adds a space followed by the size of the content and finally a null byte:
>> header = "blob #{content.length}\0"
=> "blob 16\000"
Git concatenates the header and the original content and then calculates the SHA–
1 checksum of that new content. You can calculate the SHA–1 value of a string in
Ruby by including the SHA1 digest library with the
require
command and then calling
Digest::SHA1.hexdigest()
with the string:
>> store = header + content
=> "blob 16\000what is up, doc?"
>> require ’digest/sha1’
=> true
>> sha1 = Digest::SHA1.hexdigest(store)
=> "bd9dbf5aae1a3862dd1526723246b20206e5fc37"
Git compresses the new content with zlib, which you can do in Ruby with the zlib
library. First, you need to require the library and then run
Zlib::Deflate.deflate()
on the content:
>> require ’zlib’
=> true
>> zlib_content = Zlib::Deflate.deflate(store)
=> "x\234K\312\311OR04c(\317H,Q\310,V(-\320QH\311O\266\a\000_\034\a\235"
Finally, you’ll write your zlib-deflated content to an object on disk. You’ll deter-
mine the path of the object you want to write out (the first two characters of the SHA–1
value being the subdirectory name, and the last 38 characters being the filename within
that directory). In Ruby, you can use the
FileUtils.mkdir
p()
function to create the
subdirectory if it doesn’t exist. Then, open the file with
File.open()
and write out the
previously zlib-compressed content to the file with a
write()
call on the resulting file
handle:
>> path = ’.git/objects/’ + sha1[0,2] + ’/’ + sha1[2,38]
=> ".git/objects/bd/9dbf5aae1a3862dd1526723246b20206e5fc37"
>> require ’fileutils’
=> true
>> FileUtils.mkdir_p(File.dirname(path))
=> ".git/objects/bd"
>> File.open(path, ’w’) { |f| f.write zlib_content }
=> 32
That’s it — you’ve created a valid Git blob object. All Git objects are stored the
same way, just with different types — instead of the string blob, the header will be-
gin with commit or tree. Also, although the blob content can be nearly anything, the
commit and tree content are very specifically formatted.

