Introductory Notes on the Git
Source Control Management
Ric Holt, 8 Oct 2009
1
Git is a SCM (Source Control
Management System)
• Management of changes to documents, programs,
and other information stored as computer files.
• Its first usage have been for Linux and for Git itself.
• It is open source and free
2
Torvalds and “Git”
• Written by Linus Torvalds -- Mr. Linux
• What does the word “git” mean?
– Nothing?
– “A person who is deemed to be despicable or
contemptible “ ?
• From http://www.google.ca/search?hl=en&client=firefox-a&rls=org.mozilla:en-
US:official&hs=Ll1&defl=en&q=define:git&ei=wi7NSofxA4rflAevs-
TgBQ&sa=X&oi=glossary_definition&ct=title
3
Git: A Fast Version Control
System
• Git
– Is distributed
– Has no master copy
– Has fast merges
– Is controversial
– Scales up
– Convenient tools still being built
– Safeguards against corruption
4
Bibliography
• Overview of Git (1 hour video). by Randal
Schartz, 2007, Google talk, no diagrams
http://www.youtube.com/watch?v=8dhZ9BXQgc4
• Tv's cobweb: Git for Computer Scientists (nice
diagrams)
http://eagain.net/articles/git-for-computer-scientists/
• A tutorial introduction to Git (key examples
uses of commands)
o http://www.kernel.org/pub/software/scm/git/docs/
v1.2.6/tutorial.html
5
UW CS746 Seminar Course on
Software Architecture
• Studying Git architecture (Fall 2009)
• Suggested/clarified various ideas given here
6
Fundamental Concepts of
Source Control Management
• Source control management = SCM. Records
information about an evolving software project).
Key example: CVS.
• Project (set of files and directories, typically
comprising the source files of a software system
under development, but could be any other
“content”)
• Version (one instance of a project) (called a
commit in Git)
7
More Fundamental Concepts
of Source Control Management
• Branch. A sequence of versions, each one
evolved from the previous one
• To branch. Split development into two parallel
branches.
• To merge. Combine two branches into single
branch
• Repository. In this case, a specialized database to
store an evolving project with its branches
8
Git’s Distributed Architecture: P2P
R1 R2 Ovals are
servers
R3
Boxes are
R4 individual
R5
repositories
System of distributed repositories R1, R2, ...
Mostly Git repositories, but can be CVS etc. 9
Architecture of Git Individual
Repository
Object storage. Some times called simply a
repository. Stores representation of the versions
of project (in a directory called .git )
Git’s Index. Caches objects from work space that
have changed (added to index), and that will be
stored in the repository with the next commit
Work space. User’s sandbox (ordinary files and
directories) for active version
10
Architecture of Git Individual
Repository
One Repository
add commit
Working tree Index Object
(sand box) (cache) store
pull, push
More Repositories 11
Data Structure of Repository:
ERD Defines Allowed DAGs
Each commit (version) is based on zero
commit or more previous versions
version
tree of tree Each folder contains other folders and
folders files (blobs)
file blob
Generally, committed structure is
immutable 12
Naming Nodes by Their Hashes
• The name (or key) of a node is the hash
(SHA1) of its contents.
• A hash can be used as a “pointer” to locate
its content
• Identical files have the same hash and are
represented by a single blob
13
Layered Structure of
Implementation of Repository
GUI
Porcelain
(High Level
Operations)
Plumbing
Operations are separate executables
(Low Level (not API)
Operations)
14
Key Git Operations
1) Init. Create an empty Git
repository 7) Merge. Join two or more
branches
2) Clone. Copy a repository into a
new directory. After cloning, edit, 8) Rebase. Combine/restructure a
create and remove files for new set of commits to simplify them
version 9) Checkout. Checkout files etc
3) Add. Add file contents from from a commit, and switch work
work space to the index. (The files space to that new branch
are edited locally) 10) Fetch. Download objects and
4) Remove = rm. Remove files from refs from another repository
work space and from the index 11) Pull. Fetch from and merge
5) Commit. Store the changes (that with another repository or a local
are added) to the repository, using branch
the index. Completes this version. 12) Push. Update remote refs (in
6) Branch. Create (or delete) a another repo) along with
branch associated objects
15