Inside Git: How It Works and the Role of the .git
Most of the developers works on git everyday:
git add .
git commit -m "message"
git push
But very few only knows the internal functionality of git. When something got break (merge conflicts, resetting HEAD, rebasing or corrupt git history), git feels like mysterious.
This article includes:
Core model of Git
Understand what the .git folder really is
Learn how Git stores data
Understand blob, tree, and commit objects
See what actually happens during
git addandgit commit
This article is not for the beginners. If you have just started learning git, you should check Git for Beginners article and learn the basics of git first.
The core model of git
We should first understand that git is not a file tracker. It is a snapshot database which stores the snapshot of your project and compare the changes with that snapshot.
What git doesn’t:
Track files
Track differences
Track changes line-by-line
What git does:
Stores snapshots of the project as a immutable (which cannot be change) objects.
Connect them using hashes and pointers
Every commit creates a snapshot of a project but it is pretty smart to reuse the unchanged files and compare the previous snapshot with the newly generated.
What is .git folder?
.git folder is the entire Git repository database. When you initialized the git in any project using git init, git creates a .git folder in the root of the project.
my-project
|--file1
|--file2
|--src/
|--.git/ -> git respository database
.git folder structure
.git/
|-- objects/ -> Database: blobs, trees, commits
|-- refs/ -> Branch and tag pointers
|-- HEAD -> Current position where you are
|-- index -> Staging area which will be committed
|-- config -> Repository config
Understanding Git Objects
Git stores everything as a 3 main object types
Blob - Blob is also known as Binary Large Objects. It only stores the content of a file. It does not stores the file name and its creation data.
When you use commands like git add, Git calculates the hash of the file's content and stores that content as a blob in the internal object database (located in the .git/objects directory).message.txt | |-- Good Morning -> Good Evening (message changed) Git does: hash("Good Evening") -> e.g. 557db03...Blob represents WHAT is in the file, not WHERE it is. Also If two files have the same content, Git stores only one blob.
Trees - A tree represents a directory (folder). A tree doesn’t contains of file content instead it stores the file names, folder names and the blob or folders they are pointing to.
If your folder structure look like this:project |-- file1.txt |-- file2.txt |-- folder1 | |-- file3.txt |--.gitGit creates a structure like below:
Tree (root) |-- file1.txt -> blob |-- file2.txt -> blob |-- folder1 -> Tree |-- file3.txtCommits - Commit represent the snapshot of entire project. Commit stores the pointer to main tree, pointer to parent commits, and it’s metadata (Author, Committer, Date, Commit Message).
Commit | | Tree (root) |-- file1.txt -> blob |-- file2.txt -> blob |-- folder1 -> Tree |-- file3.txtSo, internally if you changed something inside the
folder1/file3.txt:New blob created for file3.txt
New tree for folder1
New root tree
New commit created
But reuses everything else and remains unchanged.
What Really Happens During git add and git commit?
Let’s understand this with an example:
You created a file file1.text
Hello World
The file is created and untracked.
Then you run a command git add file1.txt . It adds snapshots of file content to the staging area.
Internally what happen is:
Working Directory
|
(git add)
|
Read file content
|
|
Create blob object
|
|
Store in .git/objects
|
|
Update .git/index (staging area)
The index file here acts as a staging area for the individual blobs.
Then you run the command git commit -m “message“
git commit -m “message“
|
|
Reads the staging area
|
|
Creates tree objects
|
|
Creates a commit object
|
|
Updates the branch pointer
Conclusion
Git is:
A content-addressed object database
That stores snapshots
Uses hashes for identity & integrity
And uses pointers for branches and HEAD




