Exploring Git: From Blobs and Trees to Practical Workflows

How does git store files and keep track of everything while being efficient? What's inside the .git folder?? With these interactive questions the session was kick started by Alfiya didi .

One of the most interesting questions she asked was: Can Git store or commit an empty folder? We'll come back to this at the end :)

Git is a distributed version control tool that helps in managing code and promotes collaboration. It was started by Linus Torvalds, who built the first version of Git in just 10 days.

Before Git, Linus was using another version control system called BitKeeper for Linux development. But when its license was revoked, he decided to build his own system from scratch which eventually became Git.

But what actually goes on under the hood? And why is Git so fast and efficient?

1. So how Git actually works?

Git doesn't actually track files the way I assumed it did.It tracks the content.

At its core, Git has something called an object database, which is where it stores everything as compressed objects identified by hashes. Instead of tracking files directly, Git stores data based on content, meaning every piece of data is uniquely identified by its hash.

This database is what makes Git fast and efficient, because identical content is stored only once and reused wherever needed.Inside this database, Git keeps three kinds of objects: blobs for file content, trees for structure, and commits for snapshots. They all point to each other, which is how Git keeps everything connected.

Git assigns every object a unique ID using SHA-1, based on what's inside it. So even the smallest change makes it look completely different to Git

All of this is built using the same three core objects we talked about:

• Commit : its not just a list of changes it's a full snapshot of your whole project at a point in time through a tree object, linked to the previous commits . That's how Git builds history. It also stores metadata like:
→ Author
→ Committer
→ Commit message
→ Commit date

• Tree: A tree captures a snapshot of the project at a certain point in time. It connects names to blobs and other trees, essentially building your folder structure without actually "storing" folders. Each commit points to a tree, and that tree represents a complete snapshot of your project at that time. It also keeps track of:
→ FILENAME
→ Object type
→ Object hash
→ Permissions

• Blob (Binary Large Object) : a blob is just raw file content, it doesn't store name, file path or anything extra . The most interesting part of this session was how it answered the question: "why is Git efficient?"

If two files have the exact same content, Git doesn't store them twice. It just stores one blob and reuses it with the help of pointers. That is because Git uses SHA-1, so the same content always produces the same hash, letting Git recognize it as the same blob. That means no unnecessary duplication, even across commits

Same content → same blob → no duplication.

Coming back to the question from the start - can Git store or commit an empty folder?

Git tracks content (blobs) and the state of files. Since empty folder has no content, Git does not generate a hash for it, which makes it effectively invisible to Git. That's why empty folders cannot be committed unless you add a file inside them.

IF you want a even deeper look at the three object types, I suggest you this beautiful blog on Git internals: How Git Internally Works.

With this Alfiya didi wrapped her part of the session and Chetan Bhaiya stepped in to show how all of this actually works in practice.

Conclusion of Part 1

• Always try to understand how tools work under the hood, not just use commands.
• Theory + practical usage together give a complete understanding of any tech.
• Git uses SHA-1 hashes to perfectly duplicate content and links everything together through interconnected snapshots to achieve such efficiency, security, and reliability under the hood.
• We have three types of objects in Git: blobs, trees, and commits. Git always generates a key for any of them, a SHA-1 hash, and then persists the content zipped in the repository.
• Commits capture full snapshots, trees organize structure, and blobs store raw data.

2. Git in practical workflows

Most of us use Git and GitHub without really understanding what happens behind the scenes, or what advantages and risks come with it. Chetan bhaiya continued the session by explaining the practical side of Git and how the common commands we use every day actually work.

Chetan bhaiya simplified Git into a flow:

  Working directory → Staging area → Local repo → Remote (GitHub)

Your code starts in the working directory, where you make changes. When you use git add, you are not saving everything immediately but rather moving the changes you want into the staging area. Then git commit takes that staged version and stores it in your local repository as a proper snapshot. Finally, git push sends it to GitHub, so it is available remotely.

Chetan bhaiya used visuals like nodes to explain us the structure where each node represents a commit, and from there he explained how branching actually works.That part made the whole Git tree feel much easier to imagine.

He also clarified how git fetch, git merge, and git pull are related. git fetch only brings the changes from the remote repository, while git merge applies those changes to your current branch. git pull is essentially a combination of both where it fetches and then merges.
He also spoke about merge conflicts, which happen when Git cannot automatically decide how to combine changes. This usually occurs when the same part of a file is modified in different commits. In such cases, Git pauses and lets you manually resolve what the final version should look like.

During the explanation of pull and push someone jokingly said "force push" and Chetan bhaiya immediately said never use force push and be very careful while using it and then I remembered the incident in Zenith hackathon conducted by us where one team force pushed and deleted a very essential file and since force push rewrites history, the file didn't just disappear from the latest version it was it was no longer part of the visible commit history.

Chetan bhaiya also covered git branch and git checkout, which are essential for navigating in Git. Using git checkout, you can switch to a new branch or even move to a specific commit.
For example, when you run git checkout main, you are on the main branch. Any new commit you make moves the main branch forward, and HEAD simply stays attached to that branch.

Most of the time, HEAD points to a branch. But if you checkout a commit using git checkout HEAD points directly to that commit instead. This is called a detached HEAD state.

You can still make commits here, but they won’t belong to any branch. They exist, but no branch is pointing to them. So if you switch away without creating a branch, those commits can easily get lost.

He suggested a simple fix, if you ever checkout a commit and plan to make changes, create a new branch from there so your work is preserved.And even if you mess this up, Git still gives you a safety net through reflog, which can help you trace where your HEAD was and recover those commits but for a limited time.
So even when it feels like something is gone, Git usually still has a memory of it.

Later, he explained about the reverse and reset. revert is the safer option as it undoes changes by creating a new commit, so history stays intact. But reset actually rewrites history. It can make your commit tree look cleaner, but u can lose commits which made it clear that Git is not just about saving code ,it is also about recovering from mistakes, when used properly.
Even if you lose commits after a reset, Git doesn’t immediately forget them. git log shows your visible commit history, while git reflog quietly keeps track of where your HEAD has been for around 60–90 days, making it possible to recover commits that seemed lost.

At the end we got into another really useful concept git cherry-pick. It is basically taking one specific commit and applying it somewhere else without bringing everything along.Its like selectively copying just what you need instead of merging entire branches.

Chetan Bhaiya also walked us through several other basic Git commands and workflows, which further help in daily usage, but I’ve focused here on the core concepts and practical insights we found most impactful. Later, everyone had a fun talk about the things they are working on, making it a really interactive and enjoyable session overall.

Conclusion of Part 2

• Git is a three stage pipeline: By moving changes from the Working Directory to the Staging Area before the Local Commit, you gain total control over exactly what becomes a permanent part of your project’s history.
• Git is forgiving:Between the reflog and revert commands, the system is designed with the assumption that we will make mistakes and made to help us
• Understand the workflow : Understanding the actual path your code takes, from your laptop to the staging area and finally to the local repo
• git fetch only downloads data, while git merge applies it
and git pull=git fetch + git merge

Thanks to this session by Alfiya didi and Chetan bhaiya, we learned a lot from Git. From what actually exists inside the .git folder to how Git behaves in real workflows.

I highly recommend exploring learngitbranching.js.org
for interactive, easy , visual learning, and the blog How Git Internally Works
for a deeper understanding of Git’s core objects and internals

Writted by Abhishek R P

Sources : octobot blog, gitlearnbranching, KodeKloud Blog, cinqict blog