In this chapter, we will go deep into some of the fundamentals of Git. It is essential to understand how Git thinks about files, its way of tracking the history of commits, and all the basic commands that we need to master in order to become proficient.
The first thing to understand while working with Git is how it manages files and folders within the repository. This is the time to analyze a default repository structure.
In Chapter 1, Getting Started with Git, we created an empty folder and initialized a new repository using the git init
command (in C:ReposMyFirstRepo
). Starting from now, we will call this folder the working directory. A folder that contains an initialized Git repository is a working directory. You can move the working directory around your file system without losing or corrupting your repository.
Within the working directory, you also learned that there is a .git
directory. Let's call it the git directory
from now on. In the git directory there are files and folders that compose our repository. Thanks to this, we can track the file status, configure the repository, and so on.
In Chapter 1, Getting Started with Git, we used two different commands: git add
and git commit
. These commands allowed us to change the status of a file, making Git change its status from "I don't know who you are" to "You are in a safe place".
When you create or copy a new file in the working directory, the first state of the file is untracked. This means that Git sees that there is something new, but it won't take care of it (it would not track the new file). If you want to include the file in your repository, you have to add it using the add
command. Once it is added, the state of the file becomes unmodified. It means that the file is new (Git says it is unmodified because it never tracked changes earlier) and ready to be committed, or it has reached the
staging area (also called
index). If you modify a file that is already added to the index, it changes its status to
modified.
The following screenshot explains the file status life cycle:
The staging area or index is a virtual place that collects all the files you want to include in the next commit. You will often hear people talk about staged files with regard to Git, so take care of this concept. All the files (new or modified) you want to include in the next commit have to be staged using the git add
command. If you staged a file accidentally, you have to unstage it to remove it from the next commit bundle. Unstaging is not difficult; you can do it in many ways. Let me explain a few concepts. This several ways to do the same thing is an organic problem of Git. Its constant and fast evolution sometimes increases confusion, resulting in different commands that do the same thing. This is because it will not penalize people used to working in a particular manner, allowing them the time for some Git revision to understand the new or better way. Fortunately, Git often suggests the best way to do what you want to do and warns you when you use obsolete commands. When in doubt, remember that there are man pages. You can obtain some useful suggestions by typing git <command> --help
(-h
for short) and seeing what the command is for and how to use it.
Well, back to our main topic. Before continuing, let's try to understand the unstaging concept better. Open the repo folder (C:ReposMyFirstRepo
) in Bash and follow these simple steps:
git status
. If Git says "nothing to commit, working directory clean," we are ready to start.touch NewFile.txt
.git status
again, verify that NewFile.txt
is untracked.git add NewFile.txt
command and go on.git reset HEAD <file>
command to back the file in the untracked status.It should be clear now. It worked as expected: our file returned in the untracked state, as it was before the add
command.
Another way to unstage a file that you just added is to use the git rm
command. If you want to preserve the file on your folder, you can use the --cached
option. This option simply removes it from the index, but not from the filesystem. However, remember that git rm
is to remove files from the index. So, if you use the git rm
command on an already committed file, you actually mark it for deletion. The next commit will delete it.
Using the git reset
command, we also get in touch with another fundamental of Git, the HEAD
pointer. Let's try and understand this concept.
A repository is made of commits, as a life is made of days. Every time you commit something, you write a piece of the history.
The past is represented by the previous commits that we did, as shown by C1 and C2 in the following diagram:
The HEAD
pointer is the reference to the last commit we did or the parent of the next commit we will do, as shown in the diagram:
So, the HEAD
pointer is the road sign that indicates the way to move one step back to the past.
The present is where we work. When a previous commit is done, it becomes part of the past, and the present shows itself like this diagram:
We have a HEAD
reference that points out where we came from (the C2 commit). Resetting to HEAD
as we did earlier is a manner of going back in this initial state, where there are no modifications yet. Then, we have the working directory. This directory collects files added to the repository in the previous commits. Now, it is in the untouched state. Within this place, we do our work in files and folders, adding, removing, or modifying them.
Our work remains in the working directory until we decide to put it in the next commit we will perform. Using the git add
command, we add what we want to promote to the next commit, marking them into the index, as shown in this diagram:
With git rm --cached <file or folder>
, you can unstage a file by removing it from the index, as shown here:
With git reset --hard HEAD
, we will go back to the initial state, losing all the changes we made in the working directory.
At the end, once you commit, the present becomes part of the past. The working directory comes back to the initial state, where all is untouched and the index is emptied, as shown in this diagram: