- History#

A commit is linked to one or several parent commits leading to a branch like-structure of commits when working with .

How can we explore this history?

In particular we might like to know general things like:

  • What happened when/where?

  • What changed between here and there?

And particular things like:

  • What exactly changed at a specific point?

  • When and who changed a specific line in a file?

  • What is “my history”, i.e. what was I doing?

The -Historian’s Toolset#

git log - What happened when/where?

git log   is the command to gain an overview of the history and explore the course of changes.

The git log command provides a plethora of ways to explore the history of a repository. Its focus resides reporting the relation between single changes, allowing you to gain an overview of what had happened in a repository.

git log is an extremely powerful tool to explore the history of a repository with some particularly useful option:

  • --oneline shows a condensed view with each commit on a single line.

  • --graph visualizes the commit history as a branching graph.

  • --author="Author Name" shows commits made by a specific author.

  • --since="2 weeks ago" displays commits made since a specific date, useful for reviewing recent changes.

  • -p displays the patch (diff) introduced in each commit, allowing you to see what changes were made.

  • --grep="keyword" filters commits to show only those with messages containing a specific keyword.

log: explore the history

  • commits are indicated with * <hash> followed by the commit message

  • Branches, Tags and HEAD are indicated in (...) after the commit hash

Example: git log --all --decorate --oneline --graph:
* dd0df9f (HEAD -> 21-basic-..., origin/21-basic-...) done with ...
* 31bc3cc ...
* b4b660f (origin/main, origin/HEAD, main) ingnore ...
* f5c1e53 ready ...
* 40a7b68 (origin/17-introduce-..., 17-introduce-) Merge ...
|\
| * bcb585a include ...
| * ...
| * 98c37ae changing ...
|/
* 8e4215f (tag: 1.0.0) ignoring ...
* 253dab3 Dev-...
| * 2cc8c54 added ...
| * 0b476b6 typos
|/
* 5d1c76e adding ...
|

Advanced History Investigation

Useful history commands
Search commit messages
git log --grep="bug fix"

Find commits with specific words in commit messages

Search code changes
git log -S "function_name"

Find commits that added or removed specific code

Show file history
git log --follow -- <filename>

Shows history of a specific file, even through renames

Show changes in time range
git log --since="2024-01-01" --until="2024-12-31"

Filter commits by date range

git diff - What changed between here and there?

git diff   shows you what exactly changed between different files, branches and even repositories.

git diff is one of the most essential commands for investigating changes in your Git repository. It reports in a standardized format a set of exact changes between different versions of your files.

Working Directory vs Staging Area
git diff

Shows unstaged changes (what’s different between your working directory and staging area)

Staging Area vs Last Commit
git diff --staged
# or
git diff --cached

Shows staged changes (what’s different between staging area and last commit)

Working Directory vs Last Commit
git diff HEAD

Shows all changes since last commit (staged + unstaged)

Between Two Commits
git diff commit1 commit2
# or using commit hashes
git diff abc123 def456

Shows changes between any two commits

Example: git diff HEAD~2 HEAD:
diff --git a/example.py b/example.py
index 83db48f..84d55c5 100644
--- a/example.py
+++ b/example.py
@@ -1,7 +1,8 @@
 def greet(name):
-    print("Hello, " + name)
+    print(f"Hello, {name}!")
     return True
 
 def main():
+    print("Starting application...")
     greet("World")
Lines removed

Lines starting with - show content that was deleted

Lines added

Lines starting with + show content that was added

Exercise: Exploring Changes with git diff

Let’s practice using git diff to understand changes in a repository:

Exercise 1: Basic diff operations
  1. Create a new Git repository and add a simple text file:

    mkdir diff-practice
    cd diff-practice
    git init
    echo "Hello World" > greeting.txt
    git add greeting.txt
    git commit -m "Initial commit"
    
  2. Make some changes to the file:

    echo "Hello Beautiful World!" > greeting.txt
    echo "Goodbye World" >> greeting.txt
    
  3. Now explore the differences:

    # See unstaged changes
    git diff
    
    # Stage the changes
    git add greeting.txt
    
    # See staged changes
    git diff --staged
    
    # See all changes since last commit
    git diff HEAD~1
    

Question: What’s the difference between git diff, git diff --staged, and git diff HEAD?

Exercise 2: Comparing commits
  1. Make another commit:

    git commit -m "Update greeting message"
    
  2. Make more changes and commit:

    echo "Welcome to Git!" >> greeting.txt
    git add greeting.txt
    git commit -m "Add welcome message"
    
  3. Now compare different commits:

    # Compare current commit with previous
    git diff HEAD~1
    
    # Compare first and current commit
    git diff HEAD~2 HEAD
    
    # See the history
    git log --oneline
    

Question: How can you see what changed in a specific commit?

git show - What exactly changed at a specific point?

git show   allows you to explore specific changes.

With git show allows you to inspect individual objects, like commits, tags or branches.

Example: git show HEAD~5 -- README.md
commit 65xxx
Author: Jonas I. Liechti <j-i-l@t4d.ch>
Date:   Mon Nov 4 13:43:05 2024 +0100

    adding stars

diff --git a/README.md b/README.md
index f3b5fbe..262826d 100644
--- a/README.md
+++ b/README.md
@@ -7,6 +7,14 @@ track and secure their digital projects.
 
 <!-- include-before -->
 
+---
+
+_If you find this course useful, please share it with others! Show your support by giving it a 🌟 using the ⭐-button at the top right of the page._
+
+---
+
 ## Contributing 🤝🎉
 
 We welcome contributions to this project!`:
git blame - When and who changed a specific line in a file?

git blame   shows who changed each line, when and with what commit.

Example: git blame -L 15,18 -- source/index.md
47c24e85 (Jonas I. Liechti  2025-10-16 10:31:57 +0200 15) 
00000000 (Not Committed Yet 2025-10-20 20:18:01 +0200 16) ```{toctree} 
886a0933 (Matteo Delucchi   2024-10-14 15:18:16 +0200 17) :maxdepth: 4
47c24e85 (Jonas I. Liechti  2025-10-16 10:31:57 +0200 18) :caption: Content
git reflog - What is “my history”, i.e. what was I doing?

git reflog   records updates to objects in the local repository.

Safety net

git reflog should be your go to place if you suspect having “lost” some commits.

During reset-ting or rebase-ing commits can become de-referenced making them difficult to access.

With git reflog allows you find such commits again.

Example: git reflog

git log --format=oneline

435... (HEAD -> main) Reapply "Initial commit"
cae... Revert "Initial commit"
349... (origin/main, origin/HEAD) Initial commit

git reflog

435... (HEAD -> main) HEAD@{0}: revert: Reapply "Initial commit"
cae... HEAD@{1}: revert: Revert "Initial commit"
349... (origin/main, origin/HEAD) HEAD@{2}: checkout: moving from dev/14-... to main
1d2... (origin/dev/14-..., dev/14-...) HEAD@{3}: checkout: moving from main to dev/14-...
349... (origin/main, origin/HEAD) HEAD@{4}: clone: from github.com:j-i-l/test.git
(How) can we alter history?
  • How to "git undo"?

  • How to step back and take a different turn?

  • How to consolidate and clean-up?

  • How to remove sensitive data?

  • When is it fine to change history?

The -Editors’s Toolset#

git revert - How to "git undo"?

git revert  creates a new commit that undoes changes from a specific commit or range of commits.

The most transparent and safest way to “undo” one or several commits is to revert them. By calling git revert you are actually adding another step to the history, that is the step of undoing a previous commit or range of commits.

By using git revert you make sure that:

  • Only the changes from the specified commit(s) are removed.

  • The “undo” action is transparently documented in the history, allowing even to be undone again later.

Example: git revert:

Shown is output of git log --format=oneline:

349... (HEAD -> main, origin/main, origin/HEAD) Initial commit

git revert HEAD

cae... (HEAD -> main) Revert "Initial commit"
349... (origin/main, origin/HEAD) Initial commit

git revert HEAD

435... (HEAD -> main) Reapply "Initial commit"
cae... Revert "Initial commit"
349... (origin/main, origin/HEAD) Initial commit
git reset - How to step back and take a different turn?

git reset  moves the current HEAD to a specific state.

Using git reset has some important and potentially unexpected consequences:

  1. Moving HEAD to a previous commit will potentially leave all following commits without a branch or tag. Commits in such a de-referenced state are hard to reach (use git reflog to find them) and will vanish eventually, that is when the garbage collector runs.

  2. Some options of git reset (e.g. --hard or --mixed) remove staged and un-staged changes in your repository. This might not be undoable.

  3. Setting a reference back bares the risk of creating an alternative history, leading your local version of the repository to diverge from any other copy.

Example: git reset HEAD~2:
A ── B ── C ── D ── E   (old tip, now orphaned)
          ^
          └─ HEAD after `reset HEAD~2`
A ── B ── C ── F        (new tip after the new commit)
git reset --soft

Moves HEAD back but keeps changes staged. Safest reset option.

git reset --mixed

Moves HEAD back and unstages changes, but keeps them in working directory.

git reset --hard

⚠️ DESTRUCTIVE: Moves HEAD back and deletes all changes. Use with extreme caution!

git rebase - How to consolidate and clean-up?

git rebase  alters the history by altering (re-)applying commits.

The most common usage of git rebase is when applying one Branch “on top of” another Branch. Technically, this means that you apply all the changes from the Commits on one Branch to the other Branch, creating new Commits.

With the -i/--interactive option set, git rebase allows to to take action on each Commit that you are going to re-apply, effectively enabling you to completely rewrite this part of the history.

Some of the actions you can apply to each commit during an interactive rebase:

Action

Alias

What it does

pick

p

Keep the commit as is (default).

reword

r

Edit its commit message.

edit

e

Stop after applying this commit, allowing you to amend it (e.g., change the patch, add/remove files, edit the commit message).

squash

s

Combine this commit with the previous one; the resulting commit’s message is the concatenation of both (you can edit it).

fixup

f

Like squash, but discards this commit’s message; the commit is merged into the previous one automatically.

exec

x

Run an arbitrary shell command after applying this commit (linting, tests, etc.).

drop

d

Omit the commit entirely (same as deleting the line).

break

b

Stop the rebase at this point without applying any further commits; useful for manual interventions.

And there are more!

Example: git rebase:
A ── A1 ── A2          (branch A)
 \
  └─ B ── B1 ── B2      (branch B)

git rebase B A

A ── B ── B1 ── B2 ── A1' ── A2'   (branch A)
                ^
                └─ (branch B still points at B2)
git-filter-repo - How to remove sensitive data?

git-filter-repo  is a Python project and the de facto default tool for removing content from a repository.

In case you ever (and you will) accidentally commit some sensitive data to a repository, the first crucial thing to understand is that the data, even if you remove it again, say with a git revert, will become accessible to anyone that has access to your repository. Thus, to make sure that no sensitive data ships with your repository you need to make sure that also the history does not contain it. The safest way to make sure of this is by using tools like the mentioned git-filter-repo.

While it is possible to contain some sensitive commit in a local environment by means of git reset one should know precisely how to use it, in order to avoid any leakage of the sensitive data when synchronizing with others or a remote server. Therefore, we strongly recommend to directly opt for git-filter-repo if you realize that you have a commit with sensitive data.

Next, if the sensitive commit was already pushed to a remote server, be sure to always check the recommended procedure of the specific remote server. For the record, here are the currently recommended approaches for GitHub and for GitLab.

Also, once data has left your device you should always assume that it can never be removed completely from the public domain. For unencrypted sensitive data this means that you need to proceed with the necessary measures to mitigate the risk of abuse. For committed passwords, for example, this means that you will want to change them as soon as possible, even if you think you managed to remove the sensitive commit completely from the repository again.

Sensitive data

  1. The full history ships with a repository.

  2. Assume that the data cannot be ‘unpublised’.

  3. Check the recommended procedures on the remotes.

  4. git reset is not necessarily helping you here.

When is it fine to change history?

When to Rewrite Git History

Rewriting Git history generally undermines Git’s ability to help you collaborate, maintain a reproducible record, and effectively track/debug changes. For this reason, it’s often best to avoid it.

When It May Be Necessary

Sometimes, you need to change history for a critical reason, such as removing sensitive data. In these specific cases, altering history might be acceptable because the benefit outweighs the cost of not acting.

A Simple Guideline

A good rule of thumb is to ask: Is the history you’re changing already shared?

  • Local History Only: It’s usually fine to modify history that only exists on your local machine (e.g., squashing commits, rewriting commit messages, or rebasing an unpublished branch). When you push, this revised history will be the only version shared, preventing conflicts.

  • Shared History: Avoid modifying history that has already been pushed to a remote repository. This requires a force push and creates a situation where different copies of the project have conflicting histories, which breaks the normal collaboration workflow and requires manual coordination to fix.

Sensitive data

Avoid rewriting history:

It breaks collaboration, reproducibility, and the ability to track changes.

Never change shared history:

Rewriting history that is already on the remote repository breaks the workflow and requires a force push and manual cleanup by collaborators.

Exception 1: Critical need:

It’s acceptable for necessary actions like removing sensitive data.

Exception 2 optional: Local changes only:

It is generally fine to change history that has not yet been pushed (e.g., squashing commits on a local branch).

When to Use Which Method?

Decision Guide: Which “go back” method to use?

Q: Are the commits already pushed/shared with others?

  • YES → Use git revert (creates new commit, safe for shared history)

  • NO → You can use destructive methods, but consider the consequences

Q: Do you want to keep your changes?

  • Keep as staged changesgit reset --soft

  • Keep as unstaged changesgit reset --mixed (default)

  • Discard all changesgit reset --hard ⚠️

Q: Do you want to fix the history/commit messages?

  • YESgit rebase -i (interactive rebase)

  • NO → Use git reset methods

Q: Do you just want to temporarily save work?

  • YESgit stash (safest option)

Exercise: Practicing “Time Travel” Commands

Exercise 3: Safe history manipulation
  1. Setup: Create a repository with multiple commits:

    mkdir time-travel
    cd time-travel
    git init
    
    echo "Version 1" > file.txt
    git add file.txt
    git commit -m "Add version 1"
    
    echo "Version 2" > file.txt
    git add file.txt
    git commit -m "Add version 2"
    
    echo "Version 3" > file.txt
    git add file.txt
    git commit -m "Add version 3"
    
  2. Practice safe undoing:

    # See the history
    git log --oneline
    
    # Revert the last commit (creates new commit)
    git revert HEAD
    
    # Check what happened
    git log --oneline
    cat file.txt
    

Question: What’s in file.txt now? Why is this safer than git reset?

Exercise 4: Understanding reset modes

⚠️ Note: Only do this on a local, non-shared repository!

  1. Setup: Create some commits and changes:

    mkdir reset-practice
    cd reset-practice
    git init
    
    echo "Line 1" > file.txt
    git add file.txt
    git commit -m "Commit 1"
    
    echo "Line 2" >> file.txt
    git add file.txt
    git commit -m "Commit 2"
    
    echo "Line 3" >> file.txt
    git add file.txt
    echo "Line 4" >> file.txt  # This stays unstaged
    
  2. Observe current state:

    git status
    git diff --staged
    git diff
    
  3. Try different reset modes:

    # Soft reset - keeps everything staged
    git reset --soft HEAD~1
    git status
    
    # Mixed reset - unstages but keeps changes
    git reset --mixed HEAD~1  # or just git reset HEAD~1
    git status
    
    # Hard reset - DESTROYS changes!
    # git reset --hard HEAD~1  # BE CAREFUL!
    

Question: What’s the difference between the three reset modes?

Exercise 5: When things go wrong

Sometimes you make a mistake with history manipulation. Git has safety nets:

  1. The safety net - reflog:

    # See all recent HEAD movements
    git reflog
    
    # You can recover from almost any mistake using reflog
    # git reset --hard <reflog-entry>
    
  2. Practice recovery:

    # Make a "mistake"
    git reset --hard HEAD~2
    
    # Oh no! Find the lost commit
    git reflog
    
    # Recover (replace abc123 with actual hash from reflog)
    # git reset --hard HEAD@{1}
    

Key learning: git reflog is your safety net for local repository disasters!