git blame: My daily driver for code archaeology

As a developer, I use git blame almost daily. It’s my go-to tool for code archaeology, helping me understand the history and context of our codebase. Recently, I’ve enhanced my git blame skills with some powerful features that have significantly improved my workflow.

I learned about these from the YouTube video So You Think You Know Git - FOSDEM 2024 by Scott Chacon, an insightful video about Git that recently reached 1 million views!

The Power of -L :function

One of the most game-changing features I’ve started using is the -L :function option. This allows me to trace the evolution of a specific function over time. Here’s how it works:

git blame -L :functionName fileName

This command will show the blame information only for the lines within the specified function. It’s incredibly useful when you’re focused on understanding changes to a particular piece of functionality.

For example, if I’m investigating changes to a function called calculateTotalPrice in a file named pricing.js, I’d use:

git blame -L :calculateTotalPrice pricing.js

This narrows down the blame output to just the lines of that function, making it much easier to track its evolution.

The Magic of Multiple -C Options

Another powerful feature I’ve recently incorporated into my workflow is the use of multiple -C options. The -C option is used to detect moved or copied lines from other files, but its behavior changes depending on how many times you use it:

  • -C: Detects lines moved or copied from other files modified in the same commit.
  • -CC (two times): Additionally looks for copies from other files in the commit that creates the file.
  • -CCC (three times): Goes even further, looking for copies from other files in any commit.

Here’s how I use these in practice:

# Basic copy detection
git blame -C -L :calculateTotalPrice pricing.js

# More thorough copy detection, including the file creation commit
git blame -CC -L :calculateTotalPrice pricing.js

# Exhaustive copy detection across all commits
git blame -CCC -L :calculateTotalPrice pricing.js

This graduated approach to copy detection has been a game-changer for understanding how our code has evolved, especially in projects with a lot of refactoring or code reuse.

Other Useful Options

In addition to -L :function and -C, I frequently use these options:

  • -w: Ignores whitespace changes. This is crucial when your team has different code formatting practices.
  • --since="2 weeks ago": Limits the blame to recent changes, perfect for investigating recent regressions.

My Enhanced Git Blame Workflow

  1. Initial Investigation: I start with a basic git blame on the file I’m investigating.
  2. Function Focus: If I’m looking at a specific function, I use -L :functionName.
  3. Ignoring Formatting: I almost always include -w to focus on actual code changes.
  4. Basic Copy Detection: I add -C to see if code was moved from other files in the same commit.
  5. Thorough Copy Detection: If I suspect the code might have been copied when the file was created, I use -CC.
  6. Exhaustive Copy Detection: For deep investigations, especially in large codebases, I use -CCC to track copies from any commit.
  7. Recent Changes: If I’m debugging a recent issue, I’ll add --since="1 week ago" to focus on recent changes.

Conclusion

git blame has become an indispensable part of my daily workflow. By leveraging these advanced features, especially -L :function and the graduated -C options, I’ve been able to navigate and understand our codebase much more efficiently. It’s not just about finding who wrote what line, but about understanding the complex evolution of our code over time, including refactoring and code reuse patterns.

I encourage you to experiment with these git blame features in your own workflow. The combination of function-specific blame and thorough copy detection can together with great commit messages provide insights into your codebase that you might never have discovered otherwise!