My First On-Call Experience: From Panic to Production Fix in 15 Minutes

Even though we don't usually stay on-call missions in Stash, when a new client lands, we need to be aware and take care of potential bugs. Landing new clients and making them start using Stash processes usually goes smoothly and without any bugs. But this time, it was different.

It was my first night on-call. I'll be honest, I was nervous. Being responsible for keeping production stable while everyone else sleeps felt like carrying a ticking time bomb in my pocket. I remember hoping quietly: "Please, let nothing go wrong tonight."

Of course, something went wrong… But before diving into my story, let me tell you about on-call missions in tech companies a little bit more.

On-call Missions in Tech Companies

In many tech companies, especially those running high-traffic B2C platforms or critical infrastructure developers take turns being "on-call." It's like being a night-shift doctor for production systems. If something breaks, you're the first responder. You carry your laptop like a stethoscope and your phone becomes the alarm. Even if incidents are rare, when you're on-call, you have to be ready any time, any hour to diagnose, fix, and deploy under pressure. It's not always glamorous, but it's a core part of building resilient systems.

After we learned about "on-call" a little bit more, let's continue our story.

One Letter, Two Meanings: The 'X' That Broke Production

At 2:13 AM, my phone rang. An error alert. I jumped out of bed and opened the logs:

Fatal: ambiguous argument 'X': unknown revision or path not in the working tree.

My heart dropped. Fatal? At 2 AM? Was this the night everything crashed on my watch?

The issue came from a customer in the U.S. who had just created a new project and selected a repository. The bug only reproduced when the selected branch was named X which, it turns out, was also the name of a file in their repo.

After some digging, I discovered a quirk of the git checkout command: it can be used both to switch branches and to restore files. When a file and a branch share the same name, Git gets confused and throws a fatal error.

This was new to me. I didn't realize git checkout had this dual behavior. As developers, we typically use git checkout <branch> to switch between branches, but it also has a second, lesser-known function: restoring files. When you run git checkout <filename>, Git doesn't switch branches, it reverts the specified file to the state from the current HEAD, discarding local changes.

Apparently, I wasn't the only one confused by this. The overlap in functionality was unintuitive enough that Git introduced two more focused commands in version 2.23: git switch for changing branches and git restore for recovering files. This separation helps avoid accidental mistakes, especially for newer users.

So I quickly refactored to use git switch instead of checkout when switching branches. I pushed the fix, deployed the new version, and just like that everything worked again.

The Resolution

The customer? Happy.
The bug? Gone.
Me? Wiser.

Looking back, I'm kind of glad it happened. It was a scary moment, sure but it taught me a valuable edge case I hadn't considered. And now our system is just a bit more bulletproof than it was yesterday.

Key Takeaways for Developers

Always use git switch for branch switching to avoid ambiguity
Test edge cases like files and branches sharing the same name
On-call incidents, while stressful, are excellent learning opportunities
Production fixes at 2 AM can still be straightforward with the right approach

Have you had a similar on-call experience? Sometimes the smallest details can cause the biggest headaches, but they also make us better developers. That's the beauty of working with production systems, every bug teaches you something new.

My First On-Call Experience: From Panic to Production Fix in 15 Minutes

Ismail Emir

On-call Missions in Tech Companies

One Letter, Two Meanings: The 'X' That Broke Production

The Resolution

Key Takeaways for Developers