13 min read
Writing code is the easy part. The real skill in programming — the one that separates good engineers from great ones — is debugging. Here's why, and how to get better at it.
I spent my first two years as a developer genuinely believing I was getting better at programming because I was getting faster at writing code. I'd memorized common patterns. I could scaffold a REST API in under ten minutes. I knew the React hooks lifecycle well enough to write it without looking it up. I felt productive. I felt skilled.
Then I joined a team working on a high-throughput payment processing system, and I got handed my first real bug. A race condition that only appeared under load, on production, roughly once every 48 hours. No stack trace. No reproducible test case. Just a corrupted transaction record and a very unhappy finance team.
I was completely lost. All that speed, all those patterns I'd memorized — none of it helped me. I couldn't Google my way out. I couldn't find a Stack Overflow answer that matched. I sat in front of that codebase for four days, adding log statements, building mental models that turned out to be wrong, and slowly, painfully, learning what it actually meant to understand a system. When I finally found it — a shared counter being incremented across goroutines without a mutex — I felt something I hadn't felt from writing code before. I felt like an engineer.
That experience changed how I think about this craft. Writing code is learnable. Debugging is the actual job.
The industry has a narrative problem. We celebrate the builder, not the investigator. GitHub profiles show green squares for commits, not hours spent in a debugger. Job descriptions ask for proficiency in frameworks and languages. Bootcamps teach you to ship features. Technical interviews make you write algorithms on a whiteboard, almost never asking you to diagnose a broken system.
This creates a systematic illusion: that writing code is the core competency, and that debugging is a secondary skill — something you learn passively, almost by accident.
It isn't. Debugging is the primary skill. Writing code is the secondary one.
Here's the uncomfortable math: in any production codebase that has been alive for more than a year, the ratio of time spent reading and investigating code versus time spent writing net-new code is roughly 10:1. That number comes from the industry research referenced in Robert C. Martin's Clean Code, and it's been informally validated by every senior engineer I've ever worked with. You spend the overwhelming majority of your time understanding systems — yours, your teammates', the third-party library you didn't write and the original author hasn't touched since 2019.
Copy-paste is not a dirty word here. Reusing patterns, reaching for battle-tested libraries, adapting boilerplate from previous projects — these are all rational engineering decisions. The real question is: what do you do when the thing you copied stops working?
Most developers treat debugging as a search: you look for the bug until you find it. That framing is subtly wrong, and it's why junior engineers stay junior for longer than they should.
Debugging is hypothesis-driven investigation. It has more in common with the scientific method than with a grep search. You form a mental model of how the system should behave, you compare it to how the system is behaving, and you systematically narrow the gap between those two things until you find where reality diverges from the model.
This means that the prerequisite to good debugging is good mental modeling. You cannot effectively debug a system you don't understand. And building that understanding — reading code, tracing execution paths, reading documentation (including the parts nobody reads), understanding memory models and concurrency guarantees and network semantics — is itself a skill that takes years to develop.
When I watch a senior engineer debug, it looks almost effortless. They sit down, ask two questions, add one log statement, and say "I think it's probably here." And they're usually right. That isn't magic. It's accumulated model-building. They've seen enough systems that they know where the load-bearing walls are, and where the cracks tend to form.
Not all bugs are equal, and part of becoming a better debugger is knowing which type you're dealing with before you start digging. Misidentifying the category is one of the most expensive mistakes you can make.
Deterministic bugs are the friendly ones. Same input, same broken output, every time. These are usually logic errors — an off-by-one, a wrong condition, a misunderstood API contract. They're annoying but straightforward. Your standard debugger workflow handles them fine.
Heisenbugs are the ones that change behavior when you observe them. The bug disappears when you add a console.log. The race condition that goes away when you run it under a debugger because the debugger's overhead changes the thread timing. These require you to instrument without disturbing — structured logging, metrics, careful use of atomic operations in your investigation code.
Bohrbohrs — a term I've seen credited to various corners of the internet — are bugs that manifest consistently in one environment but not another. "Works on my machine" is a Bohrbohr. These are environment divergence bugs, and they're usually a symptom of something implicit — a dependency version, an environment variable, an OS-level difference in filesystem behavior or case sensitivity.
Schrödinger bugs are the cruelest: the bug is reported, you can't reproduce it, and then it silently goes away. Production log data is your only forensic evidence. This is where good observability infrastructure is the difference between a one-hour investigation and a week of guesswork.
Understanding which category you're in changes your entire strategy.
Let me be direct about something: the GUI debugger your IDE ships with is a good starting point, but it is not sufficient. Here's what a professional debugging practice actually looks like.
Structured logging is not optional. I have seen more production bugs diagnosed through log analysis than through any other method. If you are writing console.log("here") to debug production issues, you have a problem. Use a structured logging library — winston in Node, zerolog in Go, structlog in Python — and emit machine-readable JSON with consistent field names: correlation IDs, request context, durations, error codes. This is what makes log aggregation useful.
// Don't do this in production
console.log("user created", userId);
// Do this instead
logger.info("user.created", {
userId,
email: user.email,
durationMs: Date.now() - startTime,
requestId: ctx.requestId,
});
Learn your platform's actual debugger. For Python, that means knowing pdb and ipdb deeply — not just breakpoint(), but post-mortem debugging with pdb.pm() after an uncaught exception. For Node.js, know how to use --inspect and connect Chrome DevTools to a running process. For Go, know dlv (Delve). The mental model of stepping through execution, inspecting memory, and evaluating expressions in-context is irreplaceable.
Understand the core dump / heap dump workflow. When your process crashes or hangs in production, you often can't reproduce it. Core dumps (on Linux/macOS) and heap dumps (JVM, .NET) give you a snapshot of the process state at the moment of failure. Learning to analyze these — with gdb, jstack, WinDbg, or language-specific tools — is a force multiplier for serious production incidents.
# Triggering a heap dump from a running JVM process
jmap -dump:format=b,file=heap.hprof <pid>
# Analyzing it with Eclipse MAT or VisualVM
# Look for: large retained heaps, duplicate objects, leaked references
Distributed tracing is mandatory in modern systems. In a microservices architecture, a single user request can touch 15 services. A traditional stack trace is useless here. OpenTelemetry has become the de facto standard — instrument your services once with the SDK, ship traces to a backend (Jaeger, Tempo, Honeycomb, Datadog), and suddenly you can follow a request across service boundaries with a single correlation ID.
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("process-payment") as span:
span.set_attribute("payment.amount", amount)
span.set_attribute("payment.currency", currency)
result = charge_card(card_token, amount)
span.set_attribute("payment.success", result.success)
Bisection is underused. When you have a regression — something worked in version 1.4, it's broken in 1.7, and you don't know where it broke — the right tool is git bisect. It does a binary search through your commit history, running your test case at each midpoint, and finds the exact commit that introduced the regression in O(log n) steps.
git bisect start
git bisect bad HEAD
git bisect good v1.4.0
# git will now check out the midpoint commit
# Run your test, then:
git bisect good # or git bisect bad
# Repeat until git identifies the culprit commit
Technical tools are only half the equation. The other half is epistemological rigor — being honest with yourself about what you know versus what you're assuming.
The most common debugging mistake I've watched smart people make is confirmation bias. You form a hypothesis early, and then you unconsciously interpret every piece of evidence as supporting it. You explain away the data points that don't fit. You stop when you find something that looks like the cause and call it done, when actually you've only found a symptom.
I have a practice I've found genuinely useful: write down your hypothesis before you test it. Literally write "I believe the bug is in the session invalidation logic because X and Y" before you look at the code. Then test it. If it's wrong, cross it out and write the next hypothesis. This sounds tedious. It prevents you from lying to yourself.
The second discipline is changing one thing at a time. When a system is broken, the temptation is to "fix" multiple things simultaneously. Resist this. You get two outcomes from multi-variable changes: it gets better, and you don't know why; or it doesn't, and you've added noise. Always isolate variables.
The third is knowing when to stop and ask. There's a concept in medicine called the "diagnostic timeout" — a point in an investigation where you've exhausted your differential diagnosis and you need a second opinion before you do something drastic. Engineering has an equivalent. Pride is expensive. The person who's been in the codebase for three years will often point you at the relevant module in thirty seconds. Ask sooner.
I'd be writing a dishonest article if I didn't address this directly. GitHub Copilot, Claude, GPT-4 — these tools have genuinely changed the economics of writing code. A solid LLM can write a working CRUD endpoint, a regex, a data transformation pipeline, faster than most humans.
This has accelerated the shift I've been describing, not reversed it. If AI can write the first draft of your code, the bottleneck has moved decisively toward understanding, validating, and debugging that code. The engineer who can take a Copilot-generated function, reason about its edge cases, identify why it fails under nil inputs, and fix it without being certain what it was doing — that person is more valuable now, not less.
What LLMs are actually terrible at, in my experience, is deep debugging. They hallucinate plausible-sounding root causes. They pattern-match to surface-level symptoms. They confidently suggest fixes that address the wrong layer of the stack. This isn't a dig at the tools — it's a structural limitation of systems trained on code samples, not on the act of reasoning through a broken system in real time.
Your debugging ability is, currently, the part of your job that is hardest to automate. Invest there.
I'm making a strong claim here, and I should be honest about where it's limited.
This thesis is most true for engineers working on complex, long-lived production systems. If you're writing a small automation script, building a personal project, or working in a domain where systems get thrown away after a short lifecycle, the ratio of debugging to writing genuinely shifts. The argument is weakest at the edges.
I'm also not arguing that writing code well doesn't matter. A codebase written with care — with clear boundaries, consistent abstractions, and observable internals — is dramatically easier to debug than a tangled one. The skills compound. But the priority ordering I'm suggesting is: first, learn to read and understand systems deeply; second, learn to investigate and diagnose failures systematically; third, learn to write code that makes those first two activities easier for the next person.
Finally: debugging skills are harder to measure and harder to demonstrate in a hiring process, which creates a perverse incentive for engineers to optimize for the legible signals (frameworks, commits, shipped features) rather than the real ones. I don't have a perfect solution to this. But knowing the incentive is skewed is useful.
If you're in your first two years of professional engineering, this is an argument for where to deliberately invest practice time. Seek out the hairy bugs, not just the greenfield features. Ask to be on the on-call rotation sooner than you think you're ready. When a postmortem happens, read it carefully — not to assign blame, but to understand how the failure mode developed.
If you're a senior engineer mentoring more junior folks, consider what you're modeling. When you debug in front of them, narrate your thought process. Show them the dead ends, not just the solution. The investigation is the lesson.
If you're evaluating engineering candidates, consider adding a debugging component to your process. Hand someone a broken system and watch how they reason through it. You'll learn more about their engineering ability in forty-five minutes than in two hours of algorithm questions.
The craft of programming was never primarily about the syntax, the frameworks, or the ability to produce code quickly. It was always about understanding systems — how they behave, how they fail, and how to reason about the gap between those two things. Everything else is learnable. The investigation is what separates people who write programs from people who understand them.
That race condition took me four days to find. I've never regretted a single hour of it.
git bisect if you're dealing with a regressionBest platform: Dev.to (primary), Hashnode (cross-post), Medium/Better Programming (secondary)