Fix: Go fatal error: all goroutines are asleep - deadlock!
Part of: Go, Rust & Systems Errors
Quick Answer
How to fix Go fatal error all goroutines are asleep deadlock caused by unbuffered channels, missing goroutines, WaitGroup misuse, and channel direction errors.
The Crash With the Helpful Error Message
I love this error. It is one of the few runtime panics I have hit that points directly at the bug, names the goroutine, and tells you what synchronization primitive it was waiting on. Most languages let deadlocks just hang. Go’s runtime decides “everyone is waiting on something nobody will send,” kills the program, and prints a stack trace. The first time I encountered it on a production service I was relieved, not frustrated. Your Go program crashes with:
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan send]:
main.main()
/app/main.go:8 +0x50Or variations:
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan receive]:
main.main()
/app/main.go:10 +0x68fatal error: all goroutines are asleep - deadlock!
goroutine 1 [semacquire]:
sync.runtime_Semacquire(...)Every goroutine in the program is blocked waiting for something (a channel operation, a mutex, a WaitGroup), and nothing can unblock any of them. Go’s runtime detects this condition and panics.
Quick Reference Before You Dive In
If you arrived here from Google with a panic in your terminal, the five facts that resolve roughly 90 percent of the cases I have triaged:
- The fatal panic fires only when ALL goroutines are blocked. If even one goroutine is running (an HTTP server, a
time.Sleeploop, a network read), Go reports nothing and the program hangs silently. “Why doesn’t Go see my deadlock” almost always means the program is not actually deadlocked, just leaking goroutines. - The most common cause is sending to an unbuffered channel with no receiver, or receiving from one with no sender. The
[chan send]or[chan receive]annotation in the goroutine state tells you which. - Unbuffered channels (
make(chan T)) are synchronization primitives; buffered (make(chan T, N)) are decoupling. Confusing the two is the single most common deadlock cause in code I review. rangeover a channel blocks until the channel is closed. Forget to callclose()and the receiver waits forever. The convention: only the sender closes, and only once.- WaitGroup deadlocks have a fixed recipe: always call
wg.Add()BEFORE launching the goroutine, and putwg.Done()in adeferat the top of the goroutine body. Doing either step backwards races the main goroutine and produces hangs that do not reproduce locally.
The rest of this article walks through each of those in detail, plus the failure modes most other guides skip. The canonical concurrency references are Effective Go (the “Concurrency” section) and the Go Blog post on pipelines and cancellation.
How Go Decides Everything Is Stuck
Go detects deadlocks when all goroutines are blocked. The runtime checks if any goroutine can make progress. If none can, the program is deadlocked and cannot continue. This detection is performed by the runtime scheduler when it has no runnable goroutines left and every parked goroutine is waiting on a synchronization primitive that nothing can ever signal.
The detection is intentionally conservative. The runtime can only see goroutines that exist inside the current process. It cannot see file descriptors, network sockets, kernel timers, or anything outside the Go runtime. That is why a goroutine sitting inside time.Sleep, blocked on a network read, or waiting on a syscall does not count as deadlocked even when the rest of the program is hung. The runtime assumes those operations might eventually return. The deadlock check fires only when literally every goroutine is parked on a channel, mutex, or sync primitive.
Common causes:
- Sending to an unbuffered channel with no receiver. The sender blocks forever.
- Receiving from a channel with no sender. The receiver blocks forever.
- Forgetting to close a channel. A
rangeloop over a channel blocks forever waiting for more values. - WaitGroup counter never reaches zero.
wg.Wait()blocks becausewg.Done()is never called. - Mutex double-lock. Locking a mutex that is already locked by the same goroutine.
- Circular channel dependencies. Goroutine A waits on channel X, goroutine B waits on channel Y, and they need each other to proceed.
Version History That Changes the Failure Mode
Go’s scheduler and channel runtime have evolved in ways that change which deadlock-like bugs surface as the explicit fatal error panic versus silent hangs. Knowing which Go version you are on helps you interpret the symptom.
- Go 1.14 (Feb 2020), asynchronous preemption. Before 1.14, tight loops without function calls could starve the scheduler. A goroutine spinning in a CPU-bound loop would never yield, so other goroutines never got to run. Some “deadlock-shaped” hangs in older code were actually starvation. From 1.14 onward, the runtime can preempt at safe points using signals, so a true all-goroutines-blocked condition is reported as a deadlock rather than disguised as starvation.
- Go 1.18 (Mar 2022), generics. Generics changed how people write channel and worker-pool helpers. Type-parameterized channel utilities (
func Recv[T any](ch <-chan T) T) became common, and the olderinterface{}-based helpers gradually disappeared. The deadlock conditions did not change, but the call stacks in panic output now include generic instantiations likemain.send[int], which can be harder to read. - Go 1.20 (Feb 2023),
errors.Join. Combined with the oldercontextcancellation patterns, this made it easier to thread cancellation through worker goroutines and surface the actual cause of a hang, instead of seeing only the goroutine that happened to be detected first. - Go 1.21 (Aug 2023),
sync.OnceFunc,OnceValue,OnceValues. These helpers replaced a lot of hand-rolled mutex-and-flag initialization patterns. Code written before 1.21 still works, but porting it removes a common source of double-lock and partial-init deadlocks. - Go 1.22 (Feb 2024), loop variable scoping change.
for i := range items { go func() { use(i) }() }now captures a freshiper iteration. Older code that relied on the old scoping behaviour (intentionally or not) may now finish faster or differently, occasionally exposing existing WaitGroup races that were previously masked. - Go 1.23+ (2024 onward), improved runtime/trace and
GODEBUG=schedtrace. Tooling for diagnosing hangs improved. Usego tool traceon a recent toolchain to see which goroutines are blocked on which channel addresses.
The race detector (go run -race) has been part of Go since 1.1, but its output and overhead improved noticeably in 1.19 and 1.20. If you are on an older toolchain, upgrading just to run -race against suspect code is often worth it.
When to Use Which Fix
The next eight sections cover the eight fixes in detail. Before diving in, the table below maps your symptom (taken from the panic output or the running behavior) to the specific fix I would reach for first.
| Your symptom | Recommended fix | Why |
|---|---|---|
Panic shows [chan send] and no goroutine was started before the send | Fix 1: receive in a goroutine, or use a buffered channel | The single most common deadlock shape |
Panic shows [chan receive] and the sending side never calls close() | Fix 2: close the channel when sending is done | range over a channel waits for close |
Program hangs at wg.Wait() while goroutines look like they finished | Fix 3: defer wg.Done() and call wg.Add() before launching | Counter never reaches zero |
| You need to wait on a channel with a timeout or cancellation | Fix 4: select with time.After or ctx.Done() | Avoids unconditional block on a channel |
| Producer-consumer pipeline that blocks before the consumer is ready | Fix 5: start the consumer goroutine first, or buffer the channel | Unbuffered handshake requires both sides ready at the same instant |
| Mutex acquired twice from the same goroutine, or held across a blocking call | Fix 6: restructure to avoid nested or long-held locks | sync.Mutex is not reentrant |
| Goroutine should exit when its caller is canceled | Fix 7: pass context.Context and select on Done() | The idiomatic cancellation path |
| Suspected deadlock-prone code in tests | Fix 8: go test -race | Catches data races that precede most deadlocks |
If multiple rows look like they apply, pick the topmost one. The fixes are ordered roughly from “most common cause” to “diagnostic tool you should also be running.”
Fix 1: Fix Unbuffered Channel Sends
An unbuffered channel blocks the sender until a receiver is ready:
Broken: sending with no goroutine to receive:
func main() {
ch := make(chan int)
ch <- 42 // Deadlock! No goroutine is receiving
fmt.Println(<-ch)
}Fixed: receive in a goroutine:
func main() {
ch := make(chan int)
go func() {
ch <- 42 // Send in a goroutine
}()
fmt.Println(<-ch) // Receive in main
}Fixed: use a buffered channel:
func main() {
ch := make(chan int, 1) // Buffer size 1
ch <- 42 // Does not block (buffer has space)
fmt.Println(<-ch) // 42
}A mental model that has saved me hours: I treat unbuffered channels as synchronization primitives and buffered channels as decoupling primitives. An unbuffered make(chan T) requires sender and receiver to handshake at the same instant, which is what I want when the goroutines need to coordinate. A buffered make(chan T, N) lets the sender push up to N values without a receiver waiting, which is what I want for queueing. Picking the wrong one is the most common deadlock cause in code I review.
Fix 2: Fix Channel Range Loops
range over a channel blocks until the channel is closed:
Broken: channel never closed:
func main() {
ch := make(chan int)
go func() {
for i := 0; i < 5; i++ {
ch <- i
}
// Forgot to close(ch)!
}()
for v := range ch { // Deadlock! range waits for close(ch) forever
fmt.Println(v)
}
}Fixed: close the channel when done sending:
go func() {
for i := 0; i < 5; i++ {
ch <- i
}
close(ch) // Signal that no more values will be sent
}()
for v := range ch {
fmt.Println(v) // Prints 0-4, then exits the loop
}Fixed: use a known count instead of range:
for i := 0; i < 5; i++ {
fmt.Println(<-ch)
}A channel ownership rule I enforce on my teams: only the sender closes a channel, and only once. Closing from the receiver side is unenforceable in the type system and produces panics under race conditions that do not reproduce locally. Closing twice always panics. When the sender side is split across multiple goroutines, I introduce a single owning goroutine whose only job is to close, or I use sync.Once. The “who owns this close” question has saved me more panics than any other discipline.
Fix 3: Fix WaitGroup Misuse
sync.WaitGroup deadlocks if Done() is never called:
Broken: Done() not called:
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
go func(n int) {
// Forgot wg.Done()!
fmt.Println(n)
}(i)
}
wg.Wait() // Deadlock! Counter never reaches 0Fixed: use defer wg.Done():
for i := 0; i < 5; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done() // Always called, even if the function panics
fmt.Println(n)
}(i)
}
wg.Wait()Broken: Add() called inside the goroutine (race condition):
for i := 0; i < 5; i++ {
go func(n int) {
wg.Add(1) // WRONG! Main goroutine might reach Wait() before Add()
defer wg.Done()
fmt.Println(n)
}(i)
}
wg.Wait() // Might return too earlyFixed: always call Add() before launching the goroutine:
for i := 0; i < 5; i++ {
wg.Add(1) // Add before starting the goroutine
go func(n int) {
defer wg.Done()
fmt.Println(n)
}(i)
}
wg.Wait()Fix 4: Fix Select with Default
Use select to avoid blocking on channel operations:
ch := make(chan int)
// Blocking receive (might deadlock)
value := <-ch
// Non-blocking receive with select
select {
case value := <-ch:
fmt.Println("Received:", value)
default:
fmt.Println("No value available")
}Timeout pattern:
select {
case value := <-ch:
fmt.Println("Received:", value)
case <-time.After(5 * time.Second):
fmt.Println("Timed out waiting for value")
}Multiple channels:
select {
case msg := <-msgCh:
handleMessage(msg)
case err := <-errCh:
handleError(err)
case <-ctx.Done():
fmt.Println("Context canceled")
return
}Fix 5: Fix Producer-Consumer Patterns
A common pattern that can deadlock if not implemented correctly:
Broken: single channel, single goroutine:
func main() {
jobs := make(chan int)
results := make(chan int)
// Producer
for i := 0; i < 5; i++ {
jobs <- i // Deadlock! No consumer running yet
}
close(jobs)
// Consumer
for j := range jobs {
results <- j * 2
}
}Fixed: start consumer first, or use goroutines:
func main() {
jobs := make(chan int, 10) // Buffered
results := make(chan int, 10)
// Start consumer goroutine first
go func() {
for j := range jobs {
results <- j * 2
}
close(results)
}()
// Producer
for i := 0; i < 5; i++ {
jobs <- i
}
close(jobs)
// Collect results
for r := range results {
fmt.Println(r)
}
}Worker pool pattern:
func main() {
jobs := make(chan int, 100)
results := make(chan int, 100)
var wg sync.WaitGroup
// Start 3 workers
for w := 0; w < 3; w++ {
wg.Add(1)
go func() {
defer wg.Done()
for j := range jobs {
results <- j * 2
}
}()
}
// Send jobs
for i := 0; i < 10; i++ {
jobs <- i
}
close(jobs)
// Wait for workers and close results
go func() {
wg.Wait()
close(results)
}()
for r := range results {
fmt.Println(r)
}
}Fix 6: Fix Mutex Deadlocks
Go’s sync.Mutex is not reentrant. Locking it twice from the same goroutine deadlocks:
Broken:
var mu sync.Mutex
func doWork() {
mu.Lock()
defer mu.Unlock()
helper() // Calls Lock() again — deadlock!
}
func helper() {
mu.Lock() // Deadlock! Already locked by doWork()
defer mu.Unlock()
// ...
}Fixed: restructure to avoid nested locks:
func doWork() {
mu.Lock()
data := readData()
mu.Unlock()
result := processData(data) // No lock held during processing
mu.Lock()
writeResult(result)
mu.Unlock()
}Fixed: use a lock-free inner function:
func doWork() {
mu.Lock()
defer mu.Unlock()
helperLocked() // Assumes lock is already held
}
func helperLocked() {
// Does NOT lock — caller must hold the lock
// Document this requirement in a comment
}Fix 7: Fix Context Cancellation
Use context.Context for proper goroutine lifecycle management:
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
ch := make(chan string)
go func() {
result := longOperation()
ch <- result
}()
select {
case result := <-ch:
fmt.Println("Result:", result)
case <-ctx.Done():
fmt.Println("Operation timed out:", ctx.Err())
}
}Pass context to goroutines:
func worker(ctx context.Context, ch chan<- int) {
for i := 0; ; i++ {
select {
case <-ctx.Done():
return // Exit when context is canceled
case ch <- i:
time.Sleep(100 * time.Millisecond)
}
}
}Fix 8: Use the Race Detector
While the race detector does not detect deadlocks directly, it catches data races that often accompany deadlock-prone code:
go run -race main.go
go test -race ./...Debug with GOTRACEBACK:
GOTRACEBACK=all go run main.go
# Shows all goroutine stacks on deadlock, not just the relevant onesUse runtime.NumGoroutine() to monitor goroutine leaks:
fmt.Println("Goroutines:", runtime.NumGoroutine())Subtle Goroutine Failures I Have Hunted Down
Note: Go only detects deadlocks when all goroutines are blocked. If even one goroutine is running (e.g., a time.Sleep loop, an HTTP server), Go will not detect the deadlock. The program hangs silently instead of panicking.
Use pprof to debug hanging programs:
import _ "net/http/pprof"
go func() {
http.ListenAndServe(":6060", nil)
}()
// Visit http://localhost:6060/debug/pprof/goroutine?debug=2
// Shows all goroutine stacksCheck for nil channels. A nil channel blocks forever on both send and receive. This is sometimes used intentionally inside select to disable a case, but if you accidentally hit a nil channel outside a select, your goroutine parks and never wakes. Inspect any channel assigned from a function whose error path leaves the channel as the zero value.
Inspect the goroutine count over time. Add runtime.NumGoroutine() to a periodic log line. A steadily climbing number means goroutines are being created faster than they exit. Eventually one of them blocks the wrong primitive and the runtime reports a deadlock that is really a leak. See Fix: Go goroutine leak for the diagnosis pattern.
Make goroutine count a first-class SLI, and fail tests on leaks. In production the Prometheus client library already exports go_goroutines; alert on its derivative (count rising by more than X/min for several minutes) rather than an absolute threshold, and pair it with process_resident_memory_bytes, when both climb in lockstep you have a parked-channel leak, not a slow cache fill. At PR time, add goleak.VerifyTestMain(m) to your test packages; it fails any test that leaves a goroutine running after it returns, which is the cheapest place to catch the bug.
Check channel direction in function signatures. A function declared as func send(ch chan<- int) (send-only) cannot close the channel from inside, and a function declared as func recv(ch <-chan int) cannot send. Calling close() on a send-only parameter is a compile error, but more subtle direction confusions can lead to a channel that is never closed by anyone. Audit who owns the close.
Look for blocking calls inside a held mutex. A goroutine that holds mu.Lock() and then waits on a channel that another goroutine needs to take the same lock to write to is the textbook circular wait. The deadlock panic will name one of the goroutines, but the root cause is the lock-held-during-blocking-call pattern.
What Other Tutorials Get Wrong About Goroutine Deadlocks
Most tutorials on this error list the right fixes but frame them in misleading ways. The gaps I see most often:
They claim “Go detects all deadlocks.” Go detects the very narrow case where every goroutine in the process is blocked. A program with one running goroutine and a hundred parked ones is hung but not detected. The “Go reports deadlocks for you” framing produces false confidence and sends readers chasing the wrong symptom.
They recommend adding a buffer as a fix without explaining the semantic change. Adding capacity to make(chan T, N) removes the synchronization handshake that the unbuffered version provides. Sometimes the synchronization was load-bearing and the buffer hides a real bug behind apparent success. The right question is “do the sender and receiver need to coordinate at the same instant?”, not “how do I stop the panic?”
They show worker pools without addressing channel-close ownership. A typical “fan-out workers, fan-in results” example has multiple workers writing to a shared results channel. Who closes it? If no one does, the consumer’s range waits forever. If one worker does, the others may panic by writing to a closed channel. The correct pattern is a dedicated goroutine that waits on the worker WaitGroup and then closes the channel. Tutorials that skip this leave the reader to discover it via panic in production.
They use time.Sleep to “wait for goroutines.” Sleep-based synchronization works in toy examples and breaks in CI under load. It also masks the real bug. The right primitives are sync.WaitGroup for “wait for these goroutines to finish” and channels for “wait for this value.” Anyone teaching time.Sleep(100 * time.Millisecond) as a wait-for-goroutine pattern is teaching a bug.
They omit the race detector as the second debugging tool. After reading the deadlock panic, my next step is always go test -race ./... on the suspect package. Data races and deadlocks share root causes (incorrect synchronization), and the race detector catches issues that have not yet manifested as deadlocks but are one timing change away. Tutorials that present the deadlock fix without the race-detector follow-up only solve half the problem.
They confuse goroutine leaks with deadlocks. A goroutine leak (goroutine count climbs without bound) is not a deadlock until all goroutines are blocked. The leak comes first; the deadlock is the eventual consequence. Many “deadlock” articles are actually about leaks and never explain the distinction. See Go goroutine leak for the diagnosis pattern.
Frequently Asked Questions
Why doesn’t Go detect my hanging program’s deadlock?
The runtime detector only fires when literally every goroutine in the process is parked on a synchronization primitive. If you have a running HTTP server, a periodic timer, or any goroutine that the runtime believes “might eventually make progress,” the detector stays silent and the program just hangs. To diagnose a silent hang, use pprof (visit /debug/pprof/goroutine?debug=2 for a stack dump of every goroutine) or send SIGQUIT to the process to print stacks to stderr.
What is the difference between buffered and unbuffered channels?
An unbuffered channel (make(chan T)) requires sender and receiver to be ready at the same instant. The send blocks until a receive is pending, and vice versa. This is a synchronization primitive: the handshake itself is the value. A buffered channel (make(chan T, N)) lets the sender push up to N values without a receiver waiting. This is a decoupling primitive: the sender and receiver run at independent rates. Pick unbuffered when goroutines need to coordinate; pick buffered when they need to queue.
What is the difference between a goroutine leak and a deadlock?
A leak is a goroutine that should have exited but did not, because its exit condition (a channel receive, a context cancellation) never fires. Leaked goroutines pile up over time and the goroutine count climbs without bound. A deadlock is the specific case where every goroutine in the process is blocked simultaneously and the runtime panics. Leaks typically precede deadlocks: the last running goroutine eventually blocks and the runtime reports the panic, but the underlying cause is the leak. Track runtime.NumGoroutine() over time to catch the leak before it becomes a deadlock.
Who should close a channel?
The sender. Only the sender. And only once. Closing a channel from the receiver side is unenforceable at the type level (the type system does not distinguish receive-only from “I might close this”), and closing twice always panics. When the sender side is split across multiple goroutines, introduce a single owning goroutine whose only responsibility is to close the channel after waiting on a WaitGroup, or use sync.Once to guard the close.
Why is calling wg.Add() inside the goroutine wrong?
wg.Add(1) increments a counter that wg.Wait() checks. If the main goroutine reaches wg.Wait() before any spawned goroutine has run wg.Add(1), the counter is zero and Wait() returns immediately. The spawned goroutines then keep running with no one waiting for them, and any results they produce are lost. The fix is to call wg.Add() before the go statement that launches the worker, in the goroutine that owns the WaitGroup.
When should I use select with a default case?
When you want to make the send or receive non-blocking. Without a default, select blocks until one of its cases is ready. With a default, the default branch runs immediately if no case is ready. This is the right primitive for “try to send but skip if no one is listening” patterns, common in metrics or log shipping where dropping a value is preferable to blocking the main loop. It is the wrong primitive for “wait for one of these to be ready” — for that case, omit the default.
For Go index out of range panics, see Fix: Go panic: runtime error: index out of range. For concurrent map access that causes a different fatal error, see Fix: Go fatal error: concurrent map writes. For context-deadline patterns that interact with channel waits, see Fix: Go context deadline exceeded.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Go Test Not Working, Tests Not Running, Failing Unexpectedly, or Coverage Not Collected
How to fix Go testing issues, test function naming, table-driven tests, t.Run subtests, httptest, testify assertions, and common go test flag errors.
Fix: Go Generics Type Constraint Error, Does Not Implement or Cannot Use as Type
How to fix Go generics errors, type constraints, interface vs constraint, comparable, union types, type inference failures, and common generic function pitfalls.
Fix: Go Error Handling Not Working — errors.Is, errors.As, and Wrapping
How to fix Go error handling — errors.Is vs ==, errors.As for type extraction, fmt.Errorf %w for wrapping, sentinel errors, custom error types, and stack traces.
Fix: Go Panic Not Recovered, panic/recover Patterns and Common Pitfalls
How to handle Go panics correctly, recover() placement, goroutine panics, HTTP middleware recovery, defer ordering, distinguishing panics from errors, and when not to use recover.