Robust HTTP Client Design in Go
Daniel Hayes
Full-Stack Engineer · Leapcell

Introduction
In modern distributed systems, services frequently interact with each other over HTTP. While Go's net/http
package provides a robust and efficient http.Client
for making these requests, raw usage often falls short of production demands. Network calls are inherently unreliable; they can fail due to transient network issues, server overloads, or unexpected latencies. Without proper safeguards, these failures can propagate throughout a system, leading to cascading outages and a poor user experience. This article delves into how we can wrap Go’s standard http.Client
to incorporate essential fault-tolerance patterns: retries, timeouts, and circuit breakers. By adopting these strategies, we can significantly improve the resilience and stability of our applications, ensuring reliable communication even in the face of adversity.
Core Concepts Explained
Before diving into the implementation details, let's clarify the core distributed system patterns we'll be discussing:
Timeout: A timeout defines the maximum duration an operation is allowed to take before it is aborted. Its primary purpose is to prevent a client from waiting indefinitely for a response, thus freeing up resources and preventing a backlog of stalled requests. There are generally two types: connection timeouts (for establishing a connection) and request timeouts (for the entire request-response cycle).
Retry: A retry mechanism automatically re-attempts a failed operation, assuming the failure might be transient. It's crucial to implement retries with an exponential backoff strategy and a maximum number of attempts to avoid overwhelming the target service and to give it time to recover. Not all errors are retryable; for instance, a 400 Bad Request won't magically become a 200 on retry.
Circuit Breaker: Inspired by electrical circuit breakers, this pattern prevents an application from repeatedly trying to execute an operation that is bound to fail. When a circuit breaker detects a high rate of failures, it "trips" (opens), immediately failing subsequent calls without attempting the operation. After a predefined interval, it transitions to a "half-open" state, allowing a limited number of test requests to pass through. If these succeed, the circuit "closes" again; otherwise, it returns to the open state. This pattern prevents cascading failures and gives the failing service time to recover.
Building a Resilient HTTP Client
Our goal is to create a decorator for http.Client
that seamlessly integrates these fault-tolerance features. Let's design a custom HTTPClient
interface and its implementation.
Initial Setup and Basic Client
First, we define a simple interface for our HTTP client to allow for easy mocking and testing.
package resilientclient import ( "net/http" "time" ) // HTTPClient interface defines the contract for our client type HTTPClient interface { Do(req *http.Request) (*http.Response, error) } // defaultClient wraps the standard http.Client type defaultClient struct { client *http.Client } // NewDefaultClient creates a new default client func NewDefaultClient(timeout time.Duration) HTTPClient { return &defaultClient{ client: &http.Client{ Timeout: timeout, // Basic request timeout Transport: &http.Transport{ MaxIdleConns: 100, IdleConnTimeout: 90 * time.Second, TLSHandshakeTimeout: 10 * time.Second, // More transport options can be added here }, }, } } // Do implements the HTTPClient interface for defaultClient func (c *defaultClient) Do(req *http.Request) (*http.Response, error) { return c.client.Do(req) }
Here, we've already introduced a basic request timeout via http.Client.Timeout
. This is a good starting point for preventing indefinite waits.
Implementing Retries
Retries will wrap the Do
method. We'll introduce a RetryClient
that takes another HTTPClient
as an argument.
package resilientclient import ( "fmt" "io" "io/ioutil" "log" "net/http" "time" ) // RetryConfig holds parameters for retry logic type RetryConfig struct { MaxRetries int InitialDelay time.Duration MaxDelay time.Duration // Add a predicate function if specific errors should not be retried ShouldRetry func(*http.Response, error) bool } // RetryClient provides retry logic for HTTP requests type RetryClient struct { delegate HTTPClient config RetryConfig } // NewRetryClient creates a new client with retry capabilities func NewRetryClient(delegate HTTPClient, config RetryConfig) *RetryClient { if config.MaxRetries == 0 { config.MaxRetries = 3 // Default retries } if config.InitialDelay == 0 { config.InitialDelay = 100 * time.Millisecond // Default initial delay } if config.MaxDelay == 0 { config.MaxDelay = 5 * time.Second // Default max delay } if config.ShouldRetry == nil { config.ShouldRetry = func(resp *http.Response, err error) bool { if err != nil { return true // Network errors are usually retryable } // Status codes indicating server-side issues (e.g., 5xx) return resp.StatusCode >= 500 } } return &RetryClient{ delegate: delegate, config: config, } } func (c *RetryClient) Do(req *http.Request) (*http.Response, error) { var ( resp *http.Response err error delay = c.config.InitialDelay ) for i := 0; i < c.config.MaxRetries; i++ { // Important: For requests with a body, we need to reset the body reader // for each retry, as the original reader will be consumed after the first attempt. if req.Body != nil { // Check if the body can be reset (e.g., *http.NoBody, bytes.Buffer, or a custom io.Seeker) if seeker, ok := req.Body.(io.Seeker); ok { _, seekErr := seeker.Seek(0, io.SeekStart) if seekErr != nil { return nil, fmt.Errorf("failed to seek request body: %w", seekErr) } } else { // If not seekable, read the body into memory and create a new NopCloser. // This is generally not ideal for large payloads, consider using `bytes.Buffer` // or `io.ReaderAt` for the original body if retries are expected. bodyBytes, readErr := ioutil.ReadAll(req.Body) if readErr != nil { return nil, fmt.Errorf("failed to read request body for retry: %w", readErr) } req.Body = ioutil.NopCloser(bytes.NewReader(bodyBytes)) } } resp, err = c.delegate.Do(req) if c.config.ShouldRetry(resp, err) { log.Printf("Request failed (attempt %d/%d), retrying in %v. Error: %v", i+1, c.config.MaxRetries, delay, err) time.Sleep(delay) delay = time.Duration(float64(delay) * 2) // Exponential backoff if delay > c.config.MaxDelay { delay = c.config.MaxDelay } continue } return resp, err // Success or non-retryable error } return resp, err // Return the last response/error if all retries fail }
Important consideration for request bodies: When retrying requests that send a body (e.g., POST, PUT), the req.Body
(which is an io.ReadCloser
) is consumed after the first Do
call. For subsequent retries, the body would be empty, leading to incorrect requests. The provided code attempts to handle this by either seeking back the body if it's an io.Seeker
or by reading the entire body into memory to create a new NopCloser
. For large bodies, reading into memory can be inefficient, so designing the original http.Request
body to be seekable (e.g., using bytes.Buffer
) is preferable.
Implementing Circuit Breaker
For circuit breakers, we can leverage a mature library like sony/gobreaker
. This library provides a robust implementation of the circuit breaker pattern.
package resilientclient import ( "fmt" "net/http" "time" "github.com/sony/gobreaker" ) // CircuitBreakerClient wraps an HTTPClient with circuit breaking logic type CircuitBreakerClient struct { delegate HTTPClient breaker *gobreaker.CircuitBreaker } // NewCircuitBreakerClient creates a new client with circuit breaker capabilities func NewCircuitBreakerClient(delegate HTTPClient, settings gobreaker.Settings) *CircuitBreakerClient { if settings.Name == "" { settings.Name = "default-circuit-breaker" } if settings.Timeout == 0 { settings.Timeout = 60 * time.Second // How long to wait in 'open' state before attempting 'half-open' } if settings.MaxRequests == 0 { settings.MaxRequests = 1 // Allow 1 request in 'half-open' state } if settings.Interval == 0 { settings.Interval = 5 * time.Second // Time until reset of counts } if settings.ReadyToTrip == nil { settings.ReadyToTrip = func(counts gobreaker.Counts) bool { failureRatio := float64(counts.TotalFailures) / float64(counts.Requests) // Trip if there are at least 3 requests and 60% of them failed return counts.Requests >= 3 && failureRatio >= 0.6 } } return &CircuitBreakerClient{ delegate: delegate, breaker: gobreaker.NewCircuitBreaker(settings), } } func (c *CircuitBreakerClient) Do(req *http.Request) (*http.Response, error) { // The Execute method calls the provided function if the breaker is closed or half-open. // If the breaker is open, it immediately returns gobreaker.ErrOpenState. result, err := c.breaker.Execute(func() (interface{}, error) { resp, err := c.delegate.Do(req) if err != nil { // Errors from the delegate (like network errors, timeouts) should be counted as failures return nil, err } // For circuit breakers, we also consider server-side errors (5xx) as failures if resp.StatusCode >= 500 { // Important: To prevent resource leaks, read and close the body even on error // if you are only returning nil for the interface. // Or, ideally, return the *http.Response as the interface{} and let the caller // handle the body if it fails, which is cleaner. // For simplicity here, we'll indicate failure. return resp, fmt.Errorf("server error: %d", resp.StatusCode) } return resp, nil }) if err != nil { if err == gobreaker.ErrOpenState { return nil, fmt.Errorf("circuit breaker is open: %w", err) } // If it's a server error or a network error, we return the original error. // If the error was a server error we formatted, we can extract the response // if we returned it as the interface{}. if resp, ok := result.(*http.Response); ok && resp != nil { return resp, err // Return the response received before circuit trip consideration } return nil, err } return result.(*http.Response), nil }
The gobreaker
library handles the state transitions (closed, open, half-open) and failure counting automatically. We configure it with gobreaker.Settings
to define thresholds for tripping and recovery. The Execute
method takes a function that performs the actual operation and returns interface{}, error
. The gobreaker
library uses the returned error to determine if the operation failed.
Chaining Them Together
The beauty of this decorator pattern is that we can chain these clients together. A typical setup would be: CircuitBreakerClient
wrapping RetryClient
wrapping DefaultClient
. This ensures that:
- Before even trying, the circuit breaker checks if the remote service is likely down.
- If the circuit is closed (or half-open), the request proceeds to the retry logic.
- The retry logic handles transient failures with backoff.
- The underlying
defaultClient
handles the actual HTTP request with its base timeout.
package main import ( "fmt" "io" "log" "net/http" "time" "github.com/sony/gobreaker" "your_module_path/resilientclient" // Assuming resilientclient is in a subpackage ) func main() { // 1. Create the base client with a default timeout baseClient := resilientclient.NewDefaultClient(5 * time.Second) // 2. Wrap with retry logic retryConfig := resilientclient.RetryConfig{ MaxRetries: 5, InitialDelay: 200 * time.Millisecond, MaxDelay: 10 * time.Second, ShouldRetry: func(resp *http.Response, err error) bool { if err != nil { return true // Retry on network errors } // Only retry on specific server errors or too many requests return resp.StatusCode == http.StatusTooManyRequests || resp.StatusCode >= http.StatusInternalServerError }} retryClient := resilientclient.NewRetryClient(baseClient, retryConfig) // 3. Wrap with circuit breaker cbSettings := gobreaker.Settings{ Name: "ExternalService", MaxRequests: 3, // Allow 3 requests in half-open state Interval: 5 * time.Second, // Reset counts every 5 seconds Timeout: 30 * time.Second, // Open for 30 seconds before half-open attempt ReadyToTrip: func(counts gobreaker.Counts) bool { failureRatio := float64(counts.TotalFailures) / float64(counts.Requests) return counts.Requests >= 10 && failureRatio >= 0.3 // Trip if 30% of 10 requests fail }, } resilientHttClient := resilientclient.NewCircuitBreakerClient(retryClient, cbSettings) // Example usage for i := 0; i < 20; i++ { req, err := http.NewRequest("GET", "http://localhost:8080/api/data", nil) if err != nil { log.Fatalf("Error creating request: %v", err) } log.Printf("Making request %d...", i+1) resp, err := resilientHttClient.Do(req) if err != nil { log.Printf("Request %d ERROR: %v", i+1, err) time.Sleep(500 * time.Millisecond) // Simulate some delay between calls continue } defer resp.Body.Close() body, _ := io.ReadAll(resp.Body) log.Printf("Request %d SUCCESS: Status %d, Body: %s", i+1, resp.StatusCode, string(body)) time.Sleep(500 * time.Millisecond) } }
This main
function demonstrates how to construct and use our resilientHttClient
. You would typically replace http://localhost:8080/api/data
with your actual service endpoint.
Application Scenarios
This pattern is invaluable for:
- Microservices Communication: Ensuring service-to-service calls remain stable despite network hiccups or temporary overload of dependencies.
- External API Integrations: Reliably consuming third-party APIs that might be rate-limited or occasionally unstable.
- Database Interactions (indirectly): While
http.Client
isn't for direct database interaction, if a service exposes an HTTP API that fronts a database, these patterns protect against database-related service failures.
Conclusion
By programmatically wrapping Go's standard http.Client
with retry, timeout, and circuit breaker logic, we've transformed a basic HTTP client into a production-ready, fault-tolerant component. This layered approach, using the decorator pattern, keeps concerns separate and makes our code more maintainable and robust. Implementing these patterns is not just about handling errors; it's about building resilient systems that gracefully degrade and recover, ensuring continuous operation even in challenging distributed environments. Robustness is not optional; it is fundamental to reliable distributed systems.