Decoding the Intricacies of JSON with json.RawMessage and Custom Unmarshaling
Takashi Yamamoto
Infrastructure Engineer · Leapcell

Unraveling Complex JSON Structures in Go
The world of modern software development is deeply intertwined with JSON. It's the lingua franca for data exchange across diverse systems, from web APIs to configuration files. While Go's encoding/json package offers powerful and convenient tools for marshaling and unmarshaling JSON, developers often encounter scenarios where the structure of incoming JSON is either dynamic, polymorphic, or simply too complex for standard automatic decoding. This is where json.RawMessage and custom UnmarshalJSON implementations become indispensable tools for any Go developer aiming to master JSON handling. They provide the flexibility to defer parsing, inspect raw data, and apply tailored logic, ensuring robust and resilient data processing even in the face of unpredictable JSON.
The Go-JSON Toolkit for Advanced Parsing
Before diving into the advanced techniques, let's briefly touch upon the core concepts that underpin Go's JSON handling.
JSON (JavaScript Object Notation): A lightweight data-interchange format designed to be easily readable by humans and easily parsed by machines. It builds on two main structures: a collection of name/value pairs (objects) and an ordered list of values (arrays).
encoding/json package: Go's standard library package for encoding and decoding JSON. It provides functions like json.Marshal and json.Unmarshal for converting Go structs to JSON and vice-versa.
json.Unmarshal: The function responsible for decoding JSON data into a Go value. By default, it uses reflection to match JSON fields to struct fields based on names (or json struct tags).
json.RawMessage: This is the star of our show. json.RawMessage is defined as type RawMessage []byte. It's a byte slice that specifically holds raw, unparsed JSON data. When Unmarshal encounters a json.RawMessage field, it treats the corresponding JSON value as a byte slice without attempting to further decode it. This means you can store an entire JSON object, array, string, number, boolean, or null as a raw byte slice within your struct.
json.Unmarshaler interface: This interface defines a single method: UnmarshalJSON([]byte) error. When json.Unmarshal decodes a value into a type that implements this interface, it calls the UnmarshalJSON method with the raw JSON data for that specific field (or the entire object if implemented at the struct level). This gives developers complete control over the decoding process.
The Power of json.RawMessage
Let's imagine a scenario where you're consuming an API that returns a list of events. Each event has some common fields like id and timestamp, but the details field can vary significantly based on the type of the event.
Without json.RawMessage, you might be forced to define a very large struct with many optional fields, or perform multiple Unmarshal calls with type assertions, which can be cumbersome and error-prone.
Here's how json.RawMessage simplifies this:
package main import ( "encoding/json" "fmt" ) // Event represents a generic event structure. type Event struct { ID string `json:"id"` Timestamp int64 `json:"timestamp"` Type string `json:"type"` Details json.RawMessage `json:"details"` // RawMessage to hold varying detail structures } // LoginDetails represents details for a "login" event. type LoginDetails struct { Username string `json:"username"` IPAddress string `json:"ip_address"` } // PurchaseDetails represents details for a "purchase" event. type PurchaseDetails struct { ProductID string `json:"product_id"` Amount float64 `json:"amount"` Currency string `json:"currency"` } func main() { jsonBlob := ` [ { "id": "evt-123", "timestamp": 1678886400, "type": "login", "details": { "username": "alice", "ip_address": "192.168.1.100" } }, { "id": "evt-456", "timestamp": 1678886500, "type": "purchase", "details": { "product_id": "prod-A", "amount": 99.99, "currency": "USD" } }, { "id": "evt-789", "timestamp": 1678886600, "type": "logout", "details": "user logged out successfully" } ]` var events []Event err := json.Unmarshal([]byte(jsonBlob), &events) if err != nil { fmt.Println("Error unmarshaling events:", err) return } for _, event := range events { fmt.Printf("Event ID: %s, Type: %s\n", event.ID, event.Type) switch event.Type { case "login": var loginDetails LoginDetails if err := json.Unmarshal(event.Details, &loginDetails); err != nil { fmt.Println(" Error unmarshaling login details:", err) continue } fmt.Printf(" Login Details: Username=%s, IP=%s\n", loginDetails.Username, loginDetails.IPAddress) case "purchase": var purchaseDetails PurchaseDetails if err := json.Unmarshal(event.Details, &purchaseDetails); err != nil { fmt.Println(" Error unmarshaling purchase details:", err) continue } fmt.Printf(" Purchase Details: ProductID=%s, Amount=%.2f %s\n", purchaseDetails.ProductID, purchaseDetails.Amount, purchaseDetails.Currency) case "logout": var message string if err := json.Unmarshal(event.Details, &message); err != nil { fmt.Println(" Error unmarshaling logout message:", err) continue } fmt.Printf(" Logout Message: %s\n", message) default: fmt.Printf(" Unhandled event type, raw details: %s\n", string(event.Details)) } fmt.Println("---") } }
In this example, Event.Details is a json.RawMessage. The initial json.Unmarshal parses the common fields (ID, Timestamp, Type) and leaves the details field untouched as a raw byte slice. Later, we can inspect the Type field and then selectively unmarshal event.Details into the correct concrete type. This approach is highly flexible and prevents data loss or errors when dealing with dynamic JSON structures.
Implementing Custom UnmarshalJSON
While json.RawMessage is excellent for deferring parsing, custom UnmarshalJSON implementations provide an even finer grain of control, allowing you to manipulate the JSON data before, during, or after decoding, or even decode into different types based on specific conditions.
Consider a scenario where a User's Age might be represented as an integer or a string (e.g., "unknown"). Standard Unmarshal would fail if it expected an int but received a string.
package main import ( "encoding/json" "fmt" "strconv" ) type User struct { Name string Age int // We want Age to always be an int, even if it comes as a string "unknown" } // CustomUnmarshalerUser is a struct that implements custom UnmarshalJSON type CustomUnmarshalerUser User // Alias to avoid infinite recursion func (u *CustomUnmarshalerUser) UnmarshalJSON(data []byte) error { // Define a temporary struct to hold the raw data, including Age as a raw message // This helps avoid infinite recursion if we directly unmarshal into `CustomUnmarshalerUser` // without taking care. Using json.RawMessage for Age allows us to inspect its type. type TempUser struct { Name string `json:"name"` Age json.RawMessage `json:"age"` // Holds age as raw bytes } var temp TempUser if err := json.Unmarshal(data, &temp); err != nil { return err } u.Name = temp.Name // Now, intelligently parse the Age field if temp.Age == nil { // If "age": null or missing u.Age = 0 // Or some default value return nil } // Try to unmarshal as an int var ageInt int if err := json.Unmarshal(temp.Age, &ageInt); err == nil { u.Age = ageInt return nil } // If it failed as an int, try unmarshaling as a string var ageStr string if err := json.Unmarshal(temp.Age, &ageStr); err == nil { if ageStr == "unknown" || ageStr == "" { u.Age = 0 // Represent "unknown" or empty string as 0 } else { // Try to parse string as an int if it's a number in string form parsedAge, err := strconv.Atoi(ageStr) if err == nil { u.Age = parsedAge } else { // Handle other unexpected string values, or return an error return fmt.Errorf("could not parse age string '%s'", ageStr) } } return nil } return fmt.Errorf("age field is neither a number nor a recognized string: %s", string(temp.Age)) } func main() { jsonUsers := []string{ `{"name": "Alice", "age": 30}`, `{"name": "Bob", "age": "unknown"}`, `{"name": "Charlie", "age": "25"}`, // Age as a string number `{"name": "David", "age": null}`, `{"name": "Eve"}`, // Age missing `{"name": "Frank", "age": "thirty"}`, // Invalid string } for i, j := range jsonUsers { var user CustomUnmarshalerUser err := json.Unmarshal([]byte(j), &user) if err != nil { fmt.Printf("User %d: Error unmarshaling: %v\n", i+1, err) continue } fmt.Printf("User %d: Name: %s, Age: %d\n", i+1, user.Name, user.Age) } }
In this enhanced example, UnmarshalJSON for CustomUnmarshalerUser first unmarshals the top-level fields using a temporary struct where Age is a json.RawMessage. This prevents encoding/json from trying to make an immediate, potentially failing, type assertion for Age. Then, we can inspect the raw bytes of temp.Age and attempt to unmarshal it as an int, then as a string, handling null, "unknown", and even string-encoded numbers gracefully. This demonstrates powerful error recovery and data standardization.
A common pattern when implementing UnmarshalJSON for a struct where you also want the default unmarshaling behavior for some fields is to define an alias type:
type MyConfig struct { ... usual fields ... SpecialField string } type AliasMyConfig MyConfig // Use an alias to avoid infinite recursion func (mc *MyConfig) UnmarshalJSON(data []byte) error { var alias AliasMyConfig if err := json.Unmarshal(data, &alias); err != nil { return err } *mc = MyConfig(alias) // Copy data from alias to original struct // Now, apply custom logic to mc.SpecialField or other fields // after the default unmarshaling has occurred. // E.g., validation, transformation. if mc.SpecialField == "oldvalue" { mc.SpecialField = "newvalue" } return nil }
This alias pattern allows json.Unmarshal to use its default reflection-based unmarshaling for AliasMyConfig, and then you can layer your custom logic on top.
When to Use Which
json.RawMessage: Ideal when you have a specific field (or multiple fields) whose internal structure varies, and you want to defer its parsing until you know what concrete type it should be. It's excellent for polymorphic data structures without needing to write full customUnmarshalJSONfor the entire struct.- Custom
UnmarshalJSON: Provides the ultimate control. Use it when:- You need to handle data that can come in multiple formats for a single field (e.g.,
Ageasintorstring). - You need to validate fields during unmarshaling.
- You need to perform transformations or enrich data during unmarshaling.
- The overall parsing logic depends on values of other fields within the same object (e.g.,
Typefield dictates how anotherDatafield is parsed). - You need to ignore unknown fields or perform advanced error recovery.
- You need to handle data that can come in multiple formats for a single field (e.g.,
Conclusion
json.RawMessage and custom UnmarshalJSON are powerful features in Go's encoding/json package that move you beyond basic JSON parsing. By mastering json.RawMessage, you gain the ability to gracefully handle dynamic and polymorphic JSON payloads, deferring specific field parsing until the context demands it. When you need absolute control over the decoding process for type conversions, validation, or complex conditional logic, implementing the json.Unmarshaler interface with a custom UnmarshalJSON method offers the ultimate flexibility. These tools are crucial for building robust Go applications that interact with real-world, often messy, JSON APIs. They empower you to write code that is both resilient to schema variations and precise in its data interpretation, ensuring that your application can confidently process any JSON data thrown its way.

