Neural Networks in Go: A Complete Guide to Building from Scratch
James Reed
Infrastructure Engineer · Leapcell

Building a Neural Network from Scratch with Go: Principles, Structure, and Implementation
This article will introduce how to use the Go programming language to build a simple neural network from scratch and demonstrate its workflow through the Iris classification task. It will combine principle explanations, code implementations, and visual structure displays to help readers understand the core mechanisms of neural networks.
Ⅰ. Basic Principles and Structure of Neural Networks
A neural network is a computational model that simulates biological neurons, achieving non-linear mapping through connections of nodes across multiple layers. A typical three-layer neural network structure includes an input layer, a hidden layer, and an output layer. Nodes in each layer are connected via weights and biases, and inter-layer transmission is processed through activation functions. Below is a schematic diagram of a simple three-layer neural network structure (drawn with ASCII characters):
+-----------+ +-----------+ +-----------+ | Input Layer | | Hidden Layer | | Output Layer | | 4 Nodes | | 3 Nodes | | 3 Nodes | +-----------+ +-----------+ +-----------+ ↑ ↑ ↑ │ Weights │ Weights │ Weights │ ├───────────────┼───────────────┼───────────────┤ ↓ ↓ ↓ +-----------+ +-----------+ +-----------+ | Bias | | Bias | | Bias | +-----------+ +-----------+ +-----------+ ↓ ↓ ↓ +-----------+ +-----------+ +-----------+ | Activation| | Activation| | Activation| | Function | | Function | | Function | +-----------+ +-----------+ +-----------+
Core Concepts:
-
Forward Propagation
Input data undergoes linear transformation through weight matrices (input × weights + bias
), then introduces non-linearity via activation functions, propagating layer by layer to the output layer.
Formula Examples:- Hidden layer input: ( Z_1 = X \cdot W_1 + b_1 )
- Hidden layer output: ( A_1 = \sigma(Z_1) ) ((\sigma) is the Sigmoid function)
- Output layer input: ( Z_2 = A_1 \cdot W_2 + b_2 )
- Output layer output: ( A_2 = \sigma(Z_2) )
-
Backpropagation
By calculating the error between predicted and true values (such as mean squared error), weights and biases in each layer are updated in reverse using the chain rule to optimize model parameters.
Key Steps:- Calculate output error: ( \delta_2 = A_2 - Y )
- Hidden layer error: ( \delta_1 = \delta_2 \cdot W_2^T \odot \sigma'(Z_1) ) ((\odot) denotes element-wise multiplication)
- Update weights: ( W_2 \leftarrow W_2 - \eta \cdot A_1^T \cdot \delta_2 ), ( W_1 \leftarrow W_1 - \eta \cdot X^T \cdot \delta_1 )
- Update biases: ( b_2 \leftarrow b_2 - \eta \cdot \sum \delta_2 ), ( b_1 \leftarrow b_1 - \eta \cdot \sum \delta_1 )
((\eta) is the learning rate, (\sigma') is the derivative of the activation function)
Ⅱ. Key Design of Neural Network Implementation in Go
1. Data Structure Definition
Use the gonum.org/v1/gonum/mat
package in Go for matrix operations, defining network structures and parameters:
// neuralNet stores trained neural network parameters type neuralNet struct { config neuralNetConfig // Network configuration wHidden *mat.Dense // Hidden layer weight matrix bHidden *mat.Dense // Hidden layer bias vector wOut *mat.Dense // Output layer weight matrix bOut *mat.Dense // Output layer bias vector } // neuralNetConfig defines network architecture and training parameters type neuralNetConfig struct { inputNeurons int // Number of input layer nodes (e.g., 4 features of Iris) outputNeurons int // Number of output layer nodes (e.g., 3 Iris species) hiddenNeurons int // Number of hidden layer nodes (tunable hyperparameter) numEpochs int // Number of training epochs learningRate float64 // Learning rate }
2. Activation Function and Its Derivative
Select the Sigmoid function as the activation function, whose derivative can be quickly calculated based on the function value, suitable for backpropagation:
// sigmoid Sigmoid activation function func sigmoid(x float64) float64 { return 1.0 / (1.0 + math.Exp(-x)) } // sigmoidPrime Derivative of the Sigmoid function func sigmoidPrime(x float64) float64 { s := sigmoid(x) return s * (1.0 - s) }
3. Backpropagation Training Logic
Parameter Initialization
Initialize weights and biases with random numbers to ensure the network can learn:
func (nn *neuralNet) train(x, y *mat.Dense) error { randGen := rand.New(rand.NewSource(time.Now().UnixNano())) // Random number generator // Initialize weights and biases for hidden and output layers wHidden := mat.NewDense(nn.config.inputNeurons, nn.config.hiddenNeurons, nil) bHidden := mat.NewDense(1, nn.config.hiddenNeurons, nil) wOut := mat.NewDense(nn.config.hiddenNeurons, nn.config.outputNeurons, nil) bOut := mat.NewDense(1, nn.config.outputNeurons, nil) // Fill parameter matrices with random numbers for _, param := range [][]*mat.Dense{{wHidden, bHidden}, {wOut, bOut}} { for _, m := range param { raw := m.RawMatrix().Data for i := range raw { raw[i] = randGen.Float64() // Random values in [0, 1) } } } // Invoke backpropagation for training return nn.backpropagate(x, y, wHidden, bHidden, wOut, bOut) }
Core Backpropagation Logic
Implement error backpropagation and parameter updates through matrix operations. The code uses the Apply
method to batch process activation functions and derivatives:
func (nn *neuralNet) backpropagate(x, y, wHidden, bHidden, wOut, bOut *mat.Dense) error { for epoch := 0; epoch < nn.config.numEpochs; epoch++ { // Forward propagation to calculate outputs of each layer hiddenInput := new(mat.Dense).Mul(x, wHidden) // Hidden layer linear input: X·W_hidden hiddenInput.Apply(func(_, col int, v float64) float64 { // Add bias term return v + bHidden.At(0, col) }, hiddenInput) hiddenAct := new(mat.Dense).Apply(sigmoid, hiddenInput) // Hidden layer activated output outputInput := new(mat.Dense).Mul(hiddenAct, wOut) // Output layer linear input: A_hidden·W_out outputInput.Apply(func(_, col int, v float64) float64 { // Add bias term return v + bOut.At(0, col) }, outputInput) output := new(mat.Dense).Apply(sigmoid, outputInput) // Output layer activated output // Backpropagation to calculate errors and gradients error := new(mat.Dense).Sub(y, output) // Output error: Y - A_out // Calculate output layer gradients outputSlope := new(mat.Dense).Apply(sigmoidPrime, outputInput) // σ'(Z_out) dOutput := new(mat.Dense).MulElem(error, outputSlope) // δ_out = error * σ'(Z_out) // Calculate hidden layer gradients hiddenError := new(mat.Dense).Mul(dOutput, wOut.T()) // Error backpropagation: δ_out·W_out^T hiddenSlope := new(mat.Dense).Apply(sigmoidPrime, hiddenInput) // σ'(Z_hidden) dHidden := new(mat.Dense).MulElem(hiddenError, hiddenSlope) // δ_hidden = δ_out·W_out^T * σ'(Z_hidden) // Update weights and biases (stochastic gradient descent) wOut.Add(wOut, new(mat.Dense).Scale(nn.config.learningRate, new(mat.Dense).Mul(hiddenAct.T(), dOutput))) bOut.Add(bOut, new(mat.Dense).Scale(nn.config.learningRate, sumAlongAxis(0, dOutput))) wHidden.Add(wHidden, new(mat.Dense).Scale(nn.config.learningRate, new(mat.Dense).Mul(x.T(), dHidden))) bHidden.Add(bHidden, new(mat.Dense).Scale(nn.config.learningRate, sumAlongAxis(0, dHidden))) } // Save trained parameters nn.wHidden, nn.bHidden, nn.wOut, nn.bOut = wHidden, bHidden, wOut, bOut return nil }
4. Forward Prediction Function
After training, use trained weights and biases for forward propagation to output predictions:
func (nn *neuralNet) predict(x *mat.Dense) (*mat.Dense, error) { // Check if parameters exist if nn.wHidden == nil || nn.wOut == nil { return nil, errors.New("neural network not trained") } hiddenAct := new(mat.Dense).Mul(x, nn.wHidden).Apply(func(_, col int, v float64) float64 { return v + nn.bHidden.At(0, col) }, nil).Apply(sigmoid, nil) output := new(mat.Dense).Mul(hiddenAct, nn.wOut).Apply(func(_, col int, v float64) float64 { return v + nn.bOut.At(0, col) }, nil).Apply(sigmoid, nil) return output, nil }
Ⅲ. Data Processing and Experimental Validation
1. Dataset Preparation
Use the classic Iris Dataset, containing 4 features (sepal length, sepal width, petal length, petal width) and 3 species (Setosa, Versicolor, Virginica). Data preprocessing steps:
- Convert species labels to one-hot encoding, e.g., Setosa corresponds to
[1, 0, 0]
, Versicolor to[0, 0, 1]
. - Split 80% of the data into the training set and 20% into the test set, and add small random noise to increase training difficulty.
- Sample data (excerpt from
train.csv
):sepal_length,sepal_width,petal_length,petal_width,setosa,virginica,versicolor 0.0873,0.6687,0.0,0.0417,1.0,0.0,0.0 0.7232,0.4533,0.6949,0.967,0.0,1.0,0.0 0.6617,0.4567,0.6580,0.6567,0.0,0.0,1.0
2. Main Program Flow
Read Data and Convert to Matrices
func main() { // Read training data file f, err := os.Open("data/train.csv") if err != nil { log.Fatalf("failed to open file: %v", err) } defer f.Close() reader := csv.NewReader(f) reader.FieldsPerRecord = 7 // 4 features + 3 labels rawData, err := reader.ReadAll() if err != nil { log.Fatalf("failed to read CSV: %v", err) } // Parse data into input features (X) and labels (Y) numSamples := len(rawData) - 1 // Skip header inputsData := make([]float64, 4*numSamples) labelsData := make([]float64, 3*numSamples) for i, record := range rawData { if i == 0 { continue // Skip header } for j, val := range record { fVal, err := strconv.ParseFloat(val, 64) if err != nil { log.Fatalf("invalid value: %v", val) } if j < 4 { inputsData[(i-1)*4+j] = fVal // First 4 columns are features } else { labelsData[(i-1)*3+(j-4)] = fVal // Last 3 columns are labels } } } inputs := mat.NewDense(numSamples, 4, inputsData) labels := mat.NewDense(numSamples, 3, labelsData) }
Configure Network Parameters and Train
// Define network structure: 4 inputs, 3 hidden nodes, 3 outputs config := neuralNetConfig{ inputNeurons: 4, outputNeurons: 3, hiddenNeurons: 5, numEpochs: 8000, // Train for 5000 epochs learningRate: 0.2, // Learning rate } network := newNetwork(config) if err := network.train(inputs, labels); err != nil { log.Fatalf("training failed: %v", err) }
Test Model Accuracy
// Read test data and predict predictions, err := network.predict(testInputs) if err != nil { log.Fatalf("prediction failed: %v", err) } // Calculate classification accuracy trueCount := 0 numPreds, _ := predictions.Dims() for i := 0; i < numPreds; i++ { // Get true label (one-hot to index) trueLabel := mat.Row(nil, i, testLabels) trueClass := -1 for j, val := range trueLabel { if val == 1.0 { trueClass = j break } } // Get class with highest probability in predictions predRow := mat.Row(nil, i, predictions) maxVal := floats.Min(predRow) predClass := -1 for j, val := range predRow { if val > maxVal { maxVal = val predClass = j } } if trueClass == predClass { trueCount++ } } fmt.Printf("Accuracy: %.2f%%\n", float64(trueCount)/float64(numPreds)*100)
Ⅳ. Experimental Results and Summary
After 8000 training epochs, the model achieves approximately 98% classification accuracy on the test set (results may vary slightly due to random initialization). This demonstrates that even a simple three-layer neural network can effectively solve non-linear classification problems.
Core Advantages:
- Pure Go Implementation: No reliance on C extensions (no
cgo
), can be compiled into static binary files, suitable for cross-platform deployment. - Matrix Abstraction: Numerical computations based on the
gonum/mat
package, with clear code structure and easy extensibility.
Improvement Directions:
- Experiment with different activation functions (e.g., ReLU) or optimizers (e.g., Adam).
- Add regularization (e.g., L2 regularization) to prevent overfitting.
- Support multiple hidden layers to build deeper neural networks.
Leapcell: The Best of Serverless Web Hosting
Finally, recommend the best platform for deploying Go services: Leapcell
🚀 Build with Your Favorite Language
Develop effortlessly in JavaScript, Python, Go, or Rust.
🌍 Deploy Unlimited Projects for Free
Only pay for what you use—no requests, no charges.
⚡ Pay-as-You-Go, No Hidden Costs
No idle fees, just seamless scalability.
🔹 Follow us on Twitter: @LeapcellHQ