Learning Large-Scale Go Project Architecture from Kubernetes

Before attempting to build a large-scale project with high scalability, reliability, and maintainability using Go, let's first look at the project structure of Kubernetes to see how it organizes a series of functional modules for container orchestration.

Kubernetes Code Layout

Below is a list of Kubernetes’ main top-level directories and their primary functions. Next, we will explain the purpose of each directory one by one.

api: Stores interface protocols
build: Code related to building applications
cmd: main entry points for each application
pkg: Main implementation of each component
staging: Temporarily stores code that is interdependent among components

api

This stores OpenAPI and Swagger files, including definitions for JSON and Protocol.

build

This contains scripts for building the Kubernetes project, including building each component of K8s as well as required images, such as the pause program.

cmd

The cmd directory stores the source files of the main package for building executable files. If multiple executables need to be built, each executable can be placed in its own subdirectory. Let's look at the specific subdirectories under the Kubernetes cmd directory:

- cmd: The `main` methods of each application
    - kube-proxy: Responsible for network-related rules
    - kube-apiserver: Exposes K8s APIs and handles requests, providing CURD operations for various resources (Pod, ReplicaSet, Service)
    - kube-controller-manager
    - kube-scheduler: Monitors newly created Pods and selects nodes for them to run
    - kubectl: Command-line tool for accessing the cluster

As we can see, familiar components in K8s such as kube-proxy and kube-apiserver can be found here.

pkg

The pkg directory contains both dependencies needed by the project itself and exported packages.

- pkg: Main implementations of each component
    - proxy: Network proxy implementation
    - kubelet: Maintains Pods on the Node
        - cm: Container management, such as cgroups
        - stats: Resource usage, implemented by `cAdvisor`
    - scheduler: Implementation of Pod scheduling
        - framework
    - controlplane: Control plane
        - apiserver

staging

Packages in the staging directory are linked into k8s.io via symbolic links. First, because the Kubernetes project is huge, this helps avoid development obstacles caused by fragmented repositories, allowing all code to be submitted and reviewed in one pull request. In this way, modularity is ensured, while also maintaining the completeness of the main code repository.

At the same time, by using the replace directive in go mod, you don’t need to tag every dependency, simplifying version management and release processes.

If we didn’t do it this way and instead used the monorepo approach—splitting all code under staging into independent repositories—then whenever the code of these sub-repositories changed, we would need to first submit in the sub-repository, publish a new tag, and then replace the old tag in go mod before further development. This would undoubtedly increase the overall development cost.

Therefore, linking the packages in the staging directory to the main repository through symbolic links effectively simplifies version management and the release process.

Comparison with the Standard Go Project Layout

The internal directory is used for packages that are not intended to be exported for external use. In Go, the principle behind internal is that it can be used normally within the project itself, while ensuring that it is not visible to external projects.

However, there is no internal directory in Kubernetes. This is because the Kubernetes project started around 2014, while the concept of the internal directory was only introduced in Go 1.4 (released at the end of 2014). During the early development of the Kubernetes project, the convention of using internal was not yet widely established, and there was no large-scale refactoring later on to introduce it.

At the same time, one of Kubernetes’ design goals is modularity and decoupling. It achieves encapsulation through explicit package organization and code structure, without needing to use internal packages to restrict package access.

At this point, we already understand the standard top-level directory structure for building a project.

Go does not have a standard directory framework like Java does. As a result, when starting different projects, one always has to get used to each project’s particular code structure. Even within the same team, different structures might exist, which can be a significant obstacle for newcomers trying to understand the project.

Because of these obstacles, collaboration can be difficult. A unified top-level directory structure enables us to quickly find code and have a standard entry point when taking over a project, improving development efficiency and reducing confusion about code locations during collaborative development.

But does a unified code directory structure alone make for a perfect large-scale project? The answer is of course no.

Relying solely on a unified directory structure cannot once and for all solve the problem of code gradually decaying and becoming chaotic. Only sound design principles can keep the design context clear as the project continues to expand.

Declarative Design Philosophy

The declarative API runs throughout the entire code design of Kubernetes, preventing it from falling into procedural programming.

For example, when changing the state of a resource, you should tell K8s the desired state, rather than telling K8s what steps to take. This is also why kubelet rolling-update was phased out, as its design micromanaged the entire process of updating a Pod.

By informing Kubernetes of the desired state, kubelet can take appropriate actions according to that state, and there is no need for excessive intervention from outside.

At this point, you might wonder: Why does a declarative API help keep modules clear when the project expands? Isn’t this something users perceive when using Kubernetes? How does it relate to internal design?

When we design interfaces, if we expose the entire operational process to users and let them interfere step by step in how our Pod is updated, then the modules we design will inevitably be procedural. In this way, our code modules become hard to keep clear because they are coupled with many user operations.

However, by using a declarative API, after we tell K8s the desired state, the cluster can coordinate among multiple internal components to ultimately achieve the desired state. Users don’t need to know how things are updated internally. Moreover, when additional collaboration plugins are needed, new modules can be directly added without exposing more APIs for user operations.

cAdvisor monitors resources deployed by K8s and collects container resource metrics. It works independently, not relying on external components. The controller then compares these metrics with user-declared targets to determine whether conditions for scaling up or down are met.

Because the modules are independent, cAdvisor only needs to focus on collecting and returning monitoring metrics, without caring about how these metrics are used—whether for observation or as a basis for automatic scaling.

This is also a key principle when designing different task components: clearly define the requirements to be met; when passing information, focus only on input and output; as for internal implementation, it can be encapsulated without exposing it externally, making it as simple as possible for external business usage.

Avoiding Over-Engineering

Excessive engineering design is often worse than insufficient design.

The earliest version of Kubernetes was 0.4. For networking, the official implementation was to have GCE run salt scripts to create bridges, while the recommended solutions for other environments were Flannel and OVS.

As Kubernetes developed, Flannel was no longer sufficient in some situations. Around 2015, Calico and Weave emerged in the community, which basically solved the networking problem. Kubernetes thus no longer needed to spend effort doing this itself, so it introduced CNI to standardize network plugins.

It is clear that Kubernetes was not perfectly designed from the very beginning. Instead, as new problems emerged, new designs were introduced to adapt to changes in different environments.

When starting a project, dependencies are relatively clear. Therefore, at the beginning of the engineering design, circular dependencies do not occur. But as the project grows, these issues gradually appear. Functional requirements in the product will lead to cross-references in code design.

Even if we try our best to understand all the business background and problems to be solved before starting, new problems will inevitably arise as product features change and programs iterate. What we can do is pay attention to module design and dependency management, keep functions cohesive as much as possible, and, when adding abstractions later, avoid having to overhaul all previous code in a "refactoring" manner.

Over-designing a system for "scalability," designing just for the sake of design, can become a stumbling block for future changes.

Let’s illustrate design evolution with an e-commerce business scenario.

Initially, the system has two modules:

Order Module: Responsible for handling order creation, payment, status updates, etc. It depends on the User Module for user information (such as shipping address, contact details, etc.).
User Module: Responsible for managing user information, registration, login, and storing user data. It does not depend on the Order Module.

In this initial design, the dependency is one-way: the Order Module depends on the User Module.

At this stage, there is no need to over-abstract in the code. Many projects cannot foresee whether they will succeed or fail, so spending too much effort on design is not feasible from a product release perspective, and, if the product concept changes drastically, over-design may become an obstacle to future modifications.

As requirements evolve, a new need arises: the platform needs to recommend personalized products to users based on their purchase history (order records).

To achieve personalized recommendations, the User Module now needs to call the Order Module’s API to get a user’s order history.

Now, the dependencies become:

The Order Module depends on the User Module for user information.
The User Module depends on the Order Module for order history.

This change creates a circular dependency: the Order Module depends on the User Module, and the User Module also depends on the Order Module.

To solve the circular dependency, several solutions can be considered:

Decouple module responsibilities: Introduce a new module, such as a Recommendation Module, dedicated to handling personalized recommendation logic. The Recommendation Module can obtain data separately from the User and Order Modules, avoiding direct dependencies between them.

By extracting modules, we solve the coupling between the User and Order Modules.

However, a new requirement arises: during promotional events, users purchase event-specific products. The product manager wants the Recommendation Module to be able to immediately detect such orders and provide recommendations for related promotional products. For example, if a user buys a discounted sports watch, and we also recommend discounted Bluetooth sports earphones, the user’s repurchase rate might be higher.

In this scenario, having the Order Module call the Recommendation Module directly to pass data is clearly undesirable, because the Recommendation Module already depends on the Order Module for user purchase data, establishing a one-way dependency. If we let the Order Module call the Recommendation Module, that creates a circular dependency again.

So how can the Recommendation Module quickly sense changes in orders? This requires event-driven architecture.

By using an event-driven approach, when a user places an order, the Order Module triggers an event, and the Recommendation Module subscribes to events related to user orders. In this way, the two modules don’t need to call each other’s APIs directly; instead, data is passed through events.

After receiving the data, the Recommendation Module can immediately retrain a new recommendation model and recommend related products to the user.

From the above example, we can see a major challenge in enterprise applications: business domain modeling.

In modeling, it is more of a process of optimizing design as requirements continuously evolve.

The User, Order, and Recommendation Modules described above are also common scenarios in the evolution of most To-C (consumer-facing) products.

How to continuously optimize our module design and code structure in the process of evolution and improve iteration speed is something we need to explore and think about.

Summary

Let’s review the content of this article:

When building large projects, a unified directory structure can improve collaboration efficiency, but sound design principles are the key to maintaining clarity and extensibility as the project grows.
Kubernetes' declarative API helps modules remain independent and avoids the pitfalls of procedural programming.
Project design should evolve step by step according to actual needs and avoid over-engineering.
Focus on proper separation of module responsibilities and dependencies, and use event-driven approaches to solve coupling between modules.

We are Leapcell, your top choice for hosting Go projects.

Leapcell is the Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis:

Multi-Language Support

Develop with Node.js, Python, Go, or Rust.

Deploy unlimited projects for free

pay only for usage — no requests, no charges.

Unbeatable Cost Efficiency

Pay-as-you-go with no idle charges.
Example: $25 supports 6.94M requests at a 60ms average response time.

Streamlined Developer Experience

Intuitive UI for effortless setup.
Fully automated CI/CD pipelines and GitOps integration.
Real-time metrics and logging for actionable insights.

Effortless Scalability and High Performance

Auto-scaling to handle high concurrency with ease.
Zero operational overhead — just focus on building.

Explore more in the Documentation!

Learning Large-Scale Go Project Architecture from Kubernetes

Kubernetes Code Layout

api

build

cmd

pkg

staging

Comparison with the Standard Go Project Layout

Declarative Design Philosophy

Avoiding Over-Engineering

Summary

We are Leapcell, your top choice for hosting Go projects.

Share this article

More Posts from Leapcell

PostgreSQL as a Search Engine: Deep Dive into Inverted Indexes

Why Rust Is the Future of Web Development

Popular Posts