
Platform engineering has a funny habit: it starts as a noble effort to “make developers faster,” and within a year it’s also responsible for compliance, cost controls, uptime, multi-cloud portability, and answering the eternal question: “Why is this cluster bill shaped like a hockey stick?”
That’s why the latest argument from Giant Swarm — that the future of Kubernetes-based platforms is modular — lands at exactly the right moment. In a February 13, 2026 post titled “The future is modular: what a decade of running Kubernetes taught us about platforms”, Giant Swarm’s Oliver Thylmann makes the case that bundled “all-in-one” platforms increasingly act as constraints rather than accelerators, especially for organizations that have already built opinions, skills, and investments across observability, CI/CD, and security.
This article expands on that thesis, adds industry context from CNCF and FinOps research, and (politely) interrogates the trade-offs. Because modularity can be liberation — or it can be the start of a new kind of chaos if you don’t design it with discipline.
Original source note: This piece is an independent analysis and expansion based on Giant Swarm’s RSS item and original blog post by Oliver Thylmann. You should read the original for the cleanest statement of their position: Giant Swarm blog.
Why “platform” means something different in 2026 than it did in 2016
Ten years ago, “we run Kubernetes” was often shorthand for “we have bravely adopted a complicated open-source control plane and now require a small priesthood to operate it.” Today, Kubernetes is still complicated, but its role has shifted from novelty to default substrate.
Even CNCF’s public reporting over the last couple of years reflects this maturity. Kubernetes adoption is widespread, and yet fewer developers interact with it directly — a sign that abstraction layers (internal platforms, portals, managed services) are becoming the primary interface to infrastructure. Giant Swarm cites a CNCF/SlashData data point that only about 30% of backend developers say they use Kubernetes directly, down from an earlier peak. The implications aren’t subtle: the platform layer — not the cluster — is where developer experience is won or lost. citeturn1view0turn3search0turn3search3
Meanwhile, the definition of “platform requirements” keeps expanding:
- Security: policy enforcement, image provenance, runtime detection, auditability.
- Observability: metrics/logs/traces, SLOs, multi-tenant access, incident workflows.
- Cost governance: rightsizing, autoscaling, chargeback/showback, waste reduction.
- Multi-cloud/hybrid: because regulatory and latency realities don’t care about your cloud strategy slide deck.
- AI workloads: GPUs, model serving, telemetry across inference, and new cost units like tokens.
In that environment, the “one platform to rule them all” pitch starts to wobble — especially for mature orgs that already have functioning components and don’t want to rip-and-replace everything to adopt a vendor’s preferred bundle.
The “bundle trap”: paying for a platform, using a slice, and bolting on the rest
Thylmann names a dynamic many platform teams recognize instantly: the bundle trap. You buy (or build) a comprehensive platform, but in practice:
- You only adopt the components that match your existing constraints and skills.
- You run parallel tools for the parts that don’t fit.
- You still carry integration and operational complexity — sometimes more, because now you have “the platform way” and “the reality way.”
The result is a weird outcome where a platform that was meant to reduce cognitive load becomes an additional abstraction layer with its own friction.
This is also where cost becomes political. Platform licensing isn’t the same as cloud waste, but it rhymes. The FinOps community has been loudly focused on waste reduction for years, and more recent FinOps reporting shows that waste and optimization remain top concerns. Giant Swarm points to FinOps research and the broader waste-reduction priority trend as context for why paying for unused capabilities (shelf-ware) is increasingly intolerable. citeturn1view0turn2search5turn2search9
Why bundles worked (and sometimes still do)
To be fair to bundled platforms: they were a rational response to the CNCF “landscape problem.” When the ecosystem was younger and less stable, buying a curated package could be a shortcut through evaluation paralysis. In regulated industries, a bundle can also simplify procurement and compliance by reducing the number of vendors and integration points.
And for smaller platform teams, the bundle can be a survival mechanism: fewer moving parts, fewer decisions, fewer integration chores.
But the trade-off is lockstep adoption. Bundles work best when your organization is willing to adopt the vendor’s worldview end-to-end. As soon as you have strong existing investments, bundles can become the wrong kind of “standardization.”
What modularity actually means (and what it should not mean)
“Modular platform” can mean several different things, ranging from “carefully designed capabilities that compose cleanly” to “an app store full of YAML, good luck.”
In the Giant Swarm post, the core argument is that modularity should let teams start with what they need, keep what already works, and add capabilities when they have a real reason. The modular model is also presented as a way to make costs more visible — you should know what you’re paying for and why. citeturn1view0
But Thylmann also acknowledges the hard part: modularity has trade-offs. You must make more decisions up front. You must think about integration. And you can absolutely create a patchwork platform nobody fully understands. citeturn1view0
The difference between “modular” and “fragmented”
Modularity is only a win if the modules are designed around a few principles:
- Clear interfaces: each capability has well-defined APIs and lifecycle hooks.
- Compatibility policy: versioning rules and tested combinations are documented.
- Opinionated defaults: enough standardization that teams don’t reinvent basics.
- Composable governance: policies can apply fleet-wide without breaking team autonomy.
Without those, “modular” becomes “an expensive scavenger hunt through the CNCF landscape, with incident response as your integration test suite.”
Why Kubernetes pushed platforms toward modularity in the first place
Kubernetes is simultaneously a unifier and an amplifier. It unifies because the Kubernetes API (and related ecosystems like Cluster API) offers a common control model. But it amplifies because once you standardize on Kubernetes, you now have an explosion of optional add-ons: networking, policy engines, identity, secrets, ingress, service mesh, telemetry, autoscaling, node provisioning, GitOps controllers, and so on.
At some point, every platform team ends up building a platform out of platforms.
Giant Swarm’s own documentation describes a layered architecture and a Kubernetes-centric “platform API” approach, using a management cluster as the central place to orchestrate workload clusters and platform capabilities. The platform is built on top of Cluster API for lifecycle management and uses Flux for GitOps-style reconciliation. citeturn0search0turn0search4
This is important because it hints at a practical path to modularity: treat each capability (observability, security tooling, app management, developer portal) as something that can be installed, configured, and evolved declaratively — but under a consistent control plane and workflow.
Modular platforms and the rise of the internal developer portal (Backstage is the usual suspect)
One reason modularity is becoming more viable is that we finally have a credible “front door” for platforms: the developer portal. Rather than forcing developers to learn every underlying subsystem, the portal can present curated workflows, templates, documentation, and service ownership metadata.
Backstage is the most prominent open-source example. CNCF describes Backstage as an open framework for building developer portals, and notes its CNCF trajectory (accepted in 2020, incubating since 2022). citeturn5search0turn5search1
From a modularity perspective, portals matter because they decouple experience composition from infrastructure composition. Your underlying modules can change (new policy engine, different telemetry backend, new cluster provisioning mechanism) without constantly breaking the developer-facing workflow — as long as the portal’s contracts remain stable.
Practical example: a “create service” flow that doesn’t care what you run underneath
Consider a typical golden path:
- Create a new backend service (repository, CI pipeline, base manifests, ownership metadata).
- Deploy to dev and staging via GitOps.
- Enforce security baselines (Pod Security Standards, network policies, image scanning).
- Expose dashboards (latency, errors, resource usage, cost allocation tags).
In a bundled platform, those steps often assume specific tool choices. In a modular platform, the portal can provide the workflow while allowing the platform team to swap components — within a compatibility framework — as long as the workflow remains consistent.
GitOps as the glue for modularity (and why Flux keeps showing up)
If you want modules that can be independently installed and upgraded without turning your platform into a pet project, you need repeatable change management. GitOps is the obvious candidate because it gives you:
- A single source of truth (Git).
- Auditability (pull requests, code review, history).
- Reconciliation (drift detection and correction).
- Repeatability across fleets and environments.
Giant Swarm’s documentation and tutorials emphasize Flux-based GitOps for managing clusters and applications, including recommended repository structures that map to management clusters, organizations, and workload clusters. citeturn0search3
Flux itself has been evolving toward better operational tooling, including efforts to connect AI assistants to GitOps context through a dedicated MCP server. Whether you want AI in the loop or not, the underlying point is that GitOps ecosystems are increasingly building richer “operator experience” tooling. citeturn5search4
A warning: GitOps doesn’t magically eliminate integration work
GitOps makes changes reproducible. It does not make them automatically safe. A modular platform still needs:
- Staging environments that mirror production enough to matter.
- Compatibility matrices and upgrade playbooks.
- Policy testing (e.g., “will this Kyverno change block deploys?”).
- Rollback strategy and “blast radius” thinking.
Modularity shifts the challenge from “how do we install all of this?” to “how do we continuously evolve this without breaking teams every Tuesday?” Which is progress — but it’s still work.
Cluster fleets make bundles painful: one size doesn’t fit dozens (or hundreds) of clusters
As fleets grow, the cost of rigid bundling rises. Giant Swarm’s platform messaging references operating at scale across many production clusters, and their documentation talks about fleet management practices aligned with organizational structure. citeturn0search4turn0search0
Fleet reality introduces a few modularity drivers:
- Different risk profiles: prod vs. dev vs. edge clusters won’t share the same requirements.
- Different regulatory constraints: data residency and access control can vary per business unit or region.
- Different performance economics: AI/GPU clusters vs. general compute clusters behave like different planets.
- Different lifecycle needs: some clusters are long-lived, others ephemeral for testing.
Bundles generally assume uniformity. Fleets punish uniformity.
Security as a modular capability: installable stacks, consistent baselines
Security is where modular platforms get tested hardest, because the cost of inconsistency is not “a developer got annoyed,” it’s “you shipped a compliance violation.”
Giant Swarm’s security documentation describes a secure-by-default posture and a set of integrated open-source tools and approaches: policy enforcement, RBAC, network policies, Pod Security Standards, and optional components such as Trivy/Trivy Operator, Kyverno, Falco, and Harbor. They also explicitly frame this as a stack with independently installable components to match different security requirements. citeturn5search2turn5search8
This is a good example of modularity done responsibly: you can tailor the stack, but you don’t reinvent the fundamentals every time. There’s a baseline, and then there are selectable layers.
Modular security still needs centralized policy ownership
One lesson platform teams learn the hard way: if nobody owns global policy, “team autonomy” becomes “team lottery.” Modular platforms should still support:
- Fleet-wide minimum standards (PSS levels, network isolation, image scanning requirements).
- Exception workflows (time-boxed policy bypass with approval and auditing).
- Evidence generation for audits (what was enforced, where, when).
Modularity can increase flexibility, but it should not dilute accountability.
Observability and the modular platform: standard signals, flexible backends
Observability is another domain where organizations have strong opinions. Some teams are standardized on the Grafana stack; others are deep into OpenTelemetry pipelines; others run vendor platforms for logs, metrics, and traces. A bundle that insists you use “observability the vendor way” is likely to be rejected by the teams who already have a mature setup.
The modular approach suggests a compromise: define standard signals (metrics naming conventions, tracing propagation, log schemas, SLO definitions) and allow backends to vary — with an integration contract that keeps the developer experience stable.
In other words: don’t force everyone onto the same dashboards; force everyone to emit the same signals.
Cost visibility is driving modularity more than most people admit
There is a gentle fiction in platform engineering that platforms are purchased because of “developer productivity.” That’s true, but it’s also incomplete. Platforms are purchased because of risk reduction and cost governance.
FinOps research and commentary across the industry consistently highlights waste reduction and optimization as top priorities, and the scope of FinOps is expanding beyond pure public cloud into SaaS, licensing, data centers, and AI. This means platform decisions increasingly get evaluated through a “total technology spend” lens rather than a narrow infrastructure lens. citeturn2search5turn4search4turn4search1
That context strengthens the modular argument: if cost discipline is a first-class requirement, paying for unused platform capabilities becomes harder to justify, and swapping components to match economic reality becomes more attractive.
Modular platform ROI is easier to prove incrementally
One practical win of modularity is sequencing:
- Start with Kubernetes fleet management and secure defaults.
- Add autoscaling/capacity optimization when spend becomes painful.
- Add portal workflows when onboarding and discoverability are slowing teams.
- Add AI infrastructure modules when experimentation moves to production.
Each step can be justified with a measurable bottleneck. That’s a far easier internal conversation than “we need a giant bundle because the future is complex.” The future is always complex. Finance prefers receipts.
AI changed the platform conversation: why “add it later” is suddenly the sane option
AI infrastructure is a perfect stress test for modular platforms, because AI adoption is uneven. Some organizations are all-in on internal model serving and GPU fleets; others are using managed APIs and have minimal infrastructure requirements.
The CNCF launched a Certified Kubernetes AI Conformance Program to standardize expectations for AI workloads on Kubernetes, explicitly aiming to reduce fragmentation and improve interoperability. The announcement frames this as a community-led effort to define minimum capabilities required for reliable AI on Kubernetes. citeturn2search0
Giant Swarm’s post mentions that it is among the first platforms certified through this program (and ties that to their broader modular capability story). citeturn1view0turn2search0
Even if you ignore certifications, the operational reality is this: GPUs are expensive, AI workloads are spiky, and the “unit economics” are still evolving (tokens, inference, training runs, storage, data egress). In that environment, bundling AI infrastructure into every platform contract is like bundling a snowplow into every car purchase. Some people need it. Others live in Florida.
Industry perspective: “choice and modularity” is becoming an expectation
Thylmann references a comment attributed to Benjamin Brial (Cycloid) about businesses expecting internal developer platforms to offer choice and modularity. Whether or not you take that as a universal truth, the broader industry discourse supports the trend: IDPs and platform tooling increasingly position themselves as curation layers rather than monoliths.
Brial has also discussed IDPs publicly in other venues, emphasizing that IDPs address scaling DevOps, hybrid complexity, and cloud waste, and describing common IDP components like service catalogs and orchestration layers. citeturn4search5
The bigger point: modularity is not only a technical preference. It’s becoming a procurement and organizational preference, because it aligns with how enterprises change: slowly, unevenly, and with lots of legacy constraints.
Case studies and comparisons: DIY vs bundled vs modular-curated
Most organizations end up choosing one of three models, even if they don’t label them this way.
1) DIY platform engineering (assemble your own stack)
Pros: maximum control, best fit for unique constraints, no vendor coupling.
Cons: integration tax, upgrade burden, staffing burden, operational risk. You become the vendor. Congratulations, you now support your own product — and your customers are very loud engineers.
DIY can work brilliantly when you have strong platform talent and stable requirements. It becomes painful when your platform team is small, the fleet is large, and compliance requirements evolve faster than your backlog.
2) Bundled platform (one vendor stack)
Pros: faster initial time-to-value, fewer integration decisions, simpler procurement.
Cons: bundle trap risk, paying for unused capabilities, parallel tool sprawl, reduced flexibility when requirements change.
Bundles often shine for greenfield teams or organizations that are willing to standardize aggressively. They struggle in enterprises with existing tooling and strong opinions.
3) Modular-curated platform (capabilities you can add/remove, with tested integration)
Pros: incremental adoption, reduced shelf-ware, flexibility to keep existing investments, easier to prove ROI per capability.
Cons: more decisions, governance complexity, risk of fragmentation if contracts and compatibility aren’t managed.
This is the space Giant Swarm is arguing for: modularity with curation and integration, not modularity as “here is a catalog of CNCF projects.” citeturn1view0turn0search0
How to adopt modularity without building a patchwork platform
If you’re a platform lead reading this and thinking, “Modular sounds good, but I’ve seen ‘plug and play’ become ‘plug and pray’,” you’re not wrong. Here are practical guardrails that make modularity survivable.
Define your platform contracts first
Before you pick modules, define what must stay stable:
- How teams request environments and resources.
- How deployments happen (e.g., GitOps workflow, CI triggers).
- How identity and authorization are mapped to org structures.
- How telemetry is accessed and what minimum signals exist.
- How policy exceptions are handled.
Contracts are the difference between “modular” and “constantly re-platforming.”
Run a compatibility program (yes, like a mini CNCF conformance)
Even if you never publish a logo, you need internal conformance:
- Supported versions of Kubernetes and core addons.
- Supported combinations of modules.
- Upgrade sequencing and rollback procedures.
CNCF’s conformance programs exist for a reason: interoperability is expensive without shared tests and standards. citeturn2search1turn2search0
Invest in “module lifecycle” automation, not just day-one installation
Most pain happens after go-live: upgrades, CVEs, new org requirements, expansions, migrations. Modular platforms should be judged on their lifecycle ergonomics, not their demo day.
GitOps helps, but you also need:
- Automated policy testing.
- Progressive delivery patterns for platform changes.
- Fleet-wide drift reporting.
- Clear deprecation pathways.
So is the future modular?
Yes — but not because modularity is fashionable. Because the enterprise reality is modular already:
- Your org is modular (business units, regions, compliance regimes).
- Your infrastructure is modular (hybrid, multi-cloud, edge).
- Your workload types are modular (web, data, AI, batch, streaming).
- Your developer experience needs are modular (golden paths differ by team maturity).
A monolithic platform bundle can still work in certain contexts, but the default trend line is toward pick-and-choose capabilities, integrated through consistent workflows and APIs — especially as AI infrastructure becomes a new, unevenly adopted requirement.
Giant Swarm’s argument is ultimately a pragmatic one: platforms should adapt to how organizations actually evolve, not force organizations to adapt to a vendor’s reference architecture. That’s the kind of position that sounds obvious — right up until you try to operationalize it across a fleet, a compliance regime, and a budget meeting.
In other words, modularity is the future… provided you do the unglamorous work that makes modularity coherent.
Sources
- Giant Swarm – “The future is modular: what a decade of running Kubernetes taught us about platforms” (Oliver Thylmann, Feb 13, 2026)
- Giant Swarm Docs – Architecture of the Giant Swarm cloud-native developer platform
- Giant Swarm Docs – Fleet management
- Giant Swarm Docs – Managing workload clusters with GitOps
- Giant Swarm Docs – Security
- Giant Swarm Docs – Platform Security
- CNCF – Certified Kubernetes AI Conformance Program announcement (Nov 11, 2025)
- CNCF – CNCF and SlashData: cloud native ecosystem surges to 15.6M developers (Nov 11, 2025)
- CNCF – Backstage project page
- CNCF – Backstage joins the CNCF incubator (Mar 15, 2022)
- FinOps Foundation – Reducing waste and managing commitments top priorities
- FinOps Foundation – State of FinOps (site)
- CIO – “FinOps breaks out of the cloud”
- FluxCD blog – AI-assisted GitOps with Flux Operator MCP Server
- Software Engineering Radio – Episode 699: Benjamin Brial on Internal Dev Platforms (Dec 17, 2025)
Bas Dorland, Technology Journalist & Founder of dorland.org