VCF 9.0 GA Mental Model Part 4: Fleet Topologies and SSO Boundaries (Single Site, Dual Site, Multi-Region)

TL;DR

This post targets VCF 9.0 GA only: VCF 9.0 (17 JUN 2025) build 24755599, with GA BOM examples including VCF Installer 9.0.1.0 build 24962180, ESX 9.0.0.0 build 24755229, vCenter 9.0.0.0 build 24755230, NSX 9.0.0.0 build 24733065, SDDC Manager 9.0.0.0 build 24703748, VCF Operations 9.0.0.0 build 24695812, VCF Automation 9.0.0.0 build 24701403, and VCF Identity Broker 9.0.0.0 build 24695128.
Your topology choice is the first big day-0 decision:
- Single site usually starts with one fleet + one instance.
- Two sites in one region is typically one fleet + one instance with stretched clusters for higher availability.
- Multi-region typically becomes one fleet + multiple instances (often one instance per region for latency, sovereignty, and isolation).
Your identity choice is the second big day-0 decision:
- Embedded VCF Identity Broker is the simplest and aligns to one broker per instance.
- Appliance VCF Identity Broker is a 3-node cluster, recommended for multi-instance SSO due to availability and scale (rule of thumb: up to 5 instances per broker).
- SSO model controls blast radius: fleet-wide SSO has the largest login blast radius; per-instance SSO has the smallest.
Fleets are best treated as shared governance and lifecycle scope for management components, not a shared instance management plane. Instances keep their own SDDC Manager, vCenter, and NSX control planes.

Architecture Diagram

Scope and Code Levels
Assumptions
Scenario
Core Concepts Refresher
Decision Criteria
Challenge: Pick Your Topology
Challenge: Pick Your SSO and Identity Boundaries
Architecture Tradeoff Matrix
Failure Domain Analysis
Who Owns What
Operational Runbook Snapshot
Change Management Considerations
Anti-patterns
Validation
Summary and Takeaways
Conclusion

Scope and Code Levels

Use these criteria to avoid arguing in circles.

Version Compatibility Matrix

VCF 9.x formalizes separation between:

Component	Version	Build
VMware Cloud Foundation	9.0	24755599
VCF Installer	9.0.1.0	24962180
ESX	9.0.0.0	24755229
vCenter	9.0.0.0	24755230
NSX	9.0.0.0	24733065
SDDC Manager	9.0.0.0	24703748
VCF Operations	9.0.0.0	24695812
VCF Automation	9.0.0.0	24701403
VCF Identity Broker	9.0.0.0	24695128

Assumptions

You are greenfield and building your first VCF 9.0 platform.
You deploy both VCF Operations and VCF Automation from day-1.
You need patterns for:
- Single site
- Two sites in one region
- Multi-region
You may need either:
- Shared identity (one IdP and unified SSO experience)
- Regulated isolation (separate IdPs and separate SSO boundaries)

Scenario

Failure domain notes

Architects who draw the boxes
Operators who keep the lights on
Leaders who approve budgets and risk decisions

Day-1 posture

“How many independent failure domains do we actually have?”
“What is the blast radius when identity breaks?”
“Do we have one private cloud or multiple?”
“Who is on the hook when something fails at fleet vs instance scope?”

Core Concepts Refresher

Day-1 posture

Fleet: where you centralize governance and fleet-scoped management services (not your per-instance management planes).
Instance: a discrete VCF deployment footprint with its own management domain and workload domains.
Domain: where lifecycle and isolation are designed to be independently managed (management domain + workload domains).
Clusters: where you scale capacity and availability inside a domain.

Why you choose it

If the question is “How do we run workloads here?” think instance -> domains -> clusters.
If the question is “How do we standardize and govern across footprints?” think fleet.

Decision Criteria

In VCF 9.0, identity is not a minor checkbox. It drives operator experience, automation integration, and incident response.

Design-time decisions you should treat as “hard to change”

Fleet count and why each fleet exists
Instance per site/region strategy
SSO scope and identity broker deployment mode
Which components must survive a site failure vs a region failure
Certificate and backup architecture for management components

Day-2 decisions you can iterate on

Adding workload domains
Adding clusters to domains
Adding instances to an existing fleet
Moving from per-instance SSO to cross-instance SSO (note: requires reset and reconfiguration, so treat as a major change event)

Challenge: Pick Your Topology

These are common failure modes that look fine in diagrams but hurt in production.

Solutions

Option A: Single site (one fleet, one instance)

Day-0 decisions

One VCF fleet.
One VCF instance in one physical site.
One management domain plus one or more workload domains.

Day-0 decisions

One VCF fleet.
One VCF instance.
Two sites in the same metro region, using stretched clusters to increase availability across sites.

Day-0 decisions

One VCF fleet.
Multiple VCF instances, typically aligned to regions (or sovereignty boundaries).
Each instance has its own management domain and workload domains.

Day-0 decisions

Each VCF instance uses its own dedicated VCF Identity Broker.
SSO scope is limited to that instance.
Users re-authenticate when moving across instances.

Day-0 decisions

Multiple identity brokers exist.
Each identity broker serves a set of instances in the same fleet.
VCF management components (VCF Operations and VCF Automation) connect to only one identity broker for SSO, so choose that mapping deliberately.

Day-0 decisions

One identity broker services all instances in a fleet.
Users log in once and move across instances without re-authentication.

This article is written against VCF 9.0 GA terminology and design guidance.

You value simplicity and a unified admin experience.
You accept that identity is a shared dependency for the fleet.

What it is

This has the largest login blast radius.
You should strongly consider the appliance deployment mode for availability.

Embedded vs appliance identity broker

Use this chart to stop escalations from bouncing between teams.

Embedded mode
- Runs as a service inside the management domain vCenter.
- vCenter maintenance impacts your ability to authenticate to VCF components.
- Simplest footprint.
Appliance mode
- A standalone 3-node identity broker cluster deployed via VCF Operations fleet management.
- High availability comes from nodes running on separate hosts.
- Operational tasks on vCenter do not impact the authentication stack in the same way.
- Recommended for multi-instance SSO due to availability and scale (rule of thumb: up to five instances per broker).

Tenant multi-tenancy identity patterns in VCF Automation

Day-2 posture

Enterprise model
- Provider and tenants use the same identity provider.
- Simplest for internal enterprise IT.
Service provider model
- Provider and tenants use different identity providers.
- Better fit for regulated tenants, partner access, or MSP-style separation.

Architecture Tradeoff Matrix

What it is

Decision point	Option	Strengths	Tradeoffs
Physical topology	Single site	Fastest to deploy, lowest complexity	No site-level resilience by default
	Two sites in one region	Site resilience with one instance	Requires stretched network/storage design discipline
	Multi-region	Region isolation, scalable org model	Higher footprint, more coordination, DR becomes explicit
Fleet count	One fleet	Centralized governance and consistency	Shared governance dependencies, shared change windows
	Multiple fleets	Stronger governance isolation, separate identity boundaries possible	Duplicate fleet services, more ops overhead
Platform SSO model	Fleet-wide	Lowest footprint, best UX	Largest login blast radius
	Cross-instance	Balanced footprint and blast radius	More moving parts than fleet-wide
	Per-instance	Smallest login blast radius	Highest footprint and operational overhead
Identity broker mode	Embedded	Lowest footprint	Coupled to vCenter maintenance, simpler availability story
	Appliance	HA and scale, decoupled from vCenter maintenance	More resources and lifecycle tasks

Failure Domain Analysis

Keep your runbook short and repeatable. This is a starting point.

Fleet service failure domains

VCF Operations / VCF Automation unavailable
- Provisioning workflows, centralized operations views, and governance functions degrade.
- Your vCenters and NSX managers inside instances still exist, but you lose the consolidated interface and some automation paths.
VCF Identity Broker outage
- Impacts logins based on your SSO model:
  - Fleet-wide: impacts logins for the fleet.
  - Cross-instance: impacts the subset of instances attached.
  - Per-instance: impacts only one instance.

Instance failure domains

Management domain outage (inside an instance)
- Impacts that instance’s lifecycle and management capabilities.
- May also impact authentication if using embedded identity broker in that instance.
Workload domain outage
- Impacts workloads isolated to that domain.
- Does not necessarily take down the instance management domain.

Who Owns What

You are trying to align:

Capability / task	Platform team	VI admin	App/platform teams
Choose fleet count and topology blueprint	✅	⬜	⬜
Define instance-per-site/region strategy	✅	✅	⬜
Deploy first instance and management domain	✅	✅	⬜
Deploy fleet services (VCF Operations + VCF Automation)	✅	⬜	⬜
Create workload domains	✅	✅	⬜
Define SSO model and identity broker mode	✅	✅	⬜
Configure VCF Single Sign-On and component registration	✅	✅	⬜
Provider identity in VCF Automation	✅	⬜	⬜
Tenant identity in VCF Automation (enterprise vs service provider model)	✅	⬜	✅
Day-n operations in a region (multi-region)	⬜	✅	✅ (workload level)
Certificate lifecycle standard and tooling	✅	✅	⬜
Backup and restore strategy for management components	✅	✅	⬜
Workload onboarding, catalogs, templates, guardrails	✅	⬜	✅

Operational Runbook Snapshot

Even if you are on 9.0 GA today, your day-2 operating model should assume this separation so upgrades do not surprise you later.

Daily

Check platform health in VCF Operations (fleet services and connected instances).
Validate identity broker health and login paths.
Verify capacity alarms and failed automation runs.

Weekly

Confirm backups for:
- VCF Operations and fleet management services
- VCF Automation
- Identity broker (appliance mode) or vCenter backups (embedded mode)
- Instance core components (SDDC Manager, vCenter, NSX)

Monthly

Review certificate expirations and renewal pipeline.
Review drift and out-of-band changes.
Review tenancy boundaries and entitlement creep.

Incident workflow

Identify scope first:
- Fleet services issue vs instance issue vs workload domain issue
For identity incidents:
- Identify which identity broker and which SSO model is in use for impacted components.
- Decide if this is a login outage only, or also an authorization/role mapping problem.

Change Management Considerations

Operational reality

Identity resets are a major change event

Failure domain notes

Identity broker deployment mode (embedded <-> appliance)
Identity provider changes

VMware Cloud Foundation 9.0 and later documentation (includes VCF 9.0 Release Notes, Bill of Materials, Design Blueprints, and VCF Single Sign-On models): https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-9-0-and-later/9-0.html

A planned outage window
A rollback-resistant change (because users/groups and component registrations are impacted)
A runbook that includes role and permission re-assignment

Lifecycle sequencing matters

These are the changes that most often turn into “why is this so hard?”

Management components managed at fleet level
Core components managed at instance level

Treat these as distinct failure domains with different operational responses.

Anti-patterns

Use this matrix in design reviews to avoid subjective debates.

Designing multi-region without explicitly deciding where fleet-level services live and how you recover them.
Choosing fleet-wide SSO for operator convenience without acknowledging the login blast radius.
Using embedded identity broker in environments where vCenter maintenance windows are frequent and strict uptime is required.
Treating “separate IdP” as enough isolation while keeping everything in one governance boundary.
Letting tenants share identity or entitlements by accident in VCF Automation due to weak onboarding guardrails.
Skipping workload domains and placing consumer workloads in the management domain.

Validation

Validate code levels quickly (vCenter example)

Day-0 decisions

What it is

Fleet Management -> Identity & Access -> SSO Overview
Verify:
- Selected VCF instance
- Identity provider configuration
- Component configuration state for vCenter, NSX, VCF Operations, VCF Automation

Summary and Takeaways

Use topology blueprints to align teams quickly:
- Single site for speed
- Two sites in one region for site resilience
- Multi-region for sovereignty, latency, and isolation
Treat instances as your discrete infrastructure footprints.
Treat domains as lifecycle and workload isolation units.
Treat fleets as your centralized governance and fleet-scoped lifecycle boundary.
Choose your SSO model based on blast radius tolerance, not just convenience.
Decide early if tenants need separate identity providers, and use VCF Automation provider/tenant identity models intentionally.

Conclusion

You get operational clarity in VCF 9.0 when you design topology and identity as first-class boundaries:

Topology sets your failure domains and scaling ceiling.
Identity sets your operator experience and incident blast radius.
Fleets centralize governance, while instances keep their own management stacks.

Sources

Day-2 posture

VCF 9.0 GA Mental Model Part 4: Fleet Topologies and SSO Boundaries (Single Site, Dual Site, Multi-Region)

What Are PowerCLI and Python?

CrowdTour 2026: Securing the AI Era Together

Ecommerce Trends in 2025: Building a Greener Future with Sustainable Practices

5 Bash Scripts I Use Daily as a Linux SysAdmin

TL;DR

Architecture Diagram

Table of Contents

Scope and Code Levels

Version Compatibility Matrix

Assumptions

Scenario

Core Concepts Refresher

Decision Criteria

Design-time decisions you should treat as “hard to change”

Day-2 decisions you can iterate on

Challenge: Pick Your Topology

Solutions

Option A: Single site (one fleet, one instance)

Embedded vs appliance identity broker

Tenant multi-tenancy identity patterns in VCF Automation

Architecture Tradeoff Matrix

Failure Domain Analysis

Fleet service failure domains

Instance failure domains

Who Owns What

Operational Runbook Snapshot

Daily

Weekly

Monthly

Incident workflow

Change Management Considerations

Identity resets are a major change event

Lifecycle sequencing matters

Anti-patterns

Validation

Validate code levels quickly (vCenter example)

Summary and Takeaways

Conclusion

Sources

Share this:

Like this:

Similar Posts