Skip to main content
All insights Article

Production Architectures for Agentic AI: Microsoft, Google, and AWS Compared

The hyperscalers have spoken. Microsoft, Google, and AWS have each published production reference architectures for agentic AI. The naming differs, the services differ, but the underlying patterns are remarkably similar. This matters because it signals where enterprise AI is heading and what…

The hyperscalers have spoken. Microsoft, Google, and AWS have each published production reference architectures for agentic AI. The naming differs, the services differ, but the underlying patterns are remarkably similar. This matters because it signals where enterprise AI is heading and what infrastructure decisions you need to make now.

Microsoft calls theirs the "Golden Path." Google publishes detailed architecture guides through their Cloud Architecture Center. AWS offers Bedrock AgentCore. All three address the same fundamental challenge: moving AI agents from pilot to production at enterprise scale.

The Common Pattern

Strip away the branding and all three architectures solve for the same requirements:

Multi-agent orchestration. Single agents solving isolated problems was the 2023 model. Production systems in 2025 require networks of specialised agents working together under unified coordination.

State and memory management. Agents need to maintain context across conversations and learn from interactions. All three platforms provide managed memory services for both short-term session state and long-term knowledge retention.

Tool integration. Agents must connect to enterprise systems, APIs, and data sources. Each platform offers gateways that transform existing services into agent-callable tools.

Security and identity. Enterprise-grade authentication, authorisation, and audit trails. Non-negotiable for production workloads.

Observability. Tracing, logging, and monitoring of agent decisions. You cannot govern what you cannot see.

Platform Comparison

The table below maps equivalent components across all three platforms:

Capability Microsoft Azure Google Cloud AWS
Unified Platform Azure AI Foundry Vertex AI Agent Builder Amazon Bedrock
Agent Runtime Foundry Agent Service Agent Engine AgentCore Runtime
Orchestration Framework Semantic Kernel / Agent Framework Agent Development Kit (ADK) Strands Agents / Bedrock Agents
Memory Management Cosmos DB (threads) Agent Engine Sessions / Firestore AgentCore Memory
Tool Gateway Logic Apps, Functions, MCP Cloud Run, MCP servers AgentCore Gateway
Knowledge/RAG Azure AI Search Vertex AI Search Bedrock Knowledge Bases
Identity Microsoft Entra ID IAM / Agent Identity AgentCore Identity
Observability Azure Monitor, Application Insights Cloud Logging, Cloud Trace AgentCore Observability (CloudWatch)
Code Execution Code Interpreter Code Interpreter AgentCore Code Interpreter
Container Deployment Container Apps Cloud Run / GKE Lambda / ECS

The architectural patterns are nearly identical. The choice between platforms typically comes down to existing cloud investments, not technical capability gaps.

What Each Platform Does Well

Microsoft Azure offers the tightest integration with enterprise productivity tools. If your organisation runs on Microsoft 365, SharePoint, and Dynamics, the Golden Path provides native connectors. The Semantic Kernel framework has strong enterprise adoption, with organisations like KPMG and Fujitsu using it for multi-agent orchestration in production.

Google Cloud leads on the open-source developer experience. The Agent Development Kit (ADK) has been downloaded over 7 million times and powers agents across Google's own products. The Agent Garden provides pre-built samples and tools that accelerate development. Google also pioneered the Agent-to-Agent (A2A) protocol for cross-platform agent communication.

AWS offers the broadest model selection and framework flexibility. AgentCore is explicitly framework-agnostic, working with CrewAI, LangGraph, LlamaIndex, or custom implementations. If you need to mix models from different providers or want maximum portability, AWS provides that optionality.

The MCP Convergence

All three platforms now support Model Context Protocol (MCP), the open standard for connecting agents to tools and data sources. This is significant. It means agents built on one platform can potentially use tools exposed by another. The walls between ecosystems are becoming more permeable.

For enterprises, this reduces lock-in risk. Invest in MCP-compatible tool development now, and those tools remain usable regardless of which platform you standardise on later.

Where the Platforms Diverge

Managed vs. flexible. Microsoft and Google offer more opinionated, managed services. AWS AgentCore provides more infrastructure primitives that you assemble yourself. The trade-off is speed-to-production versus customisation depth.

Framework preference. If your team knows LangChain or LangGraph, Google's integration is smoother. If you prefer .NET and C#, Microsoft's Semantic Kernel is the natural choice. AWS supports everything but optimises for nothing specific.

Pricing models. All three charge for compute, memory, and model inference. The specifics vary significantly. Model your expected workload before committing.

For the C-Suite: Platform Selection Criteria

The platform choice is less about technical features and more about organisational fit:

  1. Where do you already have cloud investment? Migrating to a new cloud for agentic AI alone rarely makes sense. Build on what you have.

  2. What is your developer ecosystem? The framework your team knows will be deployed faster than the framework with better features.

  3. What are your compliance requirements? All three meet major compliance standards, but the specific certifications differ. Check before assuming.

  4. How important is model flexibility? If you need to swap models frequently or use specialised fine-tuned models, evaluate each platform's model catalogue.

Getting Started

Audit your current state. Map every active AI initiative against production readiness. Be honest about what is progressing versus what is stalled.

Identify your first multi-agent candidate. Look for workflows requiring multiple handoffs, approvals, or system integrations. These are natural candidates for agent orchestration.

Run a proof-of-architecture. Before committing to a platform, build the same simple agent on two platforms. Compare developer experience, deployment complexity, and observability quality.

Define success criteria before building. Latency thresholds, accuracy requirements, compliance obligations, cost limits. Write them down. Measure against them.

The Bottom Line

The three major cloud providers have converged on a common architectural pattern for production agentic AI. Multi-agent orchestration, managed memory, tool gateways, security, and observability. The implementations differ in detail but not in substance.

This convergence tells us something important: the production patterns for agentic AI are stabilising. The experimental phase is ending. The question is no longer whether to build multi-agent systems, but on which foundation.

For most organisations, the answer is straightforward: build on the cloud where you already operate. The architectural patterns are similar enough that platform switching later, while not trivial, remains possible. What matters more is getting production experience now, learning what works for your specific use cases, and building organisational capability while competitors are still running pilots.


Which platform is your organisation evaluating for agentic AI? What factors are driving your decision? Share in the comments.


Paul Bratcher is a Partner at Prosper AI Consulting, specialising in AI transformation for mid-market and enterprise organisations. He developed the ODTA (Outcome-Driven Technology Adoption) framework to help leadership teams navigate technology decisions with clarity.