Architecture AI/MCP

Complete architecture diagram of the AI and MCP integration with OpenShift Lightspeed, LiteLLM, MCP Gateway, and vLLM CPU fallback

Complete Architecture

flowchart TB subgraph CLIENTS["External Clients"] direction LR OC["πŸ–₯️ OpenShift Console\nLightspeed Plugin"] N8N_EXT["βš™οΈ n8n :443\nWorkflows"] MCPI["πŸ” MCP Inspector"] CURL["πŸ“‘ curl / SDK"] end subgraph OLS_NS["Namespace: openshift-lightspeed"] OLS["πŸ€– lightspeed-app-server\nOLS v1.0.10\nfeatureGate: MCP"] PROXY["πŸ”„ tool-choice-proxy :4001\n─────────────────\nβ€’ model remap llamaβ†’deepseek\nβ€’ tool_choice injection\nβ€’ SSE bridge\nβ€’ budget fail β†’ fallback qwen-cpu"] LITELLM["πŸ“Š LiteLLM :4000\n─────────────────\nmodels: deepseek, llama-scout,\nqwen-cpu β”‚ PostgreSQL"] N8N_INT["πŸ“‹ n8n :5678\n17 workflows + Mailpit"] subgraph MCP_SERVERS["MCP Servers"] direction LR OCP_MCP["πŸ”§ openshift-mcp-server :8080\nQuarkus Β· 19 tools\nocp_* prefix"] K8S_MCP["πŸ”§ kubernetes-mcp-server :8085\nGo Β· 19 tools\nk8s_* prefix"] end end subgraph MCP_NS["Namespace: mcp-system"] GW["🌐 MCP Gateway\nKuadrant / Envoy\n─────────────────\n38 federated tools on /mcp\nocp_* β†’ openshift-mcp-server\nk8s_* β†’ kubernetes-mcp-server"] end subgraph MODELS["LLM Models"] direction LR MAAS["☁️ MaaS LiteLLM\ndeepseek-r1-distill-qwen-14b\nEXTERNAL"] VLLM["πŸ–₯️ vLLM CPU Β· KServe\nQwen2.5-7B-Instruct :8080\nRHOAI + Serverless\n~170s/response"] end K8S_API["⚑ Kubernetes API Server"] OC -->|"question"| OLS OLS -->|"OpenAI API"| PROXY OLS -->|"MCP Protocol\n(direct)"| OCP_MCP OLS -->|"MCP Protocol\n(direct)"| K8S_MCP PROXY --> LITELLM N8N_EXT --> N8N_INT N8N_INT --> LITELLM LITELLM -->|"budget βœ…"| MAAS LITELLM -->|"budget ❌\nfallback"| VLLM MCPI --> GW CURL --> GW GW --> OCP_MCP GW --> K8S_MCP OCP_MCP -->|"cluster-admin"| K8S_API K8S_MCP -->|"cluster-admin"| K8S_API

Data Flow (OLS β†’ Tool Call β†’ Response)

sequenceDiagram actor User as πŸ‘€ User participant Console as OpenShift Console
Lightspeed Plugin participant OLS as OLS v1.0.10
lightspeed-app-server participant Proxy as tool-choice-proxy
:4001 participant LiteLLM as LiteLLM
:4000 participant MaaS as MaaS
deepseek-r1 participant vLLM as vLLM CPU
qwen-2.5-7b participant MCP as MCP Server
openshift / kubernetes User->>Console: Ask question Console->>OLS: Forward query OLS->>MCP: Discover tools (MCP protocol) MCP-->>OLS: Available tools list OLS->>Proxy: chat/completions + tools Note over Proxy: Remap model:
llama-scout β†’ deepseek-r1
Inject tool_choice
Set stream=false Proxy->>LiteLLM: Modified request LiteLLM->>MaaS: Forward to model alt Budget OK MaaS-->>LiteLLM: tool_call response else Budget Exceeded MaaS-->>LiteLLM: 400 budget error LiteLLM-->>Proxy: Error response Note over Proxy: Detect budget error
Retry with fallback model Proxy->>LiteLLM: Retry β†’ qwen-2.5-7b-cpu LiteLLM->>vLLM: Forward to CPU model vLLM-->>LiteLLM: tool_call response (~170s) end LiteLLM-->>Proxy: tool_call JSON Note over Proxy: Convert to SSE stream
for OLS compatibility Proxy-->>OLS: SSE stream with tool_call OLS->>MCP: Execute tool (e.g. checkClusterHealth) MCP-->>OLS: Tool result OLS->>Proxy: chat/completions + tool result Proxy->>LiteLLM: Forward LiteLLM-->>Proxy: Final text response Proxy-->>OLS: SSE stream OLS-->>Console: Formatted answer Console-->>User: Display response

Component Reference

flowchart LR subgraph CORE["Core Components"] direction TB A["πŸ”§ openshift-mcp-server\nQuarkus Β· 19 tools Β· ocp_*"] B["πŸ”§ kubernetes-mcp-server\nGo Β· 19 tools Β· k8s_*"] end subgraph PROXY_LAYER["Proxy & Model Layer"] direction TB C["πŸ”„ tool-choice-proxy\nStreaming fix Β· tool_choice\nModel remap Β· CPU fallback"] D["πŸ“Š LiteLLM\nMulti-model proxy\nPostgreSQL Β· API keys"] end subgraph MODELS_REF["Models"] direction TB E["☁️ MaaS deepseek-r1\nPrimary Β· External"] F["πŸ–₯️ vLLM qwen-2.5-7b\nFallback Β· Local CPU"] end subgraph EXTERNAL["External Access"] direction TB G["🌐 MCP Gateway\nKuadrant/Envoy\n38 federated tools"] end A --> PROXY_LAYER B --> PROXY_LAYER C --> D D --> E D --> F G --> CORE
ComponentPurposeRequired?
tool-choice-proxyFixes OLS bugs: streaming, tool_choice, model remap, CPU fallbackYes (OLS v1.0.10 bug)
MCP GatewayUnified endpoint for external clients (n8n, curl, SDK)Yes, for external access
LiteLLMMulti-model proxy with PostgreSQL, API keys, rate limitingYes, model management
vLLM CPULocal fallback model when MaaS is unavailableYes, resilience
openshift-mcp-serverQuarkus tools for OpenShift (ocp_*)Yes, core
kubernetes-mcp-serverGo tools for Kubernetes (k8s_*)Yes, core

Active model: deepseek-r1-distill-qwen-14b via MaaS (remapped by proxy from llama-scout-17b) Fallback model: qwen-2.5-7b-cpu local via OpenShift AI (vLLM CPU, ~170s/response)

Access Channels

ChannelTool callingURL
Console Lightspeed pluginYeshttps://console-openshift-console.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com
n8n workflows (LiteLLM)Yeshttps://n8n.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com
MCP Gateway (38 tools)Yeshttps://mcp-gateway.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com/mcp
RHOAI DashboardYeshttps://rhods-dashboard-redhat-ods-applications.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com

MCP Servers

openshift-mcp-server (Quarkus Β· 19 tools Β· prefix ocp_)

kubernetes-mcp-server (Go Β· 19 tools Β· prefix k8s_)

MCP Gateway (Kuadrant Β· 38 federated tools)


Tool Distribution by Category

pie title 38 MCP Tools Distribution "k8s Read (13)" : 13 "k8s Write (6)" : 6 "ocp Observability (9)" : 9 "ocp Provisioning (5)" : 5 "ocp Benchmarks (5)" : 5

Complete Tool Reference (38)

kubernetes-mcp-server β€” Read Operations

ToolDescription
pods_listList all pods in the cluster (all namespaces)
pods_list_in_namespaceList pods in a specific namespace
pods_getGet pod details by name
pods_logGet pod logs (with tail and container)
pods_topCPU/memory usage of pods (metrics-server)
nodes_topCPU/memory usage of nodes
nodes_logLogs from a node (kubelet, kube-proxy, journals)
nodes_stats_summaryDetailed node statistics via kubelet Summary API
events_listK8s events (warnings, errors, state changes)
namespaces_listList all namespaces
projects_listList all OpenShift Projects
resources_listList resources by type (deployments, services, PVCs, CRDs, etc.)
resources_getGet a specific resource by apiVersion/kind/name

kubernetes-mcp-server β€” Write Operations

ToolDescription
resources_create_or_updateCreate or update a resource with YAML
resources_deleteDelete a resource by apiVersion/kind/name
resources_scaleScale replicas of a Deployment/StatefulSet/ReplicaSet
pods_deleteDelete a pod (force restart)
pods_execExecute a command inside a pod
pods_runCreate an ephemeral pod with an image

openshift-mcp-server β€” Observability

ToolDescription
checkClusterHealthOverall cluster status (nodes, pods, control plane)
checkNodeConditionsNode conditions (disk/memory/PID pressure, NotReady)
detectResourceIssuesDetect pods/nodes with high CPU/memory or excessive restarts
getPerformanceMetricsPerformance metrics by namespace (kubectl top)
monitorDeploymentsRollout status and replicas of deployments
analyzePodDisruptionsAnalyze disruptions: evictions, OOM kills, restarts
checkKubeletStatusKubelet status and error logs on nodes
checkCrioStatusCRI-O runtime status and recent logs
analyzeJournalctlPodErrorsAnalyze journalctl logs for pod and system errors

openshift-mcp-server β€” Provisioning

ToolDescription
createDeploymentCreate a Deployment with image, replicas, resources, ports
createServiceCreate a Service (ClusterIP/NodePort/LoadBalancer)
createHpaCreate HorizontalPodAutoscaler with CPU/memory targets
createNetworkPolicyCreate NetworkPolicy (deny-all, allow-same-namespace, custom)
deployDatabaseDeploy ephemeral DB (postgresql, mysql, mongodb, redis)

openshift-mcp-server β€” Benchmarks

ToolDescription
runCpuStressTestCPU/memory stress test with stress-ng
runStorageBenchmarkStorage benchmark with FIO
runNetworkTestNetwork test with iperf3 (throughput, latency, packet loss)
runDatabaseBenchmarkDB benchmark with pgbench/sysbench
runKubeBurnerCluster density test with kube-burner

n8n Workflow Pipeline

flowchart LR A["πŸ”˜ Manual\nTrigger"] --> B["βš™οΈ Set\nParameters"] B --> C["πŸ”§ MCP\nTool Call"] C --> D["πŸ€– AI Analysis\nvia LiteLLM"] D --> E["πŸ“§ Email Report\nvia Mailpit"] style A fill:#2d2d2d,stroke:#CC0000,color:#e0e0e0 style B fill:#2d2d2d,stroke:#CC0000,color:#e0e0e0 style C fill:#2d2d2d,stroke:#CC0000,color:#e0e0e0 style D fill:#2d2d2d,stroke:#CC0000,color:#e0e0e0 style E fill:#2d2d2d,stroke:#CC0000,color:#e0e0e0
#WorkflowMCP ToolServer
1Pod Status (Agent)pods_list_in_namespacekubernetes-mcp
2Pod Status (Granite)pods_list_in_namespacekubernetes-mcp
3Deployment Rollout StatusmonitorDeploymentsopenshift-mcp
4Resource Quota Monitorresources_listkubernetes-mcp
5Helm Releases Audithelm_listkubernetes-mcp
6Route TLS Expiryresources_list (routes)kubernetes-mcp
7Event Anomaly Detectorevents_listkubernetes-mcp
8Cluster Test SuiteMulti-tool (7 tools)both
9Cluster Revertresources_deletekubernetes-mcp
10Health DashboardMulti-tool (4 tools)both
11Resource ProvisionercreateDeployment + deployDatabaseopenshift-mcp
12Node Capacity Monitornodes_topkubernetes-mcp
13Pod Resource Usage (Top)pods_topkubernetes-mcp
14Namespace Inventoryprojects_listkubernetes-mcp
15PVC Storage Monitorresources_list (PVCs)kubernetes-mcp
16Operator Health Checkresources_list (Subscriptions)kubernetes-mcp
17MCP Gateway StatusMulti-tool (3 tools)kubernetes-mcp

Testing with MCP Gateway (curl)

# 1. Initialize session
SESSION=$(curl -sk -X POST \
  https://mcp-gateway.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}' \
  -D - 2>/dev/null | grep -i mcp-session-id | tr -d '\r' | awk -F': ' '{print $2}')

# 2. Send initialized notification
curl -sk -X POST \
  https://mcp-gateway.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "Mcp-Session-Id: $SESSION" \
  -d '{"jsonrpc":"2.0","method":"notifications/initialized"}'

# 3. List all 38 federated tools
curl -sk -X POST \
  https://mcp-gateway.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "Mcp-Session-Id: $SESSION" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}'

# 4. Call a tool (e.g. list pods in openshift-lightspeed)
curl -sk -X POST \
  https://mcp-gateway.apps.cluster-6tx4w.6tx4w.sandbox5380.opentlc.com/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "Mcp-Session-Id: $SESSION" \
  -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"k8s_pods_list_in_namespace","arguments":{"namespace":"openshift-lightspeed"}}}'