Skip to content

Platform Overview

The WebGrip Organisation Public Platform is a comprehensive Kubernetes-based infrastructure platform that provides the foundation for application development, deployment, and operations across the WebGrip organization.

Platform Purpose

This platform serves as the organizational backbone for:

  • Development Teams: Providing self-service application deployment and management
  • Infrastructure Teams: Centralizing platform operations and maintenance
  • Security Teams: Enforcing security policies and compliance requirements
  • Operations Teams: Monitoring, alerting, and incident response capabilities

Platform Architecture

High-Level Architecture

Platform Layers

Layer Components Purpose
Ingress Traefik, cert-manager External traffic routing, TLS termination
Application User applications, Platform services Business logic and platform capabilities
Platform Monitoring, Logging, Storage Cross-cutting platform services
Infrastructure Kubernetes nodes, Networking, Security Foundation compute and network

Core Capabilities

Infrastructure as Code

Repository Location: ops/helm/

All infrastructure is defined as code using Helm charts, providing:

  • Reproducible Deployments: Consistent environments across development, staging, and production
  • Version Control: All infrastructure changes tracked in Git
  • Rollback Capability: Easy rollback to previous working configurations
  • Documentation: Self-documenting infrastructure through code

Key Infrastructure Components: - Cluster Monitoring: ops/helm/007-cluster-monitoring/ - Certificate Management: ops/helm/010-cert-manager/ - Ingress Controllers: ops/helm/030-ingress-controllers/ - CI/CD Infrastructure: ops/helm/040-gha-runners-controller/

Service Discovery & Catalog

Repository Location: catalog/

Backstage-powered service catalog providing:

CI/CD Automation

Repository Location: .github/workflows/

GitHub Actions-based automation providing:

Secret Management

Repository Location: ops/secrets/

SOPS and Age-based secret management providing:

  • Encrypted at Rest: All secrets encrypted in repository
  • Fine-grained Access: Role-based access to secret categories
  • Audit Trail: All secret changes tracked in Git history
  • Rotation Support: Structured approach to secret rotation

Observability

Repository Location: grafana-dashboards/

Comprehensive monitoring and observability:

Platform Benefits

For Developers

  • 🚀 Fast Time-to-Market: Standardized templates and deployment pipelines
  • 📊 Built-in Observability: Monitoring and alerting included by default
  • 🔐 Security by Default: Security policies and secret management built-in
  • 📚 Self-Service Documentation: Complete platform documentation and runbooks

For Operations

  • ⚙️ Standardized Operations: Consistent deployment and management procedures
  • 🔍 Full Visibility: Comprehensive monitoring across all platform components
  • 🛡️ Security Compliance: Built-in security scanning and policy enforcement
  • 📈 Scalability: Auto-scaling and resource management capabilities

For Organization

  • 💰 Cost Efficiency: Shared infrastructure and standardized tooling
  • ⚡ Developer Productivity: Reduced operational overhead for development teams
  • 🎯 Consistency: Standardized approaches across all projects and teams
  • 📋 Governance: Clear ownership, documentation, and decision tracking

Getting Started

Ready to start using the platform? Choose your path:

  • 👨‍💻 I'm a Developer

    Start with the Onboarding Guide to set up your local environment and deploy your first application.

  • 📋 I'm a Product Manager

    Explore the Service Catalog to understand the organizational structure and service ownership.

Platform Metrics

Key platform health indicators:

Metric Current Status Target
Platform Uptime 99.9% >99.5%
Application Deployment Time <5 minutes <10 minutes
Mean Time to Recovery (MTTR) <15 minutes <30 minutes
Developer Onboarding Time <2 hours <4 hours
Security Scan Coverage 100% 100%

📊 Live Metrics: View real-time platform metrics in Grafana dashboards

Next Steps

  1. 📋 Prerequisites - Ensure you have required tools and access
  2. 🏗️ Cluster Architecture - Understand the underlying infrastructure
  3. 🔧 Platform Components - Learn about core platform services
  4. 📖 Operations Runbooks - Master platform operations procedures

💡 Platform Evolution: This platform follows our Architecture Decision Records (ADRs). Proposed changes should include an ADR for significant architectural decisions.