Back to pricipal page
Contact

Platform engineering, data you can trust

Schemas, APIs, deploys, observability β€” shipped the dependable way.

Tip: try the Spider (πŸ•·οΈ) theme in the header.
PostgreSQL first OpenAPI & GraphQL IaC (Terraform) Zero-downtime deploys p95 performance Least privilege

Engineering principles

Boring is beautiful. Reliable beats flashy.
1. Design for clarity

Simple beats clever

  • Small surface-area APIs with explicit contracts
  • Readable schemas, real names, docstrings
  • Single owners, single sources of truth
2. Make it observable

Measure before tuning

  • RED + USE metrics, golden signals
  • Structured logs, trace IDs everywhere
  • SLOs with burn alerts, not noise
3. Ship safely

Automate the scary parts

  • Blue/green or canary deploys
  • Migrations with guards & fallbacks
  • Backups tested with restores
4. Secure by default

Least privilege, always

  • Short-lived creds & minimal IAM
  • CSP, TLS, dependency hygiene
  • Threat modeling as a habit

What we ship

From design to production β€” and after
Web Apps

Modern & accessible

SSR/SPA setups that load fast and are a joy to maintain.

Next/ReactViteTesting
Data & DB

Clean schemas

PostgreSQL first, indexes shaped by queries, safe migrations.

PostgreSQLPartitioningBackups
APIs

Small & well-documented

OpenAPI/GraphQL, strong typing, rate limits, testable stubs.

OpenAPIGraphQLgRPC
Cloud & Ops

Confidence at release

Containers, IaC, rollbacks, and hooks to APM/uptime.

DockerTerraformCDN/Edge
Perf

p95 reality

Real-user monitoring and targeted cache strategies.

RUMHTTP/3CDN
Security

Practical safety

OWASP-first with CSP, SAST/DAST, and alert drills.

OWASPCSPSAST/DAST

Django-first delivery

Fast CRUD, real auth, clean migrations, solid ops
Django + DRF

Clean APIs quickly

  • Typed serializers & validators
  • ViewSets + routers for consistent URLs
  • OpenAPI schema export + API docs
Admin + Permissions

Useful back office

  • Curated Django Admin for staff workflows
  • Role-based access (groups/permissions)
  • Row-level checks where needed
Async & Jobs

Celery + Redis

  • Idempotent tasks + retries
  • Rate limits + backoff for 3rd parties
  • Beat schedules with audit logs
Performance

Make queries cheap

  • Index-first modeling, pg_stat_statements
  • select_related/prefetch_related hygiene
  • Per-view caching + CDN edge
Security

Safe by default

  • Settings split by env, DJANGO_SECURE_*
  • CSP, HTTPS only, HSTS, cookie flags
  • Dependency audit, secret rotation
Ops

Ship & observe

  • 12-factor settings, Docker + Terraform
  • Zero-downtime deploys + migrations guard
  • APM + SLOs wired to alerts

Django REST Framework pattern

Tiny surface area, strong contracts

Serializer + ViewSet + Router

# serializers.py
class ProductIn(serializers.Serializer):
    sku = serializers.CharField(max_length=64)
    price = serializers.DecimalField(max_digits=10, decimal_places=2)

class ProductOut(ProductIn):
    id = serializers.IntegerField(read_only=True)
    in_stock = serializers.BooleanField()

# views.py
class ProductViewSet(viewsets.ModelViewSet):
    queryset = Product.objects.all().select_related("category")
    serializer_class = ProductOut
    permission_classes = [IsAuthenticated]
    filterset_fields = ["sku", "category_id"]
    search_fields = ["sku"]

# urls.py
router = DefaultRouter()
router.register(r"products", ProductViewSet)
urlpatterns = [path("api/", include(router.urls))]

Celery task (idempotent)

@shared_task(bind=True, autoretry_for=(HTTPError,), retry_backoff=True)
def sync_product(self, product_id):
    p = Product.objects.select_for_update().get(pk=product_id)
    ext = fetch_external(p.sku)  # pure function
    with transaction.atomic():
        Product.objects.filter(pk=p.pk).update(price=ext.price)
        AuditLog.objects.create(kind="sync", ref=p.pk)

Safe migration checklist

Zero downtime, audit-friendly

Pattern

  • Write + deploy code that tolerates BOTH schemas
  • Deploy 1: add nullable column/index concurrently
  • Backfill in batches (Celery, ETL)
  • Deploy 2: switch reads, then writes
  • Deploy 3: drop old column

Postgres helpers

-- add index concurrently
CREATE INDEX CONCURRENTLY idx_orders_user_created
ON orders(user_id, created_at);

-- partition by month (example)
CREATE TABLE orders_2025_02 PARTITION OF orders
FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');

Architecture patterns

Choose boring, scale when it matters

API Gateway + BFF

client ──► BFF ──► services
             β”‚        β”œβ”€ accounts
             └─ cache  └─ catalog

Event-driven jobs

app ──► queue ──► workers
           β”‚         β”œβ”€ email
           └─ retry   └─ reports

Partitioned Postgres

orders_y2025_m02 (
  PARTITION OF orders FOR VALUES
  FROM ('2025-02-01') TO ('2025-03-01')
)

Preferred stack & tools

Interchangeable parts β€” no lock-in
React / Next Node / Deno PostgreSQL Redis OpenAPI / GraphQL Docker Terraform Cloudflare GitHub Actions Sentry / DataDog

Example SLOs

Targets you can run a business on
99.95%
Uptime (month)
≀ 300ms
p95 API
≀ 30m
RTO
≀ 15m
First reply

Recent wins

Ask for case study details

Payments Platform Migration

Partitioned Postgres, replicas, failover drills

p95 ↓ 63%Throughput ↑ 3.4Γ—Zero lost txns
PostgreSQL Kafka Kubernetes
CREATE TABLE payments (
  id BIGSERIAL PRIMARY KEY,
  user_id BIGINT NOT NULL,
  amount NUMERIC(12,2) NOT NULL,
  status TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

E-commerce Catalog API

GraphQL gateway + Redis cache + CDN edge

Cache hit 94%$ infra ↓ 28%SLA 99.99%
type Product { id: ID!, sku: String!, price: Float!, inStock: Boolean! }
type Query   { products(sku: String): [Product]! }

SLO & Error Budget Simulator

Forecast burn, time-to-violation, and β€œwhat-if” scenarios

Adjust inputs and see how fast you burn your monthly error budget. Uses simple math with clear assumptions.

How this works
–
Budget (1-SLO)
–
Allowed errors this window
–
Burn rate (Γ— of safe)
–
Time to run out
Hour Remaining budget Errors used

Assumes stationary error rate; use for direction, not absolutes.

How the simulator works

Clear math you can re-use in docs

Budget

Budget fraction = 1 βˆ’ SLO_target. For 99.9% SLO, budget = 0.1% of total requests in the window.

Allowed errors

allowed = traffic_rpm Γ— 60 Γ— 24 Γ— days Γ— budget

Burn rate

burn = (error_rate / 100) / budget. 1.0Γ— means you use exactly your budget pace; 2.0Γ— burns twice as fast.

Time-to-run-out

ttr_hours = allowed / (traffic_rpm Γ— 60 Γ— error_rate_fraction)

Let’s build something dependable

Tell us about your project

Quick brief below and we’ll reply within one business day.

We reply within one business day.