CI/CD Patterns Quick Reference#

The canonical pipeline#

Every modern CI pipeline should have these stages, in order:

Lint / format check — cheap, catches ~20% of problems.
Type check (if typed language) — catches another 30%.
Unit tests — fast feedback, runs on every commit.
Integration tests — slower, may need a database/network.
Build — compile / bundle / package the artifact.
Container image build and push (if deploying containers).
Security scan — SAST, dependency audit, container scan.
Deploy to staging — automatic on merge to main.
Smoke tests on staging — prove the deploy actually works.
Deploy to prod — manual approval gate.

Stages 1–4 should complete in under 5 minutes. If they don’t, developers stop running them locally and start pushing broken code.

GitLab CI example#

# .gitlab-ci.yml
stages:
  - lint
  - test
  - build
  - deploy

variables:
  PYTHON_VERSION: "3.11"

default:
  image: python:${PYTHON_VERSION}-slim
  cache:
    key:
      files:
        - pyproject.toml
        - uv.lock
    paths:
      - .uv-cache/
  before_script:
    - pip install uv
    - uv sync --frozen

lint:
  stage: lint
  script:
    - uv run ruff check .
    - uv run ruff format --check .

typecheck:
  stage: lint
  script:
    - uv run mypy src/

test:
  stage: test
  services:
    - postgres:16-alpine
  variables:
    POSTGRES_PASSWORD: test
    DATABASE_URL: postgresql://postgres:test@postgres:5432/test
  script:
    - uv run pytest --cov=src --cov-report=term --cov-report=xml
  coverage: '/^TOTAL.+?(\d+\%)$/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage.xml

build:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA $CI_REGISTRY_IMAGE:latest
    - echo $CI_REGISTRY_PASSWORD | docker login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
    - docker push $CI_REGISTRY_IMAGE:latest

deploy_staging:
  stage: deploy
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  environment:
    name: staging
    url: https://staging.example.com
  script:
    - ./scripts/deploy.sh staging $CI_COMMIT_SHA

deploy_prod:
  stage: deploy
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual
  environment:
    name: production
    url: https://example.com
  script:
    - ./scripts/deploy.sh prod $CI_COMMIT_SHA

GitHub Actions equivalent#

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v3
      - run: uv sync --frozen
      - run: uv run ruff check .
      - run: uv run ruff format --check .

  test:
    runs-on: ubuntu-latest
    needs: lint
    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_PASSWORD: test
        ports: ["5432:5432"]
        options: >-
          --health-cmd pg_isready
          --health-interval 5s
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v3
      - run: uv sync --frozen
      - run: uv run pytest --cov=src
        env:
          DATABASE_URL: postgresql://postgres:test@localhost:5432/postgres

  build-and-push:
    runs-on: ubuntu-latest
    needs: test
    if: github.ref == 'refs/heads/main'
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v6
        with:
          push: true
          tags: |
            ghcr.io/${{ github.repository }}:${{ github.sha }}
            ghcr.io/${{ github.repository }}:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

Secrets management#

Never commit secrets to Git. Options in order of preference:

Cloud-native identity — GitHub Actions OIDC → AWS IAM assume role, no long-lived keys at all.
Secret manager integration — AWS Secrets Manager, HashiCorp Vault, Infisical, injected at job start.
CI-native secrets — GitLab CI variables, GitHub Actions secrets. Easy, still much better than Git.
Never — environment variables committed to the repo.

Caching tips#

Cache the dependency lockfile’s output, not its input.
Use a content-addressed cache key (hash of uv.lock, pnpm-lock.yaml, etc.) so cache invalidates automatically on changes.
Cache build tool output (ruff, mypy, tsc incremental files) for 2–5× speedup on incremental runs.

Zero-downtime deploys#

For containerized services, the deploy step should:

Push the new image with a unique tag (commit SHA, never latest).
Update the Kubernetes Deployment / ECS service / Cloud Run revision to point at the new tag.
Use a rolling strategy with a readiness probe so old pods drain only when new ones pass /ready.
Verify with a smoke test against the new version before marking success.
Keep the previous version’s image available for fast rollback.

Common mistakes#

Slow pipelines — anything over 15 minutes and devs stop waiting, start pushing broken code, and the signal value collapses.
Flaky tests — one flaky test ruins the entire feedback loop. Quarantine or fix aggressively.
Everything in one job — makes failures hard to diagnose. Split stages.
No branch protection — merge buttons that don’t require CI to pass defeat the whole point.
Manual steps hidden in runbooks — if it’s not in the pipeline, it will drift.

Practice#

1. Build the canonical pipeline#

Take a small Python FastAPI service. Write a .gitlab-ci.yml (or .github/workflows/ci.yml) that implements the full canonical pipeline from the documentation page: lint → typecheck → test → build → deploy (to a fake staging target).

Target: total pipeline under 5 minutes on cache hit.

2. Matrix test against multiple Python versions#

Run the test suite against Python 3.11, 3.12, and 3.13 in parallel. Use parallel: matrix: (GitLab) or strategy.matrix (GitHub). Verify the pipeline summary shows all three as independent jobs.

3. Cache hit rate#

Run your pipeline twice. Measure total time and cache hit rate on the second run. If cache hit rate isn’t >80%, your cache key is wrong — fix it.

4. Secret rotation#

Add a secret (e.g., a fake API key) to the CI-native secret store. Use it in a job via an environment variable. Rotate it — confirm the next pipeline run uses the new value without any code change.

Bonus: migrate the same secret to a cloud secret manager and inject it via OIDC instead of a static CI variable.

5. Manual approval gate#

Add a deploy_prod job that requires manual approval via GitLab when: manual (or GitHub Actions environments: with required reviewers). Confirm the pipeline pauses and waits for a human to click the button before running the deploy step.

6. Flake detection#

Deliberately introduce a flaky test (random assert random.random() > 0.3). Run the pipeline 10 times. Configure the pipeline to retry failing tests once and report the flake. Then fix or quarantine it.

Review Questions#

What is the target wall-clock time for the lint-through-integration-test phase of a CI pipeline?
- A. Under 60 minutes
- B. Under 15 minutes; ideally under 5
- C. Under 2 hours
- D. There is no target
Why is latest a dangerous tag for container images in a deploy pipeline?
- A. It’s slower to pull
- B. It’s mutable — you cannot reliably roll back or identify what is running in production
- C. It’s banned by Docker Hub
- D. It uses more disk space
What is the most secure way to give a GitHub Actions workflow access to AWS?
- A. Long-lived access keys stored as GitHub Secrets
- B. OIDC federation with an AWS IAM role and an assume-role trust policy (no long-lived keys)
- C. Committing keys to a private repo
- D. Sharing the root account password
Which stage of a canonical CI pipeline should run first?
- A. Integration tests
- B. Lint and format check (cheap, catches ~20% of problems)
- C. Build and push Docker image
- D. Deploy to production
What makes a cache key effective in CI?
- A. Using a fixed string like "cache"
- B. Hashing the dependency lockfile (e.g., uv.lock, pnpm-lock.yaml) so the cache invalidates automatically on changes
- C. Using the current timestamp
- D. Not caching at all
A flaky test is in your pipeline. What should you do?
- A. Ignore it and re-run the pipeline until it passes
- B. Quarantine (skip) or fix it — one flake destroys the signal value of the whole pipeline
- C. Delete the test
- D. Mark the whole suite as optional
How should the deploy step tag a container image?
- A. With latest
- B. With an immutable unique tag like the commit SHA
- C. With a random UUID generated at deploy time
- D. It shouldn’t tag at all
Why should the production deploy require a manual approval gate?
- A. To give someone credit
- B. To add a human checkpoint for high-blast-radius changes, even after automated tests pass
- C. It’s required by law
- D. It’s free extra compute time
What does “zero-downtime deploy” typically require?
- A. Taking the service offline during deploys
- B. A rolling update strategy with health/readiness probes, so old instances drain only when new ones are healthy
- C. Recompiling the kernel
- D. Running two separate clusters
Why should manual steps be avoided in the deploy pipeline?
- A. Manual steps are slower
- B. They drift from documentation, are unauditable, and can’t be reproduced — anything not in the pipeline eventually breaks
- C. They use more electricity
- D. Manual steps are illegal

View Answer Key

B — Under 15 minutes; ideally under 5 for fast feedback.
B — Mutable tags break rollback and auditability.
B — OIDC federation is the modern, keyless approach.
B — Cheap checks first; they catch a large fraction of problems at minimal cost.
B — Content-addressed keys (hashing lockfiles) give automatic invalidation.
B — Quarantine or fix; never just re-run.
B — Immutable, unique tags (commit SHA) for rollback and traceability.
B — Manual approval is a human checkpoint for high-blast-radius changes.
B — Rolling updates with readiness probes are the standard zero-downtime pattern.
B — Manual steps drift and become un-reproducible; automate everything.