Release Process
The CMS follows a pull-based, manually-gated release flow: every push
to main ships to staging automatically, and a human triggers production
when staging looks good.
Roles
- Author — Writes code, opens PR, merges to
main. The merge triggers a staging deploy automatically. - Releaser — Decides “staging is healthy enough, ship it.” Triggers the production workflow with a version bump.
- Approver — Reviewer on the
productionGitHub Environment. Approves the workflow run before it touches prod. (Often the same person as the Releaser, but the two-person rule is what the Environment enforces if you configure multiple required reviewers.)
The flow
-
Merge to
main.ci.ymlruns build/test/lint and, on success, chains thedeploy-stagingjob that ships to staging. Watch the Actions tab; checkhttps://cms-staging.alokai-apps.comonce it goes green. -
Verify on staging. Run through the pre-deploy checklist.
-
Trigger production. Actions → Deploy to Production → Run workflow → choose
version_bump:patch(default) — bug fixes, doc/UI tweaks, no schema changes.minor— new feature, additive schema, additive API.major— breaking schema or API change, or anything that requires a coordinated storefront update.
-
Approve. A required reviewer on the
productionGitHub Environment approves the run. The job stays paused until they do. -
Workflow runs. It tests, bumps the version in
apps/cms/package.json, applies any pending D1 migrations, deploys MCP, deploys CMS, smoke-checks/api/health, and only then commits the version bump, tagscms-vX.Y.Z, pushes, and creates a GitHub Release with auto-generated notes. -
Verify. Confirm
https://cms.alokai-apps.com/api/healthreturns the new version, the dashboard’s sidebar badge matches, and the Release shows up under GitHub Releases.
When a deploy fails
| Failure | What happened | What to do |
|---|---|---|
| Test step | A unit test regressed on main. | Don’t bypass. Fix on main, let staging redeploy, retry the prod run. |
| Bump CMS version | Should not happen — uses a direct JSON rewrite. If it ever does, fix the inline node script and retry. | Investigate the failing log; the workflow stopped before any deploy work. |
Backfill d1_migrations if needed | First-time run on a runtime-migrated DB; the auto-detect+backfill probe failed. | Run the manual backfill (yarn workspace cms db:migrate:backfill:prod) from a wrangler-authed shell, then re-trigger the workflow. |
| Apply D1 migrations | New migration is invalid or conflicts with prod schema. | Workflow stopped before deploying. Fix the migration on main, let staging redeploy and verify, then retry prod. |
| Deploy MCP worker | MCP build broken or wrangler config mismatch. | Workflow stopped before CMS was redeployed; prod is still on the previous CMS+MCP. Fix and retry. |
| Deploy CMS worker | New CMS bundle fails to deploy; old bundle still serves traffic. | Investigate; if MCP already deployed and is back-incompatible with the old CMS, roll MCP back via wrangler rollback (see Deployment). |
| Smoke check | Deploy went out but /api/health failed or returned the wrong version. | The new bundle is live and bad. Roll back via wrangler rollback. The version bump commit was not pushed, so re-running picks up cleanly after a fix. |
Versioning policy
- Patch (
X.Y.Z+1) — default for everything that isn’t a feature or a breaking change. Hotfixes, doc-only changes, refactors, internal cleanups. - Minor (
X.Y+1.0) — new feature you want users to notice, or any change that adds tables / columns / API endpoints (additive only). - Major (
X+1.0.0) — coordinate-with-storefront changes: API contract breaks, removed columns, removed routes, auth-flow changes.
The version is always the version of apps/cms (the dashboard worker).
@alokai/cms-api and @alokai/cms-cli have their own publish workflows
(publish-api.yml, publish-cli.yml) — those are independent.
Hotfix flow
If prod breaks and a fix needs to ship before the staging cadence:
- Branch from the last released tag (
cms-vX.Y.Z), notmain. - Apply the minimum fix.
- Open a PR straight to
mainand merge once green. - Trigger Deploy to Production with
version_bump=patch.
This skips the “soak in staging” step deliberately. Reserve it for actual
incidents. After the hotfix lands, the next normal main push will
re-roll staging onto the new tip, picking up the hotfix in the staging
deploy too.