Agent development plan¶

This document tracks how design-ai should evolve its local AI learning and agent workflow surface after reviewing adjacent open-source agent projects. It is a product and engineering plan, not a mandate to copy code from those repositories.

Reference baseline¶

Reference	Useful pattern	design-ai decision
NousResearch/hermes-agent	Closed learning loop, skills from experience, session search, scheduled automations, subagent delegation	Adopt the pattern as local deterministic learning, route evals, skill proposals, and explicit operator approval. Do not add autonomous background collection yet.
harness/harness	Pipeline, conformance tests, release evidence, local service readiness	Adopt evidence-first release gates and conformance-style CLI smoke checks. Do not build a DevOps platform in this repo.
strands-agents/sdk-python	Model/tool abstraction, MCP-native agent composition, lightweight SDK shape	Keep design-ai model-agnostic and add tool readiness metadata before adding provider adapters.
obra/superpowers	Skill-triggered workflow, planning before coding, test-first checkpoints	Adopt mandatory workflow checkpoints in skills and route evals. Keep user approval for destructive or external actions.
affaan-m/ECC	Cross-harness packaging, memory persistence, eval/checkpoint framing, security guardrails	Adopt cross-harness compatibility and eval checkpoint language. Avoid hidden hooks that mutate state without explicit CLI commands.
anomalyco/opencode	Separate plan/build agents and terminal-first agent UX	Add route/eval support for plan vs implementation prompts; no full coding agent runtime in design-ai.
langflow-ai/langflow and langgenius/dify	Visual workflow builders, API/MCP deployment, observability	Future Website Console can export workflow JSON and reports. MVP stays static/local.
anthropics/skills	Self-contained `SKILL.md` folders with metadata, scripts, and resources	Keep skills self-contained and add validation for route/skill coverage.
langchain-ai/langchain	Agent engineering layers, integrations, observability/evals	Adopt eval/observability concepts only; no dependency on LangChain.
google-gemini/gemini-cli	Terminal-first CLI, MCP support, checkpointing, GitHub action workflows	Add CLI checkpoint reports and future CI smoke targets.
TauricResearch/TradingAgents	Role-specialized multi-agent debate and risk review	Use role debate as a prompt template for design decisions, not as financial-domain logic.
farion1231/cc-switch	Cross-tool provider, MCP, and skill management	Future UI can manage provider/readiness metadata. Avoid API relay or provider switching inside this repo.
Shubhamsaboo/awesome-llm-apps	Runnable app examples and RAG/agent catalog	Use as inspiration for examples only.
x1xhlol/system-prompts-and-models-of-ai-tools	Prompt surface comparison	Do not copy prompts or code. Treat as a red-team/input hygiene reference because licensing and provenance are risky.

Architecture stance¶

design-ai should remain a local, deterministic control layer:

It routes tasks to skills, commands, agents, examples, and checked knowledge files.
It stores explicit local learning entries and usage metadata.
It validates artifacts and captures warn/fail feedback only when requested.
It produces prompts, packs, site handoff reports, and release evidence.
It should not become a hosted model runtime, external telemetry system, or hidden background trainer.

Phase plan¶

Phase 271: route eval harness¶

Add design-ai route --eval-template and design-ai route --eval so route selection can be checked with deterministic fixtures. This protects agent routing before deeper learning features rely on it.

Example:

design-ai route --eval-template --json > route-eval.json
design-ai route --eval --from-file route-eval.json --strict --json

Phase 272: prompt/pack eval harness¶

Extend the eval pattern from route selection to prompt plans and context bundles:

expected route id
required files to read
required checklist items
required prompt fragments
optional learning context expectations
strict failure on missing playbook files, missing checklist items, route drift, or context bundle drift

Examples:

design-ai prompt --eval-template --json > prompt-eval.json
design-ai prompt --eval --from-file prompt-eval.json --strict --json

design-ai pack --eval-template --json > pack-eval.json
design-ai pack --eval --from-file pack-eval.json --strict --json

Prompt evals report the generated prompt plan. Pack evals report a context snapshot with file metadata, context status, and markdown byte counts without dumping full context file bodies into eval JSON.

Phase 273: learning signal registry¶

Implemented design-ai learn --signals as a read-only registry report that joins:

learning profile audit
usage sidecar
route/prompt/pack/learning eval signal files
check learning capture entries
deterministic agent development backlog actions
workspace readiness

design-ai learn --signals --from-file . --json
design-ai learn --signals --from-file . --strict --json
design-ai learn --signals --from-file route-eval-report.json --usage-file learning.usage.json

This exposes drift without changing the learning profile, calling external AI APIs, adding dependencies, or storing raw brief text. Use --strict when the signal registry and agent development backlog should behave like a local deterministic gate.

Phase 488: readiness check index¶

Added automation-friendly readiness indexes to design-ai learn --signals and design-ai learn --agent-backlog JSON:

requiredCheckIds
optionalCheckIds
checkStatusById
checkRequiredById

These fields keep the existing checks array intact while letting local runners branch on checks such as check-capture or agent-development without array scanning or prose parsing. The change remains deterministic, local, read-only, and dependency-free.

Phase 274: skill evolution proposals¶

Implemented design-ai learn --propose-skills as a preview-only command that converts repeated learning/check issues into proposed skill edits:

candidate skill
evidence sources
proposed instruction delta
verification command
risk level

design-ai learn --propose-skills --from-file . --json
design-ai learn --propose-skills --from-file route-eval-report.json --usage-file learning.usage.json

The command groups repeated source: check:* learning entries by candidate skill and category. It reports single-entry groups as skipped, rejects --yes, and does not change learning.json, edit skills/*/SKILL.md, call external AI APIs, or add dependencies. No skill file should be changed unless the operator runs an explicit apply command in a later phase.

Phase 425: skill proposal patch handoffs¶

Added design-ai learn --propose-skills --patch as a preview-only handoff mode:

design-ai learn --propose-skills --from-file . --patch
design-ai learn --propose-skills --from-file . --patch --out skill-proposals.patch

The patch output is a unified diff preview that appends proposal review notes to candidate skills/*/SKILL.md files for manual review. It still does not mutate learning.json, edit skill files, call external AI APIs, add embeddings/fine-tuning, or add dependencies.

Phase 549: apply-plan decision manual apply status tones¶

Added selected-branch manual-apply status tones under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactManualApplyStatusToneByKey
decision.nextCommandOutputArtifactManualApplyStatusTone

Wrappers can now style apply badges without keeping a local status-to-tone map. Current tones are neutral for review-only artifacts, warning for blocked patch previews, and success for ready manual-apply artifacts.

Phase 548: apply-plan decision manual apply status labels¶

Added selected-branch manual-apply status display labels under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactManualApplyStatusLabelByKey
decision.nextCommandOutputArtifactManualApplyStatusLabel

Wrappers can now render apply badges without keeping a local enum-to-label map. Current labels are Review only, Blocked, and Ready to apply.

Phase 547: apply-plan decision manual apply status¶

Added selected-branch manual-apply status metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactManualApplyStatusByKey
decision.nextCommandOutputArtifactManualApplyStatus

Wrappers can now render apply badges from one enum. The values are not-applicable for review-only artifacts, blocked for manual-apply candidates with pending required preconditions, and ready once a manual-apply candidate has no required pending preconditions.

Phase 546: apply-plan decision manual apply blocked reasons¶

Added selected-branch manual-apply blocked reason metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactManualApplyBlockedReasonByKey
decision.commandOutputArtifactManualApplyBlockedReasonCodeByKey
decision.nextCommandOutputArtifactManualApplyBlockedReason
decision.nextCommandOutputArtifactManualApplyBlockedReasonCode

Wrappers can now render disabled patch-apply copy from explicit reason fields. Current review reports return not-manual-apply-candidate, while patch previews return required-preconditions-pending until manual review and clean-workspace preconditions are satisfied.

Phase 545: apply-plan decision manual apply readiness¶

Added selected-branch manual-apply readiness booleans under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactManualApplyReadyByKey
decision.nextCommandOutputArtifactManualApplyReady

Wrappers can now gate patch-apply affordances from one boolean. The field is true only when the artifact is a manual-apply candidate and all required apply preconditions are satisfied; current patch-preview output remains false until manual review and clean-workspace preconditions are explicitly satisfied.

Phase 544: apply-plan decision apply precondition state counts¶

Added selected-branch apply-precondition state counts under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactSatisfiedApplyPreconditionCountByKey
decision.commandOutputArtifactPendingApplyPreconditionCountByKey
decision.commandOutputArtifactRequiredPendingApplyPreconditionCountByKey
decision.nextCommandOutputArtifactSatisfiedApplyPreconditionCount
decision.nextCommandOutputArtifactPendingApplyPreconditionCount
decision.nextCommandOutputArtifactRequiredPendingApplyPreconditionCount

Wrappers can now render checklist progress and disabled apply affordances without reducing row objects. Current patch-preview preconditions are pending by default; a precondition counts as satisfied only when it explicitly carries satisfied: true.

Phase 543: apply-plan decision apply precondition counts¶

Added selected-branch apply-precondition summary counts under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactApplyPreconditionCountByKey
decision.commandOutputArtifactRequiredApplyPreconditionCountByKey
decision.nextCommandOutputArtifactApplyPreconditionCount
decision.nextCommandOutputArtifactRequiredApplyPreconditionCount

Wrappers can now render checklist summaries without iterating over compact precondition objects. Use the compact { id, label, required } array for row rendering and these count fields for summary or disabled-state copy.

Phase 542: apply-plan decision compact apply preconditions¶

Added selected-branch compact apply-precondition objects under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactApplyPreconditionsByKey.reviewCheckReport
decision.commandOutputArtifactApplyPreconditionsByKey.proposalPatchPreview
decision.nextCommandOutputArtifactApplyPreconditions

Wrappers can now render checklist rows from { id, label, required } objects without zipping parallel id and label arrays. Keep using split id/label arrays when a host needs separate automation and display surfaces.

Phase 541: apply-plan decision apply precondition labels¶

Added selected-branch ordered apply-precondition labels under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactApplyPreconditionLabelsByKey.reviewCheckReport
decision.commandOutputArtifactApplyPreconditionLabelsByKey.proposalPatchPreview
decision.nextCommandOutputArtifactApplyPreconditionLabels

Wrappers can now render checklist copy without maintaining a local label map for precondition ids. Use decision.commandOutputArtifactApplyPreconditionIdsByKey.<key> for stable automation ids and decision.commandOutputArtifactApplyPreconditionLabelsByKey.<key> for display copy in the same order.

Phase 540: apply-plan decision apply precondition ids¶

Added selected-branch ordered apply-precondition ids under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactApplyPreconditionIdsByKey.reviewCheckReport
decision.commandOutputArtifactApplyPreconditionIdsByKey.proposalPatchPreview
decision.nextCommandOutputArtifactApplyPreconditionIds

Wrappers can now render patch-apply checklist items from a stable array instead of recombining multiple booleans. Use decision.commandOutputArtifactApplyPreconditionIdsByKey.proposalPatchPreview for ordered apply checklist ids, then use the corresponding boolean fields for gate state and confirmation logic.

Phase 539: apply-plan decision clean workspace apply gates¶

Added selected-branch clean-workspace apply gates under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactRequiresCleanWorkspaceBeforeApplyByKey.reviewCheckReport
decision.commandOutputArtifactRequiresCleanWorkspaceBeforeApplyByKey.proposalPatchPreview
decision.nextCommandOutputArtifactRequiresCleanWorkspaceBeforeApply

Wrappers can now require a clean workspace before manually applying a patch preview without blocking the preview-generation command itself. Treat decision.nextCommandSafety.requiresCleanWorkspace as the safety requirement for running the selected command, and decision.nextCommandOutputArtifactRequiresCleanWorkspaceBeforeApply as the safety requirement for applying the generated artifact after review.

Phase 538: apply-plan decision output artifact review instructions¶

Added selected-branch output artifact review guidance under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactReviewInstructionByKey.reviewCheckReport
decision.commandOutputArtifactReviewInstructionByKey.proposalPatchPreview
decision.nextCommandOutputArtifactReviewInstruction

Wrappers can now render artifact-specific review guidance for Markdown reports and patch previews without hard-coding command keys, artifact names, or copy. Use decision.commandOutputArtifactReviewInstructionByKey.<key> for review copy, decision.commandOutputArtifactRequiresManualReviewByKey.<key> for review gates, and decision.commandOutputArtifactManualApplyCandidateByKey.<key> for boolean apply affordances.

Phase 537: apply-plan decision manual review gates¶

Added selected-branch manual-review-required metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactRequiresManualReviewByKey.reviewCheckReport
decision.commandOutputArtifactRequiresManualReviewByKey.proposalPatchPreview
decision.nextCommandOutputArtifactRequiresManualReview

Wrappers can now require human review before enabling patch-preview apply affordances without parsing command keys, disposition strings, or manual-apply candidate flags. Use decision.commandOutputArtifactRequiresManualReviewByKey.<key> for review gates, decision.commandOutputArtifactManualApplyCandidateByKey.<key> for boolean apply affordances, and decision.commandArgsByKey.<key> for automation execution.

Phase 536: apply-plan decision manual-apply candidate flags¶

Added selected-branch manual-apply candidate metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactManualApplyCandidateByKey.reviewCheckReport
decision.commandOutputArtifactManualApplyCandidateByKey.proposalPatchPreview
decision.nextCommandOutputArtifactManualApplyCandidate

Wrappers can now show manual-apply buttons, warnings, or confirmation affordances only for patch previews without parsing artifact disposition strings or hard-coding command keys. Use decision.commandOutputArtifactManualApplyCandidateByKey.<key> for boolean UI gates, decision.commandOutputArtifactDispositionByKey.<key> for post-render handling, and decision.commandArgsByKey.<key> for automation execution.

Phase 535: apply-plan decision output artifact dispositions¶

Added selected-branch output artifact disposition metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactDispositionByKey.reviewCheckReport
decision.commandOutputArtifactDispositionByKey.proposalPatchPreview
decision.nextCommandOutputArtifactDisposition

Wrappers can now distinguish review-only artifacts from manual-apply previews without hard-coding command keys, parsing file names, or inferring behavior from media types. Use decision.commandOutputArtifactDispositionByKey.<key> for post-render handling, decision.commandOutputArtifactMediaTypeByKey.<key> for content type handling, decision.commandOutputArtifactActionByKey.<key> for preview behavior, and decision.commandArgsByKey.<key> for automation execution.

Phase 534: apply-plan decision output artifact media types¶

Added selected-branch output artifact media type metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactMediaTypeByKey.reviewCheckReport
decision.commandOutputArtifactMediaTypeByKey.proposalPatchPreview
decision.nextCommandOutputArtifactMediaType

Wrappers can now configure Markdown and diff viewers, clipboard behavior, or download content types without parsing file extensions, artifact type strings, command strings, or argv arrays. Use decision.commandOutputArtifactMediaTypeByKey.<key> for content type handling, decision.commandOutputArtifactActionByKey.<key> for preview behavior, decision.commandOutputArtifactTypeByKey.<key> for artifact type, and decision.commandArgsByKey.<key> for automation execution.

Phase 533: apply-plan decision output artifact actions¶

Added selected-branch output artifact action metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactActionByKey.reviewCheckReport
decision.commandOutputArtifactActionByKey.proposalPatchPreview
decision.nextCommandOutputArtifactAction

Wrappers can now choose Markdown report rendering or unified diff preview rendering without deriving UI behavior from artifact type strings, file names, command strings, or argv arrays. Use decision.commandOutputArtifactActionByKey.<key> for preview behavior, decision.commandOutputArtifactTypeByKey.<key> for artifact type, decision.commandOutputArtifactByKey.<key> for artifact names, and decision.commandArgsByKey.<key> for automation execution.

Phase 532: apply-plan decision output artifact types¶

Added selected-branch output artifact type metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactTypeByKey.reviewCheckReport
decision.commandOutputArtifactTypeByKey.proposalPatchPreview
decision.nextCommandOutputArtifactType

Wrappers can now distinguish Markdown review reports from unified diff previews without parsing file extensions, command strings, or argv arrays. Use decision.commandOutputArtifactTypeByKey.<key> for preview/rendering type decisions, decision.commandOutputArtifactByKey.<key> for artifact names, decision.commandDisplayLabelByKey.<key> for UI labels, decision.commandDescriptionByKey.<key> for helper text, and decision.commandArgsByKey.<key> for automation execution.

Phase 531: apply-plan decision output artifacts¶

Added selected-branch output artifact metadata under operatorRunbook.stageSelection.decision:

decision.commandOutputArtifactByKey.reviewCheckReport
decision.commandOutputArtifactByKey.proposalPatchPreview
decision.nextCommandOutputArtifact

Wrappers can now display selected optional preview artifact targets without parsing --out arguments from command strings or argv arrays. Use decision.commandOutputArtifactByKey.<key> for UI/export artifact names, decision.commandDisplayLabelByKey.<key> for UI labels, decision.commandDescriptionByKey.<key> for helper text, decision.commandStringByKey.<key> for copy/display command strings, and decision.commandArgsByKey.<key> for automation execution.

Phase 530: apply-plan decision command descriptions¶

Added selected-branch command description metadata under operatorRunbook.stageSelection.decision:

decision.commandDescriptionByKey.reviewCheckReport
decision.commandDescriptionByKey.proposalPatchPreview
decision.nextCommandDescription

Wrappers can now render selected optional preview command tooltips or secondary descriptions without hard-coding command semantics or scanning decision.commands. Use decision.commandDisplayLabelByKey.<key> for UI labels, decision.commandDescriptionByKey.<key> for helper text, decision.commandStringByKey.<key> for copy/display command strings, and decision.commandArgsByKey.<key> for automation execution.

Phase 529: apply-plan decision command display labels¶

Added selected-branch command display-label metadata under operatorRunbook.stageSelection.decision:

decision.commandDisplayLabelByKey.reviewCheckReport
decision.commandDisplayLabelByKey.proposalPatchPreview
decision.nextCommandDisplayLabel

Wrappers can now render selected optional preview command labels without deriving UI copy from camelCase command keys or scanning decision.commands. Use decision.commandDisplayLabelByKey.<key> for UI labels, decision.commandStringByKey.<key> for copy/display command strings, and decision.commandArgsByKey.<key> for automation execution.

Phase 528: apply-plan decision command string lookup¶

Added selected-branch command-string lookup metadata under operatorRunbook.stageSelection.decision:

decision.commandStringByKey.reviewCheckReport
decision.commandStringByKey.proposalPatchPreview

Wrappers can now display or copy selected optional preview command strings by key without scanning decision.commands, opening decision.commandByKey, or jumping to the top-level commands object. Use decision.commandArgsByKey.<key> for automation execution and decision.commandStringByKey.<key> for human-readable copy/display handoffs.

Phase 527: apply-plan decision command args lookup¶

Added selected-branch command-args lookup metadata under operatorRunbook.stageSelection.decision:

decision.commandArgsByKey.reviewCheckReport
decision.commandArgsByKey.proposalPatchPreview

Wrappers can now retrieve selected optional preview command argv by key without scanning decision.commands, opening decision.commandByKey, or jumping to the top-level commandArgs object. The lookup currently maps both selected preview commands to their full structured argv arrays; use decision.commandByKey.<key> when a full command object is needed.

Phase 526: apply-plan decision command safety-level lookup¶

Added selected-branch safety-level lookup metadata under operatorRunbook.stageSelection.decision:

decision.commandSafetyLevelByKey.reviewCheckReport
decision.commandSafetyLevelByKey.proposalPatchPreview

Wrappers can now validate selected optional preview command safety levels by key without scanning decision.commands or opening decision.commandByKey. The lookup currently maps both selected preview commands to local-output; use decision.commandByKey.<key>.safety when full mutation details are needed.

Phase 525: apply-plan decision command run-policy lookup¶

Added selected-branch run-policy lookup metadata under operatorRunbook.stageSelection.decision:

decision.commandRunPolicyByKey.reviewCheckReport
decision.commandRunPolicyByKey.proposalPatchPreview

Wrappers can now validate the selected optional preview branch's execution policy by key without scanning decision.commands or reading the full commandSequence. The lookup currently maps both selected preview commands to output-artifact.

Phase 524: apply-plan decision command step lookup¶

Added selected-branch step lookup metadata under operatorRunbook.stageSelection.decision:

decision.commandStepByKey.reviewCheckReport
decision.commandStepByKey.proposalPatchPreview

Wrappers can now validate the selected optional preview branch's original command order by key without scanning decision.commands or reading the full commandSequence. The lookup currently maps reviewCheckReport to 2 and proposalPatchPreview to 3.

Phase 523: apply-plan decision command step metadata¶

Added command-sequence step metadata under operatorRunbook.stageSelection.decision:

decision.commands[*].step
decision.commandByKey.<key>.step
decision.nextCommandEntry.step
decision.nextCommandStep

Wrappers can now preserve the selected optional preview branch's original command order without reading the full commandSequence. The selected next command currently reports nextCommandStep: 2 for reviewCheckReport, after the read-only reviewCheckJson command at step 1.

Phase 522: apply-plan decision next command safety¶

Added selected next-command safety metadata under operatorRunbook.stageSelection.decision:

decision.nextCommandSafety: the full safety object for the command named by decision.nextCommandKey

This lets wrappers that already consume decision.nextCommand, decision.nextCommandArgs, and decision.nextCommandRunPolicy gate the selected optional preview command without reading decision.nextCommandEntry. The field mirrors decision.nextCommandEntry.safety and keeps the existing decision.nextCommandSafetyLevel string for compatibility.

Phase 521: apply-plan decision command safety objects¶

Added nested command-level safety objects under operatorRunbook.stageSelection.decision:

decision.commands[*].safety
decision.commandByKey.<key>.safety
decision.nextCommandEntry.safety

Wrappers can now inspect level, local-output writes, mutation boundaries, external-AI usage, clean-workspace requirements, and the command-specific reason directly from the selected decision command object. The older safetyLevel and flattened boolean flags remain for compatibility, while commandSequenceByKey remains the canonical full-command lookup.

Phase 520: apply-plan decision next command entry¶

Added a compact full command object under operatorRunbook.stageSelection.decision:

nextCommandEntry: the first selected optional preview command object, currently reviewCheckReport

Wrappers should prefer decision.nextCommandEntry when rendering or running the first optional preview handoff because it carries the command string, structured args, run policy, safety level, write/mutation flags, external-AI flags, and clean-workspace requirement in one object. The separate decision.nextCommand* fields remain for compatibility, while decision.commandByKey remains the lookup surface for explicit operator command choice.

Phase 519: apply-plan decision command lookup¶

Added direct lookup fields under operatorRunbook.stageSelection.decision:

commandByKey: compact lookup for selected-branch commands
nextCommandKey: currently reviewCheckReport
nextCommand / nextCommandArgs: executable first optional preview command handoff
nextCommandRunPolicy / nextCommandSafetyLevel: quick gate metadata for the first command

Wrappers should use decision.nextCommand* when offering the first optional preview command and decision.commandByKey when the operator chooses a specific preview artifact. The full canonical command contract remains commandSequenceByKey.

Phase 518: apply-plan decision command handoff¶

Added compact selected-branch command handoffs under operatorRunbook.stageSelection.decision:

commandCount: currently 2
commandKeys: reviewCheckReport, proposalPatchPreview
commands: compact command objects with command string, structured args, run policy, safety level, write/mutation flags, and external-AI flags

Wrappers can now branch on decision.action, gate on decision.safety, then offer or execute decision.commands for optional local preview artifacts. commandSequenceByKey remains the full canonical command lookup for later gates.

Phase 517: apply-plan decision safety summary¶

Added operatorRunbook.stageSelection.decision.safety so wrappers can gate the selected decision directly:

level: currently local-output
writesLocalFiles / writesOutputArtifacts: true for optional preview artifacts
mutatesProfile / mutatesReviewFile / mutatesSkillFiles: false
callsExternalAiApis: false
requiresCleanWorkspace: false

Wrappers should branch on decision.action, then inspect decision.safety before executing or offering commands. The selected-stage summaries remain available for fuller detail, but the decision object is now self-contained for first-branch gating.

Phase 516: apply-plan stage decision enum¶

Added operatorRunbook.stageSelection.decision as the first branch decision for apply-plan wrappers:

action: currently offer-optional-preview
stageKey / stageKind: the selected optional preview branch
commandKeys / runPolicy: the commands and execution policy for that branch
nextRequiredStageKey / nextRequiredCommandStageKey: the mandatory path after optional preview
requiresOperatorActionBeforeRequiredCommands: currently true, because accepted skill deltas remain manual

Wrappers should branch on decision.action before reading the selected-stage summaries. The decision enum is the routing surface; the summaries are the safety/detail surface.

Phase 515: apply-plan selected stage summaries¶

Added compact selected-stage summaries under operatorRunbook.stageSelection:

nextStage: the optional selected preview branch, currently previewArtifacts
nextRequiredStage: the first mandatory branch, currently manualSkillEdit
nextRequiredCommandStage: the first mandatory command-bearing branch, currently reviewReadiness

Each summary includes command count, command keys, optional/required state, stage kind, local-output flags, mutation flags, external-AI flags, clean-workspace requirement, and reason. Wrappers should use these summaries for branch safety checks before consulting stageByKey for full stage details.

Phase 514: apply-plan stage selection summary¶

Added operatorRunbook.stageSelection to group the stage-selection policy in one object:

strategy: currently optional-preview-before-required-manual-edit
stageOrder: stable operator stage order
nextStageKey / nextStageCommandKeys: optional preview artifacts
nextRequiredStageKey / nextRequiredStageCommandKeys: required manual skill edit stage
nextRequiredCommandStageKey / nextRequiredCommandStageCommandKeys: required read-only review gate

Wrappers should use this object when they need a single branch point for optional previews versus mandatory operator work. The top-level fields remain for backward compatibility, and invalid command contracts still return an empty stageSelection object.

Phase 513: apply-plan required stage handoff¶

Added required-stage handoff fields to operatorRunbook so local AI/agent wrappers can distinguish optional local preview artifacts from required operator work:

nextRequiredStageKey: the first required stage, currently manualSkillEdit
nextRequiredStageCommandKeys: commands on that required stage, currently empty because skill-file edits remain manual
nextRequiredCommandStageKey: the first required stage that has commands, currently reviewReadiness
nextRequiredCommandStageCommandKeys: commands for that stage, currently reviewCheckJson

This lets automation offer optional previewArtifacts while still routing the mandatory path through manual skill edits and read-only review gates. Invalid command contracts stay fail-closed with empty required-stage fields.

Phase 512: apply-plan runbook stage index¶

Added operatorRunbook.stageKeys and operatorRunbook.stageByKey to design-ai learn --propose-skills --review-file skill-proposals.review.json --apply-plan so local AI/agent wrappers can retrieve runbook stages by stable key without scanning the ordered stages array.

The index exposes the same four runbook stages:

previewArtifacts
manualSkillEdit
reviewReadiness
strictGate

Invalid command contracts stay fail-closed with an empty stageKeys list and empty stageByKey map. This mirrors the command-level commandSequenceKeys / commandSequenceByKey contract at the operator-runbook layer while preserving the same no-mutation boundary for learning profiles, review files, skill files, external AI APIs, embeddings, and fine-tuning jobs.

Phase 511: apply-plan operator runbook¶

Added operatorRunbook to design-ai learn --propose-skills --review-file skill-proposals.review.json --apply-plan so local AI/agent wrappers can follow the accepted-proposal handoff at the operator stage level.

The runbook exposes four deterministic stages:

previewArtifacts: optional local-output previews for reviewCheckReport and proposalPatchPreview
manualSkillEdit: required manual review/edit of accepted skill deltas
reviewReadiness: required read-only reviewCheckJson validation after manual edits
strictGate: required read-only strict gate before marking proposals applied

Invalid command contracts stay fail-closed with blocked: true, zero stages, and no next stage. The runbook is additive and preserves the existing boundary: preview artifact commands may write explicit --out files, but apply-plan output still does not mutate learning.json, review files, skill files, external AI APIs, embeddings, or fine-tuning jobs.

Phase 510: apply-plan sequence key index¶

Added commandSequenceKeys and commandSequenceByKey to design-ai learn --propose-skills --review-file skill-proposals.review.json --apply-plan so local AI/agent wrappers can retrieve named follow-up commands without scanning the ordered commandSequence array.

The index preserves the same validated command items:

reviewCheckJson
reviewCheckReport
proposalPatchPreview
strictGate

Invalid command contracts stay fail-closed with an empty key list and empty key map. The key index is additive and keeps the same boundary as the ordered sequence: local output previews may write requested --out artifacts, but the apply plan still does not mutate learning.json, review files, skill files, external AI APIs, embeddings, or fine-tuning jobs.

Phase 509: apply-plan sequence safety summary¶

Added commandSequenceSummary to design-ai learn --propose-skills --review-file skill-proposals.review.json --apply-plan so local AI/agent wrappers can branch on the full follow-up handoff without reducing the full commandSequence array.

The summary reports:

whether the sequence is executable or blocked
total step count
read-only vs local-output step counts
local write/output artifact flags
profile, review-file, and skill-file mutation flags
external AI API and clean-workspace boundaries
aggregate run policy

This keeps the apply-plan handoff deterministic and explicit: preview/report/patch artifacts may write local output files when requested with --out, but the sequence still does not mutate learning.json, review files, skill files, external AI APIs, embeddings, or fine-tuning jobs.

Phase 275: Website Console MCP probes¶

Implemented design-ai site --mcp-check --probes and design-ai site --mcp-plan --probes as optional read-only probe overlays for the existing Website Console MCP readiness matrix:

GitHub repo reference parseable through github.com/<owner>/<repo> or an existing local repo path
Figma URL parseable for design, file, board, slides, or make handoff references
Browser smoke target available from a valid live URL plus configured viewport set
deployment provider reference configured with a valid live URL

The probes report as a separate probes JSON block so the default --mcp-check contract stays stable. They remain deterministic, local, and read-only: no external MCP calls, no writes to GitHub/Figma/deploy providers, no crawling, no Lighthouse/axe automation, and no new dependencies.

Phase 276: workflow graph export¶

Implemented design-ai site --graph [--json] so website improvement workspaces and agent plans can be exported as portable graphs that are renderable later in the static console.

The graph includes deterministic nodes for:

workspace intake
site profile
audit categories
MCP readiness
generated and retained refactor tasks
prompt templates
handoff report, local bundle, and target website repo boundary

Edges connect profile context, audit findings, MCP readiness, task execution, prompt generation, and handoff flow. The export remains deterministic, local, and read-only: no external MCP calls, no target-repo mutation, no workflow runtime dependency, no Lighthouse/axe/crawling, and no new dependencies.

Phase 277: static workflow graph rendering¶

Implemented the static Website Console Workflow Graph tab so operators can inspect the workflow graph in the browser before exporting JSON or handing prompts to a target website repo.

The view renders:

summary metrics for graph nodes, edges, tasks, and required MCPs
lane-based node groups for intake, audit, MCP readiness, tasks, prompts, and handoff
deterministic edge rows matching the portable graph contract
boundary markers for local execution, no external MCP calls, no target-repo mutation, and no new dependencies
copy/export actions for website-workflow-graph.json

This keeps the useful part of visual workflow builders in a dependency-free, local/read-only console. It does not add a workflow runtime, backend sync, crawling, Lighthouse/axe automation, or live MCP connection checks.

Phase 278: handoff evidence tracking¶

Implemented browser-local Website Console evidence tracking for the handoff phase:

executed target-repo work
verification results from lint/typecheck/build, Browser QA, deployment checks, or manual QA
remaining risks
next actions

The Handoff Report tab stores those fields in the workspace JSON, shows compact evidence counts, and injects the evidence into copied/exported Markdown reports. This keeps the closed loop between generated prompts and final operator evidence without mutating the target repo, calling external MCPs, adding a backend, or adding dependencies.

Phase 279: CLI handoff evidence export¶

Implemented implementationEvidence support in design-ai site so browser-captured evidence survives the file-first CLI workflow:

--json reports evidence counts
--tasks preserves the evidence block
--report renders executed work, verification results, remaining risks, and next actions
--bundle stores evidence in website-workspace.tasks.json, website-handoff.md, and summary.json

The CLI validates malformed evidence array shapes, but it does not verify target-repo claims automatically. Evidence remains operator-entered, deterministic, local, and dependency-free.

Phase 280: evidence package smoke expansion¶

Expanded packed-tarball smoke coverage for Website Console evidence preservation:

installed-bin design-ai site --stdin --report preserves non-empty handoff evidence in Markdown
installed-bin design-ai site --stdin --tasks preserves the implementationEvidence JSON block
installed-bin design-ai site --stdin --bundle --out <dir> preserves evidence in summary.json, website-workspace.tasks.json, and website-handoff.md
one-shot npm exec --package <tarball> covers the same report, tasks, and bundle paths
package-smoke.py --self-test now includes evidence payload and Markdown drift fixtures

This turns the Website Console evidence loop into a release-smoked distribution contract without adding dependencies, calling external MCPs, mutating target repos, or claiming that target-repo evidence is automatically verified.

Current MVP boundary¶

In scope:

deterministic route, prompt, learning, and site evals
local JSON state
explicit preview/apply boundaries
release smoke coverage
skill and command documentation

Out of scope:

embeddings
fine-tuning
autonomous background learning
external telemetry
hosted sync
provider/API relay management
copying third-party system prompts

Verification checklist¶

For every agent/AI phase:

node --test for touched CLI modules
design-ai <command> --help for public CLI surface
JSON round-trip check for every machine-readable report
strict-mode failure fixture for every eval
git diff --check
release metadata update only when the public smoke surface changes