mirror of
https://github.com/khodges42/nightShift.git
synced 2026-06-14 18:18:36 +00:00
1032 lines
18 KiB
Markdown
1032 lines
18 KiB
Markdown
# NIGHTSHIFT_CODEX.md
|
|
|
|
You are Codex working on **NightShift**, a local-first AI coding pipeline runner.
|
|
|
|
This file is the implementation-driving context document. Treat it as the project brief, architectural guide, and task checklist.
|
|
|
|
---
|
|
|
|
# 0. Project Identity
|
|
|
|
## Name
|
|
|
|
**NightShift**
|
|
|
|
## Tagline
|
|
|
|
Auditable local-first AI coding pipelines.
|
|
|
|
## Core Thesis
|
|
|
|
NightShift is not an autonomous coding god.
|
|
|
|
NightShift is a deterministic pipeline runner that lets unreliable AI agents perform bounded coding work inside scoped, auditable, test-driven workflows.
|
|
|
|
The user should be able to run NightShift overnight and wake up to:
|
|
|
|
* a reviewable repository state
|
|
* task artifacts
|
|
* plans
|
|
* logs
|
|
* diffs
|
|
* test output
|
|
* review notes
|
|
* a final report
|
|
|
|
## Priority Order
|
|
|
|
Optimize in this order:
|
|
|
|
1. Cheapness
|
|
2. Correctness
|
|
3. Auditability
|
|
4. Speed
|
|
|
|
This means:
|
|
|
|
* Prefer local models first.
|
|
* Keep context compact.
|
|
* Avoid token waste.
|
|
* Make failure explicit.
|
|
* Always produce artifacts.
|
|
* Do not optimize for cleverness before trust.
|
|
|
|
---
|
|
|
|
# 1. Product Summary
|
|
|
|
NightShift runs long-running AI-assisted coding pipelines against a scoped project directory.
|
|
|
|
A user provides:
|
|
|
|
* a repository
|
|
* a markdown task file
|
|
* a declarative pipeline config
|
|
* agent definitions
|
|
* allowed test/static commands
|
|
|
|
NightShift processes one task at a time:
|
|
|
|
```text
|
|
select task
|
|
-> plan
|
|
-> review plan
|
|
-> implement
|
|
-> run tests
|
|
-> run static checks
|
|
-> review result
|
|
-> retry or complete
|
|
-> write summary
|
|
```
|
|
|
|
The output is not automatically shipped.
|
|
|
|
The output is a reviewable work package.
|
|
|
|
---
|
|
|
|
# 2. Non-Negotiable Design Constraints
|
|
|
|
## 2.1 Local-first
|
|
|
|
The first implementation should assume local execution.
|
|
|
|
Primary target backend:
|
|
|
|
* local command-driven agent execution
|
|
|
|
Future-compatible backends:
|
|
|
|
* Ollama
|
|
* Claude Code
|
|
* Codex CLI
|
|
* OpenAI API
|
|
* Anthropic API
|
|
|
|
Do not overbuild backend support in v1.
|
|
|
|
Build a clean interface first.
|
|
|
|
---
|
|
|
|
## 2.2 Scoped directory access
|
|
|
|
NightShift must only operate inside a configured project root.
|
|
|
|
It must not casually read/write arbitrary paths.
|
|
|
|
All path resolution should:
|
|
|
|
* normalize paths
|
|
* reject path traversal
|
|
* reject writes outside project root
|
|
* prefer relative paths in artifacts
|
|
|
|
---
|
|
|
|
## 2.3 One task at a time
|
|
|
|
v1 runs one task at a time.
|
|
|
|
No parallel task execution.
|
|
|
|
No DAG executor yet.
|
|
|
|
---
|
|
|
|
## 2.4 Declarative config first
|
|
|
|
Use YAML for v1.
|
|
|
|
Do not implement arbitrary Python config yet.
|
|
|
|
The config should be expressive enough for:
|
|
|
|
* agents
|
|
* stages
|
|
* commands
|
|
* retries
|
|
* artifact directory
|
|
* task file location
|
|
* scoped paths
|
|
* allowlisted commands
|
|
|
|
---
|
|
|
|
## 2.5 Auditable artifacts
|
|
|
|
Every run should create a durable artifact tree.
|
|
|
|
Artifacts are core product behavior, not debug leftovers.
|
|
|
|
---
|
|
|
|
# 3. Architecture
|
|
|
|
## 3.1 Conceptual Components
|
|
|
|
```text
|
|
NightShift CLI
|
|
|
|
|
v
|
|
Config Loader
|
|
|
|
|
v
|
|
Task Parser
|
|
|
|
|
v
|
|
Pipeline Runner
|
|
|
|
|
+--> Agent Executor
|
|
|
|
|
+--> Command Executor
|
|
|
|
|
+--> Artifact Store
|
|
|
|
|
+--> Context Manager
|
|
|
|
|
v
|
|
Run Summary
|
|
```
|
|
|
|
---
|
|
|
|
## 3.2 Suggested Module Layout
|
|
|
|
Use this layout unless the existing repo already strongly implies another structure.
|
|
|
|
```text
|
|
nightshift/
|
|
__init__.py
|
|
cli.py
|
|
config.py
|
|
tasks.py
|
|
pipeline.py
|
|
stages.py
|
|
agents.py
|
|
commands.py
|
|
artifacts.py
|
|
context.py
|
|
safety.py
|
|
reports.py
|
|
errors.py
|
|
|
|
tests/
|
|
test_config.py
|
|
test_tasks.py
|
|
test_pipeline.py
|
|
test_safety.py
|
|
test_artifacts.py
|
|
|
|
examples/
|
|
pipeline.yaml
|
|
tasks.md
|
|
agents/
|
|
planner.md
|
|
implementer.md
|
|
reviewer.md
|
|
|
|
NIGHTSHIFT_CODEX.md
|
|
README.md
|
|
```
|
|
|
|
If this project is implemented in Rust instead of Python, preserve the same conceptual boundaries.
|
|
|
|
---
|
|
|
|
# 4. Config Format
|
|
|
|
## 4.1 Example `nightshift.yaml`
|
|
|
|
```yaml
|
|
project:
|
|
name: example-project
|
|
root: .
|
|
task_file: tasks.md
|
|
artifact_dir: .nightshift
|
|
|
|
safety:
|
|
require_clean_worktree: false
|
|
scoped_paths:
|
|
- src/
|
|
- tests/
|
|
allowed_commands:
|
|
- cargo test
|
|
- cargo fmt --check
|
|
- cargo clippy -- -D warnings
|
|
forbidden_commands:
|
|
- rm -rf
|
|
- git push
|
|
- curl | bash
|
|
|
|
agents:
|
|
planner:
|
|
backend: command
|
|
command: echo
|
|
system_prompt: examples/agents/planner.md
|
|
|
|
implementer:
|
|
backend: command
|
|
command: echo
|
|
system_prompt: examples/agents/implementer.md
|
|
|
|
reviewer:
|
|
backend: command
|
|
command: echo
|
|
system_prompt: examples/agents/reviewer.md
|
|
|
|
pipeline:
|
|
max_task_retries: 3
|
|
stages:
|
|
- id: plan
|
|
type: agent
|
|
agent: planner
|
|
output: plan.md
|
|
|
|
- id: review_plan
|
|
type: agent_review
|
|
agent: reviewer
|
|
on_fail: plan
|
|
output: plan-review.md
|
|
|
|
- id: implement
|
|
type: agent
|
|
agent: implementer
|
|
output: implementation-log.md
|
|
|
|
- id: test
|
|
type: command
|
|
commands:
|
|
- cargo test
|
|
output: test-output.txt
|
|
|
|
- id: static
|
|
type: command
|
|
commands:
|
|
- cargo fmt --check
|
|
- cargo clippy -- -D warnings
|
|
output: static-output.txt
|
|
|
|
- id: review
|
|
type: agent_review
|
|
agent: reviewer
|
|
on_fail: implement
|
|
output: review.md
|
|
|
|
- id: summarize
|
|
type: summarize
|
|
output: final-notes.md
|
|
```
|
|
|
|
---
|
|
|
|
# 5. Task File Format
|
|
|
|
## 5.1 Input Task Format
|
|
|
|
Tasks are markdown checklist items with acceptance criteria.
|
|
|
|
Example:
|
|
|
|
```markdown
|
|
# Tasks
|
|
|
|
- [ ] TASK-001: Add YAML config loading
|
|
|
|
Description:
|
|
Implement config loading for NightShift.
|
|
|
|
Acceptance Criteria:
|
|
- Loads `nightshift.yaml`
|
|
- Validates required fields
|
|
- Returns typed config object
|
|
- Includes tests for valid and invalid config
|
|
|
|
- [ ] TASK-002: Add artifact directory creation
|
|
|
|
Description:
|
|
Create per-run and per-task artifact directories.
|
|
|
|
Acceptance Criteria:
|
|
- Creates `.nightshift/runs/<timestamp>/`
|
|
- Creates task-specific folder
|
|
- Writes task snapshot
|
|
- Includes tests
|
|
```
|
|
|
|
## 5.2 Parser Requirements
|
|
|
|
The parser should identify:
|
|
|
|
* task id
|
|
* task title
|
|
* completion state
|
|
* description
|
|
* acceptance criteria
|
|
* optional dependency notes
|
|
|
|
For v1, parsing can be simple and documented.
|
|
|
|
Do not try to support every markdown style.
|
|
|
|
---
|
|
|
|
# 6. Pipeline Model
|
|
|
|
## 6.1 State Machine, Not DAG
|
|
|
|
v1 should use a configurable state machine.
|
|
|
|
Reason:
|
|
|
|
* one task at a time
|
|
* retry loops matter
|
|
* easier to audit
|
|
* easier to debug
|
|
* easier MVP
|
|
|
|
A stage returns a `StageResult`.
|
|
|
|
Suggested shape:
|
|
|
|
```python
|
|
@dataclass
|
|
class StageResult:
|
|
stage_id: str
|
|
status: Literal["pass", "fail", "retry", "escalate"]
|
|
reason: str
|
|
output_path: str | None = None
|
|
next_stage: str | None = None
|
|
context_update: str | None = None
|
|
```
|
|
|
|
Equivalent Rust structs are fine if using Rust.
|
|
|
|
## 6.2 Retry Behavior
|
|
|
|
Retry behavior should be deterministic.
|
|
|
|
Rules:
|
|
|
|
* retries are counted per task
|
|
* max retries come from config
|
|
* failed review stages can redirect to configured `on_fail`
|
|
* after max retries, task is marked failed
|
|
* failure is summarized in artifacts
|
|
|
|
---
|
|
|
|
# 7. Agent Model
|
|
|
|
## 7.1 Agent Definition
|
|
|
|
Agents have:
|
|
|
|
* id
|
|
* backend
|
|
* command or model
|
|
* system prompt file
|
|
* role
|
|
|
|
For v1, support a `command` backend first.
|
|
|
|
This lets the user wrap:
|
|
|
|
* Codex
|
|
* Claude Code
|
|
* Ollama scripts
|
|
* local model scripts
|
|
* fake test agents
|
|
|
|
## 7.2 Agent Invocation
|
|
|
|
The runner should construct a prompt/input bundle containing:
|
|
|
|
* system prompt
|
|
* task markdown
|
|
* acceptance criteria
|
|
* relevant project context
|
|
* previous stage output
|
|
* retry notes, if any
|
|
* required output contract
|
|
|
|
The agent should write output to the configured artifact path.
|
|
|
|
Do not pass giant history blobs.
|
|
|
|
---
|
|
|
|
# 8. Context System
|
|
|
|
## 8.1 Context Layers
|
|
|
|
There are three context layers:
|
|
|
|
```text
|
|
project context
|
|
long-lived, compact, shared across tasks
|
|
|
|
task context
|
|
specific to the current task
|
|
|
|
retry context
|
|
compact notes from failed attempts
|
|
```
|
|
|
|
## 8.2 Project Context
|
|
|
|
Stored at:
|
|
|
|
```text
|
|
.nightshift/project-context.md
|
|
```
|
|
|
|
Contains:
|
|
|
|
* architecture notes
|
|
* repo conventions
|
|
* summaries from completed tasks
|
|
* high-value durable facts
|
|
|
|
## 8.3 Task Context
|
|
|
|
Stored per task:
|
|
|
|
```text
|
|
.nightshift/runs/<run-id>/tasks/<task-id>/context.md
|
|
```
|
|
|
|
## 8.4 Context Compaction
|
|
|
|
After each task, write:
|
|
|
|
```text
|
|
context-out.md
|
|
```
|
|
|
|
Then selectively bubble useful durable information into project context.
|
|
|
|
Do not automatically dump everything into project context.
|
|
|
|
---
|
|
|
|
# 9. Artifact Layout
|
|
|
|
Every run should create:
|
|
|
|
```text
|
|
.nightshift/
|
|
project-context.md
|
|
runs/
|
|
<run-id>/
|
|
run-summary.md
|
|
config.snapshot.yaml
|
|
tasks/
|
|
TASK-001/
|
|
task.md
|
|
plan.md
|
|
plan-review.md
|
|
implementation-log.md
|
|
test-output.txt
|
|
static-output.txt
|
|
review.md
|
|
final-notes.md
|
|
diff.patch
|
|
context.md
|
|
context-out.md
|
|
```
|
|
|
|
Artifacts should be written even on failure.
|
|
|
|
---
|
|
|
|
# 10. Safety Rules
|
|
|
|
## 10.1 Path Safety
|
|
|
|
Implement helpers that:
|
|
|
|
* resolve paths against project root
|
|
* reject writes outside project root
|
|
* reject `..` traversal that escapes root
|
|
* prefer pathlib/path abstractions
|
|
|
|
## 10.2 Command Safety
|
|
|
|
For v1:
|
|
|
|
* only run commands listed in `allowed_commands`
|
|
* block commands containing known forbidden fragments
|
|
* record all command output
|
|
* record exit code
|
|
* set timeouts when practical
|
|
|
|
## 10.3 Git Safety
|
|
|
|
v1 should support config option:
|
|
|
|
```yaml
|
|
require_clean_worktree: true | false
|
|
```
|
|
|
|
If true, abort when git working tree is dirty.
|
|
|
|
Do not implement automatic branch creation in v1.
|
|
|
|
Do not push.
|
|
|
|
---
|
|
|
|
# 11. CLI Commands
|
|
|
|
Recommended initial CLI:
|
|
|
|
```bash
|
|
nightshift init
|
|
nightshift validate
|
|
nightshift run
|
|
nightshift run --task TASK-001
|
|
nightshift status
|
|
```
|
|
|
|
## 11.1 `nightshift init`
|
|
|
|
Creates example files:
|
|
|
|
* `nightshift.yaml`
|
|
* `tasks.md`
|
|
* `agents/planner.md`
|
|
* `agents/implementer.md`
|
|
* `agents/reviewer.md`
|
|
|
|
## 11.2 `nightshift validate`
|
|
|
|
Validates:
|
|
|
|
* config file exists
|
|
* task file exists
|
|
* scoped paths are inside root
|
|
* agents exist
|
|
* prompt files exist
|
|
* allowed commands are valid strings
|
|
* pipeline references valid agents
|
|
|
|
## 11.3 `nightshift run`
|
|
|
|
Runs the next incomplete task.
|
|
|
|
## 11.4 `nightshift run --task TASK-001`
|
|
|
|
Runs a specific task.
|
|
|
|
## 11.5 `nightshift status`
|
|
|
|
Prints:
|
|
|
|
* current config
|
|
* task count
|
|
* completed/incomplete tasks
|
|
* latest run directory
|
|
|
|
---
|
|
|
|
# 12. Testing Strategy
|
|
|
|
Write tests early.
|
|
|
|
Minimum tests:
|
|
|
|
* config loading happy path
|
|
* config missing required fields
|
|
* markdown task parsing
|
|
* artifact directory creation
|
|
* path traversal rejection
|
|
* command allowlist behavior
|
|
* forbidden command rejection
|
|
* simple pipeline execution with fake agents
|
|
* retry limit behavior
|
|
|
|
Use fake agents for tests.
|
|
|
|
Do not require real LLM calls in unit tests.
|
|
|
|
---
|
|
|
|
# 13. MVP Task Checklist
|
|
|
|
## Phase 1: Skeleton
|
|
|
|
* [ ] Create project package/module layout
|
|
* [ ] Add CLI entry point
|
|
* [ ] Add `nightshift init`
|
|
* [ ] Generate example `nightshift.yaml`
|
|
* [ ] Generate example `tasks.md`
|
|
* [ ] Generate example agent prompt files
|
|
|
|
Acceptance Criteria:
|
|
|
|
* User can run init command
|
|
* Expected files are created
|
|
* Existing files are not overwritten without confirmation or force flag
|
|
|
|
---
|
|
|
|
## Phase 2: Config Loading
|
|
|
|
* [ ] Implement YAML config loader
|
|
* [ ] Define typed config objects
|
|
* [ ] Validate required sections
|
|
* [ ] Validate agent references
|
|
* [ ] Validate pipeline stages
|
|
* [ ] Add tests
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Valid config loads
|
|
* Invalid config fails with clear error
|
|
* Pipeline stages cannot reference missing agents
|
|
|
|
---
|
|
|
|
## Phase 3: Safety Layer
|
|
|
|
* [ ] Implement project root resolution
|
|
* [ ] Implement scoped path validation
|
|
* [ ] Implement safe artifact path creation
|
|
* [ ] Implement command allowlist check
|
|
* [ ] Implement forbidden command fragment check
|
|
* [ ] Add tests for path traversal
|
|
* [ ] Add tests for forbidden commands
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Cannot write outside project root
|
|
* Cannot execute commands outside allowlist
|
|
* Dangerous command fragments are blocked
|
|
|
|
---
|
|
|
|
## Phase 4: Task Parser
|
|
|
|
* [ ] Parse markdown task checklist
|
|
* [ ] Extract task id
|
|
* [ ] Extract title
|
|
* [ ] Extract description
|
|
* [ ] Extract acceptance criteria
|
|
* [ ] Support selecting next incomplete task
|
|
* [ ] Support selecting specific task id
|
|
* [ ] Add tests
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Parser handles documented task format
|
|
* Parser returns useful errors for malformed tasks
|
|
* Task selection works
|
|
|
|
---
|
|
|
|
## Phase 5: Artifact Store
|
|
|
|
* [ ] Create `.nightshift/`
|
|
* [ ] Create per-run directory
|
|
* [ ] Create per-task directory
|
|
* [ ] Write config snapshot
|
|
* [ ] Write task snapshot
|
|
* [ ] Write stage outputs
|
|
* [ ] Write command outputs
|
|
* [ ] Write final task notes
|
|
* [ ] Add tests
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Every run creates deterministic artifact structure
|
|
* Artifacts are present even when stages fail
|
|
|
|
---
|
|
|
|
## Phase 6: Command Executor
|
|
|
|
* [ ] Implement command stage execution
|
|
* [ ] Capture stdout
|
|
* [ ] Capture stderr
|
|
* [ ] Capture exit code
|
|
* [ ] Persist command output
|
|
* [ ] Return structured stage result
|
|
* [ ] Add tests with harmless commands
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Passing command returns pass
|
|
* Failing command returns fail
|
|
* Output is written to artifact file
|
|
|
|
---
|
|
|
|
## Phase 7: Agent Executor
|
|
|
|
* [ ] Implement `command` backend agent
|
|
* [ ] Load system prompt file
|
|
* [ ] Build prompt bundle
|
|
* [ ] Pass prompt to command backend
|
|
* [ ] Capture output
|
|
* [ ] Persist output
|
|
* [ ] Return structured stage result
|
|
* [ ] Add fake-agent tests
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Fake command agent can produce stage output
|
|
* Prompt includes task and acceptance criteria
|
|
* Agent output is stored in artifacts
|
|
|
|
---
|
|
|
|
## Phase 8: Pipeline Runner
|
|
|
|
* [ ] Execute configured stages in order
|
|
* [ ] Stop on unrecoverable failure
|
|
* [ ] Support `on_fail` stage redirection
|
|
* [ ] Track retry count
|
|
* [ ] Enforce max task retries
|
|
* [ ] Write per-stage summaries
|
|
* [ ] Add tests
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Happy path pipeline completes
|
|
* Failed review can retry implementation
|
|
* Retry limit is enforced
|
|
* Final task status is recorded
|
|
|
|
---
|
|
|
|
## Phase 9: Context Manager
|
|
|
|
* [ ] Create project context file if absent
|
|
* [ ] Create task context file
|
|
* [ ] Include project context in agent prompt bundle
|
|
* [ ] Include prior stage notes in retry prompt
|
|
* [ ] Write `context-out.md`
|
|
* [ ] Add tests
|
|
|
|
Acceptance Criteria:
|
|
|
|
* Context files are created
|
|
* Agent prompt receives compact context
|
|
* Context output is persisted
|
|
|
|
---
|
|
|
|
## Phase 10: Reports
|
|
|
|
* [ ] Generate task final report
|
|
* [ ] Generate run summary
|
|
* [ ] Include task status
|
|
* [ ] Include retry count
|
|
* [ ] Include modified files if available
|
|
* [ ] Include test/static results
|
|
* [ ] Include artifact paths
|
|
* [ ] Add tests
|
|
|
|
Acceptance Criteria:
|
|
|
|
* User can inspect one summary after run
|
|
* Summary explains what happened without reading every artifact
|
|
|
|
---
|
|
|
|
## Phase 11: README
|
|
|
|
* [ ] Explain what NightShift is
|
|
* [ ] Explain what it is not
|
|
* [ ] Add quickstart
|
|
* [ ] Add config example
|
|
* [ ] Add task file example
|
|
* [ ] Add safety model explanation
|
|
* [ ] Add MVP status
|
|
|
|
Acceptance Criteria:
|
|
|
|
* A new user can understand and run the MVP
|
|
* README emphasizes reviewable output, not blind autonomy
|
|
|
|
---
|
|
|
|
# 14. Implementation Guidance
|
|
|
|
## 14.1 Prefer boring code
|
|
|
|
This project should be reliable.
|
|
|
|
Do not make clever abstractions before the simple pipeline works.
|
|
|
|
## 14.2 Tests are part of the product
|
|
|
|
This is an AI automation safety tool.
|
|
|
|
Tests are credibility.
|
|
|
|
## 14.3 Make errors helpful
|
|
|
|
Bad:
|
|
|
|
```text
|
|
ValueError: invalid config
|
|
```
|
|
|
|
Good:
|
|
|
|
```text
|
|
Config error: pipeline stage 'review_plan' references unknown agent 'critic'.
|
|
Defined agents: planner, implementer, reviewer.
|
|
```
|
|
|
|
## 14.4 Do not assume real LLMs in tests
|
|
|
|
Use fake command agents.
|
|
|
|
Real model integration can come later.
|
|
|
|
## 14.5 Keep artifacts human-readable
|
|
|
|
Prefer markdown, YAML, and plain text.
|
|
|
|
---
|
|
|
|
# 15. Suggested Agent Prompt Files
|
|
|
|
## `agents/planner.md`
|
|
|
|
```markdown
|
|
You are the planning agent for NightShift.
|
|
|
|
Your job is to create a conservative implementation plan for one coding task.
|
|
|
|
Rules:
|
|
- Do not write code.
|
|
- Identify relevant files.
|
|
- Preserve existing behavior.
|
|
- Prefer small changes.
|
|
- Include test strategy.
|
|
- Include risks.
|
|
|
|
Output:
|
|
# Plan
|
|
|
|
## Summary
|
|
|
|
## Relevant Files
|
|
|
|
## Steps
|
|
|
|
## Test Strategy
|
|
|
|
## Risks
|
|
|
|
## Acceptance Criteria Mapping
|
|
```
|
|
|
|
## `agents/implementer.md`
|
|
|
|
```markdown
|
|
You are the implementation agent for NightShift.
|
|
|
|
Your job is to implement the approved plan inside the scoped project directory.
|
|
|
|
Rules:
|
|
- Make the smallest correct change.
|
|
- Do not edit files outside scope.
|
|
- Do not skip tests intentionally.
|
|
- Preserve existing style.
|
|
- Write useful implementation notes.
|
|
|
|
Output:
|
|
# Implementation Notes
|
|
|
|
## Changed Files
|
|
|
|
## Summary
|
|
|
|
## Tests Added or Updated
|
|
|
|
## Risks
|
|
|
|
## Follow-up Notes
|
|
```
|
|
|
|
## `agents/reviewer.md`
|
|
|
|
```markdown
|
|
You are the review agent for NightShift.
|
|
|
|
Your job is to decide whether the current task should pass, retry implementation, retry planning, or fail.
|
|
|
|
Priorities:
|
|
1. Correctness
|
|
2. Safety
|
|
3. Acceptance criteria
|
|
4. Maintainability
|
|
5. Minimality
|
|
|
|
Output exactly:
|
|
|
|
status: pass | fail | retry | escalate
|
|
reason: <short explanation>
|
|
next_stage: <optional stage id>
|
|
context_update: <compact useful note>
|
|
```
|
|
|
|
---
|
|
|
|
# 16. Definition of Done for MVP
|
|
|
|
NightShift MVP is done when:
|
|
|
|
* `nightshift init` creates a usable starter project
|
|
* `nightshift validate` catches bad config
|
|
* `nightshift run` can process one markdown task
|
|
* pipeline stages execute in order
|
|
* fake command agents work
|
|
* command stages run safely
|
|
* artifacts are written
|
|
* retry limits work
|
|
* final report is generated
|
|
* tests cover core safety and pipeline behavior
|
|
|
|
---
|
|
|
|
# 17. Future Features
|
|
|
|
Do not implement these until MVP is stable:
|
|
|
|
* DAG workflows
|
|
* parallel tasks
|
|
* Git branches per task
|
|
* remote workers
|
|
* cloud agent APIs
|
|
* dashboard UI
|
|
* prompt A/B testing
|
|
* model cost telemetry
|
|
* agent tournaments
|
|
* constraint-language experiments
|
|
* task dependency solver
|
|
* self-improving prompt library
|
|
|
|
---
|
|
|
|
# 18. Final Instruction to Codex
|
|
|
|
Build this incrementally.
|
|
|
|
Start with the smallest vertical slice:
|
|
|
|
```text
|
|
init -> validate -> parse one task -> create artifacts -> run fake pipeline -> write summary
|
|
```
|
|
|
|
Then add safety, retries, command execution, and real agent wrappers.
|
|
|
|
Do not build the cathedral before the generator turns on.
|
|
|
|
The goal is boring, auditable leverage.
|