Phase 2 work

Added /keys and /help keys through src/keybindings.rs, wired into src/app.rs and src/repl.rs. It documents key actions and explicit slash-command fallbacks for copy, explain, discard, context, and stance without adding a premature TUI dependency.
This commit is contained in:
K. Hodges 2026-06-08 03:36:31 -07:00
parent ba19b6bcff
commit a4769580b0
18 changed files with 783 additions and 22 deletions

2
Cargo.lock generated
View File

@ -100,7 +100,7 @@ dependencies = [
[[package]] [[package]]
name = "exoshell" name = "exoshell"
version = "0.3.0" version = "0.4.0"
dependencies = [ dependencies = [
"async-trait", "async-trait",
"futures-util", "futures-util",

View File

@ -1,6 +1,6 @@
[package] [package]
name = "exoshell" name = "exoshell"
version = "0.3.0" version = "0.4.0"
edition = "2024" edition = "2024"
license = "GPL-3.0-or-later" license = "GPL-3.0-or-later"

View File

@ -63,6 +63,17 @@ Select an operating stance:
cargo run -- --stance audit cargo run -- --stance audit
``` ```
Configure model routing:
```toml
[router]
enabled = true
model = "qwen2.5-coder:7b"
fallback_role = "coding"
```
The default router roles are `instant`, `coding`, `heavy`, and `conversational`. For Ollama model setup examples, see [khodges42/modelfiles](https://github.com/khodges42/modelfiles).
Exoshell suggests commands. It does not execute them. Exoshell suggests commands. It does not execute them.
## Quality Checks ## Quality Checks

View File

@ -18,6 +18,8 @@ Command suggestion is a shell fenced code block such as `powershell`, `pwsh`, `s
Transcript entry is a markdown record of user prompts, assistant responses, context events, budget warnings, stance changes, command suggestions, and command actions. Transcript entry is a markdown record of user prompts, assistant responses, context events, budget warnings, stance changes, command suggestions, and command actions.
Model route is an optional provider decision made before a request is answered. When model routing is enabled, a fast router model chooses one configured role. The selected role, target model, and reason are recorded in the transcript.
## Prompt Assembly ## Prompt Assembly
Prompt assembly is deterministic: Prompt assembly is deterministic:
@ -113,9 +115,11 @@ Useful commands:
```text ```text
/panel /panel
/keys
/context /context
/context stats /context stats
/help /help
/help keys
/help context /help context
/help stance /help stance
/help commands /help commands
@ -123,6 +127,8 @@ Useful commands:
`/panel` renders stance, shell family, provider/model, transcript state, context entries, and prompt estimates without requiring a TUI. `/panel` renders stance, shell family, provider/model, transcript state, context entries, and prompt estimates without requiring a TUI.
`/keys` documents the current line-REPL key actions and slash-command fallbacks. Advanced terminal key handling is not active yet; copy, explain, discard, context, and stance controls degrade to explicit slash commands.
## Non-Goals ## Non-Goals
Phase 2 does not make Exoshell an autonomous executor. Phase 2 does not make Exoshell an autonomous executor.
@ -141,4 +147,34 @@ Command parsing is intentionally simple and based on fenced blocks.
Risk detection is heuristic and incomplete. Risk detection is heuristic and incomplete.
Advanced TUI keybindings and config profiles remain planned work. Advanced full-screen TUI keybindings and config profiles remain planned work.
## Model Routing
The configurable model router lets a fast model inspect the prompt payload and choose the model role that should answer.
Default roles:
```text
instant qwen2.5-coder:7b
coding coder-g4-26b
heavy coder-g4-26b
conversational qwen2.5-coder:7b
```
Enable routing:
```toml
[router]
enabled = true
model = "qwen2.5-coder:7b"
fallback_role = "coding"
```
The router asks for compact JSON:
```json
{"role":"coding","reason":"source code change request"}
```
If the router fails or returns an unknown role, Exoshell uses the configured fallback role and records that reason in the transcript.

View File

@ -82,6 +82,60 @@ To replace the built-in defaults entirely:
include_defaults = false include_defaults = false
``` ```
## Configure Model Routing
Exoshell can route each prompt through a fast router model before selecting the model that should answer.
Enable the default router:
```toml
[router]
enabled = true
model = "qwen2.5-coder:7b"
fallback_role = "coding"
```
Default roles:
```text
instant qwen2.5-coder:7b
coding coder-g4-26b
heavy coder-g4-26b
conversational qwen2.5-coder:7b
```
Override role models or behavior:
```toml
[router]
enabled = true
model = "qwen2.5-coder:7b"
fallback_role = "coding"
behavior = "Prefer instant for short shell questions. Use heavy only for architecture or high-context analysis."
[[router.roles]]
name = "instant"
model = "qwen2.5-coder:7b"
description = "fast responses for simple prompts"
[[router.roles]]
name = "coding"
model = "coder-g4-26b"
description = "code edits, debugging, tests, and shell command construction"
[[router.roles]]
name = "heavy"
model = "coder-g4-26b"
description = "complex reasoning and architecture"
[[router.roles]]
name = "conversational"
model = "qwen2.5-coder:7b"
description = "general discussion and explanations"
```
For Ollama model setup examples, see [khodges42/modelfiles](https://github.com/khodges42/modelfiles).
## Start Exoshell ## Start Exoshell
Run with defaults: Run with defaults:
@ -246,6 +300,16 @@ exo> /panel
The panel includes stance, shell family, provider/model, transcript state, context entries, and prompt estimates. The panel includes stance, shell family, provider/model, transcript state, context entries, and prompt estimates.
## Keybinding Fallbacks
The current REPL is line-oriented. Use `/keys` to show the available key actions and their slash-command fallbacks:
```text
exo> /keys
```
Copy, explain, discard, context, and stance actions degrade to explicit commands such as `/copy <cmd-id>`, `/explain <cmd-id>`, `/discard <cmd-id>`, `/context`, and `/stance`.
## Multi-Line Prompts ## Multi-Line Prompts
Use `/multi` for longer prompts: Use `/multi` for longer prompts:

View File

@ -102,3 +102,4 @@ Historical codenames should be tracked in docs/versioning.md below
* 0.1.0 packet-kobold * 0.1.0 packet-kobold
* 0.2.0 context-relic * 0.2.0 context-relic
* 0.3.0 stance-lantern * 0.3.0 stance-lantern
* 0.4.0 switchboard-relic

View File

@ -9,6 +9,7 @@ use crate::context::{
render_context_details, render_context_list, render_context_stats, render_context_details, render_context_list, render_context_stats,
}; };
use crate::formatting::render_assistant_output_with_policy; use crate::formatting::render_assistant_output_with_policy;
use crate::keybindings::render_keybindings;
use crate::prompts::{Stance, assemble_prompt, render_prompt_estimate}; use crate::prompts::{Stance, assemble_prompt, render_prompt_estimate};
use crate::providers::{ChatMessage, ChatRequest, ChatResponse, ChatRole, Provider, ProviderError}; use crate::providers::{ChatMessage, ChatRequest, ChatResponse, ChatRole, Provider, ProviderError};
use crate::repl::ReplError; use crate::repl::ReplError;
@ -72,6 +73,9 @@ impl App {
Ok(Ok(ChatResponse::Complete(response))) => response, Ok(Ok(ChatResponse::Complete(response))) => response,
Ok(Ok(ChatResponse::Stream(chunks))) => chunks.concat(), Ok(Ok(ChatResponse::Stream(chunks))) => chunks.concat(),
Ok(Err(error)) => { Ok(Err(error)) => {
if let Some(route) = self.provider.last_model_route() {
self.transcript.record_model_route(&route);
}
self.transcript.record_error(&error.to_string()); self.transcript.record_error(&error.to_string());
return Err(error.into()); return Err(error.into());
} }
@ -79,6 +83,9 @@ impl App {
self.conversation self.conversation
.push(ChatMessage::new(ChatRole::Assistant, response.clone())); .push(ChatMessage::new(ChatRole::Assistant, response.clone()));
self.transcript.record_assistant(&response); self.transcript.record_assistant(&response);
if let Some(route) = self.provider.last_model_route() {
self.transcript.record_model_route(&route);
}
self.last_command_suggestions = self.last_command_suggestions =
parse_command_suggestions_with_policy(&response, &self.config.commands.risk); parse_command_suggestions_with_policy(&response, &self.config.commands.risk);
for suggestion in &self.last_command_suggestions { for suggestion in &self.last_command_suggestions {
@ -102,6 +109,10 @@ impl App {
return Ok(help_overview().into()); return Ok(help_overview().into());
} }
if trimmed == "/keys" {
return Ok(render_keybindings());
}
if let Some(topic) = trimmed.strip_prefix("/help ") { if let Some(topic) = trimmed.strip_prefix("/help ") {
return Ok(help_topic(topic.trim()).into()); return Ok(help_topic(topic.trim()).into());
} }
@ -328,10 +339,11 @@ impl App {
fn render_panel(&self) -> String { fn render_panel(&self) -> String {
format!( format!(
"Exoshell session\nstance: {}\nshell: {}\nprovider: openai-compatible\nmodel: {}\ntranscript: {}\n\nContext\n{}\n\nPrompt estimate\n{}", "Exoshell session\nstance: {}\nshell: {}\nprovider: openai-compatible\nmodel: {}\nrouter: {}\ntranscript: {}\n\nContext\n{}\n\nPrompt estimate\n{}",
self.config.interaction.stance, self.config.interaction.stance,
self.config.shell.family, self.config.shell.family,
self.config.provider.model, self.config.provider.model,
self.render_router_status(),
if self.config.transcript.enabled { if self.config.transcript.enabled {
"enabled" "enabled"
} else { } else {
@ -342,6 +354,25 @@ impl App {
) )
} }
fn render_router_status(&self) -> String {
if !self.config.router.enabled {
return "disabled".into();
}
let roles = self
.config
.router
.roles
.iter()
.map(|role| format!("{}={}", role.name, role.model))
.collect::<Vec<_>>()
.join(", ");
format!(
"enabled model={} fallback={} roles=[{}]",
self.config.router.model, self.config.router.fallback_role, roles
)
}
fn set_stance(&mut self, input: &str) -> Result<String, AppError> { fn set_stance(&mut self, input: &str) -> Result<String, AppError> {
let stance = input let stance = input
.parse::<Stance>() .parse::<Stance>()
@ -519,6 +550,7 @@ fn help_overview() -> &'static str {
/explain <cmd-id> explain a suggested command /explain <cmd-id> explain a suggested command
/discard <cmd-id> mark a suggested command as discarded /discard <cmd-id> mark a suggested command as discarded
/panel show session, stance, provider, and context state /panel show session, stance, provider, and context state
/keys show keybinding fallbacks for the line REPL
/multi enter multi-line input /multi enter multi-line input
/exit quit and write transcript if enabled /exit quit and write transcript if enabled
@ -536,7 +568,10 @@ fn help_topic(topic: &str) -> &'static str {
"commands" => { "commands" => {
"Suggested commands appear as fenced shell blocks and get IDs such as cmd-001. Use /copy, /explain, or /discard by ID. Copy prints the command when clipboard support is unavailable and never runs it." "Suggested commands appear as fenced shell blocks and get IDs such as cmd-001. Use /copy, /explain, or /discard by ID. Copy prints the command when clipboard support is unavailable and never runs it."
} }
_ => "Unknown help topic. Try /help context, /help stance, or /help commands.", "keys" => {
"The current line REPL does not install advanced terminal keybindings. Use /keys to see the predictable slash-command fallbacks for copy, explain, discard, context, and stance actions."
}
_ => "Unknown help topic. Try /help context, /help stance, /help commands, or /help keys.",
} }
} }
@ -861,6 +896,16 @@ mod tests {
.expect("help") .expect("help")
.contains("/copy") .contains("/copy")
); );
assert!(
app.handle_command("/keys")
.expect("keys")
.contains("/discard <cmd-id>")
);
assert!(
app.handle_command("/help keys")
.expect("help keys")
.contains("line REPL")
);
} }
#[tokio::test] #[tokio::test]
@ -961,6 +1006,7 @@ mod tests {
model: "test-model".into(), model: "test-model".into(),
request_timeout_seconds: 120, request_timeout_seconds: 120,
}, },
router: crate::providers::router::ModelRouterConfig::default(),
shell: ShellConfig { shell: ShellConfig {
family: ShellFamily::PowerShell, family: ShellFamily::PowerShell,
}, },

View File

@ -156,7 +156,8 @@ impl CommandRisk {
} }
} }
pub fn parse_command_suggestions(response: &str) -> Vec<CommandSuggestion> { #[cfg(test)]
fn parse_command_suggestions(response: &str) -> Vec<CommandSuggestion> {
parse_command_suggestions_with_policy(response, &CommandRiskPolicy::default()) parse_command_suggestions_with_policy(response, &CommandRiskPolicy::default())
} }
@ -218,7 +219,8 @@ pub fn parse_command_suggestions_with_policy(
suggestions suggestions
} }
pub fn detect_command_risk(command: &str, shell: CommandShell) -> CommandRisk { #[cfg(test)]
fn detect_command_risk(command: &str, shell: CommandShell) -> CommandRisk {
detect_command_risk_with_policy(command, shell, &CommandRiskPolicy::default()) detect_command_risk_with_policy(command, shell, &CommandRiskPolicy::default())
} }

View File

@ -8,11 +8,13 @@ use crate::app::CliOptions;
use crate::commands::{CommandRiskPolicy, CommandRiskRule}; use crate::commands::{CommandRiskPolicy, CommandRiskRule};
use crate::context::ContextBudget; use crate::context::ContextBudget;
use crate::prompts::Stance; use crate::prompts::Stance;
use crate::providers::router::{ModelRouterConfig, ModelRouterRole};
use crate::shell::ShellFamily; use crate::shell::ShellFamily;
#[derive(Debug, Clone, PartialEq, Eq)] #[derive(Debug, Clone, PartialEq, Eq)]
pub struct Config { pub struct Config {
pub provider: ProviderConfig, pub provider: ProviderConfig,
pub router: ModelRouterConfig,
pub shell: ShellConfig, pub shell: ShellConfig,
pub interaction: InteractionConfig, pub interaction: InteractionConfig,
pub commands: CommandConfig, pub commands: CommandConfig,
@ -68,6 +70,7 @@ impl ContextConfig {
#[derive(Debug, Deserialize, Default)] #[derive(Debug, Deserialize, Default)]
struct RawConfig { struct RawConfig {
provider: Option<RawProviderConfig>, provider: Option<RawProviderConfig>,
router: Option<RawModelRouterConfig>,
shell: Option<RawShellConfig>, shell: Option<RawShellConfig>,
interaction: Option<RawInteractionConfig>, interaction: Option<RawInteractionConfig>,
commands: Option<RawCommandConfig>, commands: Option<RawCommandConfig>,
@ -83,6 +86,22 @@ struct RawProviderConfig {
request_timeout_seconds: Option<u64>, request_timeout_seconds: Option<u64>,
} }
#[derive(Debug, Deserialize, Default)]
struct RawModelRouterConfig {
enabled: Option<bool>,
model: Option<String>,
fallback_role: Option<String>,
behavior: Option<String>,
roles: Option<Vec<RawModelRouterRole>>,
}
#[derive(Debug, Deserialize, Default)]
struct RawModelRouterRole {
name: Option<String>,
model: Option<String>,
description: Option<String>,
}
#[derive(Debug, Deserialize, Default)] #[derive(Debug, Deserialize, Default)]
struct RawShellConfig { struct RawShellConfig {
family: Option<String>, family: Option<String>,
@ -128,6 +147,7 @@ impl Config {
fn from_raw(raw: RawConfig) -> Result<Self, ConfigError> { fn from_raw(raw: RawConfig) -> Result<Self, ConfigError> {
let provider = raw.provider.unwrap_or_default(); let provider = raw.provider.unwrap_or_default();
let router = raw.router.unwrap_or_default();
let shell = raw.shell.unwrap_or_default(); let shell = raw.shell.unwrap_or_default();
let interaction = raw.interaction.unwrap_or_default(); let interaction = raw.interaction.unwrap_or_default();
let commands = raw.commands.unwrap_or_default(); let commands = raw.commands.unwrap_or_default();
@ -141,6 +161,7 @@ impl Config {
.api_key_env .api_key_env
.unwrap_or_else(|| "OPENAI_API_KEY".into()); .unwrap_or_else(|| "OPENAI_API_KEY".into());
let api_key = provider_api_key(&base_url, &api_key_env)?; let api_key = provider_api_key(&base_url, &api_key_env)?;
let router = model_router_config(router)?;
let family = shell.family.unwrap_or_else(default_shell_family); let family = shell.family.unwrap_or_else(default_shell_family);
let family = family let family = family
@ -161,6 +182,7 @@ impl Config {
model: provider.model.unwrap_or_else(|| "gpt-4.1-mini".into()), model: provider.model.unwrap_or_else(|| "gpt-4.1-mini".into()),
request_timeout_seconds: provider.request_timeout_seconds.unwrap_or(120), request_timeout_seconds: provider.request_timeout_seconds.unwrap_or(120),
}, },
router,
shell: ShellConfig { family }, shell: ShellConfig { family },
interaction: InteractionConfig { stance }, interaction: InteractionConfig { stance },
commands: CommandConfig { risk }, commands: CommandConfig { risk },
@ -196,6 +218,61 @@ impl Config {
} }
} }
fn model_router_config(raw: RawModelRouterConfig) -> Result<ModelRouterConfig, ConfigError> {
let mut config = ModelRouterConfig::default();
if let Some(enabled) = raw.enabled {
config.enabled = enabled;
}
if let Some(model) = raw.model {
config.model = non_empty_config_value("router.model", model)?;
}
if let Some(fallback_role) = raw.fallback_role {
config.fallback_role = non_empty_config_value("router.fallback_role", fallback_role)?;
}
if let Some(behavior) = raw.behavior {
config.behavior = non_empty_config_value("router.behavior", behavior)?;
}
if let Some(roles) = raw.roles {
let mut parsed = Vec::new();
for role in roles {
parsed.push(ModelRouterRole {
name: non_empty_config_value(
"router.roles.name",
role.name.ok_or_else(|| {
ConfigError::Invalid("router.roles entries require name".into())
})?,
)?,
model: non_empty_config_value(
"router.roles.model",
role.model.ok_or_else(|| {
ConfigError::Invalid("router.roles entries require model".into())
})?,
)?,
description: non_empty_config_value(
"router.roles.description",
role.description.ok_or_else(|| {
ConfigError::Invalid("router.roles entries require description".into())
})?,
)?,
});
}
config.roles = parsed;
}
config
.validate()
.map_err(|error| ConfigError::Invalid(error.to_string()))?;
Ok(config)
}
fn non_empty_config_value(name: &str, value: String) -> Result<String, ConfigError> {
if value.trim().is_empty() {
Err(ConfigError::Invalid(format!("{name} cannot be empty")))
} else {
Ok(value)
}
}
fn command_risk_policy( fn command_risk_policy(
raw: Option<RawCommandRiskConfig>, raw: Option<RawCommandRiskConfig>,
) -> Result<CommandRiskPolicy, ConfigError> { ) -> Result<CommandRiskPolicy, ConfigError> {
@ -385,6 +462,7 @@ mod tests {
shell: Some(RawShellConfig { shell: Some(RawShellConfig {
family: Some("cmd".into()), family: Some("cmd".into()),
}), }),
router: None,
interaction: None, interaction: None,
commands: None, commands: None,
transcript: None, transcript: None,
@ -406,6 +484,22 @@ base_url = "http://localhost:11434/v1"
model = "local-model" model = "local-model"
request_timeout_seconds = 45 request_timeout_seconds = 45
[router]
enabled = true
model = "qwen2.5-coder:7b"
fallback_role = "instant"
behavior = "Route to the smallest model that can answer well."
[[router.roles]]
name = "instant"
model = "qwen2.5-coder:7b"
description = "fast answers"
[[router.roles]]
name = "heavy"
model = "coder-g4-26b"
description = "deep technical work"
[shell] [shell]
family = "posix" family = "posix"
@ -436,6 +530,11 @@ max_estimated_tokens = 3000
assert_eq!(config.provider.api_key, "exoshell-local-provider"); assert_eq!(config.provider.api_key, "exoshell-local-provider");
assert_eq!(config.provider.model, "local-model"); assert_eq!(config.provider.model, "local-model");
assert_eq!(config.provider.request_timeout_seconds, 45); assert_eq!(config.provider.request_timeout_seconds, 45);
assert!(config.router.enabled);
assert_eq!(config.router.model, "qwen2.5-coder:7b");
assert_eq!(config.router.fallback_role, "instant");
assert_eq!(config.router.roles.len(), 2);
assert_eq!(config.router.roles[1].model, "coder-g4-26b");
assert_eq!(config.shell.family, ShellFamily::Posix); assert_eq!(config.shell.family, ShellFamily::Posix);
assert_eq!(config.interaction.stance, Stance::Audit); assert_eq!(config.interaction.stance, Stance::Audit);
assert!(config.commands.risk.include_defaults); assert!(config.commands.risk.include_defaults);
@ -470,6 +569,36 @@ max_estimated_tokens = 3000
assert_eq!(config.provider.request_timeout_seconds, 120); assert_eq!(config.provider.request_timeout_seconds, 120);
assert!(config.commands.risk.include_defaults); assert!(config.commands.risk.include_defaults);
assert!(config.commands.risk.rules.is_empty()); assert!(config.commands.risk.rules.is_empty());
assert!(!config.router.enabled);
assert_eq!(
config
.router
.role("conversational")
.expect("conversational")
.model,
"qwen2.5-coder:7b"
);
}
#[test]
fn rejects_router_fallback_role_that_is_not_defined() {
let mut file = tempfile::NamedTempFile::new().expect("temp config");
write!(
file,
r#"
[provider]
base_url = "http://localhost:11434/v1"
[router]
enabled = true
fallback_role = "missing"
"#
)
.expect("write config");
let error = Config::load(Some(file.path())).expect_err("config should fail");
assert!(error.to_string().contains("fallback role"));
} }
#[test] #[test]

View File

@ -2,7 +2,8 @@ use crate::commands::{
CommandRiskPolicy, parse_command_suggestions_with_policy, render_suggestions, CommandRiskPolicy, parse_command_suggestions_with_policy, render_suggestions,
}; };
pub fn render_assistant_output(response: &str) -> String { #[cfg(test)]
fn render_assistant_output(response: &str) -> String {
render_assistant_output_with_policy(response, &CommandRiskPolicy::default()) render_assistant_output_with_policy(response, &CommandRiskPolicy::default())
} }

75
src/keybindings.rs Normal file
View File

@ -0,0 +1,75 @@
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Keybinding {
pub key: &'static str,
pub action: &'static str,
pub fallback: &'static str,
}
pub const BASIC_KEYBINDINGS: &[Keybinding] = &[
Keybinding {
key: "Enter",
action: "send the current prompt",
fallback: "type a prompt and press Enter",
},
Keybinding {
key: "Ctrl+C",
action: "interrupt the current terminal operation",
fallback: "keyboard interrupt remains handled by the terminal",
},
Keybinding {
key: "copy",
action: "copy or print a suggested command",
fallback: "/copy <cmd-id>",
},
Keybinding {
key: "explain",
action: "explain a suggested command",
fallback: "/explain <cmd-id>",
},
Keybinding {
key: "discard",
action: "discard a suggested command",
fallback: "/discard <cmd-id>",
},
Keybinding {
key: "context",
action: "show attached context",
fallback: "/context",
},
Keybinding {
key: "stance",
action: "show or change stance",
fallback: "/stance",
},
];
pub fn render_keybindings() -> String {
let mut rendered = String::from("Keybindings and fallbacks\n");
rendered.push_str("Advanced terminal key handling is not active in the line REPL.\n");
rendered.push_str("Use these slash commands when direct keybindings are unavailable.\n\n");
for binding in BASIC_KEYBINDINGS {
rendered.push_str(&format!(
"- {}: {}; fallback: {}\n",
binding.key, binding.action, binding.fallback
));
}
rendered.trim_end().to_string()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn renders_keybinding_fallbacks() {
let output = render_keybindings();
assert!(output.contains("/copy <cmd-id>"));
assert!(output.contains("/explain <cmd-id>"));
assert!(output.contains("/discard <cmd-id>"));
assert!(output.contains("/context"));
assert!(output.contains("Ctrl+C"));
}
}

View File

@ -3,6 +3,7 @@ mod commands;
mod config; mod config;
pub mod context; pub mod context;
mod formatting; mod formatting;
mod keybindings;
mod prompts; mod prompts;
mod providers; mod providers;
mod repl; mod repl;
@ -14,6 +15,7 @@ use std::io::{IsTerminal, Read};
use crate::app::{App, CliOptions}; use crate::app::{App, CliOptions};
use crate::config::Config; use crate::config::Config;
use crate::providers::openai_compatible::OpenAiCompatibleProvider; use crate::providers::openai_compatible::OpenAiCompatibleProvider;
use crate::providers::router::ModelRouterProvider;
use crate::repl::Repl; use crate::repl::Repl;
#[tokio::main] #[tokio::main]
@ -34,8 +36,16 @@ async fn run() -> Result<(), app::AppError> {
let mut config = Config::load(options.config_path.as_deref())?; let mut config = Config::load(options.config_path.as_deref())?;
config.apply_cli_overrides(&options)?; config.apply_cli_overrides(&options)?;
let provider = OpenAiCompatibleProvider::from_config(&config)?; let base_provider = OpenAiCompatibleProvider::from_config(&config)?;
let mut app = App::new(config, Box::new(provider)); let provider: Box<dyn crate::providers::Provider> = if config.router.enabled {
Box::new(ModelRouterProvider::new(
base_provider,
config.router.clone(),
)?)
} else {
Box::new(base_provider)
};
let mut app = App::new(config, provider);
for note in options.context_notes { for note in options.context_notes {
println!("{}", app.add_note_context(note)?); println!("{}", app.add_note_context(note)?);

View File

@ -5,9 +5,10 @@ use crate::context::{ContextBudget, ContextEntry, ContextSize, render_prompt_con
use crate::providers::{ChatMessage, ChatRole}; use crate::providers::{ChatMessage, ChatRole};
use crate::shell::ShellFamily; use crate::shell::ShellFamily;
#[derive(Debug, Clone, Copy, PartialEq, Eq, serde::Serialize, serde::Deserialize)] #[derive(Debug, Default, Clone, Copy, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
#[serde(rename_all = "snake_case")] #[serde(rename_all = "snake_case")]
pub enum Stance { pub enum Stance {
#[default]
Operator, Operator,
Audit, Audit,
Teach, Teach,
@ -47,12 +48,6 @@ impl Stance {
} }
} }
impl Default for Stance {
fn default() -> Self {
Self::Operator
}
}
impl fmt::Display for Stance { impl fmt::Display for Stance {
fn fmt(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result { fn fmt(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
match self { match self {

View File

@ -1,4 +1,5 @@
pub mod openai_compatible; pub mod openai_compatible;
pub mod router;
#[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)] #[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
pub struct ChatMessage { pub struct ChatMessage {
@ -35,9 +36,20 @@ pub enum ChatResponse {
Stream(Vec<String>), Stream(Vec<String>),
} }
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ModelRoute {
pub role: String,
pub model: String,
pub reason: String,
}
#[async_trait::async_trait] #[async_trait::async_trait]
pub trait Provider: Send + Sync { pub trait Provider: Send + Sync {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, ProviderError>; async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, ProviderError>;
fn last_model_route(&self) -> Option<ModelRoute> {
None
}
} }
#[derive(Debug, thiserror::Error)] #[derive(Debug, thiserror::Error)]

View File

@ -53,17 +53,25 @@ impl OpenAiCompatibleProvider {
fn chat_url(&self) -> String { fn chat_url(&self) -> String {
format!("{}/chat/completions", self.base_url) format!("{}/chat/completions", self.base_url)
} }
}
#[async_trait::async_trait] pub async fn chat_with_model(
impl Provider for OpenAiCompatibleProvider { &self,
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, ProviderError> { request: ChatRequest,
model: &str,
) -> Result<ChatResponse, ProviderError> {
let payload = ChatCompletionRequest { let payload = ChatCompletionRequest {
model: self.model.clone(), model: model.to_string(),
messages: request.messages, messages: request.messages,
stream: request.stream, stream: request.stream,
}; };
self.send_chat_completion(payload).await
}
async fn send_chat_completion(
&self,
payload: ChatCompletionRequest,
) -> Result<ChatResponse, ProviderError> {
let response = self let response = self
.client .client
.post(self.chat_url()) .post(self.chat_url())
@ -108,6 +116,19 @@ impl Provider for OpenAiCompatibleProvider {
} }
} }
#[async_trait::async_trait]
impl Provider for OpenAiCompatibleProvider {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, ProviderError> {
let payload = ChatCompletionRequest {
model: self.model.clone(),
messages: request.messages,
stream: request.stream,
};
self.send_chat_completion(payload).await
}
}
async fn read_streaming_response( async fn read_streaming_response(
response: reqwest::Response, response: reqwest::Response,
) -> Result<ChatResponse, ProviderError> { ) -> Result<ChatResponse, ProviderError> {

333
src/providers/router.rs Normal file
View File

@ -0,0 +1,333 @@
use std::fmt;
use std::sync::Mutex;
use serde::Deserialize;
use crate::providers::openai_compatible::OpenAiCompatibleProvider;
use crate::providers::{
ChatMessage, ChatRequest, ChatResponse, ChatRole, ModelRoute, Provider, ProviderError,
};
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ModelRouterConfig {
pub enabled: bool,
pub model: String,
pub fallback_role: String,
pub behavior: String,
pub roles: Vec<ModelRouterRole>,
}
impl ModelRouterConfig {
pub fn validate(&self) -> Result<(), ModelRouterConfigError> {
if self.model.trim().is_empty() {
return Err(ModelRouterConfigError::Invalid(
"router model is empty".into(),
));
}
if self.fallback_role.trim().is_empty() {
return Err(ModelRouterConfigError::Invalid(
"router fallback role is empty".into(),
));
}
if self.behavior.trim().is_empty() {
return Err(ModelRouterConfigError::Invalid(
"router behavior is empty".into(),
));
}
if self.roles.is_empty() {
return Err(ModelRouterConfigError::Invalid(
"router requires at least one role".into(),
));
}
for role in &self.roles {
if role.name.trim().is_empty() {
return Err(ModelRouterConfigError::Invalid(
"router role name is empty".into(),
));
}
if role.model.trim().is_empty() {
return Err(ModelRouterConfigError::Invalid(format!(
"router role '{}' has an empty model",
role.name
)));
}
if role.description.trim().is_empty() {
return Err(ModelRouterConfigError::Invalid(format!(
"router role '{}' has an empty description",
role.name
)));
}
}
if self.role(&self.fallback_role).is_none() {
return Err(ModelRouterConfigError::Invalid(format!(
"router fallback role '{}' is not defined",
self.fallback_role
)));
}
Ok(())
}
pub fn role(&self, name: &str) -> Option<&ModelRouterRole> {
self.roles.iter().find(|role| role.name == name)
}
fn fallback_route(&self, reason: impl Into<String>) -> ModelRoute {
let role = self
.role(&self.fallback_role)
.expect("validated router fallback role exists");
ModelRoute {
role: role.name.clone(),
model: role.model.clone(),
reason: reason.into(),
}
}
}
impl Default for ModelRouterConfig {
fn default() -> Self {
Self {
enabled: false,
model: "qwen2.5-coder:7b".into(),
fallback_role: "coding".into(),
behavior: default_router_behavior().into(),
roles: default_router_roles(),
}
}
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ModelRouterRole {
pub name: String,
pub model: String,
pub description: String,
}
#[derive(Debug, thiserror::Error, PartialEq, Eq)]
pub enum ModelRouterConfigError {
#[error("invalid model router config: {0}")]
Invalid(String),
}
pub struct ModelRouterProvider {
provider: OpenAiCompatibleProvider,
config: ModelRouterConfig,
last_route: Mutex<Option<ModelRoute>>,
}
impl ModelRouterProvider {
pub fn new(
provider: OpenAiCompatibleProvider,
config: ModelRouterConfig,
) -> Result<Self, ProviderError> {
config
.validate()
.map_err(|error| ProviderError::Configuration(error.to_string()))?;
Ok(Self {
provider,
config,
last_route: Mutex::new(None),
})
}
async fn route(&self, request: &ChatRequest) -> ModelRoute {
let router_request = ChatRequest {
messages: vec![
ChatMessage::new(ChatRole::System, self.router_system_prompt()),
ChatMessage::new(ChatRole::User, router_user_prompt(request)),
],
stream: false,
};
let response = self
.provider
.chat_with_model(router_request, &self.config.model)
.await;
match response {
Ok(ChatResponse::Complete(content)) => parse_router_response(&content, &self.config)
.unwrap_or_else(|| {
self.config
.fallback_route(format!("router returned unrecognized role: {content}"))
}),
Ok(ChatResponse::Stream(chunks)) => {
let content = chunks.concat();
parse_router_response(&content, &self.config).unwrap_or_else(|| {
self.config
.fallback_route(format!("router returned unrecognized role: {content}"))
})
}
Err(error) => self
.config
.fallback_route(format!("router failed: {error}")),
}
}
fn router_system_prompt(&self) -> String {
let mut prompt = String::new();
prompt.push_str("You are Exoshell's model router. Choose exactly one role for the next assistant response.\n");
prompt.push_str(&self.config.behavior);
prompt.push_str("\n\nAvailable roles:\n");
for role in &self.config.roles {
prompt.push_str(&format!(
"- {}: {} (model: {})\n",
role.name, role.description, role.model
));
}
prompt.push_str(
"\nRespond as compact JSON only: {\"role\":\"<role>\",\"reason\":\"<short reason>\"}",
);
prompt
}
}
#[async_trait::async_trait]
impl Provider for ModelRouterProvider {
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, ProviderError> {
let route = self.route(&request).await;
{
let mut last_route = self
.last_route
.lock()
.expect("model route lock should not be poisoned");
*last_route = Some(route.clone());
}
self.provider.chat_with_model(request, &route.model).await
}
fn last_model_route(&self) -> Option<ModelRoute> {
self.last_route
.lock()
.expect("model route lock should not be poisoned")
.clone()
}
}
fn router_user_prompt(request: &ChatRequest) -> String {
let mut rendered = String::new();
rendered.push_str("Route this request using the current prompt payload.\n\n");
for message in &request.messages {
rendered.push_str(&format!("{:?}:\n{}\n\n", message.role, message.content));
}
rendered
}
fn parse_router_response(content: &str, config: &ModelRouterConfig) -> Option<ModelRoute> {
if let Ok(response) = serde_json::from_str::<RouterResponse>(content.trim())
&& let Some(role) = config.role(response.role.trim())
{
return Some(ModelRoute {
role: role.name.clone(),
model: role.model.clone(),
reason: response
.reason
.unwrap_or_else(|| "router selected role".into()),
});
}
let lowered = content.to_ascii_lowercase();
config.roles.iter().find_map(|role| {
if lowered.contains(&role.name.to_ascii_lowercase()) {
Some(ModelRoute {
role: role.name.clone(),
model: role.model.clone(),
reason: "router selected role from text response".into(),
})
} else {
None
}
})
}
#[derive(Debug, Deserialize)]
struct RouterResponse {
role: String,
reason: Option<String>,
}
fn default_router_behavior() -> &'static str {
"Prefer the cheapest and fastest role that is likely to answer well. Use instant for simple routing, short shell questions, quick lookups, and low-risk responses. Use coding for source changes, debugging, tests, command construction, and repository work. Use heavy for complex architecture, multi-step reasoning, risky operational analysis, or large-context synthesis. Use conversational for general explanation, planning, and non-code discussion."
}
fn default_router_roles() -> Vec<ModelRouterRole> {
vec![
ModelRouterRole {
name: "instant".into(),
model: "qwen2.5-coder:7b".into(),
description: "fast responses for simple prompts and low-latency checks".into(),
},
ModelRouterRole {
name: "coding".into(),
model: "coder-g4-26b".into(),
description: "code edits, debugging, tests, and shell command construction".into(),
},
ModelRouterRole {
name: "heavy".into(),
model: "coder-g4-26b".into(),
description: "complex reasoning, architecture, and high-context technical work".into(),
},
ModelRouterRole {
name: "conversational".into(),
model: "qwen2.5-coder:7b".into(),
description: "general discussion, planning, and explanations".into(),
},
]
}
impl fmt::Debug for ModelRouterProvider {
fn fmt(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
formatter
.debug_struct("ModelRouterProvider")
.field("config", &self.config)
.finish_non_exhaustive()
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn default_router_roles_match_expected_models() {
let config = ModelRouterConfig::default();
assert_eq!(
config.role("instant").expect("instant").model,
"qwen2.5-coder:7b"
);
assert_eq!(
config.role("conversational").expect("conversational").model,
"qwen2.5-coder:7b"
);
assert_eq!(config.role("coding").expect("coding").model, "coder-g4-26b");
assert_eq!(config.role("heavy").expect("heavy").model, "coder-g4-26b");
config.validate().expect("default config validates");
}
#[test]
fn parses_json_router_response() {
let config = ModelRouterConfig::default();
let route = parse_router_response(
r#"{"role":"heavy","reason":"large context architecture"}"#,
&config,
)
.expect("route");
assert_eq!(route.role, "heavy");
assert_eq!(route.model, "coder-g4-26b");
assert_eq!(route.reason, "large context architecture");
}
#[test]
fn rejects_missing_fallback_role() {
let config = ModelRouterConfig {
fallback_role: "missing".into(),
..ModelRouterConfig::default()
};
assert!(config.validate().is_err());
}
}

View File

@ -50,6 +50,7 @@ impl Repl {
|| input.starts_with("/explain ") || input.starts_with("/explain ")
|| input.starts_with("/discard ") || input.starts_with("/discard ")
|| input.starts_with("/help") || input.starts_with("/help")
|| input == "/keys"
|| input == "/panel" || input == "/panel"
|| input.starts_with("/add-note ") || input.starts_with("/add-note ")
|| input.starts_with("/add-file ") || input.starts_with("/add-file ")

View File

@ -5,6 +5,7 @@ use std::time::{SystemTime, UNIX_EPOCH};
use crate::commands::CommandSuggestion; use crate::commands::CommandSuggestion;
use crate::context::{ContextEntry, redacted_provider_details}; use crate::context::{ContextEntry, redacted_provider_details};
use crate::prompts::Stance; use crate::prompts::Stance;
use crate::providers::ModelRoute;
use crate::shell::ShellFamily; use crate::shell::ShellFamily;
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
@ -93,6 +94,14 @@ impl Transcript {
}); });
} }
pub fn record_model_route(&mut self, route: &ModelRoute) {
self.entries.push(TranscriptEntry::ModelRoute {
role: route.role.clone(),
model: route.model.clone(),
reason: route.reason.clone(),
});
}
pub fn write_to_dir(&self, directory: &Path) -> Result<PathBuf, TranscriptError> { pub fn write_to_dir(&self, directory: &Path) -> Result<PathBuf, TranscriptError> {
fs::create_dir_all(directory).map_err(|error| TranscriptError::CreateDir { fs::create_dir_all(directory).map_err(|error| TranscriptError::CreateDir {
path: directory.to_path_buf(), path: directory.to_path_buf(),
@ -209,6 +218,16 @@ impl Transcript {
markdown.push_str(&format!("- action: `{action}`\n")); markdown.push_str(&format!("- action: `{action}`\n"));
markdown.push_str(&format!("- note: `{note}`\n\n")); markdown.push_str(&format!("- note: `{note}`\n\n"));
} }
TranscriptEntry::ModelRoute {
role,
model,
reason,
} => {
markdown.push_str("## Model Route\n\n");
markdown.push_str(&format!("- role: `{role}`\n"));
markdown.push_str(&format!("- model: `{model}`\n"));
markdown.push_str(&format!("- reason: `{reason}`\n\n"));
}
} }
} }
@ -253,6 +272,11 @@ enum TranscriptEntry {
action: String, action: String,
note: String, note: String,
}, },
ModelRoute {
role: String,
model: String,
reason: String,
},
} }
fn unix_millis() -> u128 { fn unix_millis() -> u128 {