Added the Whitepaper and System Arch docs
This commit is contained in:
parent
487fc433b6
commit
8a5337dcde
71
Docs/SCL Whitepaper.md
Normal file
71
Docs/SCL Whitepaper.md
Normal file
|
|
@ -0,0 +1,71 @@
|
||||||
|
## Synapse Command Language (SCL): Transforming Large Language Models into Deterministic State Machines
|
||||||
|
|
||||||
|
### Abstract
|
||||||
|
The rapid advancement of Large Language Models (LLMs) has led to a pervasive misconception: that these models are akin to living organisms, possessing a "magical" black-box superintelligence. The Synapse Command Language (SCL) challenges and disproves this notion. SCL is a lightweight, highly structured syntax designed to force an LLM to operate strictly as a deterministic state machine. By stripping away conversational overhead and enforcing a rigid command-and-response protocol, SCL demonstrates that LLMs are fundamentally massive computational engines capable of driving complex, deterministic operations.
|
||||||
|
|
||||||
|
Furthermore, while tech giants like Microsoft and Nvidia have invested years of R&D and millions of dollars to integrate AI deeply into operating systems, the SCL framework achieves a comparable level of OS integration in a matter of days, at zero cost. This white paper outlines the SCL specification, its philosophical implications, its superiority over existing agent frameworks, and its potential applications.
|
||||||
|
|
||||||
|
### 1. The Philosophy: LLMs as State Machines
|
||||||
|
The core thesis of SCL is that an LLM is not an autonomous "Agent" with free will, but a highly advanced text-processing computer. When properly constrained by a strict system prompt and a rigid syntax, an LLM can be forced into a state-machine loop:
|
||||||
|
1. **Input State:** The LLM receives a base64-encoded user request and the current system context.
|
||||||
|
2. **Transition State:** The LLM processes the request and outputs a deterministic SCL command.
|
||||||
|
3. **Execution State:** The local client parses the command, executes it on the host machine, and returns the result to the LLM.
|
||||||
|
4. **Resolution State:** The LLM evaluates the result and either issues a subsequent command or concludes the operation.
|
||||||
|
|
||||||
|
This project proves that AI can be reliably used for deterministic computing, bridging the gap between natural language understanding and rigid system execution.
|
||||||
|
|
||||||
|
### 2. SCL vs. OpenClaw and Agent Frameworks
|
||||||
|
SCL shares conceptual similarities with frameworks like OpenClaw, but it is fundamentally superior in several key areas:
|
||||||
|
* **Zero Overhead:** Traditional agent frameworks rely heavily on verbose JSON or XML parsing, which consumes massive amounts of token bandwidth. SCL uses a custom, character-efficient syntax (`~cmd[param]`), drastically reducing token usage and latency.
|
||||||
|
* **Not an "Agent":** SCL does not attempt to give the AI "autonomy" or "thoughts." It treats the AI as a driver for a state machine. The AI is a function: `f(user_input, system_state) = SCL_Command`.
|
||||||
|
* **Extreme Flexibility:** Because SCL operates over standard chat interfaces (via the C2 Web Harness), it can be deployed on almost any LLM without requiring API access, complex backend orchestration, or specialized model fine-tuning.
|
||||||
|
|
||||||
|
### 3. Language Specification
|
||||||
|
SCL is designed to be parsed easily by both the LLM and the client application using a robust character-stepping state machine.
|
||||||
|
|
||||||
|
**Issuing Commands (Action Trigger: `~`)**
|
||||||
|
When the AI needs to perform an action, it outputs a command using the tilde (`~`) prefix, followed by the command ID, and parameters enclosed in brackets (`[]`), separated by pipes (`|`).
|
||||||
|
* **Syntax:** `~command_name[parameter1|parameter2]`
|
||||||
|
* **Example:** `~cmd[start chrome.exe]`
|
||||||
|
|
||||||
|
**Receiving Results (Result Trigger: `^`)**
|
||||||
|
The client application executes the command and replies to the AI using the caret (`^`) prefix.
|
||||||
|
* **Syntax:** `^command_name[status_code|output_data]`
|
||||||
|
* **Status Codes:** `0` for Success, `1` (or non-zero) for Error.
|
||||||
|
* **Example:** `^cmd[0|Success]`
|
||||||
|
|
||||||
|
**Escaping Rules**
|
||||||
|
If a parameter contains a literal `]`, `|`, or `\`, it must be escaped with a backslash (`\`).
|
||||||
|
|
||||||
|
### 4. Defining New Commands in C#
|
||||||
|
SCL is highly extensible. Developers can define new commands in the client application with just a few lines of code. The `SclProcessor` handles all the complex parsing and escaping.
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// 1. Instantiate the processor
|
||||||
|
var sclProcessor = new SclProcessor();
|
||||||
|
|
||||||
|
// 2. Register a new command (e.g., "read_file")
|
||||||
|
sclProcessor.RegisterCommand("read_file", async (args) =>
|
||||||
|
{
|
||||||
|
if (args.Length < 1) return SclResult.Error("Missing file path.");
|
||||||
|
|
||||||
|
try {
|
||||||
|
string content = await File.ReadAllTextAsync(args[0]);
|
||||||
|
return SclResult.Success(content);
|
||||||
|
} catch (Exception ex) {
|
||||||
|
return SclResult.Error(ex.Message);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// 3. Update the System Prompt Builder so the AI knows about the command
|
||||||
|
var promptBuilder = new SclPromptBuilder();
|
||||||
|
promptBuilder.AddCommand("read_file", "Reads the contents of a file.", "file_path");
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Applications
|
||||||
|
* **Accessibility (Voice Assistance):** As demonstrated in the PoC, SCL allows disabled users to control their entire Windows OS using natural voice commands. The AI translates "Computer, open command prompt and print X directory" into precise SCL shell commands.
|
||||||
|
* **AI Server Backends:** SCL can be used to allow an AI to manage server infrastructure, query databases, or orchestrate microservices deterministically.
|
||||||
|
* **Automated QA Testing:** An AI can use SCL to drive UI automation tools, acting as a state machine that navigates a website and reports bugs.
|
||||||
|
|
||||||
|
### 6. Limitations: The ChatGPT Refusal
|
||||||
|
It is important to note that ChatGPT (specifically the web interface versions of GPT-4/4o) is currently unsupported for SCL. OpenAI's aggressive Reinforcement Learning from Human Feedback (RLHF) causes the model to stubbornly refuse strict syntax constraints. It frequently attempts to wrap SCL commands in markdown, add conversational filler, or outright refuse to act as a silent state machine. Models like Google Gemini, DeepSeek, and Google AI Search are far more compliant with the SCL system instructions.
|
||||||
29
Docs/System Architecture & Technical Documentation.md
Normal file
29
Docs/System Architecture & Technical Documentation.md
Normal file
|
|
@ -0,0 +1,29 @@
|
||||||
|
# System Architecture & Technical Documentation
|
||||||
|
|
||||||
|
## 1. Overview
|
||||||
|
The system consists of three primary components:
|
||||||
|
1. **AI C2 Server (`AI C2 Server`):** A central hub that routes commands between the local client and the web browser.
|
||||||
|
2. **Unified Web Harness (Tampermonkey Script):** A JavaScript payload injected into AI chat websites to hijack the UI and use the AI as a free API.
|
||||||
|
3. **Voice Assistant PoC (`VoiceAssistantPoC_Win`):** The client application that listens to the user's voice, communicates with the C2 server, and executes SCL commands on the local OS.
|
||||||
|
|
||||||
|
## 2. AI C2 Server (Command and Control)
|
||||||
|
The C2 Server is a WinForms application running a local `HttpListener` on port `8080`. It acts as a message broker.
|
||||||
|
* **State Management (`OperationCenter.cs`):** Maintains a dictionary of active instances (browser tabs). It tracks whether an instance is busy, its turn count, and its chat history.
|
||||||
|
* **Relay Endpoints (`/api/relay/*`):** Used exclusively by the Tampermonkey script.
|
||||||
|
* `GET /command/{id}`: The browser polls this endpoint to see if the user has queued a prompt.
|
||||||
|
* `POST /result/{id}`: The browser posts the extracted markdown response from the AI here.
|
||||||
|
* `POST /state/{id}`: The browser sends heartbeats to keep the instance alive in the C2 UI.
|
||||||
|
* **Admin Endpoints (`/api/admin/*`):** Used by the Voice Assistant PoC to queue prompts (`/inject`), reset chats (`/action/new_chat`), and fetch active sessions.
|
||||||
|
|
||||||
|
## 3. Unified Web Harness (Tampermonkey)
|
||||||
|
The harness (`GoogleAI_Search_Deepseek_ChatGPT_UnifiedHarness.js`) is a sophisticated DOM-manipulation script that bypasses the need for paid API keys by puppeteering the web interfaces of major AI providers.
|
||||||
|
* **CORS Bypass:** Uses `GM_xmlhttpRequest` to communicate with `localhost:8080`, bypassing standard browser CORS restrictions.
|
||||||
|
* **DOM Tactics:** Employs different strategies to inject text into the AI's chat box. For Google, it uses standard `value` setters. For Gemini and ChatGPT, it uses `contenteditable` tactics (manipulating the `Selection` and `Range` APIs to simulate typing).
|
||||||
|
* **Kinetic Wait & Semantic Lock:** Because AI responses stream in dynamically, the script uses a "Kinetic Wait" (a hard delay to allow the UI to register the submit click) followed by a "Semantic Lock". The Semantic Lock monitors the DOM for text growth. Once the text length remains stable for a required "streak" (e.g., 3 seconds), the lock releases, and the script extracts the final response.
|
||||||
|
* **Smart Extraction:** Custom HTML-to-Markdown parsers strip away UI clutter (buttons, SVGs, hidden elements) and extract only the AI's actual response text.
|
||||||
|
|
||||||
|
## 4. Voice Assistant PoC
|
||||||
|
The client application brings the system together, acting as the bridge between the user's voice, the AI, and the Windows OS.
|
||||||
|
* **Speech-to-Text (Vosk):** Uses the Vosk offline speech recognition engine via `NAudio`. It listens continuously for the wake word ("Computer"). Once detected, it captures the subsequent phrase.
|
||||||
|
* **Base64 Encoding:** To prevent the AI from confusing user input with system instructions, the user's prompt is Base64 encoded before being sent to the C2 server. The AI is instructed to decode it before processing.
|
||||||
|
* **SCL Execution (`SclProcessor.cs`):** When the AI responds with an SCL command (e.g., `~cmd[dir]`), the `SclProcessor` parses it using a custom character-stepping algorithm that respects nested brackets and escape characters. It then executes the command via `System.Diagnostics.Process` (`cmd.exe`) and returns the `^cmd[0|output]` result back to the AI loop until the AI is satisfied.
|
||||||
Loading…
Reference in New Issue
Block a user