Primitives, not features. Local-first. Extensible.
sharur is a powerful, local-first agentic harness designed for developers who want a flexible and reliable assistant that runs on their own hardware. It prioritizes local LLMs (via Ollama and llama.cpp) but adapts seamlessly to cloud providers like OpenAI, Anthropic, and Google Gemini.
Sharur, smasher of thousands! The weapon of Ninurta, acting as his counselor and scout - flies ahead, assesses, reports back, then executes.
Core Philosophy
Local-First — Built from the ground up to favor local inference for privacy, speed, and cost-efficiency.
Aggressively Extensible — Every tool, provider, and behavior is a plugin interface. Supports gRPC extensions, markdown skills, and reusable prompt templates.
Session Persistence — Intelligent JSONL-backed session management with project-aware storage, branching, forking, and tree visualization.
Flexible Modes — TUI mode, one-shot mode, or a multi-session gRPC service — all powered by a central service-oriented architecture.
Security & Safety — Dry-run safety for destructive tools, automatic prompt injection mitigation, and a gRPC extension system for enforcing arbitrary policies.
Getting Started
Prerequisites
Go 1.26.2+
Nix (optional, recommended) — with flake support enabled
Installation
# Recommended: use Nix for a fully reproducible dev environmentnix develop
# Build binary with Gogo build -o shr ./cmd/shr
# Or install globallygo install ./cmd/shr
Quick Start
# Launch the interactive TUIshr
# One-shot answer (JSONL output)shr --mode json "What is the best way to structure a Go project?"# Resume the most recent sessionshr --continue
GoDoc for sdk, extensions, internal/tools, internal/agent
Subsections of Sharur
User Guide
The user guide covers day-to-day use of sharur from the terminal:
CLI — runtime modes, flags, keybindings, slash commands, and configuration
Extensibility — skills, prompt templates, and Go/Python/gRPC extensions
Subsections of User Guide
CLI
shr is the sharur CLI binary. It supports three runtime modes and a rich flag surface for model selection, session management, tools, and extensions.
Runtime Modes
Mode
Flag
Description
TUI
--mode tui(default)
Interactive Bubble Tea terminal interface with streaming, tool cards, and session management
JSON
--mode json
One-shot query with line-delimited JSON event output — useful for shell pipelines
gRPC
--mode grpc
Persistent multi-session gRPC service — any gRPC-capable client can connect
Quick Start
# Launch the interactive TUIshr
# One-shot answer (JSONL output)shr --mode json "What is the best way to structure a Go project?"# Resume the most recent sessionshr --continue
See the sub-pages for full keybinding and slash command references, JSON event schema, gRPC proto overview, provider setup, and the full configuration schema.
Subsections of CLI
Configuration
sharur uses layered JSON configuration. Project-level settings override global defaults.
Path
Scope
~/.sharur/config.json
Global defaults — applies to all projects
.sharur/config.json
Project-level overrides — applies in this directory
API keys can also be set via environment variables — env vars take priority over config file values.
Context Files
sharur auto-discovers AGENTS.md, CLAUDE.md, GEMINI.md, and .context.md in your project root and parent directories and injects them into the system prompt. Outermost files take precedence (parent directory wins over project root).
sharur supports five LLM providers. All configuration lives in config.json files or environment variables; environment variables take priority over config file values.
Model Naming
Models can be specified as provider/model shorthand or with separate flags:
# Shorthand: provider inferred from the slash-prefixshr --model anthropic/claude-sonnet-4-6
# Explicit: provider and model as separate flagsshr --provider anthropic --model claude-sonnet-4-6
Both forms are equivalent. The shorthand is convenient for one-off overrides; the config file form is better for persistent defaults.
Environment Variables
API keys set via environment variable take priority over values in config.json. The env var names use the SHARUR_ prefix:
Provider
Environment Variable
Anthropic
SHARUR_ANTHROPIC_API_KEY
OpenAI
SHARUR_OPENAI_API_KEY
Google
SHARUR_GOOGLE_API_KEY
Ollama and llama.cpp are local servers and do not use API keys.
Ollama
Ollama runs models locally. It is the default provider.
// ~/.sharur/config.json or .sharur/config.json
{"defaultProvider":"ollama","defaultModel":"llama3.2","ollamaBaseURL":"http://localhost:11434"}
# Pull a model and launchollama pull llama3.2
shr
# Use a specific modelshr --model ollama/llama3.2
# Point at a remote Ollama servershr --model llama3.2 --provider ollama
Notes:
Default base URL is http://localhost:11434. Override with ollamaBaseURL.
Ollama models support tools and images (vision models).
Use shr --list-models to see all locally available models.
Thinking is supported on models that emit <think> tokens (e.g. qwq, deepseek-r1).
llama.cpp
llama.cpp exposes an OpenAI-compatible HTTP server.
Gemini 1.5 Pro and later have a 1M+ token context window.
Supports tools and vision (images).
Use shr --list-models to see available Gemini models.
Listing Available Models
All five providers implement model listing. Use --list-models to query the active provider:
# List Ollama modelsshr --list-models
# List models from a specific providershr --provider anthropic --list-models
# Filter resultsshr --provider openai --list-models gpt-4
The output is a plain list of model names, suitable for piping:
The TUI is a rich, Bubble Tea-powered interface with real-time streaming, tool cards, session management, and a live context usage progress bar in the status footer.
Keybindings
Key
Action
Enter
Send message (or Steer the running agent)
Shift+Enter
Insert newline
Ctrl+Enter
Queue follow-up message (runs after agent finishes)
Ctrl+C
Abort the current agent run and clear the input editor
Esc
Cancel streaming / Close modal / Abort current turn
Ctrl+O
Toggle tool call output expansion
Ctrl+P
Open model selection modal (cycling via --models flag)
↑/↓
Navigate prompt history (if at start/end of editor) / Scroll viewport
F1
Show help modal
Slash Commands
Command
Description
/new
Start a fresh session
/resume <id>
Resume a session by ID or partial UUID (fuzzy search enabled)
/branch [idx]
Create a new child session branching from a specific message index (defaults to last)
/fork
Duplicate current session into a new independent session (no parent link)
/rebase
Interactive rebase: select specific messages to keep in a new session
/merge <id>
Merge another session’s history into the current one with a synthesis turn
/tree [-g|-p]
Open session tree modal. Flags: --global (-g) or --project (-p)
/import <path>
Import a session from a JSONL file
/export <path>
Export the current session to a JSONL file
/model <p/m>
Switch model mid-conversation (e.g. /model anthropic/claude-sonnet-4-6)
/stats
View session statistics and token usage
/config
View and edit active configuration
/context
View detailed context window usage
/compact
Manually trigger a context compaction
/skill:<name> [args]
Invoke a skill
/prompt:<name>
Expand a prompt template into the editor
/exit
Quit (alias: /quit)
Session Tree Modal (/tree)
Key
Action
↑/↓ / PgUp/PgDn
Navigate the session list
Enter
Resume the selected session (or branch from it if it’s an interior node)
B
Create a new branch from the selected session
F
Create an independent fork of the selected session
R
Start an interactive rebase from the selected session’s history
Esc
Close modal
Bang Commands
Bang commands execute a shell command and inject the output into the conversation:
!ls -la # Execute shell command, paste output into editor!!cat README.md # Execute shell command, send output directly to agent
!cmd — pastes stdout into the editor so you can review before sending
!!cmd — sends stdout directly to the agent without review
At-File Attachments
Type @ in the input to fuzzy-search and attach file contents to your prompt:
Tell me what this does @src/agent/loop.go
The file content is embedded inline in the message sent to the agent.
JSON Mode
JSON mode runs a single prompt and streams the agent’s events as line-delimited JSON (JSONL) to stdout. It is designed for shell pipelines and tooling integration.
shr --mode json "What is the best way to structure a Go project?"# Pipe stdin as contextcat main.go | shr --mode json "Refactor this to use interfaces"# Specify a modelshr --mode json "Summarize the last 10 git commits" --model anthropic/claude-opus-4-5
Event Format
Each line is the protobuf JSON encoding of an AgentEvent. Event types mirror the TUI stream:
EVENT_AGENT_START / EVENT_AGENT_END
EVENT_TEXT_DELTA — incremental response text
EVENT_THINKING_DELTA — incremental thinking text (extended thinking models)
EVENT_TOOL_CALL — tool invocation start
EVENT_TOOL_DELTA — streaming tool output
EVENT_TOOL_OUTPUT — final tool result
EVENT_TURN_START / EVENT_TURN_END
Common Patterns
# Capture only the text deltasshr --mode json "Explain Go interfaces"\
| jq -r 'select(.type == "EVENT_TEXT_DELTA") | .content'# Run without saving the sessionshr --mode json --no-session "Quick one-off question"# Dry-run to see what tools would be calledshr --mode json --dry-run "Delete all .tmp files in the current directory"
gRPC mode starts a persistent AgentService server. Each connecting client supplies a session_id and gets its own isolated agent. Sessions are saved to disk after each turn and reloaded automatically on reconnect.
# Start on the default portshr --mode grpc
# Use a custom addressshr --mode grpc --grpc-addr :9090
The server responds to SIGINT/SIGTERM with a graceful shutdown: in-flight turns are allowed to finish (30 s timeout), all sessions are flushed to disk, then the listener closes.
Proto Definition
The service is defined in proto/sharur/v1/agent.proto. Generated Go stubs live in internal/gen/sharur/v1/. Regenerate with mage generate.
Key RPCs:
RPC
Description
Prompt
Send a user message; streams back AgentEvents
NewSession
Create a new session
GetMessages
Retrieve message history for a session
GetState
Get current agent state
Steer
Inject a steering message mid-turn
FollowUp
Queue a follow-up after the current turn
Abort
Cancel the current running turn
ForkSession
Fork a session into a new independent copy
ConfigureSession
Change model, provider, or thinking level
In-Process Transport
For the TUI and JSON modes, all internal communication also goes through this same protobuf boundary using a bufconn in-memory pipe — not a network socket. This means all three modes share identical code paths. See Service Architecture for details.
Extensibility
sharur supports three extension points:
Skills — reusable prompt templates invoked with /skill-name
Skills are Markdown files that provide sharur with specialized, reusable instructions for specific tasks. When a skill is invoked, its content is sent as a user message to the agent along with any arguments you provide.
How Skills Work
When sharur starts, it scans the skill directories and adds a list of available skills to the system prompt. The agent knows which skills exist and their descriptions. You can explicitly invoke a skill with /skill:<name> from the TUI, or the agent may choose to invoke one automatically via the read tool or a specialized skill tool call.
When you invoke a skill via /skill:<name>, it is executed as a skill tool, which loads the content and sends it to the agent:
<skill name="refactor" location="/path/to/refactor/SKILL.md">
References are relative to /path/to/refactor/.
...skill content here...
</skill>
your additional arguments here
Skill Discovery Directories
sharur searches for skills in these locations (in order):
Path
Scope
~/.sharur/skills/
Global — available in all projects
.sharur/skills/ (project root)
Project-specific skills
Skills with the same name in a project directory override global ones.
Skill File Formats
Simple: Single .md file
Create a .md file directly in a skills directory. The filename (without extension) becomes the skill name.
.sharur/skills/refactor.md
Invoke with:
/skill:refactor improve error handling
Structured: Directory with SKILL.md
Create a directory containing a SKILL.md file. The directory name becomes the skill name. This format lets you include supporting files (examples, templates) alongside the skill.
Note: When a SKILL.md is found in a directory, subdirectories are not scanned further. This lets you bundle reference files with your skill.
Frontmatter (Optional)
Both formats support optional YAML frontmatter to provide metadata:
---
name: refactor
description: Refactor Go code to use idiomatic patterns and interfaces
---
You are an expert Go developer. When asked to refactor code:
1. Identify opportunities to use interfaces for testability
2. Replace repetitive code with helper functions
3. Add godoc comments to all exported symbols
4. Ensure error handling follows Go conventions (wrap with %w)
Always explain the reasoning behind each change before making it.
Frontmatter fields:
Field
Description
name
Override the skill name (defaults to filename/directory name)
description
A short description shown to the agent in the system prompt
Practical Examples
Code Review Skill
.sharur/skills/code-review.md
---
name: code-review
description: Perform a thorough code review with actionable feedback
---
Review the provided code and evaluate it against these criteria:
**Correctness**- Does the logic match the intended behavior?
- Are edge cases handled?
- Are there potential nil pointer dereferences or index out-of-bounds issues?
**Maintainability**- Is the code readable and self-documenting?
- Are functions focused on a single responsibility?
- Is there appropriate error handling?
**Performance**- Are there obvious inefficiencies (e.g. unnecessary allocations, N+1 queries)?
Format your response as:
## Summary
<one paragraph>
## Issues
<numbered list of specific issues with file:line references>
## Suggestions
<numbered list of improvements>
---
name: db-migration
description: Generate SQL migration files following our project conventions
---
Generate a database migration for the requested schema change.
Our migration file conventions:
- Files are named: `YYYYMMDD_HHMMSS_description.sql`- Each file has an `-- +migrate Up` and `-- +migrate Down` section
- All tables use `BIGINT` primary keys with `AUTO_INCREMENT`- Always include `created_at` and `updated_at` TIMESTAMP columns
See the example schema at the path listed in this skill's location directory: `schema-example.sql`
Global Utility Skill
~/.sharur/skills/explain.md
---
name: explain
description: Explain code clearly for a non-expert audience
---
Explain the following code in plain English. Assume the reader is a competent programmer but unfamiliar with this codebase.
Structure your explanation as:
1.**Purpose** — What does this code do in one sentence?
2.**How it works** — Step-by-step walkthrough of the logic
3.**Key concepts** — Any domain-specific terms or patterns used
4. **Gotchas** — Anything surprising or non-obvious
Tips
Keep skills focused. One skill = one task type. Compose them with arguments rather than making a single skill do everything.
Use relative file references — when your skill body references files, note they resolve relative to the skill’s directory. The agent is told the skill’s location so it can use the read tool on supporting files.
Test your skill by invoking it with /skill:<name> in the TUI. The skill’s content and its effect on the conversation will be visible in the tool output cards.
Override skills per-project — place a skill with the same name in .sharur/skills/ to override the global version for a specific project.
Prompt templates are reusable text snippets that expand directly into the TUI input editor. Unlike skills (which are sent to the agent immediately), prompt templates let you pre-fill the editor so you can review, edit, or complete the text before sending.
How Prompt Templates Work
When you type /prompt:<name> and press Enter, the template content is loaded into the editor input. You can then modify it, add context, attach files with @, and send it normally. This is useful for long, structured prompts you use frequently.
Prompt Template Directories
sharur searches these locations (in order):
Path
Scope
~/.sharur/prompts/
Global — available in all projects
.sharur/prompts/ (project root)
Project-specific templates
Template File Format
A prompt template is any .md file in a prompts directory. The filename (without extension) is the template name.
.sharur/prompts/bug-report.md
Invoke with:
/prompt:bug-report
Minimal Template (no frontmatter)
The entire file content becomes the template text:
Describe the bug you found:
**Steps to reproduce:**1.
2.
3.
**Expected behavior:****Actual behavior:****Environment:**- OS:
- shr version:
- Model:
Template with Frontmatter
Add optional YAML frontmatter for metadata:
---
description: Generate a structured bug report
argument-hint: <component-name>
---
Describe the bug you found in the $1 component:
**Steps to reproduce:**1.
2.
3.
**Expected behavior:****Actual behavior:**
Frontmatter fields:
Field
Description
description
Short description shown in the /prompt: picker
argument-hint
Hint shown in autocomplete describing expected arguments
Argument Substitution
Templates support positional argument placeholders: $1, $2, etc.
When you invoke a template via the slash command handler (not the interactive TUI), arguments after the template name are substituted. To mitigate prompt injection, sharur automatically wraps these arguments in <untrusted_input> tags. In the TUI, the template expands as-is and you fill in the values manually.
Practical Examples
PR Description Template
.sharur/prompts/pr-description.md
---
description: Generate a pull request description
---
Write a pull request description for the following changes.
**Format:**## Summary
<What does this PR do? Why?>
## Changes
<Bullet list of specific changes>
## Testing
<How was this tested?>
## Notes
<Anything reviewers should pay attention to>
The diff is:
Invoke:
/prompt:pr-description
Then paste or attach the diff before sending.
Architecture Decision Record
.sharur/prompts/adr.md
---
description: Draft an Architecture Decision Record (ADR)
argument-hint: <decision-title>
---
Draft an Architecture Decision Record (ADR) for: **$1**
Use this structure:
# ADR: $1
## Status
Proposed
## Context
<What is the issue motivating this decision?>
## Decision
<What was decided?>
## Consequences
### Positive
-
### Negative
-
### Neutral
-
## Alternatives Considered
<What other approaches were evaluated and why were they rejected?>
Invoke:
/prompt:adr Use JSONL for session storage
Global Commit Message Template
~/.sharur/prompts/commit.md
---
description: Generate a conventional commit message
---
Generate a commit message following the Conventional Commits specification for the following diff or description of changes.
Format:
` ``
<type>(<scope>): <short description>
<body: what changed and why, wrapped at 72 chars>
<footer: breaking changes, issue references>
`` `Types: feat, fix, docs, style, refactor, perf, test, chore
Changes:
Invoke:
/prompt:commit
Code Explanation for PR Comments
.sharur/prompts/explain-for-review.md
---
description: Explain a code block suitable for a PR comment
---
Explain the following code in a way that's suitable for a GitHub PR review comment. Be concise (2-4 sentences max), assume the reader is a senior engineer, and highlight any non-obvious design decisions.
Code:
Tips
Prompt templates are for your input. They expand into the editor, not directly to the agent. This gives you a chance to customize before sending.
Use $1, $2 placeholders for dynamic parts you’ll always fill in differently. Leave static boilerplate as literal text.
Combine with @ file attachments. Type /prompt:code-review then add @src/myfile.go before pressing Enter to attach a file.
Project-specific overrides. A template in .sharur/prompts/ with the same name as a global template takes priority for that project.
Organize with subdirectories. Templates are discovered recursively, so you can group them:
Invoke as /prompt:refactor, /prompt:adr, etc. (name is the filename, not the full path).
Go Extensions
Extensions let you add new behaviors to sharur beyond what’s possible with skills and prompt templates. They can observe and modify every stage of the agent loop — from the raw user input through each LLM turn and tool call to compaction and session teardown. Extensions run as separate processes and communicate with sharur via gRPC.
Extension Types
Type
Language
Use Case
Go binary
Go
High-performance tools, direct filesystem access
Python script
Python
Data processing, ML integrations, API calls
Any executable
Any
Shell scripts, compiled binaries from any language
All extension types use the same gRPC protocol. The loader treats .py files specially (runs them with the configured Python interpreter), and everything else is executed directly as a binary.
Extension Discovery
Extensions are loaded from directories listed in your config under extensions:
Every Go extension implements the extensions.Plugin interface from github.com/goppydae/sharur/extensions. Embed extensions.NoopPlugin and override only the hooks you need.
Load-time hooks
Method
When called
Purpose
Name()
On load
Returns the extension’s identifier string
Tools()
On load
Returns tool definitions the agent can call
ExecuteTool()
On tool call
Executes a tool registered by this extension
Session lifecycle hooks
Method
When called
Purpose
SessionStart(ctx, sessionID, reason)
Session attached or first prompt
Open connections, initialize per-session state
SessionEnd(ctx, sessionID, reason)
Session reset
Flush buffers, close connections
reason is "new" for a fresh session and "resume" for one loaded from disk.
Agent loop hooks
Method
When called
Purpose
AgentStart(ctx)
User prompt received, loop begins
Per-prompt setup, logging
AgentEnd(ctx)
Agent loop completes
Per-prompt teardown, emit metrics
TurnStart(ctx)
Start of each LLM request turn
Per-turn timing
TurnEnd(ctx)
After each turn’s tool calls finish
Per-turn cleanup
Transformation hooks
Method
When called
Can modify
Purpose
ModifyInput(ctx, text)
Before user text hits the transcript
Yes — transform or consume
Pre-process input, implement shortcuts
ModifySystemPrompt(prompt)
Before each LLM request
Yes — returns new prompt
Inject dynamic context into the system prompt
BeforePrompt(ctx, state)
Before each LLM request
Yes — returns new state
Change model, provider, or thinking level
ModifyContext(ctx, messagesJSON)
Before each LLM request is built
Yes — returns new JSON
Filter or inject messages sent to the LLM (transcript unchanged)
BeforeProviderRequest(ctx, requestJSON)
Just before the request is sent
Yes — returns new JSON
Modify temperature, max tokens, tools list
AfterProviderResponse(ctx, content, numToolCalls)
After LLM stream consumed
No
Observe response text and tool call count
BeforeToolCall(ctx, call, args)
Before each tool execution
Yes — can intercept
Block or replace tool execution
AfterToolCall(ctx, call, result)
After each tool execution
Yes — returns new result
Observe or modify tool results
BeforeCompact(ctx, prep)
Before LLM-based summarization
Yes — can skip
Provide a custom compaction summary
AfterCompact(ctx, freedTokens)
After compaction completes
No
Observe freed token count
Key behaviors:
ModifyInput returns agent.InputResult. Set Action to "continue" (pass through unchanged), "transform" (use the Text field instead), or "handled" (consume the message entirely — it is not appended to the transcript and the agent does not run).
ModifyContext and BeforeProviderRequest work with JSON strings at the gRPC boundary. The GRPCClient marshals/unmarshals the Go structs automatically.
BeforeCompact returns "" (empty) to let the default LLM summarization run, or a non-empty summary string to provide your own and skip the LLM call. The prep argument includes the message count, estimated token count, and the previous summary (if any).
BeforeToolCall returns (ToolResult, true) to intercept (the tool does not execute), or (ToolResult{}, false) to allow normal execution.
ModifyInput runs before the user text is added to the transcript. Return "handled" to consume shortcuts silently, or "transform" to rewrite the text:
func(p*MyPlugin)ModifyInput(_context.Context,textstring)agent.InputResult{ifstrings.HasPrefix(text,"?quick "){returnagent.InputResult{Action:agent.InputTransform,Text:"Respond in one sentence: "+text[7:],}}iftext=="ping"{returnagent.InputResult{Action:agent.InputHandled}}returnagent.InputResult{Action:agent.InputContinue}}
Example: Custom Compaction
Return a non-nil *agent.CompactionResult from BeforeCompact to supply your own summary and bypass the default LLM-based summarization:
Extensions can contribute tools the agent calls just like built-in tools:
typeCounterPluginstruct{extensions.NoopPlugin}func(p*CounterPlugin)Tools()[]extensions.ToolDefinition{return[]extensions.ToolDefinition{{Name:"count_lines",Description:"Count lines in a string",Schema:json.RawMessage(`{"type":"object","properties":{"text":{"type":"string"}},"required":["text"]}`),IsReadOnly:true,},}}func(p*CounterPlugin)ExecuteTool(_context.Context,namestring,argsjson.RawMessage)extensions.ToolResult{ifname!="count_lines"{returnextensions.ToolResult{Content:"unknown tool",IsError:true}}varinputstruct{Textstring`json:"text"`}_=json.Unmarshal(args,&input)n:=strings.Count(input.Text,"\n")+1returnextensions.ToolResult{Content:fmt.Sprintf("%d lines",n)}}
Example: Intercepting Tool Calls (Sandbox)
BeforeToolCall lets you block or replace any built-in tool call:
typeSandboxPluginstruct{extensions.NoopPluginAllowedDirstring}func(p*SandboxPlugin)BeforeToolCall(_context.Context,callextensions.ToolCall,argsjson.RawMessage)(extensions.ToolResult,bool){varinputstruct{Pathstring`json:"path"`}_=json.Unmarshal(args,&input)ifinput.Path!=""&&!strings.HasPrefix(input.Path,p.AllowedDir){returnextensions.ToolResult{Content:fmt.Sprintf("blocked: %s is outside %s",input.Path,p.AllowedDir),IsError:true,},true}returnextensions.ToolResult{},false}
flowchart TD
Start["shr startup"] --> Scan["Scan extension directories"]
Scan --> Launch["Launch subprocess
SHARUR_SOCKET_PATH=..."]
Launch --> Socket["Wait for socket · dial gRPC"]
Socket --> Init["Name() · Tools()"]
Init --> SS["SessionStart(sessionID, reason)
on new session or resume"]
SS --> MI["ModifyInput(text)"]
MI --> AS["AgentStart()"]
subgraph turn ["Per LLM turn (repeats until no tool calls)"]
direction TB
T1["BeforePrompt() · ModifySystemPrompt()
ModifyContext() · BeforeProviderRequest()"]
T2[/"LLM streams"/]
T3["AfterProviderResponse() · TurnStart()"]
subgraph toolloop ["Per tool call"]
BTC["BeforeToolCall()"] --> Intercept{"intercept?"}
Intercept -->|yes| CustomResult["return custom ToolResult"]
Intercept -->|no| Exec["execTool() · AfterToolCall()"]
end
TE["TurnEnd()"]
T1 --> T2 --> T3 --> toolloop --> TE
end
AS --> turn
turn --> AE["AgentEnd()"]
subgraph compact ["On compaction (auto or /compact)"]
direction TB
BC["BeforeCompact(prep)"] --> CustomSummary{"return non-nil?"}
CustomSummary -->|yes| SkipLLM["skip LLM summarization"]
CustomSummary -->|no| LLMSum["LLM summarizes"]
SkipLLM --> AC["AfterCompact(freedTokens)"]
LLMSum --> AC
end
AE --> SE["SessionEnd(sessionID, reason)
on session reset"]
SE --> Shutdown["shr shutdown · kill subprocess"]
In-Process Go Extension (Advanced)
If your extension is written in Go and you control the build, you can implement agent.Extension directly via the SDK and register it without the gRPC overhead:
import("github.com/goppydae/sharur/internal/agent""github.com/goppydae/sharur/internal/tools")typeMyExtensionstruct{agent.NoopExtension}func(e*MyExtension)AgentStart(ctxcontext.Context){log.Println("agent started")}func(e*MyExtension)ModifyInput(ctxcontext.Context,textstring)agent.InputResult{iftext=="ping"{returnagent.InputResult{Action:agent.InputHandled}}returnagent.InputResult{Action:agent.InputContinue}}func(e*MyExtension)ModifySystemPrompt(promptstring)string{returnprompt+"\n\nAlways respond in bullet points."}func(e*MyExtension)BeforeToolCall(ctxcontext.Context,call*agent.ToolCall,argsjson.RawMessage)(*tools.ToolResult,bool){ifcall.Name=="bash"{return&tools.ToolResult{Content:"bash is disabled",IsError:true},true}returnnil,false}
Pass the extension via ag.SetExtensions() from the SDK or directly in cmd/shr.
Tips
Extensions are isolated processes. A crash in an extension will not crash sharur — the loader catches errors and logs them.
Keep BeforePrompt and ModifySystemPrompt fast. They run before every single LLM call. Cache data when possible; avoid blocking network calls.
ModifyContext does not affect the stored transcript. Changes to the message slice are only visible to the LLM for that turn.
Use skills for static context. If you only need to append static text to the system prompt, a skill is simpler than an extension.
Extensions are global. All extensions in the configured directories are loaded for every session. There is no per-project scoping beyond the directory config.
Logs go to stderr. Stdout is not read by the host; stderr is passed through for debugging.
InputHandled stops all further processing. No agent turn is started, no message is appended to the transcript.
BeforeCompact fires before the LLM call. Return nil to let the default summarizer run. Return a *CompactionResult to supply your own summary — useful for using a cheaper model or domain-specific logic.
Python extensions use the same gRPC protocol as Go extensions. The loader detects .py files and runs them with the configured Python interpreter, passing SHARUR_SOCKET_PATH as an environment variable. The extension is expected to listen on that Unix socket.
Place the script in your extensions directory. sharur runs it as python ticket_context.py on startup.
Available RPC Methods
Implement any subset of the ExtensionServicer methods. Unimplemented methods should return a sensible empty response (see the template above). The full list mirrors the Go plugin interface — see Go Extensions for hook semantics.
RPC
Purpose
Name
Return extension identifier
Tools
Return tool definitions
ExecuteTool
Execute a registered tool
SessionStart / SessionEnd
Session lifecycle
AgentStart / AgentEnd
Per-prompt lifecycle
TurnStart / TurnEnd
Per-LLM-turn lifecycle
ModifyInput
Transform or consume user input
ModifySystemPrompt
Augment the system prompt
BeforePrompt
Mutate model/provider/thinking
ModifyContext
Filter or inject LLM-bound messages
BeforeProviderRequest
Modify the raw completion request
AfterProviderResponse
Observe LLM output
BeforeToolCall
Intercept or block tool calls
AfterToolCall
Observe or modify tool results
BeforeCompact / AfterCompact
Compaction lifecycle
Tips
Logs go to stderr. Python’s print() goes to stdout, which is not read by the host. Use sys.stderr.write() or logging for debugging output.
Keep proto stubs in the same directory as your script, or adjust sys.path before importing them.
Thread safety:grpc.server with ThreadPoolExecutor handles concurrent RPC calls. If you maintain per-session state, use a lock or session-keyed dict.
gRPC extensions run as separate processes. sharur manages their lifecycle: launching the binary, passing the socket path, waiting for readiness, dialing, and killing on shutdown. The extension communicates entirely over a Unix Domain Socket using the generated proto stubs in extensions/proto/extension.proto.
How It Works
sequenceDiagram
participant Loader as shr Loader
participant Ext as Extension process
participant Client as gRPC client
Loader->>Ext: exec binary/script
note over Ext: env: SHARUR_SOCKET_PATH=/tmp/...sock
Ext->>Ext: net.Listen("unix", socketPath)
note over Ext: signals readiness by listening
Loader->>Loader: poll for socket file
Loader->>Client: dial gRPC over Unix socket
Client->>Ext: Name()
Ext-->>Client: "my-extension"
Client->>Ext: Tools()
Ext-->>Client: [ToolDefinition, ...]
note over Loader,Ext: extension registered — hooks active for all sessions
The extension must call net.Listen("unix", os.Getenv("SHARUR_SOCKET_PATH")) and start serving before shr times out.
Writing a Go Extension
Import github.com/goppydae/sharur/extensions — no internal packages needed.
packagemainimport"github.com/goppydae/sharur/extensions"typemyPluginstruct{extensions.NoopPlugin}func(p*myPlugin)ModifySystemPrompt(promptstring)string{returnprompt+"\n\nAlways respond in haiku."}funcmain(){extensions.Serve(&myPlugin{NoopPlugin:extensions.NoopPlugin{NameStr:"haiku"},})}
extensions.Serve handles the socket path, gRPC server setup, and graceful shutdown. extensions.NoopPlugin provides no-op defaults for every method.
Build and place the binary in a configured extensions directory:
go build -o .sharur/extensions/haiku .
Or load at runtime:
shr --extension .sharur/extensions/haiku
Plugin Interface
All hooks map 1:1 to agent.Extension. See Go Extensions for full hook semantics and examples.
Load-time:
Method
Called
Purpose
Name()
Once on connect
Extension identifier
Tools()
Once on connect
Contribute tools to the agent
ExecuteTool()
Per tool call
Execute a registered tool
Session lifecycle:
Method
Called
Purpose
SessionStart(ctx, sessionID, reason)
New or resumed session
Open connections, init per-session state
SessionEnd(ctx, sessionID, reason)
Session reset
Flush, close connections
reason is "new" or "resume".
Proto Definition
The extension service is defined in extensions/proto/extension.proto. Generated Go stubs are in extensions/gen/. Regenerate with mage generate.
Tool definitions returned by Tools() have an IsReadOnly bool field. Set it to true for tools that are safe in dry-run mode. The GRPCClient propagates this to the internal RemoteTool.IsReadOnly() so dry-run and sandbox extensions honour it correctly.
Debugging
Logs go to stderr. The host passes the subprocess’s stderr through. Use log.Println or fmt.Fprintln(os.Stderr, ...) for debug output.
Crashes are isolated. A panicking extension does not crash shr — the loader catches errors and logs them.
Socket timeout. If the extension doesn’t listen within the timeout, the loader logs an error and skips it. Ensure extensions.Serve (or your own net.Listen + grpc.Serve) is called promptly in main().
Test in isolation. Set SHARUR_SOCKET_PATH=/tmp/test.sock and run your extension binary directly; then grpcurl the socket to verify RPCs before integrating with shr.
Internals — architecture, agent loop, session format, build system
Subsections of Developer Guide
Internals
This section describes the high-level architecture of sharur: how its components are organized, how data flows through the system, and how the key abstractions relate to each other.
The agent is driven by an event-bus (internal/events). Every meaningful state transition emits an agent.Event to all subscribers.
EventBus Performance
The EventBus is async and non-blocking. Publish() enqueues to a 4096-item buffered channel per subscriber and returns immediately — it never blocks the agent loop. Each subscriber runs in its own goroutine. Slow subscribers drop events to protect the agent loop from backpressure.
Event Flow
sequenceDiagram
participant User
participant Agent
participant LLM
participant Tools
User->>Agent: Prompt(text)
Agent->>Agent: EventAgentStart
loop each LLM turn
Agent->>Agent: EventTurnStart · EventMessageStart
Agent->>LLM: provider.Stream()
LLM-->>Agent: EventTextDelta (×n)
LLM-->>Agent: EventThinkingDelta (×n, if thinking enabled)
LLM-->>Agent: EventToolCall (×n, if tools requested)
Agent->>Agent: EventMessageEnd
loop each tool call
Agent->>Tools: execTool()
Tools-->>Agent: EventToolDelta (streaming)
Agent->>Agent: EventToolOutput
end
Agent->>Agent: EventTurnEnd
end
Agent->>Agent: EventAgentEnd
State Machine
The agent transitions through explicit states to prevent concurrent modification:
A ToolRegistry holds all registered tools. During a turn, when the LLM emits a tool call, execTool looks up the tool by name, executes it, and streams partial output via EventToolDelta before emitting the final EventToolOutput.
Dry-Run Mode: When DryRun is enabled, any tool that is not marked as read-only will bypass execution and return a descriptive preview of what it would have done.
Input Sanitization: Prompt template expansion automatically wraps user inputs in <untrusted_input> tags to prevent prompt injection into the base instructions.
sharur follows a Strict Protobuf Internal Architecture. Instead of UI modes calling Go functions directly, all interfaces are treated as clients of a central AgentService.
Protobuf Boundary
The interface between the UI and the core is defined in proto/sharur/v1/agent.proto. This boundary ensures:
Consistency: All modes (TUI, CLI, JSON, Remote gRPC) use the exact same code paths and logic.
Decoupling: UI logic is completely isolated from agent state, session persistence, and provider adapters.
Interoperability: Any gRPC-capable client can interact with a sharur service.
In-Process Communication
For local CLI usage, sharur uses a specialized In-Process Client (internal/service/client.go). It uses bufconn to implement the pb.AgentServiceClient interface over an in-memory pipe. This provides the safety and structure of gRPC without the latency or configuration complexity of network ports.
Backend Service (internal/service)
The Service struct implements pb.AgentServiceServer. It owns the session.Manager and manages the lifecycle of agent.Agent instances. It translates between internal agent events (Go channels) and Protobuf event streams.
Session Loading Strategy
RPCs split into three lookup strategies:
Strategy
Used by
Behaviour
getOrCreate(id)
Prompt, NewSession
Always returns an entry — creates a fresh agent if id is unknown, loading from disk if a matching session file exists
Returns the entry if it is in memory or can be loaded from disk; returns NotFound for completely unknown IDs
lookup(id)
Steer, Abort, FollowUp, StreamEvents
In-memory only — these only make sense for a currently-running agent
This means a /resume <id> command can switch to any session ever saved to disk without a round-trip NewSession call: the first GetMessages or GetState call transparently loads it.
All providers return a uniform Stream of Event values — text deltas, thinking deltas, tool calls, and usage. The agent’s consumeStream function normalizes these into the internal Message format, making the agent completely provider-agnostic.
The BeforeProviderRequest extension hook receives this struct as JSON and can modify any field before it is sent to the provider — useful for overriding temperature, trimming the tool list, or adjusting MaxTokens per request.
Info() is called once at startup. The service uses ContextWindow to trigger compaction when the conversation grows too large. HasImages controls whether the TUI offers image attachment UI.
All five adapters implement ModelLister. When --list-models is passed, the CLI casts the active provider to ModelLister and prints the result. Each adapter queries the appropriate API:
Provider
Query mechanism
ollama
GET /api/tags
llamacpp
GET /v1/models
openai
GET /v1/models
anthropic
GET /v1/models
google
Gemini model list API
Supported Providers
Provider
Backend
ollama
Local Ollama server (HTTP)
llamacpp
llama.cpp server (HTTP, OpenAI-compatible)
openai
OpenAI API or any OpenAI-compatible endpoint
anthropic
Anthropic Messages API
google
Google Gemini API
Each adapter lives in internal/llm/ and translates the provider’s wire format into the uniform Stream abstraction.
Feature Matrix
Provider
Tools
Images
Thinking
Context Window
ollama
✓
✓
model-dependent
4096 (default)
llamacpp
✓
✗
✗
from server n_ctx
openai
✓
✓
reasoning models
model-dependent
anthropic
✓
✓
✓ extended
model-dependent
google
✓
✓
✗
1,000,000+
Per-Provider Notes
Ollama
The Ollama adapter uses the /api/chat endpoint with streaming enabled. Context window defaults to 4096 when not reported by the server. Thinking is supported on models that emit <think> tokens (e.g. qwq, deepseek-r1) — sharur surfaces these as EventThinkingDelta events by detecting the tag boundaries in the stream.
llama.cpp
Uses the OpenAI-compatible /v1/chat/completions endpoint. The context window (n_ctx) is queried from the server at startup. Image attachments are not supported because llama.cpp’s OpenAI endpoint does not accept multipart vision payloads in the standard format.
OpenAI
Uses the standard /v1/chat/completions streaming endpoint. Any server implementing this API — vLLM, LM Studio, Groq, Together AI — can be used by setting openAIBaseURL. Reasoning models (o3, o4-mini) emit reasoning_content deltas that are surfaced as EventThinkingDelta.
Anthropic
Uses the Messages API (/v1/messages) with streaming. Extended thinking is activated when req.Thinking is medium or high:
medium — 10,000-token thinking budget
high — 20,000-token thinking budget
The API requires temperature: 1.0 when extended thinking is enabled; the adapter sets this automatically and overrides any user-supplied temperature for that request.
Google
Uses the Gemini generateContent API via the google.golang.org/genai client library. Gemini 1.5 Pro and later have context windows of 1M+ tokens; compaction is rarely triggered for typical sessions.
Adding a Provider
Implement the Provider interface in internal/llm/yourprovider.go and register it in internal/config/factory.go. Implement ModelLister to enable --list-models. The adapter receives a fully-formed CompletionRequest; it is responsible for translating Message.ToolCalls and Message.Images into the target API’s format.
Each .jsonl file contains one JSON object per line:
Line 0 (header): kind=header — session ID, parentId, model, timestamps, system prompt, compaction settings, dryRun flag
Subsequent lines: kind=message — individual conversation messages with full payloads (role, content, thinking, tool calls, tool call ID)
Session Tree
Sessions form a linked tree via parentId. The session.Manager.BuildTree() method assembles all sessions from the project directory into a []*TreeNode tree. FlattenTree produces a depth-first flat list with structured layout metadata (gutters, connectors, indentation), which the TUI layer uses to render a clean Unicode box-drawing tree diagram.
flowchart TD
A["Session A
(root)"] --> B["Session B
(/branch from A)"]
A --> C["Session C
(/fork of A)"]
B --> D["Session D
(/branch from B at msg 5)"]
B --> E["Session E
(/rebase of B)"]
B --> F["Session F
(/merge into B)"]
style C stroke-dasharray: 5 5
/fork creates an independent copy (dashed border above) with no parentId link — it does not appear as a child in the tree visualization.
Branching, Rebasing & Merging
flowchart TD
Q{"What do you need?"}
Q -->|"Explore an alternate
path from this point"| Branch["/branch [idx]
Child session, same history up to idx"]
Q -->|"Independent copy
no tree relationship"| Fork["/fork
Detached snapshot"]
Q -->|"Clean up the conversation
keep only specific messages"| Rebase["/rebase
Interactively select messages
for a new session"]
Q -->|"Combine two sessions
into one context"| Merge["/merge <id>
LLM-synthesized merge turn
appended to current session"]
Command
Creates parent link
Copies history
Interactive
/branch [idx]
✓
up to idx
✗
/fork
✗
full
✗
/rebase
✓
selected messages
✓
/merge <id>
✗
appends other session
LLM turn
The /tree modal (keyboard shortcut B, F, R on a selected session) exposes all of these without leaving the TUI.
Compaction & Context Management
To stay within LLM context windows, sharur implements an auto-compaction strategy:
Trigger: When tokens > ContextWindow - reserveTokens, compaction fires.
Summarization: The agent uses the LLM to generate a structured summary (<!-- sharur-summary -->) of the pruned messages.
File Tracking: The summary carries forward lists of files read and modified, so the assistant retains awareness of what it has already seen.
Split Turn Handling: If compaction cuts mid-turn, a “Turn Prefix Summary” is generated to preserve context for the remaining tool calls.
Session Tree Integration: Compaction events are stored as TypeCompaction records in the JSONL file, visible in /stats and preserved across restarts.
Compaction Configuration
// ~/.sharur/config.json or .sharur/config.json
{"compaction":{"enabled":true,"reserveTokens":2048,"keepRecentTokens":8192}}
Field
Default
Description
enabled
true
Whether auto-compaction fires when the token budget is exceeded
reserveTokens
2048
Tokens to keep free at the top of the context window; compaction triggers when used > window - reserveTokens
keepRecentTokens
8192
Minimum recent-turn tokens to always retain after compaction, ensuring the current conversation thread survives
Trigger compaction manually at any time with /compact in the TUI or by calling the Compact RPC directly.
Export & Import
Sessions can be exported to and imported from JSONL files:
# Export from TUI/export /path/to/session.jsonl
# Import into TUI (creates a new session from the file)/import /path/to/session.jsonl
# Export from CLI without entering TUIshr --export /path/to/session.html # HTML snapshot
Exported JSONL files are self-contained: they include the session header and all messages. Imported sessions are assigned a new UUID and added to the current project’s session directory.
historyEntry, contentItem, toolCallEntry — render data model
utils.go
Helper functions (Capitalize)
Prompt Submission
Prompt submission uses promptGRPC(), which opens a client.Prompt() server-streaming RPC and drains *pb.AgentEvent messages into m.eventCh in a goroutine. The listenForEvent Bubble Tea command feeds that channel back into the update loop one event at a time.
Prompt History
The TUI maintains a per-session prompt history in m.promptHistory, synced from the service via GetMessages at startup and after session switches. Users navigate previous prompts using Up/Down arrow keys while the editor is focused; the current draft is preserved as m.draftInput.
Render Data Model
The TUI stores conversation history as []historyEntry. Each entry has an ordered []contentItem slice that preserves the exact stream order:
This mirrors the content[] array model, ensuring correct temporal ordering of thinking, text, and tool calls.
Modal System
Stats — Token counts, session metadata, file/path info
Config — Active model, provider, compaction settings
Session Tree — Interactive paginated tree with structured branch visualization; supports Resume (Enter) and Branch (B)
Rebase Picker — Selection interface for history manipulation
Merge Picker — Fuzzy finder for selecting sessions to merge into the current conversation
Build & Release
sharur uses a combination of Mage and GitHub Actions for CI/CD.
Versioning
The project version is maintained in a VERSION file in the repository root. During build, Magefile.go reads this file and injects it into the binary using linker flags (-ldflags "-X main.version=...").
Mage Targets
Target
Description
Build
Compile shr for the current platform with version injection
Test
Run all unit tests with coverage
Vet
Static analysis with go vet
Lint
Run golangci-lint
Vuln
Vulnerability scan with govulncheck
All
Run generate, build, test, vet, lint, and vuln in sequence
Release
Cross-compile for Linux, macOS, and Windows (AMD64/ARM64), package into dist/
Generate
Run buf to regenerate protobuf stubs
Docs
Generate API reference (gomarkdoc) and build the Hugo site
DocsServe
Run Hugo dev server at localhost:1313 with live reload
PkgSite
Run pkgsite for local full API browsing including internals
CI/CD Pipelines
Continuous Integration (ci.yml)
Triggered on every push to main and all pull requests. Runs mage all within a Nix environment on both ubuntu-latest and macos-latest, then uploads per-platform binaries as build artifacts. Coverage is collected and summarised via go tool cover.
Automated Release (release.yml)
Triggered by pushing a version tag (e.g., v1.2.3). Runs mage release to build cross-platform assets and uses softprops/action-gh-release to publish them to a new GitHub Release.
Docs Deploy (docs.yml)
Triggered on push to main and on published releases. Runs mage docs (gomarkdoc + Hugo build) and deploys docs/public/ to the gh-pages branch via peaceiris/actions-gh-pages.
SDK
The github.com/goppydae/sharur/sdk package lets you embed a sharur agent in any Go program.
import"github.com/goppydae/sharur/sdk"
See the sub-pages for a quickstart, custom tool implementations, the EventBus API, and in-process extensions.
Subsections of SDK
Quickstart
Import github.com/goppydae/sharur/sdk to embed an agent in any Go program.
import"github.com/goppydae/sharur/sdk"ag,err:=sdk.NewAgent(sdk.Config{Provider:"ollama",Model:"llama3.2",Tools:sdk.DefaultTools(),})iferr!=nil{panic(err)}ag.Subscribe(func(esdk.Event){ife.Type==sdk.EventTextDelta{fmt.Print(e.Content)}})ag.Prompt(context.Background(),"List the Go files in this directory")<-ag.Idle()
Config Fields
typeConfigstruct{Providerstring// "ollama", "openai", "anthropic", "llamacpp", "google"Modelstring// model name or "provider/model"APIKeystring// optional; env vars take priorityBaseURLstring// optional provider endpoint overrideTools[]sdk.Tool// sdk.DefaultTools() or custom listExtensions[]sdk.ExtensionSystemPromptstringThinkingLevelsdk.ThinkingLevelSessionDirstring// where to persist sessionsDryRunbool}
Core API
Call
Description
sdk.NewAgent(cfg)
Create and initialize an agent
ag.Subscribe(fn)
Register an event handler; called for every emitted event
ag.Prompt(ctx, text)
Send a user message and start the agent loop
ag.Idle()
Returns a channel that closes when the agent reaches Idle state
ag.Steer(ctx, text)
Inject a steering message into the running turn
ag.FollowUp(ctx, text)
Queue a message to process after the current turn
ag.Abort(ctx)
Cancel the current running turn
ag.SetExtensions(exts)
Replace the extension list (takes effect on next prompt)
Event Types
Subscribe to events by checking e.Type:
Event type
Payload field
Description
EventAgentStart
—
Agent loop started
EventAgentEnd
—
Agent loop completed
EventTurnStart
—
LLM turn started
EventTurnEnd
—
LLM turn completed
EventTextDelta
e.Content
Incremental response text
EventThinkingDelta
e.Content
Incremental thinking text
EventToolCall
e.ToolCall
Tool invocation started
EventToolDelta
e.Content
Streaming tool output
EventToolOutput
e.ToolOutput
Final tool result
Minimal Example (no tools, no session)
ag,_:=sdk.NewAgent(sdk.Config{Provider:"anthropic",Model:"claude-sonnet-4-6",APIKey:os.Getenv("ANTHROPIC_API_KEY"),})varbufstrings.Builderag.Subscribe(func(esdk.Event){ife.Type==sdk.EventTextDelta{buf.WriteString(e.Content)}})ag.Prompt(context.Background(),"What is 2+2?")<-ag.Idle()fmt.Println(buf.String())
Pass sdk.DefaultTools() in sdk.Config.Tools to get the full set of built-in tools:
Tool
Description
read
Read file contents with offset/limit support
write
Create or overwrite files
edit
Search-and-replace edits within files
bash
Execute shell commands
grep
Search file contents via regex
ls
List directory contents
find
Locate files using glob patterns
bash, write, and edit are destructive. In --dry-run mode they preview what they would do without executing.
Tool Interface
Implement sdk.Tool to create a custom tool:
typeToolinterface{Name()stringDescription()stringSchema()json.RawMessage// JSON Schema for the input parametersExecute(ctxcontext.Context,argsjson.RawMessage,updateToolUpdate)(*ToolResult,error)IsReadOnly()bool// if true, tool is allowed in dry-run mode}
ToolUpdate is a callback for streaming partial output while the tool runs:
typeToolUpdatefunc(contentstring)
Example: Custom Tool
typeCountLinesToolstruct{}func(t*CountLinesTool)Name()string{return"count_lines"}func(t*CountLinesTool)Description()string{return"Count the number of lines in a file"}func(t*CountLinesTool)Schema()json.RawMessage{returnjson.RawMessage(`{
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path to count lines in"}
},
"required": ["path"]
}`)}func(t*CountLinesTool)IsReadOnly()bool{returntrue}func(t*CountLinesTool)Execute(ctxcontext.Context,argsjson.RawMessage,updatesdk.ToolUpdate)(*sdk.ToolResult,error){varinputstruct{Pathstring`json:"path"`}iferr:=json.Unmarshal(args,&input);err!=nil{returnnil,err}data,err:=os.ReadFile(input.Path)iferr!=nil{return&sdk.ToolResult{Content:err.Error(),IsError:true},nil}n:=strings.Count(string(data),"\n")+1return&sdk.ToolResult{Content:fmt.Sprintf("%d lines",n)},nil}
Multiple subscribers are allowed. Each runs in its own goroutine. The EventBus is non-blocking — Publish enqueues to a 4096-item buffered channel per subscriber and returns immediately, so slow subscribers drop events rather than stalling the agent loop.
ag.Idle() returns a channel that closes when the agent returns to Idle. Use it to block until a prompt completes:
ag.Prompt(ctx,"Refactor main.go")<-ag.Idle()// agent is idle, safe to call Prompt again
In-Process Extensions
If your extension is written in Go and you control the build, you can implement sdk.Extension (an alias of agent.Extension) directly — no gRPC, no subprocess, no socket. This is the lowest-overhead extension path.
All types are re-exported from sdk so callers only need to import github.com/goppydae/sharur/sdk.
Key Hook Behaviours
ModifyInput — runs before the user text is added to the transcript. Return an InputResult with:
sdk.InputContinue — pass through unchanged
sdk.InputTransform — replace with result.Text
sdk.InputHandled — consume entirely; no agent turn is started and nothing is appended to the transcript
ModifyContext — receives and returns the message slice that will be sent to the LLM. Changes do not affect the stored session transcript — they are ephemeral per-turn.
BeforeToolCall — return (result, true) to intercept and block the tool; return (nil, false) to allow normal execution.
BeforeCompact — return nil to let the default LLM summarization run, or a *CompactionResult to supply your own summary and skip the LLM call.
typesandboxExtstruct{sdk.NoopExtensionallowedDirstring}func(e*sandboxExt)BeforeToolCall(_context.Context,call*sdk.ToolCall,argsjson.RawMessage)(*sdk.ToolResult,bool){varinputstruct{Pathstring`json:"path"`}_=json.Unmarshal(args,&input)ifinput.Path!=""&&!strings.HasPrefix(input.Path,e.allowedDir){return&sdk.ToolResult{Content:fmt.Sprintf("blocked: %s is outside %s",input.Path,e.allowedDir),IsError:true,},true}returnnil,false}
This section is generated by gomarkdoc from Go source comments. Run mage docs to regenerate.
Extension interface and NoopExtension for in-process extensions
Note:internal/tools and internal/agent are documented here because they are contract surfaces for in-process extension authors who build inside the same module. Go’s import restrictions prevent external consumers from importing them directly, but the interfaces are stable and intentionally exposed through this reference.
Subsections of API Reference
agent
import"github.com/goppydae/sharur/internal/agent"
Package agent provides the stateful agent with transcript, tools, and events.
constSUMMARIZATION_PROMPT=`The messages above are a conversation to summarize. Create a structured context checkpoint summary that another LLM will use to continue the work.
Start your response with the exact string: <!-- sharur-summary -->
Then use this EXACT format:
## Goal
[What is the user trying to accomplish? Can be multiple items if the session covers different tasks.]
## Constraints & Preferences
- [Any constraints, preferences, or requirements mentioned by user]
- [Or "(none)" if none were mentioned]
## Progress
### Done
- [x] [Completed tasks/changes]
### In Progress
- [ ] [Current work]
### Blocked
- [Issues preventing progress, if any]
## Key Decisions
- **[Decision]**: [Brief rationale]
## Next Steps
1. [Ordered list of what should happen next]
## Critical Context
- [Any data, examples, or references needed to continue]
- [Or "(none)" if not applicable]
Keep each section concise. Preserve exact file paths, function names, and error messages.`
constTURN_PREFIX_SUMMARIZATION_PROMPT=`This is the PREFIX of a turn that was too large to keep. The SUFFIX (recent work) is retained.
Summarize the prefix to provide context for the retained suffix:
## Original Request
[What did the user ask for in this turn?]
## Early Progress
- [Key decisions and work done in the prefix]
## Context for Suffix
- [Information needed to understand the retained recent work]
Be concise. Focus on what's needed to understand the kept suffix.`
constUPDATE_SUMMARIZATION_PROMPT=`The messages above are NEW conversation messages to incorporate into the existing summary provided in <previous-summary> tags.
Start your response with the exact string: <!-- sharur-summary -->
Update the existing structured summary with new information. RULES:
- PRESERVE all existing information from the previous summary
- ADD new progress, decisions, and context from the new messages
- UPDATE the Progress section: move items from "In Progress" to "Done" when completed
- UPDATE "Next Steps" based on what was accomplished
- PRESERVE exact file paths, function names, and error messages
- If something is no longer relevant, you may remove it
Use this EXACT format:
## Goal
[Preserve existing goals, add new ones if the task expanded]
## Constraints & Preferences
- [Preserve existing, add new ones discovered]
## Progress
### Done
- [x] [Include previously done items AND newly completed items]
### In Progress
- [ ] [Current work - update based on progress]
### Blocked
- [Current blockers - remove if resolved]
## Key Decisions
- **[Decision]**: [Brief rationale] (preserve all previous, add new)
## Next Steps
1. [Update based on current state]
## Critical Context
- [Preserve important context, add new if needed]
Keep each section concise. Preserve exact file paths, function names, and error messages.`
func EstimateMessageTokens
funcEstimateMessageTokens(mMessage)int
type Agent
Agent owns the transcript, emits events, and executes tools.
typeAgentstruct{// contains filtered or unexported fields}
InvokeTool manually triggers a tool call as if it came from the assistant. It executes the tool, records the result, and then starts the agent loop to allow the LLM to react to the invocation.
func (*Agent) IsRunning
func(a*Agent)IsRunning()bool
IsRunning reports whether the agent is currently processing.
func (*Agent) LifecycleState
func(a*Agent)LifecycleState()string
LifecycleState returns the current lifecycle state as a string.
func (*Agent) Messages
func(a*Agent)Messages()[]Message
Messages returns a copy of the conversation messages.
typeEventstruct{TypeEventTypeContentstringToolCall*ToolCallUsage*llm.UsageErrorerror// ToolOutput stores the result content of a tool execution.// Emitted when type is EventToolOutput.ToolOutput*ToolOutput// StateChange holds details of a lifecycle state transition.// Emitted when type is EventStateChange.StateChange*StateTransition// Value stores a numeric value (e.g. token count).// Emitted when type is EventTokens.Valueint64}
Extension is the unified interface for all extensions (gRPC plugins, Markdown Skills, etc.)
typeExtensioninterface{// Name returns the extension's unique identifier.Name()string// Tools returns additional tools to register with the agent.Tools()[]tools.Tool// BeforePrompt is called before each LLM request.// Return a modified state to change the request.BeforePrompt(ctxcontext.Context,state*AgentState)*AgentState// BeforeToolCall is called before each tool execution.// Return (result, true) to intercept and prevent the tool from running.// Return (nil, false) to allow normal execution.BeforeToolCall(ctxcontext.Context,call*ToolCall,argsjson.RawMessage)(*tools.ToolResult,bool)// AfterToolCall is called after each tool call completes.// Return a modified result to change the outcome.AfterToolCall(ctxcontext.Context,call*ToolCall,result*tools.ToolResult)*tools.ToolResult// ModifySystemPrompt is called to augment the system prompt.ModifySystemPrompt(promptstring)string// SessionStart is called when a session is attached or the first prompt begins.SessionStart(ctxcontext.Context,sessionIDstring,reasonSessionStartReason)// SessionEnd is called when a session is reset or the agent is torn down.SessionEnd(ctxcontext.Context,sessionIDstring,reasonSessionEndReason)// AgentStart is called when the agent begins processing a user prompt.AgentStart(ctxcontext.Context)// AgentEnd is called when the agent loop finishes (success, error, or abort).AgentEnd(ctxcontext.Context)// TurnStart is called at the start of each LLM request turn.TurnStart(ctxcontext.Context)// TurnEnd is called after each turn's tool calls have been processed.TurnEnd(ctxcontext.Context)// ModifyInput is called with raw user input before it is added to the transcript.// Return InputHandled to consume the message without further processing.// Return InputTransform to replace the text.// Return InputContinue (or zero value) to proceed unchanged.ModifyInput(ctxcontext.Context,textstring)InputResult// ModifyContext is called with the message slice just before building each LLM// request. The returned slice replaces what is sent to the LLM (not the stored// transcript). Extensions are chained; each receives the previous result.ModifyContext(ctxcontext.Context,messages[]types.Message)[]types.Message// BeforeProviderRequest is called with the assembled CompletionRequest before// it is sent to the LLM provider. Return a modified copy to alter the request.BeforeProviderRequest(ctxcontext.Context,req*llm.CompletionRequest)*llm.CompletionRequest// AfterProviderResponse is called after the LLM stream is fully consumed.AfterProviderResponse(ctxcontext.Context,contentstring,numToolCallsint)// BeforeCompact is called before the compaction summarization LLM call.// Return a non-nil *CompactionResult to provide a custom summary and skip the// default LLM-based summarization entirely.BeforeCompact(ctxcontext.Context,prepCompactionPrep)*CompactionResult// AfterCompact is called after compaction completes.AfterCompact(ctxcontext.Context,freedTokensint)}
type Image
Image is an alias for types.Image.
typeImage=types.Image
type InputAction
InputAction controls how ModifyInput’s result is applied.
typeInputActionstring
const(// InputContinue passes the original text through unchanged.InputContinueInputAction="continue"// InputTransform replaces the user text with InputResult.Text.InputTransformInputAction="transform"// InputHandled marks the input as consumed; the message is not appended to the transcript.InputHandledInputAction="handled")
type InputResult
InputResult is returned by ModifyInput to describe how to process the user input.
Cleanup kills all running extension subprocesses and removes their socket files.
func (*Loader) Load
func(l*Loader)Load()([]agent.Extension,[]error)
Load discovers extensions, starts them as subprocesses, and returns gRPC client interfaces. Extensions that fail to load are logged and skipped; the returned error accumulates all failures so callers can distinguish “nothing loaded” from “everything succeeded”.
func (*Loader) LoadOrLog
func(l*Loader)LoadOrLog()[]agent.Extension
LoadOrLog calls Load and logs any errors, returning only the successfully loaded extensions.
type NoopPlugin
NoopPlugin is a base Plugin implementation with no-op defaults. Embed it in your Plugin struct and override only what you need.
NewAgent creates a new agent from the given configuration.
type CompactionPrep
CompactionPrep describes the state passed to BeforeCompact.
typeCompactionPrep=agent.CompactionPrep
type CompactionResult
CompactionResult can be returned by BeforeCompact to provide a custom summary.
typeCompactionResult=agent.CompactionResult
type Config
Config holds the options for creating a new agent.
typeConfigstruct{// Provider selects the LLM backend: "ollama" (default), "openai", or "anthropic".Providerstring// Model is the model name to use (e.g. "llama3", "gpt-4o", "claude-sonnet-4-6").Modelstring// OllamaURL overrides the Ollama base URL (default: http://localhost:11434).OllamaURLstring// OpenAIURL overrides the OpenAI-compatible base URL.OpenAIURLstring// OpenAIKey is the API key for OpenAI or any compatible provider.OpenAIKeystring// AnthropicKey is the Anthropic API key.AnthropicKeystring// SystemPrompt sets the agent's system prompt.SystemPromptstring// ThinkingLevel controls reasoning depth.ThinkingLevelThinkingLevel// MaxTokens caps the response length (0 = provider default).MaxTokensint// DryRun mode prevents tools from performing destructive actions.DryRunbool// Tools registers additional tools beyond the builtins.// Pass tools.Read{}, tools.Write{}, tools.Bash{}, etc.Tools[]Tool// Extensions registers active extensions (gRPC plugins or Skills).Extensions[]Extension}
type Event
Event is an agent lifecycle event emitted to subscribers.
typeEvent=agent.Event
type EventType
EventType identifies the kind of event.
typeEventType=agent.EventType
type Extension
Extension is the interface for agent extensions (gRPC plugins, skills, etc.).
typeExtension=agent.Extension
type InputAction
InputAction controls how ModifyInput’s result is applied.
NormalizePath strips a leading ‘@’ from a path if present.
type Bash
Bash is a tool for executing shell commands.
Security note: commands are executed as-is via `bash -c`. The subprocess runs in an isolated environment containing only an explicit allowlist of variables (PATH, HOME, LANG, TERM, TMPDIR, USER, SHELL). Extra variables can be injected via EnvAllowlist. DenyPatterns blocks commands by substring match before execution.
typeBashstruct{// Cwd is the working directory for commands.Cwdstring// Timeout for command execution.Timeouttime.Duration// DenyPatterns is an optional list of substrings that, if found in the// command, will cause execution to be rejected. Checked case-insensitively.// Example: []string{"rm -rf /", "dd if=", "> /dev/sd"}DenyPatterns[]string// EnvAllowlist is an optional list of KEY=VALUE pairs to inject into the// subprocess environment in addition to the default allowlist.EnvAllowlist[]string}
Tool is the universal tool interface — anything the agent can do.
typeToolinterface{Name()stringDescription()stringSchema()json.RawMessage// JSON Schema for parametersExecute(ctxcontext.Context,argsjson.RawMessage,updateToolUpdate)(*ToolResult,error)IsReadOnly()bool}
type ToolCall
ToolCall represents a tool invocation from the LLM.