User Guide

The user guide covers day-to-day use of sharur from the terminal:

CLI — runtime modes, flags, keybindings, slash commands, and configuration
Extensibility — skills, prompt templates, and Go/Python/gRPC extensions

CLI

shr is the sharur CLI binary. It supports three runtime modes and a rich flag surface for model selection, session management, tools, and extensions.

Runtime Modes

Mode	Flag	Description
TUI	`--mode tui` (default)	Interactive Bubble Tea terminal interface with streaming, tool cards, and session management
JSON	`--mode json`	One-shot query with line-delimited JSON event output — useful for shell pipelines
gRPC	`--mode grpc`	Persistent multi-session gRPC service — any gRPC-capable client can connect

Quick Start

# Launch the interactive TUI
shr

# One-shot answer (JSONL output)
shr --mode json "What is the best way to structure a Go project?"

# Resume the most recent session
shr --continue

See the sub-pages for full keybinding and slash command references, JSON event schema, gRPC proto overview, provider setup, and the full configuration schema.

Configuration

sharur uses layered JSON configuration. Project-level settings override global defaults.

Path	Scope
`~/.sharur/config.json`	Global defaults — applies to all projects
`.sharur/config.json`	Project-level overrides — applies in this directory

config.json Schema

{
  "defaultModel": "llama3.2",
  "defaultProvider": "ollama",
  "theme": "dark",
  "thinkingLevel": "medium",
  "ollamaBaseURL": "http://localhost:11434",
  "openAIBaseURL": "https://api.openai.com/v1",
  "openAIApiKey": "",
  "anthropicApiKey": "",
  "anthropicApiVersion": "",
  "googleApiKey": "",
  "llamaCppBaseURL": "http://localhost:8080",
  "compaction": {
    "enabled": true,
    "reserveTokens": 2048,
    "keepRecentTokens": 8192
  }
}

API keys can also be set via environment variables — env vars take priority over config file values.

Context Files

sharur auto-discovers AGENTS.md, CLAUDE.md, GEMINI.md, and .context.md in your project root and parent directories and injects them into the system prompt. Outermost files take precedence (parent directory wins over project root).

Disable with --no-context-files.

CLI Flags

Mode

Flag	Description
`--mode`	Mode: `tui` (default), `json`, `grpc`
`--grpc-addr`	gRPC listen address (default `:50051`; `--mode grpc` only)

Model / Provider

Flag	Description
`--model` / `-m`	Model to use (e.g. `llama3`, `gpt-4o`, `anthropic/claude-sonnet-4-6`)
`--provider`	Provider: `ollama`, `openai`, `anthropic`, `llamacpp`, `google`
`--api-key`	API key override
`--thinking`	Thinking level: `off`, `minimal`, `low`, `medium`, `high`, `xhigh`
`--models`	Comma-separated model list for `Ctrl+P` cycling

Session

Flag	Description
`--continue` / `-c`	Resume the most recent session
`--resume` / `-r`	Select a session to resume (fuzzy search or ID)
`--session`	Use a specific session file path
`--session-dir`	Directory for session storage and lookup
`--branch`	Branch from a session file or partial UUID into a new child session
`--no-session`	Ephemeral mode: don’t save the session

System Prompt

Flag	Description
`--system-prompt`	Override the system prompt
`--append-system-prompt`	Append text or file to the system prompt (repeatable)

Tools

Flag	Description
`--tools`	Comma-separated list of tools to enable: `read,bash,edit,write,grep,find,ls`
`--no-tools`	Disable all built-in tools
`--dry-run`	Safety mode: destructive tools preview actions instead of running

Extensions / Skills / Prompts

Flag	Description
`--extension` / `-e`	Load a gRPC extension binary (repeatable)
`--no-extensions`	Disable extension directory auto-discovery (`-e` paths still load)
`--skill`	Load a skill file or directory (repeatable)
`--no-skills`	Disable skill auto-discovery
`--prompt-template`	Load a prompt template file or directory (repeatable)
`--no-prompt-templates`	Disable prompt template auto-discovery

Output / Info

Flag	Description
`--export`	Export current session to an HTML file and exit
`--list-models`	List available models from the configured provider (optional fuzzy filter)
`--version` / `-v`	Show version number
`--verbose`	Force verbose startup output
`--offline`	Disable startup network operations (model checks, etc.)

Providers

Provider Setup

sharur supports five LLM providers. All configuration lives in config.json files or environment variables; environment variables take priority over config file values.

Model Naming

Models can be specified as provider/model shorthand or with separate flags:

# Shorthand: provider inferred from the slash-prefix
shr --model anthropic/claude-sonnet-4-6

# Explicit: provider and model as separate flags
shr --provider anthropic --model claude-sonnet-4-6

Both forms are equivalent. The shorthand is convenient for one-off overrides; the config file form is better for persistent defaults.

Environment Variables

API keys set via environment variable take priority over values in config.json. The env var names use the SHARUR_ prefix:

Provider	Environment Variable
Anthropic	`SHARUR_ANTHROPIC_API_KEY`
OpenAI	`SHARUR_OPENAI_API_KEY`
Google	`SHARUR_GOOGLE_API_KEY`

Ollama and llama.cpp are local servers and do not use API keys.

Ollama

Ollama runs models locally. It is the default provider.

// ~/.sharur/config.json or .sharur/config.json
{
  "defaultProvider": "ollama",
  "defaultModel": "llama3.2",
  "ollamaBaseURL": "http://localhost:11434"
}

# Pull a model and launch
ollama pull llama3.2
shr

# Use a specific model
shr --model ollama/llama3.2

# Point at a remote Ollama server
shr --model llama3.2 --provider ollama

Notes:

Default base URL is http://localhost:11434. Override with ollamaBaseURL.
Ollama models support tools and images (vision models).
Use shr --list-models to see all locally available models.
Thinking is supported on models that emit <think> tokens (e.g. qwq, deepseek-r1).

llama.cpp

llama.cpp exposes an OpenAI-compatible HTTP server.

{
  "defaultProvider": "llamacpp",
  "llamaCppBaseURL": "http://localhost:8080"
}

# Start the llama.cpp server (example)
./llama-server -m model.gguf --port 8080

# Connect with sharur
shr --provider llamacpp --model my-model

Notes:

Default base URL is http://localhost:8080. Override with llamaCppBaseURL.
The model name passed to shr is forwarded to the server as-is.
Image attachments are not supported.
The server’s own context window size is used; sharur queries /v1/models to detect it.

OpenAI

{
  "defaultProvider": "openai",
  "defaultModel": "gpt-4o",
  "openAIApiKey": "",
  "openAIBaseURL": "https://api.openai.com/v1"
}

# Via environment variable (recommended)
export SHARUR_OPENAI_API_KEY=sk-...
shr --model openai/gpt-4o

# One-off key override
shr --provider openai --model gpt-4o --api-key sk-...

OpenAI-compatible endpoints:

Any server that implements the OpenAI chat completions API can be used by pointing openAIBaseURL at it:

{
  "defaultProvider": "openai",
  "openAIBaseURL": "http://localhost:11434/v1",
  "openAIApiKey": "unused"
}

This works with vLLM, LM Studio, and others.

Notes:

Reasoning models (o3, o4-mini) emit thinking deltas that appear in the TUI and JSON event stream.
Supports tools and vision (images) for compatible models.

Anthropic

{
  "defaultProvider": "anthropic",
  "defaultModel": "claude-sonnet-4-6",
  "anthropicApiKey": "",
  "anthropicApiVersion": ""
}

export SHARUR_ANTHROPIC_API_KEY=sk-ant-...
shr --model anthropic/claude-sonnet-4-6

# Extended thinking (claude-3-7-sonnet and later)
shr --model anthropic/claude-3-7-sonnet-20250219 --thinking high

Notes:

Extended thinking is supported for models that enable it (e.g. claude-3-7-sonnet). Use --thinking medium or --thinking high.
medium thinking uses a 10,000-token budget; high uses 20,000 tokens. Temperature is automatically set to the required value.
anthropicApiVersion overrides the anthropic-version request header; leave empty to use the library default.

Google Gemini

{
  "defaultProvider": "google",
  "defaultModel": "gemini-2.0-flash",
  "googleApiKey": ""
}

export SHARUR_GOOGLE_API_KEY=AIza...
shr --model google/gemini-2.0-flash

Notes:

Gemini 1.5 Pro and later have a 1M+ token context window.
Supports tools and vision (images).
Use shr --list-models to see available Gemini models.

Listing Available Models

All five providers implement model listing. Use --list-models to query the active provider:

# List Ollama models
shr --list-models

# List models from a specific provider
shr --provider anthropic --list-models

# Filter results
shr --provider openai --list-models gpt-4

The output is a plain list of model names, suitable for piping:

shr --list-models | fzf | xargs -I{} shr --model {}

Provider Feature Matrix

Provider	Tools	Images	Thinking	Model Listing
`ollama`	✓	✓	model-dependent	✓
`llamacpp`	✓	✗	✗	✓
`openai`	✓	✓	reasoning models	✓
`anthropic`	✓	✓	✓ extended	✓
`google`	✓	✓	✗	✓

TUI

The TUI is a rich, Bubble Tea-powered interface with real-time streaming, tool cards, session management, and a live context usage progress bar in the status footer.

Keybindings

Key	Action
`Enter`	Send message (or Steer the running agent)
`Shift+Enter`	Insert newline
`Ctrl+Enter`	Queue follow-up message (runs after agent finishes)
`Ctrl+C`	Abort the current agent run and clear the input editor
`Esc`	Cancel streaming / Close modal / Abort current turn
`Ctrl+O`	Toggle tool call output expansion
`Ctrl+P`	Open model selection modal (cycling via `--models` flag)
`↑/↓`	Navigate prompt history (if at start/end of editor) / Scroll viewport
`F1`	Show help modal

Slash Commands

Command	Description
`/new`	Start a fresh session
`/resume <id>`	Resume a session by ID or partial UUID (fuzzy search enabled)
`/branch [idx]`	Create a new child session branching from a specific message index (defaults to last)
`/fork`	Duplicate current session into a new independent session (no parent link)
`/rebase`	Interactive rebase: select specific messages to keep in a new session
`/merge <id>`	Merge another session’s history into the current one with a synthesis turn
`/tree [-g\|-p]`	Open session tree modal. Flags: `--global` (-g) or `--project` (-p)
`/import <path>`	Import a session from a JSONL file
`/export <path>`	Export the current session to a JSONL file
`/model <p/m>`	Switch model mid-conversation (e.g. `/model anthropic/claude-sonnet-4-6`)
`/stats`	View session statistics and token usage
`/config`	View and edit active configuration
`/context`	View detailed context window usage
`/compact`	Manually trigger a context compaction
`/skill:<name> [args]`	Invoke a skill
`/prompt:<name>`	Expand a prompt template into the editor
`/exit`	Quit (alias: `/quit`)

Session Tree Modal (`/tree`)

Key	Action
`↑/↓` / `PgUp/PgDn`	Navigate the session list
`Enter`	Resume the selected session (or branch from it if it’s an interior node)
`B`	Create a new branch from the selected session
`F`	Create an independent fork of the selected session
`R`	Start an interactive rebase from the selected session’s history
`Esc`	Close modal

Bang Commands

Bang commands execute a shell command and inject the output into the conversation:

!ls -la          # Execute shell command, paste output into editor
!!cat README.md  # Execute shell command, send output directly to agent

!cmd — pastes stdout into the editor so you can review before sending
!!cmd — sends stdout directly to the agent without review

At-File Attachments

Type @ in the input to fuzzy-search and attach file contents to your prompt:

Tell me what this does @src/agent/loop.go

The file content is embedded inline in the message sent to the agent.

JSON Mode

JSON mode runs a single prompt and streams the agent’s events as line-delimited JSON (JSONL) to stdout. It is designed for shell pipelines and tooling integration.

shr --mode json "What is the best way to structure a Go project?"

# Pipe stdin as context
cat main.go | shr --mode json "Refactor this to use interfaces"

# Specify a model
shr --mode json "Summarize the last 10 git commits" --model anthropic/claude-opus-4-5

Event Format

Each line is the protobuf JSON encoding of an AgentEvent. Event types mirror the TUI stream:

EVENT_AGENT_START / EVENT_AGENT_END
EVENT_TEXT_DELTA — incremental response text
EVENT_THINKING_DELTA — incremental thinking text (extended thinking models)
EVENT_TOOL_CALL — tool invocation start
EVENT_TOOL_DELTA — streaming tool output
EVENT_TOOL_OUTPUT — final tool result
EVENT_TURN_START / EVENT_TURN_END

Common Patterns

# Capture only the text deltas
shr --mode json "Explain Go interfaces" \
  | jq -r 'select(.type == "EVENT_TEXT_DELTA") | .content'

# Run without saving the session
shr --mode json --no-session "Quick one-off question"

# Dry-run to see what tools would be called
shr --mode json --dry-run "Delete all .tmp files in the current directory"

Grpc

gRPC Mode

gRPC mode starts a persistent AgentService server. Each connecting client supplies a session_id and gets its own isolated agent. Sessions are saved to disk after each turn and reloaded automatically on reconnect.

# Start on the default port
shr --mode grpc

# Use a custom address
shr --mode grpc --grpc-addr :9090

The server responds to SIGINT/SIGTERM with a graceful shutdown: in-flight turns are allowed to finish (30 s timeout), all sessions are flushed to disk, then the listener closes.

Proto Definition

The service is defined in proto/sharur/v1/agent.proto. Generated Go stubs live in internal/gen/sharur/v1/. Regenerate with mage generate.

Key RPCs:

RPC	Description
`Prompt`	Send a user message; streams back `AgentEvent`s
`NewSession`	Create a new session
`GetMessages`	Retrieve message history for a session
`GetState`	Get current agent state
`Steer`	Inject a steering message mid-turn
`FollowUp`	Queue a follow-up after the current turn
`Abort`	Cancel the current running turn
`ForkSession`	Fork a session into a new independent copy
`ConfigureSession`	Change model, provider, or thinking level

In-Process Transport

For the TUI and JSON modes, all internal communication also goes through this same protobuf boundary using a bufconn in-memory pipe — not a network socket. This means all three modes share identical code paths. See Service Architecture for details.

Extensibility

sharur supports three extension points:

Skills — reusable prompt templates invoked with /skill-name
Prompts — system prompt injection via YAML files
Extensions — in-process Go, out-of-process Python, or gRPC plugins

Skills

Skills

Skills are Markdown files that provide sharur with specialized, reusable instructions for specific tasks. When a skill is invoked, its content is sent as a user message to the agent along with any arguments you provide.

How Skills Work

When sharur starts, it scans the skill directories and adds a list of available skills to the system prompt. The agent knows which skills exist and their descriptions. You can explicitly invoke a skill with /skill:<name> from the TUI, or the agent may choose to invoke one automatically via the read tool or a specialized skill tool call.

When you invoke a skill via /skill:<name>, it is executed as a skill tool, which loads the content and sends it to the agent:

<skill name="refactor" location="/path/to/refactor/SKILL.md">
References are relative to /path/to/refactor/.

...skill content here...
</skill>

your additional arguments here

Skill Discovery Directories

sharur searches for skills in these locations (in order):

Path	Scope
`~/.sharur/skills/`	Global — available in all projects
`.sharur/skills/` (project root)	Project-specific skills

Skills with the same name in a project directory override global ones.

Skill File Formats

Simple: Single `.md` file

Create a .md file directly in a skills directory. The filename (without extension) becomes the skill name.

.sharur/skills/refactor.md

Invoke with:

/skill:refactor improve error handling

Structured: Directory with `SKILL.md`

Create a directory containing a SKILL.md file. The directory name becomes the skill name. This format lets you include supporting files (examples, templates) alongside the skill.

.sharur/skills/
  code-review/
    SKILL.md
    checklist.md
    examples/
      before.go
      after.go

Invoke with:

/skill:code-review

Note: When a SKILL.md is found in a directory, subdirectories are not scanned further. This lets you bundle reference files with your skill.

Frontmatter (Optional)

Both formats support optional YAML frontmatter to provide metadata:

---
name: refactor
description: Refactor Go code to use idiomatic patterns and interfaces
---

You are an expert Go developer. When asked to refactor code:

1. Identify opportunities to use interfaces for testability
2. Replace repetitive code with helper functions
3. Add godoc comments to all exported symbols
4. Ensure error handling follows Go conventions (wrap with %w)

Always explain the reasoning behind each change before making it.

Frontmatter fields:

Field	Description
`name`	Override the skill name (defaults to filename/directory name)
`description`	A short description shown to the agent in the system prompt

Practical Examples

Code Review Skill

.sharur/skills/code-review.md

---
name: code-review
description: Perform a thorough code review with actionable feedback
---

Review the provided code and evaluate it against these criteria:

**Correctness**
- Does the logic match the intended behavior?
- Are edge cases handled?
- Are there potential nil pointer dereferences or index out-of-bounds issues?

**Maintainability**
- Is the code readable and self-documenting?
- Are functions focused on a single responsibility?
- Is there appropriate error handling?

**Performance**
- Are there obvious inefficiencies (e.g. unnecessary allocations, N+1 queries)?

Format your response as:
## Summary
<one paragraph>

## Issues
<numbered list of specific issues with file:line references>

## Suggestions
<numbered list of improvements>

Invoke:

/skill:code-review

Or attach a file reference:

/skill:code-review @[internal/agent/loop.go]

Structured Skill with Supporting Files

.sharur/skills/
  db-migration/
    SKILL.md
    schema-example.sql

---
name: db-migration
description: Generate SQL migration files following our project conventions
---

Generate a database migration for the requested schema change.

Our migration file conventions:
- Files are named: `YYYYMMDD_HHMMSS_description.sql`
- Each file has an `-- +migrate Up` and `-- +migrate Down` section
- All tables use `BIGINT` primary keys with `AUTO_INCREMENT`
- Always include `created_at` and `updated_at` TIMESTAMP columns

See the example schema at the path listed in this skill's location directory: `schema-example.sql`

Global Utility Skill

~/.sharur/skills/explain.md

---
name: explain
description: Explain code clearly for a non-expert audience
---

Explain the following code in plain English. Assume the reader is a competent programmer but unfamiliar with this codebase.

Structure your explanation as:
1. **Purpose** — What does this code do in one sentence?
2. **How it works** — Step-by-step walkthrough of the logic
3. **Key concepts** — Any domain-specific terms or patterns used
4. **Gotchas** — Anything surprising or non-obvious

Tips

Keep skills focused. One skill = one task type. Compose them with arguments rather than making a single skill do everything.
Use relative file references — when your skill body references files, note they resolve relative to the skill’s directory. The agent is told the skill’s location so it can use the read tool on supporting files.
Test your skill by invoking it with /skill:<name> in the TUI. The skill’s content and its effect on the conversation will be visible in the tool output cards.
Override skills per-project — place a skill with the same name in .sharur/skills/ to override the global version for a specific project.

Prompts

Prompt Templates

Prompt templates are reusable text snippets that expand directly into the TUI input editor. Unlike skills (which are sent to the agent immediately), prompt templates let you pre-fill the editor so you can review, edit, or complete the text before sending.

How Prompt Templates Work

When you type /prompt:<name> and press Enter, the template content is loaded into the editor input. You can then modify it, add context, attach files with @, and send it normally. This is useful for long, structured prompts you use frequently.

Prompt Template Directories

sharur searches these locations (in order):

Path	Scope
`~/.sharur/prompts/`	Global — available in all projects
`.sharur/prompts/` (project root)	Project-specific templates

Template File Format

A prompt template is any .md file in a prompts directory. The filename (without extension) is the template name.

.sharur/prompts/bug-report.md

Invoke with:

/prompt:bug-report

Minimal Template (no frontmatter)

The entire file content becomes the template text:

Describe the bug you found:

**Steps to reproduce:**
1.
2.
3.

**Expected behavior:**

**Actual behavior:**

**Environment:**
- OS:
- shr version:
- Model:

Template with Frontmatter

Add optional YAML frontmatter for metadata:

---
description: Generate a structured bug report
argument-hint: <component-name>
---

Describe the bug you found in the $1 component:

**Steps to reproduce:**
1.
2.
3.

**Expected behavior:**

**Actual behavior:**

Frontmatter fields:

Field	Description
`description`	Short description shown in the `/prompt:` picker
`argument-hint`	Hint shown in autocomplete describing expected arguments

Argument Substitution

Templates support positional argument placeholders: $1, $2, etc.

When you invoke a template via the slash command handler (not the interactive TUI), arguments after the template name are substituted. To mitigate prompt injection, sharur automatically wraps these arguments in <untrusted_input> tags. In the TUI, the template expands as-is and you fill in the values manually.

Practical Examples

PR Description Template

.sharur/prompts/pr-description.md

---
description: Generate a pull request description
---

Write a pull request description for the following changes.

**Format:**
## Summary
<What does this PR do? Why?>

## Changes
<Bullet list of specific changes>

## Testing
<How was this tested?>

## Notes
<Anything reviewers should pay attention to>

The diff is:

Invoke:

/prompt:pr-description

Then paste or attach the diff before sending.

Architecture Decision Record

.sharur/prompts/adr.md

---
description: Draft an Architecture Decision Record (ADR)
argument-hint: <decision-title>
---

Draft an Architecture Decision Record (ADR) for: **$1**

Use this structure:

# ADR: $1

## Status
Proposed

## Context
<What is the issue motivating this decision?>

## Decision
<What was decided?>

## Consequences
### Positive
-

### Negative
-

### Neutral
-

## Alternatives Considered
<What other approaches were evaluated and why were they rejected?>

Invoke:

/prompt:adr Use JSONL for session storage

Global Commit Message Template

~/.sharur/prompts/commit.md

---
description: Generate a conventional commit message
---

Generate a commit message following the Conventional Commits specification for the following diff or description of changes.

Format:
` ` `
<type>(<scope>): <short description>

<body: what changed and why, wrapped at 72 chars>

<footer: breaking changes, issue references>
` ` `

Types: feat, fix, docs, style, refactor, perf, test, chore

Changes:

Invoke:

/prompt:commit

Code Explanation for PR Comments

.sharur/prompts/explain-for-review.md

---
description: Explain a code block suitable for a PR comment
---

Explain the following code in a way that's suitable for a GitHub PR review comment. Be concise (2-4 sentences max), assume the reader is a senior engineer, and highlight any non-obvious design decisions.

Code:

Tips

Prompt templates are for your input. They expand into the editor, not directly to the agent. This gives you a chance to customize before sending.
Use $1, $2 placeholders for dynamic parts you’ll always fill in differently. Leave static boilerplate as literal text.
Combine with @ file attachments. Type /prompt:code-review then add @src/myfile.go before pressing Enter to attach a file.
Project-specific overrides. A template in .sharur/prompts/ with the same name as a global template takes priority for that project.
Organize with subdirectories. Templates are discovered recursively, so you can group them:
```
.sharur/prompts/
  code/
    refactor.md
    review.md
  docs/
    readme.md
    adr.md
```
Invoke as /prompt:refactor, /prompt:adr, etc. (name is the filename, not the full path).

Go Extensions

Extensions let you add new behaviors to sharur beyond what’s possible with skills and prompt templates. They can observe and modify every stage of the agent loop — from the raw user input through each LLM turn and tool call to compaction and session teardown. Extensions run as separate processes and communicate with sharur via gRPC.

Extension Types

Type	Language	Use Case
Go binary	Go	High-performance tools, direct filesystem access
Python script	Python	Data processing, ML integrations, API calls
Any executable	Any	Shell scripts, compiled binaries from any language

All extension types use the same gRPC protocol. The loader treats .py files specially (runs them with the configured Python interpreter), and everything else is executed directly as a binary.

Extension Discovery

Extensions are loaded from directories listed in your config under extensions:

// .sharur/config.json
{
  "extensions": [".sharur/extensions"]
}

Or globally in ~/.sharur/config.json.

Place your extension binary or script in the configured directory. sharur will automatically discover and launch it on startup.

You can also load a specific extension at runtime with the --extension flag:

shr --extension /path/to/my-extension "Your prompt here"

The Plugin Interface

Every Go extension implements the extensions.Plugin interface from github.com/goppydae/sharur/extensions. Embed extensions.NoopPlugin and override only the hooks you need.

Load-time hooks

Method	When called	Purpose
`Name()`	On load	Returns the extension’s identifier string
`Tools()`	On load	Returns tool definitions the agent can call
`ExecuteTool()`	On tool call	Executes a tool registered by this extension

Session lifecycle hooks

Method	When called	Purpose
`SessionStart(ctx, sessionID, reason)`	Session attached or first prompt	Open connections, initialize per-session state
`SessionEnd(ctx, sessionID, reason)`	Session reset	Flush buffers, close connections

reason is "new" for a fresh session and "resume" for one loaded from disk.

Agent loop hooks

Method	When called	Purpose
`AgentStart(ctx)`	User prompt received, loop begins	Per-prompt setup, logging
`AgentEnd(ctx)`	Agent loop completes	Per-prompt teardown, emit metrics
`TurnStart(ctx)`	Start of each LLM request turn	Per-turn timing
`TurnEnd(ctx)`	After each turn’s tool calls finish	Per-turn cleanup

Transformation hooks

Method	When called	Can modify	Purpose
`ModifyInput(ctx, text)`	Before user text hits the transcript	Yes — transform or consume	Pre-process input, implement shortcuts
`ModifySystemPrompt(prompt)`	Before each LLM request	Yes — returns new prompt	Inject dynamic context into the system prompt
`BeforePrompt(ctx, state)`	Before each LLM request	Yes — returns new state	Change model, provider, or thinking level
`ModifyContext(ctx, messagesJSON)`	Before each LLM request is built	Yes — returns new JSON	Filter or inject messages sent to the LLM (transcript unchanged)
`BeforeProviderRequest(ctx, requestJSON)`	Just before the request is sent	Yes — returns new JSON	Modify temperature, max tokens, tools list
`AfterProviderResponse(ctx, content, numToolCalls)`	After LLM stream consumed	No	Observe response text and tool call count
`BeforeToolCall(ctx, call, args)`	Before each tool execution	Yes — can intercept	Block or replace tool execution
`AfterToolCall(ctx, call, result)`	After each tool execution	Yes — returns new result	Observe or modify tool results
`BeforeCompact(ctx, prep)`	Before LLM-based summarization	Yes — can skip	Provide a custom compaction summary
`AfterCompact(ctx, freedTokens)`	After compaction completes	No	Observe freed token count

Key behaviors:

ModifyInput returns agent.InputResult. Set Action to "continue" (pass through unchanged), "transform" (use the Text field instead), or "handled" (consume the message entirely — it is not appended to the transcript and the agent does not run).
ModifyContext and BeforeProviderRequest work with JSON strings at the gRPC boundary. The GRPCClient marshals/unmarshals the Go structs automatically.
BeforeCompact returns "" (empty) to let the default LLM summarization run, or a non-empty summary string to provide your own and skip the LLM call. The prep argument includes the message count, estimated token count, and the previous summary (if any).
BeforeToolCall returns (ToolResult, true) to intercept (the tool does not execute), or (ToolResult{}, false) to allow normal execution.

Example: Git Context Injection

// .sharur/extensions/git-context/main.go
package main

import (
    "context"
    "fmt"
    "os/exec"
    "strings"

    "github.com/goppydae/sharur/extensions"
)

type GitContextPlugin struct {
    extensions.NoopPlugin
}

func (p *GitContextPlugin) BeforePrompt(_ context.Context, state extensions.AgentState) extensions.AgentState {
    branch := gitOutput("rev-parse", "--abbrev-ref", "HEAD")
    status := gitOutput("status", "--short")
    log := gitOutput("log", "--oneline", "-5")

    state.SystemPrompt += fmt.Sprintf(
        "\n\n<git_context>\nBranch: %s\n\nRecent commits:\n%s\n\nWorking tree:\n%s\n</git_context>",
        branch, log, status,
    )
    return state
}

func gitOutput(args ...string) string {
    out, err := exec.Command("git", args...).Output()
    if err != nil {
        return "(unavailable)"
    }
    return strings.TrimSpace(string(out))
}

func main() {
    extensions.Serve(&GitContextPlugin{
        NoopPlugin: extensions.NoopPlugin{NameStr: "git-context"},
    })
}

Build and auto-discover:

cd .sharur/extensions/git-context && go build -o ../git-context .

Example: Session Lifecycle Hooks

type AuditPlugin struct {
    extensions.NoopPlugin
    log *os.File
}

func (p *AuditPlugin) SessionStart(_ context.Context, sessionID string, reason agent.SessionStartReason) {
    p.log, _ = os.OpenFile(fmt.Sprintf("/tmp/sharur-%s.log", sessionID[:8]), os.O_CREATE|os.O_APPEND|os.O_WRONLY, 0644)
    fmt.Fprintf(p.log, "session %s (%s)\n", sessionID, reason)
}

func (p *AuditPlugin) SessionEnd(_ context.Context, sessionID string, _ agent.SessionEndReason) {
    if p.log != nil {
        p.log.Close()
    }
}

func (p *AuditPlugin) AfterProviderResponse(_ context.Context, content string, numToolCalls int) {
    fmt.Fprintf(p.log, "response: %d chars, %d tool calls\n", len(content), numToolCalls)
}

Example: Input Transformation

ModifyInput runs before the user text is added to the transcript. Return "handled" to consume shortcuts silently, or "transform" to rewrite the text:

func (p *MyPlugin) ModifyInput(_ context.Context, text string) agent.InputResult {
    if strings.HasPrefix(text, "?quick ") {
        return agent.InputResult{
            Action: agent.InputTransform,
            Text:   "Respond in one sentence: " + text[7:],
        }
    }
    if text == "ping" {
        return agent.InputResult{Action: agent.InputHandled}
    }
    return agent.InputResult{Action: agent.InputContinue}
}

Example: Custom Compaction

Return a non-nil *agent.CompactionResult from BeforeCompact to supply your own summary and bypass the default LLM-based summarization:

func (p *MyPlugin) BeforeCompact(_ context.Context, prep agent.CompactionPrep) *agent.CompactionResult {
    if prep.EstimatedTokens < 50000 {
        return nil
    }
    summary := callCheaperModel(prep.PreviousSummary, prep.MessageCount)
    return &agent.CompactionResult{
        Summary: summary,
    }
}

Example: Extension with Custom Tools

Extensions can contribute tools the agent calls just like built-in tools:

type CounterPlugin struct {
    extensions.NoopPlugin
}

func (p *CounterPlugin) Tools() []extensions.ToolDefinition {
    return []extensions.ToolDefinition{
        {
            Name:        "count_lines",
            Description: "Count lines in a string",
            Schema:      json.RawMessage(`{"type":"object","properties":{"text":{"type":"string"}},"required":["text"]}`),
            IsReadOnly:  true,
        },
    }
}

func (p *CounterPlugin) ExecuteTool(_ context.Context, name string, args json.RawMessage) extensions.ToolResult {
    if name != "count_lines" {
        return extensions.ToolResult{Content: "unknown tool", IsError: true}
    }
    var input struct{ Text string `json:"text"` }
    _ = json.Unmarshal(args, &input)
    n := strings.Count(input.Text, "\n") + 1
    return extensions.ToolResult{Content: fmt.Sprintf("%d lines", n)}
}

Example: Intercepting Tool Calls (Sandbox)

BeforeToolCall lets you block or replace any built-in tool call:

type SandboxPlugin struct {
    extensions.NoopPlugin
    AllowedDir string
}

func (p *SandboxPlugin) BeforeToolCall(_ context.Context, call extensions.ToolCall, args json.RawMessage) (extensions.ToolResult, bool) {
    var input struct{ Path string `json:"path"` }
    _ = json.Unmarshal(args, &input)
    if input.Path != "" && !strings.HasPrefix(input.Path, p.AllowedDir) {
        return extensions.ToolResult{
            Content: fmt.Sprintf("blocked: %s is outside %s", input.Path, p.AllowedDir),
            IsError: true,
        }, true
    }
    return extensions.ToolResult{}, false
}

See examples/sandbox/ for a complete standalone implementation.

Extension Lifecycle

flowchart TD
    Start["shr startup"] --> Scan["Scan extension directories"]
    Scan --> Launch["Launch subprocess
SHARUR_SOCKET_PATH=..."]
    Launch --> Socket["Wait for socket · dial gRPC"]
    Socket --> Init["Name() · Tools()"]

    Init --> SS["SessionStart(sessionID, reason)
on new session or resume"]

    SS --> MI["ModifyInput(text)"]
    MI --> AS["AgentStart()"]

    subgraph turn ["Per LLM turn (repeats until no tool calls)"]
        direction TB
        T1["BeforePrompt() · ModifySystemPrompt()
ModifyContext() · BeforeProviderRequest()"]
        T2[/"LLM streams"/]
        T3["AfterProviderResponse() · TurnStart()"]
        subgraph toolloop ["Per tool call"]
            BTC["BeforeToolCall()"] --> Intercept{"intercept?"}
            Intercept -->|yes| CustomResult["return custom ToolResult"]
            Intercept -->|no| Exec["execTool() · AfterToolCall()"]
        end
        TE["TurnEnd()"]
        T1 --> T2 --> T3 --> toolloop --> TE
    end

    AS --> turn
    turn --> AE["AgentEnd()"]

    subgraph compact ["On compaction (auto or /compact)"]
        direction TB
        BC["BeforeCompact(prep)"] --> CustomSummary{"return non-nil?"}
        CustomSummary -->|yes| SkipLLM["skip LLM summarization"]
        CustomSummary -->|no| LLMSum["LLM summarizes"]
        SkipLLM --> AC["AfterCompact(freedTokens)"]
        LLMSum --> AC
    end

    AE --> SE["SessionEnd(sessionID, reason)
on session reset"]
    SE --> Shutdown["shr shutdown · kill subprocess"]

In-Process Go Extension (Advanced)

If your extension is written in Go and you control the build, you can implement agent.Extension directly via the SDK and register it without the gRPC overhead:

import (
    "github.com/goppydae/sharur/internal/agent"
    "github.com/goppydae/sharur/internal/tools"
)

type MyExtension struct {
    agent.NoopExtension
}

func (e *MyExtension) AgentStart(ctx context.Context) {
    log.Println("agent started")
}

func (e *MyExtension) ModifyInput(ctx context.Context, text string) agent.InputResult {
    if text == "ping" {
        return agent.InputResult{Action: agent.InputHandled}
    }
    return agent.InputResult{Action: agent.InputContinue}
}

func (e *MyExtension) ModifySystemPrompt(prompt string) string {
    return prompt + "\n\nAlways respond in bullet points."
}

func (e *MyExtension) BeforeToolCall(ctx context.Context, call *agent.ToolCall, args json.RawMessage) (*tools.ToolResult, bool) {
    if call.Name == "bash" {
        return &tools.ToolResult{Content: "bash is disabled", IsError: true}, true
    }
    return nil, false
}

Pass the extension via ag.SetExtensions() from the SDK or directly in cmd/shr.

Tips

Extensions are isolated processes. A crash in an extension will not crash sharur — the loader catches errors and logs them.
Keep BeforePrompt and ModifySystemPrompt fast. They run before every single LLM call. Cache data when possible; avoid blocking network calls.
ModifyContext does not affect the stored transcript. Changes to the message slice are only visible to the LLM for that turn.
Use skills for static context. If you only need to append static text to the system prompt, a skill is simpler than an extension.
Extensions are global. All extensions in the configured directories are loaded for every session. There is no per-project scoping beyond the directory config.
Logs go to stderr. Stdout is not read by the host; stderr is passed through for debugging.
InputHandled stops all further processing. No agent turn is started, no message is appended to the transcript.
BeforeCompact fires before the LLM call. Return nil to let the default summarizer run. Return a *CompactionResult to supply your own summary — useful for using a cheaper model or domain-specific logic.

Python

Python Extensions

Python extensions use the same gRPC protocol as Go extensions. The loader detects .py files and runs them with the configured Python interpreter, passing SHARUR_SOCKET_PATH as an environment variable. The extension is expected to listen on that Unix socket.

Prerequisites

pip install grpcio grpcio-tools

Generate Python Stubs

python -m grpc_tools.protoc \
  -I extensions/proto \
  --python_out=.sharur/extensions \
  --grpc_python_out=.sharur/extensions \
  extensions/proto/extension.proto

This deposits extension_pb2.py and extension_pb2_grpc.py alongside your script.

Implement the Extension

# .sharur/extensions/ticket_context.py
import os
import subprocess
import grpc
from concurrent import futures
import extension_pb2
import extension_pb2_grpc


class TicketContextServicer(extension_pb2_grpc.ExtensionServicer):
    def Name(self, request, context):
        return extension_pb2.NameResponse(name="ticket-context")

    def Tools(self, request, context):
        return extension_pb2.ToolsResponse(tools=[])

    def BeforePrompt(self, request, context):
        branch = subprocess.check_output(
            ["git", "rev-parse", "--abbrev-ref", "HEAD"], text=True
        ).strip()
        state = request.state or extension_pb2.AgentState()
        state.prompt += f"\n\n<branch>Current branch: {branch}</branch>"
        return extension_pb2.BeforePromptResponse(state=state)

    def BeforeToolCall(self, request, context):
        return extension_pb2.BeforeToolCallResponse(intercept=False)

    def AfterToolCall(self, request, context):
        return extension_pb2.AfterToolCallResponse(result=request.result)

    def ModifySystemPrompt(self, request, context):
        return extension_pb2.ModifySystemPromptResponse(
            modified_prompt=request.current_prompt
        )

    def AgentStart(self, request, context):
        return extension_pb2.Empty()

    def AgentEnd(self, request, context):
        return extension_pb2.Empty()

    def ModifyInput(self, request, context):
        return extension_pb2.ModifyInputResponse(action="continue", text=request.text)


def serve():
    socket_path = os.environ["SHARUR_SOCKET_PATH"]
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    extension_pb2_grpc.add_ExtensionServicer_to_server(TicketContextServicer(), server)
    server.add_insecure_port(f"unix:{socket_path}")
    server.start()
    server.wait_for_termination()


if __name__ == "__main__":
    serve()

Place the script in your extensions directory. sharur runs it as python ticket_context.py on startup.

Available RPC Methods

Implement any subset of the ExtensionServicer methods. Unimplemented methods should return a sensible empty response (see the template above). The full list mirrors the Go plugin interface — see Go Extensions for hook semantics.

RPC	Purpose
`Name`	Return extension identifier
`Tools`	Return tool definitions
`ExecuteTool`	Execute a registered tool
`SessionStart` / `SessionEnd`	Session lifecycle
`AgentStart` / `AgentEnd`	Per-prompt lifecycle
`TurnStart` / `TurnEnd`	Per-LLM-turn lifecycle
`ModifyInput`	Transform or consume user input
`ModifySystemPrompt`	Augment the system prompt
`BeforePrompt`	Mutate model/provider/thinking
`ModifyContext`	Filter or inject LLM-bound messages
`BeforeProviderRequest`	Modify the raw completion request
`AfterProviderResponse`	Observe LLM output
`BeforeToolCall`	Intercept or block tool calls
`AfterToolCall`	Observe or modify tool results
`BeforeCompact` / `AfterCompact`	Compaction lifecycle

Tips

Logs go to stderr. Python’s print() goes to stdout, which is not read by the host. Use sys.stderr.write() or logging for debugging output.
Keep proto stubs in the same directory as your script, or adjust sys.path before importing them.
Thread safety: grpc.server with ThreadPoolExecutor handles concurrent RPC calls. If you maintain per-session state, use a lock or session-keyed dict.

Grpc

gRPC Extensions

gRPC extensions run as separate processes. sharur manages their lifecycle: launching the binary, passing the socket path, waiting for readiness, dialing, and killing on shutdown. The extension communicates entirely over a Unix Domain Socket using the generated proto stubs in extensions/proto/extension.proto.

How It Works

sequenceDiagram
    participant Loader as shr Loader
    participant Ext as Extension process
    participant Client as gRPC client

    Loader->>Ext: exec binary/script
    note over Ext: env: SHARUR_SOCKET_PATH=/tmp/...sock
    Ext->>Ext: net.Listen("unix", socketPath)
    note over Ext: signals readiness by listening
    Loader->>Loader: poll for socket file
    Loader->>Client: dial gRPC over Unix socket
    Client->>Ext: Name()
    Ext-->>Client: "my-extension"
    Client->>Ext: Tools()
    Ext-->>Client: [ToolDefinition, ...]
    note over Loader,Ext: extension registered — hooks active for all sessions

The extension must call net.Listen("unix", os.Getenv("SHARUR_SOCKET_PATH")) and start serving before shr times out.

Writing a Go Extension

Import github.com/goppydae/sharur/extensions — no internal packages needed.

package main

import "github.com/goppydae/sharur/extensions"

type myPlugin struct {
    extensions.NoopPlugin
}

func (p *myPlugin) ModifySystemPrompt(prompt string) string {
    return prompt + "\n\nAlways respond in haiku."
}

func main() {
    extensions.Serve(&myPlugin{
        NoopPlugin: extensions.NoopPlugin{NameStr: "haiku"},
    })
}

extensions.Serve handles the socket path, gRPC server setup, and graceful shutdown. extensions.NoopPlugin provides no-op defaults for every method.

Build and place the binary in a configured extensions directory:

go build -o .sharur/extensions/haiku .

Or load at runtime:

shr --extension .sharur/extensions/haiku

Plugin Interface

All hooks map 1:1 to agent.Extension. See Go Extensions for full hook semantics and examples.

Load-time:

Method	Called	Purpose
`Name()`	Once on connect	Extension identifier
`Tools()`	Once on connect	Contribute tools to the agent
`ExecuteTool()`	Per tool call	Execute a registered tool

Session lifecycle:

Method	Called	Purpose
`SessionStart(ctx, sessionID, reason)`	New or resumed session	Open connections, init per-session state
`SessionEnd(ctx, sessionID, reason)`	Session reset	Flush, close connections

reason is "new" or "resume".

Proto Definition

The extension service is defined in extensions/proto/extension.proto. Generated Go stubs are in extensions/gen/. Regenerate with mage generate.

Python stubs can be generated with:

python -m grpc_tools.protoc \
  -I extensions/proto \
  --python_out=.sharur/extensions \
  --grpc_python_out=.sharur/extensions \
  extensions/proto/extension.proto

Tool Read-Only Semantics

Tool definitions returned by Tools() have an IsReadOnly bool field. Set it to true for tools that are safe in dry-run mode. The GRPCClient propagates this to the internal RemoteTool.IsReadOnly() so dry-run and sandbox extensions honour it correctly.

Debugging

Logs go to stderr. The host passes the subprocess’s stderr through. Use log.Println or fmt.Fprintln(os.Stderr, ...) for debug output.
Crashes are isolated. A panicking extension does not crash shr — the loader catches errors and logs them.
Socket timeout. If the extension doesn’t listen within the timeout, the loader logs an error and skips it. Ensure extensions.Serve (or your own net.Listen + grpc.Serve) is called promptly in main().
Test in isolation. Set SHARUR_SOCKET_PATH=/tmp/test.sock and run your extension binary directly; then grpcurl the socket to verify RPCs before integrating with shr.

User Guide

Subsections of User Guide

CLI

Runtime Modes

Quick Start

Subsections of CLI

Configuration

config.json Schema

Context Files

CLI Flags

Mode

Model / Provider

Session

System Prompt

Tools

Extensions / Skills / Prompts

Output / Info

Provider Setup

Model Naming

Environment Variables

Ollama

llama.cpp

OpenAI

Anthropic

Google Gemini

Listing Available Models

Provider Feature Matrix

TUI

Keybindings

Slash Commands

Session Tree Modal (/tree)

Bang Commands

At-File Attachments

JSON Mode

Event Format

Common Patterns

gRPC Mode

Proto Definition

In-Process Transport

Extensibility

Subsections of Extensibility

Skills

How Skills Work

Skill Discovery Directories

Skill File Formats

Simple: Single .md file

Structured: Directory with SKILL.md

Frontmatter (Optional)

Practical Examples

Code Review Skill

Structured Skill with Supporting Files

Global Utility Skill

Tips

Prompt Templates

How Prompt Templates Work

Prompt Template Directories

Template File Format

Minimal Template (no frontmatter)

Template with Frontmatter

Argument Substitution

Practical Examples

PR Description Template

Architecture Decision Record

Global Commit Message Template

Code Explanation for PR Comments

Tips

Go Extensions

Extension Types

Extension Discovery

The Plugin Interface

Load-time hooks

Session lifecycle hooks

Agent loop hooks

Transformation hooks

Example: Git Context Injection

Example: Session Lifecycle Hooks

Example: Input Transformation

Example: Custom Compaction

Example: Extension with Custom Tools

Example: Intercepting Tool Calls (Sandbox)

Session Tree Modal (`/tree`)

Simple: Single `.md` file

Structured: Directory with `SKILL.md`