CLI

shr is the sharur CLI binary. It supports three runtime modes and a rich flag surface for model selection, session management, tools, and extensions.

Runtime Modes

Mode	Flag	Description
TUI	`--mode tui` (default)	Interactive Bubble Tea terminal interface with streaming, tool cards, and session management
JSON	`--mode json`	One-shot query with line-delimited JSON event output — useful for shell pipelines
gRPC	`--mode grpc`	Persistent multi-session gRPC service — any gRPC-capable client can connect

Quick Start

# Launch the interactive TUI
shr

# One-shot answer (JSONL output)
shr --mode json "What is the best way to structure a Go project?"

# Resume the most recent session
shr --continue

See the sub-pages for full keybinding and slash command references, JSON event schema, gRPC proto overview, provider setup, and the full configuration schema.

Configuration

sharur uses layered JSON configuration. Project-level settings override global defaults.

Path	Scope
`~/.sharur/config.json`	Global defaults — applies to all projects
`.sharur/config.json`	Project-level overrides — applies in this directory

config.json Schema

{
  "defaultModel": "llama3.2",
  "defaultProvider": "ollama",
  "theme": "dark",
  "thinkingLevel": "medium",
  "ollamaBaseURL": "http://localhost:11434",
  "openAIBaseURL": "https://api.openai.com/v1",
  "openAIApiKey": "",
  "anthropicApiKey": "",
  "anthropicApiVersion": "",
  "googleApiKey": "",
  "llamaCppBaseURL": "http://localhost:8080",
  "compaction": {
    "enabled": true,
    "reserveTokens": 2048,
    "keepRecentTokens": 8192
  }
}

API keys can also be set via environment variables — env vars take priority over config file values.

Context Files

sharur auto-discovers AGENTS.md, CLAUDE.md, GEMINI.md, and .context.md in your project root and parent directories and injects them into the system prompt. Outermost files take precedence (parent directory wins over project root).

Disable with --no-context-files.

CLI Flags

Mode

Flag	Description
`--mode`	Mode: `tui` (default), `json`, `grpc`
`--grpc-addr`	gRPC listen address (default `:50051`; `--mode grpc` only)

Model / Provider

Flag	Description
`--model` / `-m`	Model to use (e.g. `llama3`, `gpt-4o`, `anthropic/claude-sonnet-4-6`)
`--provider`	Provider: `ollama`, `openai`, `anthropic`, `llamacpp`, `google`
`--api-key`	API key override
`--thinking`	Thinking level: `off`, `minimal`, `low`, `medium`, `high`, `xhigh`
`--models`	Comma-separated model list for `Ctrl+P` cycling

Session

Flag	Description
`--continue` / `-c`	Resume the most recent session
`--resume` / `-r`	Select a session to resume (fuzzy search or ID)
`--session`	Use a specific session file path
`--session-dir`	Directory for session storage and lookup
`--branch`	Branch from a session file or partial UUID into a new child session
`--no-session`	Ephemeral mode: don’t save the session

System Prompt

Flag	Description
`--system-prompt`	Override the system prompt
`--append-system-prompt`	Append text or file to the system prompt (repeatable)

Tools

Flag	Description
`--tools`	Comma-separated list of tools to enable: `read,bash,edit,write,grep,find,ls`
`--no-tools`	Disable all built-in tools
`--dry-run`	Safety mode: destructive tools preview actions instead of running

Extensions / Skills / Prompts

Flag	Description
`--extension` / `-e`	Load a gRPC extension binary (repeatable)
`--no-extensions`	Disable extension directory auto-discovery (`-e` paths still load)
`--skill`	Load a skill file or directory (repeatable)
`--no-skills`	Disable skill auto-discovery
`--prompt-template`	Load a prompt template file or directory (repeatable)
`--no-prompt-templates`	Disable prompt template auto-discovery

Output / Info

Flag	Description
`--export`	Export current session to an HTML file and exit
`--list-models`	List available models from the configured provider (optional fuzzy filter)
`--version` / `-v`	Show version number
`--verbose`	Force verbose startup output
`--offline`	Disable startup network operations (model checks, etc.)

Providers

Provider Setup

sharur supports five LLM providers. All configuration lives in config.json files or environment variables; environment variables take priority over config file values.

Model Naming

Models can be specified as provider/model shorthand or with separate flags:

# Shorthand: provider inferred from the slash-prefix
shr --model anthropic/claude-sonnet-4-6

# Explicit: provider and model as separate flags
shr --provider anthropic --model claude-sonnet-4-6

Both forms are equivalent. The shorthand is convenient for one-off overrides; the config file form is better for persistent defaults.

Environment Variables

API keys set via environment variable take priority over values in config.json. The env var names use the SHARUR_ prefix:

Provider	Environment Variable
Anthropic	`SHARUR_ANTHROPIC_API_KEY`
OpenAI	`SHARUR_OPENAI_API_KEY`
Google	`SHARUR_GOOGLE_API_KEY`

Ollama and llama.cpp are local servers and do not use API keys.

Ollama

Ollama runs models locally. It is the default provider.

// ~/.sharur/config.json or .sharur/config.json
{
  "defaultProvider": "ollama",
  "defaultModel": "llama3.2",
  "ollamaBaseURL": "http://localhost:11434"
}

# Pull a model and launch
ollama pull llama3.2
shr

# Use a specific model
shr --model ollama/llama3.2

# Point at a remote Ollama server
shr --model llama3.2 --provider ollama

Notes:

Default base URL is http://localhost:11434. Override with ollamaBaseURL.
Ollama models support tools and images (vision models).
Use shr --list-models to see all locally available models.
Thinking is supported on models that emit <think> tokens (e.g. qwq, deepseek-r1).

llama.cpp

llama.cpp exposes an OpenAI-compatible HTTP server.

{
  "defaultProvider": "llamacpp",
  "llamaCppBaseURL": "http://localhost:8080"
}

# Start the llama.cpp server (example)
./llama-server -m model.gguf --port 8080

# Connect with sharur
shr --provider llamacpp --model my-model

Notes:

Default base URL is http://localhost:8080. Override with llamaCppBaseURL.
The model name passed to shr is forwarded to the server as-is.
Image attachments are not supported.
The server’s own context window size is used; sharur queries /v1/models to detect it.

OpenAI

{
  "defaultProvider": "openai",
  "defaultModel": "gpt-4o",
  "openAIApiKey": "",
  "openAIBaseURL": "https://api.openai.com/v1"
}

# Via environment variable (recommended)
export SHARUR_OPENAI_API_KEY=sk-...
shr --model openai/gpt-4o

# One-off key override
shr --provider openai --model gpt-4o --api-key sk-...

OpenAI-compatible endpoints:

Any server that implements the OpenAI chat completions API can be used by pointing openAIBaseURL at it:

{
  "defaultProvider": "openai",
  "openAIBaseURL": "http://localhost:11434/v1",
  "openAIApiKey": "unused"
}

This works with vLLM, LM Studio, and others.

Notes:

Reasoning models (o3, o4-mini) emit thinking deltas that appear in the TUI and JSON event stream.
Supports tools and vision (images) for compatible models.

Anthropic

{
  "defaultProvider": "anthropic",
  "defaultModel": "claude-sonnet-4-6",
  "anthropicApiKey": "",
  "anthropicApiVersion": ""
}

export SHARUR_ANTHROPIC_API_KEY=sk-ant-...
shr --model anthropic/claude-sonnet-4-6

# Extended thinking (claude-3-7-sonnet and later)
shr --model anthropic/claude-3-7-sonnet-20250219 --thinking high

Notes:

Extended thinking is supported for models that enable it (e.g. claude-3-7-sonnet). Use --thinking medium or --thinking high.
medium thinking uses a 10,000-token budget; high uses 20,000 tokens. Temperature is automatically set to the required value.
anthropicApiVersion overrides the anthropic-version request header; leave empty to use the library default.

Google Gemini

{
  "defaultProvider": "google",
  "defaultModel": "gemini-2.0-flash",
  "googleApiKey": ""
}

export SHARUR_GOOGLE_API_KEY=AIza...
shr --model google/gemini-2.0-flash

Notes:

Gemini 1.5 Pro and later have a 1M+ token context window.
Supports tools and vision (images).
Use shr --list-models to see available Gemini models.

Listing Available Models

All five providers implement model listing. Use --list-models to query the active provider:

# List Ollama models
shr --list-models

# List models from a specific provider
shr --provider anthropic --list-models

# Filter results
shr --provider openai --list-models gpt-4

The output is a plain list of model names, suitable for piping:

shr --list-models | fzf | xargs -I{} shr --model {}

Provider Feature Matrix

Provider	Tools	Images	Thinking	Model Listing
`ollama`	✓	✓	model-dependent	✓
`llamacpp`	✓	✗	✗	✓
`openai`	✓	✓	reasoning models	✓
`anthropic`	✓	✓	✓ extended	✓
`google`	✓	✓	✗	✓

TUI

The TUI is a rich, Bubble Tea-powered interface with real-time streaming, tool cards, session management, and a live context usage progress bar in the status footer.

Keybindings

Key	Action
`Enter`	Send message (or Steer the running agent)
`Shift+Enter`	Insert newline
`Ctrl+Enter`	Queue follow-up message (runs after agent finishes)
`Ctrl+C`	Abort the current agent run and clear the input editor
`Esc`	Cancel streaming / Close modal / Abort current turn
`Ctrl+O`	Toggle tool call output expansion
`Ctrl+P`	Open model selection modal (cycling via `--models` flag)
`↑/↓`	Navigate prompt history (if at start/end of editor) / Scroll viewport
`F1`	Show help modal

Slash Commands

Command	Description
`/new`	Start a fresh session
`/resume <id>`	Resume a session by ID or partial UUID (fuzzy search enabled)
`/branch [idx]`	Create a new child session branching from a specific message index (defaults to last)
`/fork`	Duplicate current session into a new independent session (no parent link)
`/rebase`	Interactive rebase: select specific messages to keep in a new session
`/merge <id>`	Merge another session’s history into the current one with a synthesis turn
`/tree [-g\|-p]`	Open session tree modal. Flags: `--global` (-g) or `--project` (-p)
`/import <path>`	Import a session from a JSONL file
`/export <path>`	Export the current session to a JSONL file
`/model <p/m>`	Switch model mid-conversation (e.g. `/model anthropic/claude-sonnet-4-6`)
`/stats`	View session statistics and token usage
`/config`	View and edit active configuration
`/context`	View detailed context window usage
`/compact`	Manually trigger a context compaction
`/skill:<name> [args]`	Invoke a skill
`/prompt:<name>`	Expand a prompt template into the editor
`/exit`	Quit (alias: `/quit`)

Session Tree Modal (`/tree`)

Key	Action
`↑/↓` / `PgUp/PgDn`	Navigate the session list
`Enter`	Resume the selected session (or branch from it if it’s an interior node)
`B`	Create a new branch from the selected session
`F`	Create an independent fork of the selected session
`R`	Start an interactive rebase from the selected session’s history
`Esc`	Close modal

Bang Commands

Bang commands execute a shell command and inject the output into the conversation:

!ls -la          # Execute shell command, paste output into editor
!!cat README.md  # Execute shell command, send output directly to agent

!cmd — pastes stdout into the editor so you can review before sending
!!cmd — sends stdout directly to the agent without review

At-File Attachments

Type @ in the input to fuzzy-search and attach file contents to your prompt:

Tell me what this does @src/agent/loop.go

The file content is embedded inline in the message sent to the agent.

JSON Mode

JSON mode runs a single prompt and streams the agent’s events as line-delimited JSON (JSONL) to stdout. It is designed for shell pipelines and tooling integration.

shr --mode json "What is the best way to structure a Go project?"

# Pipe stdin as context
cat main.go | shr --mode json "Refactor this to use interfaces"

# Specify a model
shr --mode json "Summarize the last 10 git commits" --model anthropic/claude-opus-4-5

Event Format

Each line is the protobuf JSON encoding of an AgentEvent. Event types mirror the TUI stream:

EVENT_AGENT_START / EVENT_AGENT_END
EVENT_TEXT_DELTA — incremental response text
EVENT_THINKING_DELTA — incremental thinking text (extended thinking models)
EVENT_TOOL_CALL — tool invocation start
EVENT_TOOL_DELTA — streaming tool output
EVENT_TOOL_OUTPUT — final tool result
EVENT_TURN_START / EVENT_TURN_END

Common Patterns

# Capture only the text deltas
shr --mode json "Explain Go interfaces" \
  | jq -r 'select(.type == "EVENT_TEXT_DELTA") | .content'

# Run without saving the session
shr --mode json --no-session "Quick one-off question"

# Dry-run to see what tools would be called
shr --mode json --dry-run "Delete all .tmp files in the current directory"

Grpc

gRPC Mode

gRPC mode starts a persistent AgentService server. Each connecting client supplies a session_id and gets its own isolated agent. Sessions are saved to disk after each turn and reloaded automatically on reconnect.

# Start on the default port
shr --mode grpc

# Use a custom address
shr --mode grpc --grpc-addr :9090

The server responds to SIGINT/SIGTERM with a graceful shutdown: in-flight turns are allowed to finish (30 s timeout), all sessions are flushed to disk, then the listener closes.

Proto Definition

The service is defined in proto/sharur/v1/agent.proto. Generated Go stubs live in internal/gen/sharur/v1/. Regenerate with mage generate.

Key RPCs:

RPC	Description
`Prompt`	Send a user message; streams back `AgentEvent`s
`NewSession`	Create a new session
`GetMessages`	Retrieve message history for a session
`GetState`	Get current agent state
`Steer`	Inject a steering message mid-turn
`FollowUp`	Queue a follow-up after the current turn
`Abort`	Cancel the current running turn
`ForkSession`	Fork a session into a new independent copy
`ConfigureSession`	Change model, provider, or thinking level

In-Process Transport

For the TUI and JSON modes, all internal communication also goes through this same protobuf boundary using a bufconn in-memory pipe — not a network socket. This means all three modes share identical code paths. See Service Architecture for details.

CLI

Runtime Modes

Quick Start

Subsections of CLI

Configuration

config.json Schema

Context Files

CLI Flags

Mode

Model / Provider

Session

System Prompt

Tools

Extensions / Skills / Prompts

Output / Info

Provider Setup

Model Naming

Environment Variables

Ollama

llama.cpp

OpenAI

Anthropic

Google Gemini

Listing Available Models

Provider Feature Matrix

TUI

Keybindings

Slash Commands

Session Tree Modal (/tree)

Bang Commands

At-File Attachments

JSON Mode

Event Format

Common Patterns

gRPC Mode

Proto Definition

In-Process Transport

Session Tree Modal (`/tree`)