CLI

shr is the sharur CLI binary. It supports three runtime modes and a rich flag surface for model selection, session management, tools, and extensions.

Runtime Modes

ModeFlagDescription
TUI--mode tui (default)Interactive Bubble Tea terminal interface with streaming, tool cards, and session management
JSON--mode jsonOne-shot query with line-delimited JSON event output — useful for shell pipelines
gRPC--mode grpcPersistent multi-session gRPC service — any gRPC-capable client can connect

Quick Start

# Launch the interactive TUI
shr

# One-shot answer (JSONL output)
shr --mode json "What is the best way to structure a Go project?"

# Resume the most recent session
shr --continue

See the sub-pages for full keybinding and slash command references, JSON event schema, gRPC proto overview, provider setup, and the full configuration schema.

Subsections of CLI

Configuration

sharur uses layered JSON configuration. Project-level settings override global defaults.

PathScope
~/.sharur/config.jsonGlobal defaults — applies to all projects
.sharur/config.jsonProject-level overrides — applies in this directory

config.json Schema

{
  "defaultModel": "llama3.2",
  "defaultProvider": "ollama",
  "theme": "dark",
  "thinkingLevel": "medium",
  "ollamaBaseURL": "http://localhost:11434",
  "openAIBaseURL": "https://api.openai.com/v1",
  "openAIApiKey": "",
  "anthropicApiKey": "",
  "anthropicApiVersion": "",
  "googleApiKey": "",
  "llamaCppBaseURL": "http://localhost:8080",
  "compaction": {
    "enabled": true,
    "reserveTokens": 2048,
    "keepRecentTokens": 8192
  }
}

API keys can also be set via environment variables — env vars take priority over config file values.


Context Files

sharur auto-discovers AGENTS.md, CLAUDE.md, GEMINI.md, and .context.md in your project root and parent directories and injects them into the system prompt. Outermost files take precedence (parent directory wins over project root).

Disable with --no-context-files.


CLI Flags

Mode

FlagDescription
--modeMode: tui (default), json, grpc
--grpc-addrgRPC listen address (default :50051; --mode grpc only)

Model / Provider

FlagDescription
--model / -mModel to use (e.g. llama3, gpt-4o, anthropic/claude-sonnet-4-6)
--providerProvider: ollama, openai, anthropic, llamacpp, google
--api-keyAPI key override
--thinkingThinking level: off, minimal, low, medium, high, xhigh
--modelsComma-separated model list for Ctrl+P cycling

Session

FlagDescription
--continue / -cResume the most recent session
--resume / -rSelect a session to resume (fuzzy search or ID)
--sessionUse a specific session file path
--session-dirDirectory for session storage and lookup
--branchBranch from a session file or partial UUID into a new child session
--no-sessionEphemeral mode: don’t save the session

System Prompt

FlagDescription
--system-promptOverride the system prompt
--append-system-promptAppend text or file to the system prompt (repeatable)

Tools

FlagDescription
--toolsComma-separated list of tools to enable: read,bash,edit,write,grep,find,ls
--no-toolsDisable all built-in tools
--dry-runSafety mode: destructive tools preview actions instead of running

Extensions / Skills / Prompts

FlagDescription
--extension / -eLoad a gRPC extension binary (repeatable)
--no-extensionsDisable extension directory auto-discovery (-e paths still load)
--skillLoad a skill file or directory (repeatable)
--no-skillsDisable skill auto-discovery
--prompt-templateLoad a prompt template file or directory (repeatable)
--no-prompt-templatesDisable prompt template auto-discovery

Output / Info

FlagDescription
--exportExport current session to an HTML file and exit
--list-modelsList available models from the configured provider (optional fuzzy filter)
--version / -vShow version number
--verboseForce verbose startup output
--offlineDisable startup network operations (model checks, etc.)

Provider Setup

sharur supports five LLM providers. All configuration lives in config.json files or environment variables; environment variables take priority over config file values.


Model Naming

Models can be specified as provider/model shorthand or with separate flags:

# Shorthand: provider inferred from the slash-prefix
shr --model anthropic/claude-sonnet-4-6

# Explicit: provider and model as separate flags
shr --provider anthropic --model claude-sonnet-4-6

Both forms are equivalent. The shorthand is convenient for one-off overrides; the config file form is better for persistent defaults.


Environment Variables

API keys set via environment variable take priority over values in config.json. The env var names use the SHARUR_ prefix:

ProviderEnvironment Variable
AnthropicSHARUR_ANTHROPIC_API_KEY
OpenAISHARUR_OPENAI_API_KEY
GoogleSHARUR_GOOGLE_API_KEY

Ollama and llama.cpp are local servers and do not use API keys.


Ollama

Ollama runs models locally. It is the default provider.

// ~/.sharur/config.json or .sharur/config.json
{
  "defaultProvider": "ollama",
  "defaultModel": "llama3.2",
  "ollamaBaseURL": "http://localhost:11434"
}
# Pull a model and launch
ollama pull llama3.2
shr

# Use a specific model
shr --model ollama/llama3.2

# Point at a remote Ollama server
shr --model llama3.2 --provider ollama

Notes:

  • Default base URL is http://localhost:11434. Override with ollamaBaseURL.
  • Ollama models support tools and images (vision models).
  • Use shr --list-models to see all locally available models.
  • Thinking is supported on models that emit <think> tokens (e.g. qwq, deepseek-r1).

llama.cpp

llama.cpp exposes an OpenAI-compatible HTTP server.

{
  "defaultProvider": "llamacpp",
  "llamaCppBaseURL": "http://localhost:8080"
}
# Start the llama.cpp server (example)
./llama-server -m model.gguf --port 8080

# Connect with sharur
shr --provider llamacpp --model my-model

Notes:

  • Default base URL is http://localhost:8080. Override with llamaCppBaseURL.
  • The model name passed to shr is forwarded to the server as-is.
  • Image attachments are not supported.
  • The server’s own context window size is used; sharur queries /v1/models to detect it.

OpenAI

{
  "defaultProvider": "openai",
  "defaultModel": "gpt-4o",
  "openAIApiKey": "",
  "openAIBaseURL": "https://api.openai.com/v1"
}
# Via environment variable (recommended)
export SHARUR_OPENAI_API_KEY=sk-...
shr --model openai/gpt-4o

# One-off key override
shr --provider openai --model gpt-4o --api-key sk-...

OpenAI-compatible endpoints:

Any server that implements the OpenAI chat completions API can be used by pointing openAIBaseURL at it:

{
  "defaultProvider": "openai",
  "openAIBaseURL": "http://localhost:11434/v1",
  "openAIApiKey": "unused"
}

This works with vLLM, LM Studio, and others.

Notes:

  • Reasoning models (o3, o4-mini) emit thinking deltas that appear in the TUI and JSON event stream.
  • Supports tools and vision (images) for compatible models.

Anthropic

{
  "defaultProvider": "anthropic",
  "defaultModel": "claude-sonnet-4-6",
  "anthropicApiKey": "",
  "anthropicApiVersion": ""
}
export SHARUR_ANTHROPIC_API_KEY=sk-ant-...
shr --model anthropic/claude-sonnet-4-6

# Extended thinking (claude-3-7-sonnet and later)
shr --model anthropic/claude-3-7-sonnet-20250219 --thinking high

Notes:

  • Extended thinking is supported for models that enable it (e.g. claude-3-7-sonnet). Use --thinking medium or --thinking high.
  • medium thinking uses a 10,000-token budget; high uses 20,000 tokens. Temperature is automatically set to the required value.
  • anthropicApiVersion overrides the anthropic-version request header; leave empty to use the library default.

Google Gemini

{
  "defaultProvider": "google",
  "defaultModel": "gemini-2.0-flash",
  "googleApiKey": ""
}
export SHARUR_GOOGLE_API_KEY=AIza...
shr --model google/gemini-2.0-flash

Notes:

  • Gemini 1.5 Pro and later have a 1M+ token context window.
  • Supports tools and vision (images).
  • Use shr --list-models to see available Gemini models.

Listing Available Models

All five providers implement model listing. Use --list-models to query the active provider:

# List Ollama models
shr --list-models

# List models from a specific provider
shr --provider anthropic --list-models

# Filter results
shr --provider openai --list-models gpt-4

The output is a plain list of model names, suitable for piping:

shr --list-models | fzf | xargs -I{} shr --model {}

Provider Feature Matrix

ProviderToolsImagesThinkingModel Listing
ollamamodel-dependent
llamacpp
openaireasoning models
anthropic✓ extended
google

TUI

The TUI is a rich, Bubble Tea-powered interface with real-time streaming, tool cards, session management, and a live context usage progress bar in the status footer.


Keybindings

KeyAction
EnterSend message (or Steer the running agent)
Shift+EnterInsert newline
Ctrl+EnterQueue follow-up message (runs after agent finishes)
Ctrl+CAbort the current agent run and clear the input editor
EscCancel streaming / Close modal / Abort current turn
Ctrl+OToggle tool call output expansion
Ctrl+POpen model selection modal (cycling via --models flag)
↑/↓Navigate prompt history (if at start/end of editor) / Scroll viewport
F1Show help modal

Slash Commands

CommandDescription
/newStart a fresh session
/resume <id>Resume a session by ID or partial UUID (fuzzy search enabled)
/branch [idx]Create a new child session branching from a specific message index (defaults to last)
/forkDuplicate current session into a new independent session (no parent link)
/rebaseInteractive rebase: select specific messages to keep in a new session
/merge <id>Merge another session’s history into the current one with a synthesis turn
/tree [-g|-p]Open session tree modal. Flags: --global (-g) or --project (-p)
/import <path>Import a session from a JSONL file
/export <path>Export the current session to a JSONL file
/model <p/m>Switch model mid-conversation (e.g. /model anthropic/claude-sonnet-4-6)
/statsView session statistics and token usage
/configView and edit active configuration
/contextView detailed context window usage
/compactManually trigger a context compaction
/skill:<name> [args]Invoke a skill
/prompt:<name>Expand a prompt template into the editor
/exitQuit (alias: /quit)

Session Tree Modal (/tree)

KeyAction
↑/↓ / PgUp/PgDnNavigate the session list
EnterResume the selected session (or branch from it if it’s an interior node)
BCreate a new branch from the selected session
FCreate an independent fork of the selected session
RStart an interactive rebase from the selected session’s history
EscClose modal

Bang Commands

Bang commands execute a shell command and inject the output into the conversation:

!ls -la          # Execute shell command, paste output into editor
!!cat README.md  # Execute shell command, send output directly to agent
  • !cmd — pastes stdout into the editor so you can review before sending
  • !!cmd — sends stdout directly to the agent without review

At-File Attachments

Type @ in the input to fuzzy-search and attach file contents to your prompt:

Tell me what this does @src/agent/loop.go

The file content is embedded inline in the message sent to the agent.

JSON Mode

JSON mode runs a single prompt and streams the agent’s events as line-delimited JSON (JSONL) to stdout. It is designed for shell pipelines and tooling integration.

shr --mode json "What is the best way to structure a Go project?"

# Pipe stdin as context
cat main.go | shr --mode json "Refactor this to use interfaces"

# Specify a model
shr --mode json "Summarize the last 10 git commits" --model anthropic/claude-opus-4-5

Event Format

Each line is the protobuf JSON encoding of an AgentEvent. Event types mirror the TUI stream:

  • EVENT_AGENT_START / EVENT_AGENT_END
  • EVENT_TEXT_DELTA — incremental response text
  • EVENT_THINKING_DELTA — incremental thinking text (extended thinking models)
  • EVENT_TOOL_CALL — tool invocation start
  • EVENT_TOOL_DELTA — streaming tool output
  • EVENT_TOOL_OUTPUT — final tool result
  • EVENT_TURN_START / EVENT_TURN_END

Common Patterns

# Capture only the text deltas
shr --mode json "Explain Go interfaces" \
  | jq -r 'select(.type == "EVENT_TEXT_DELTA") | .content'

# Run without saving the session
shr --mode json --no-session "Quick one-off question"

# Dry-run to see what tools would be called
shr --mode json --dry-run "Delete all .tmp files in the current directory"

gRPC Mode

gRPC mode starts a persistent AgentService server. Each connecting client supplies a session_id and gets its own isolated agent. Sessions are saved to disk after each turn and reloaded automatically on reconnect.

# Start on the default port
shr --mode grpc

# Use a custom address
shr --mode grpc --grpc-addr :9090

The server responds to SIGINT/SIGTERM with a graceful shutdown: in-flight turns are allowed to finish (30 s timeout), all sessions are flushed to disk, then the listener closes.


Proto Definition

The service is defined in proto/sharur/v1/agent.proto. Generated Go stubs live in internal/gen/sharur/v1/. Regenerate with mage generate.

Key RPCs:

RPCDescription
PromptSend a user message; streams back AgentEvents
NewSessionCreate a new session
GetMessagesRetrieve message history for a session
GetStateGet current agent state
SteerInject a steering message mid-turn
FollowUpQueue a follow-up after the current turn
AbortCancel the current running turn
ForkSessionFork a session into a new independent copy
ConfigureSessionChange model, provider, or thinking level

In-Process Transport

For the TUI and JSON modes, all internal communication also goes through this same protobuf boundary using a bufconn in-memory pipe — not a network socket. This means all three modes share identical code paths. See Service Architecture for details.