Internals

This section describes the high-level architecture of sharur: how its components are organized, how data flows through the system, and how the key abstractions relate to each other.


Directory Structure

sharur/
โ”‚   โ”œโ”€โ”€ internal/
โ”‚   โ”‚   โ”œโ”€โ”€ service/        # Central AgentService implementation + in-process client
โ”‚   โ”‚   โ”œโ”€โ”€ gen/            # Generated Protobuf stubs (pb.AgentServiceClient/Server)
โ”‚   โ”‚   โ”œโ”€โ”€ agent/          # Core agentic loop, event bus, state machine
โ”‚   โ”‚   โ”œโ”€โ”€ llm/            # LLM provider adapters (Ollama, OpenAI, Anthropic, llama.cpp, Google)
โ”‚   โ”‚   โ”œโ”€โ”€ tools/          # Built-in tool implementations + registry
โ”‚   โ”‚   โ”œโ”€โ”€ session/        # JSONL-backed session persistence, branching, tree
โ”‚   โ”‚   โ”œโ”€โ”€ modes/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ interactive/ # Bubble Tea TUI (pb client)
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ print.go    # One-shot CLI JSONL mode (pb client)
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ grpc.go     # gRPC server mode (wraps Service)
โ”‚   โ”‚   โ”œโ”€โ”€ config/         # Config loading (global + project layering)
โ”‚   โ”‚   โ”œโ”€โ”€ themes/         # TUI colour themes
โ”‚   โ”‚   โ”œโ”€โ”€ types/          # Shared value types (Message, Session, ThinkingLevel)
โ”‚   โ”‚   โ”œโ”€โ”€ events/         # Generic publish-subscribe event bus
โ”‚   โ”‚   โ”œโ”€โ”€ skills/         # Skill discovery (Markdown files โ†’ slash commands)
โ”‚   โ”‚   โ”œโ”€โ”€ prompts/        # Prompt template discovery
โ”‚   โ”‚   โ””โ”€โ”€ contextfiles/   # Auto-discovered context file injection (AGENTS.md, etc.)
โ”‚   โ”œโ”€โ”€ cmd/                # Entry points (shr)
โ”‚   โ”œโ”€โ”€ proto/              # Protobuf definitions (sharur/v1/agent.proto)
โ”‚   โ”œโ”€โ”€ extensions/         # gRPC extension loader + proto definitions
โ”‚   โ””โ”€โ”€ sdk/                # Public Go SDK

Component Diagram

flowchart TD
    CLI["CLI flags & Config"] --> Svc

    subgraph core ["internal/agent"]
        Agent["Agent
Messages ยท SteerQueue ยท FollowUpQueue
StateMachine"]
        RunTurn["runTurn
provider.Stream ยท consumeStream ยท execTools"]
        EB["EventBus
async ยท non-blocking ยท 4096-item buffer"]
        Agent --> RunTurn
        RunTurn -->|publishes| EB
    end

    Svc["internal/service
AgentService"] --> core

    RunTurn --> LLM

    subgraph llm ["internal/llm"]
        LLM["Provider interface
Stream ยท Info"]
        Adapters["Ollama ยท OpenAI ยท Anthropic
llama.cpp ยท Google"]
        LLM --> Adapters
    end

    EB --> TUI["TUI"]
    EB --> JSON["JSON stdout"]
    EB --> GRPC["gRPC stream"]
    EB --> Session["session saver"]

Data Flow Summary

flowchart TD
    Input["User Input"] --> Mode["TUI ยท JSON ยท Remote Client"]
    Mode --> PBClient["pb.AgentServiceClient
bufconn or TCP"]
    PBClient --> Service["internal/service
getOrCreate / loadIfExists"]
    Service --> AP["agent.Prompt(ctx, text)"]
    AP --> MI["ext.ModifyInput()"]
    MI --> SS["ext.SessionStart() ยท ext.AgentStart()
EventAgentStart"]

    SS --> Loop

    subgraph Loop ["runTurn loop"]
        direction TB
        BP["ext.BeforePrompt() ยท ModifySystemPrompt()
ModifyContext() ยท BeforeProviderRequest()"]
        LLMStream["llm.Provider.Stream()
EventTextDelta ยท EventThinkingDelta ยท EventToolCall"]
        APR["ext.AfterProviderResponse()
EventTurnStart ยท ext.TurnStart()"]
        ToolExec["ext.BeforeToolCall() ยท execTool() ยท ext.AfterToolCall()
EventToolDelta ยท EventToolOutput"]
        TE["ext.TurnEnd()"]
        More{"more tool calls?"}
        BP --> LLMStream --> APR --> ToolExec --> TE --> More
        More -->|yes| BP
    end

    More -->|no| AgEnd["EventAgentEnd ยท ext.AgentEnd()"]
    AgEnd --> Save["service saves session to disk"]
    Save --> Stream["Stream Protobuf Events to client"]
    Stream --> Render["Render: TUI ยท JSONL stdout ยท gRPC stream"]

Subsections of Internals

Agent Loop

The agent is driven by an event-bus (internal/events). Every meaningful state transition emits an agent.Event to all subscribers.


EventBus Performance

The EventBus is async and non-blocking. Publish() enqueues to a 4096-item buffered channel per subscriber and returns immediately โ€” it never blocks the agent loop. Each subscriber runs in its own goroutine. Slow subscribers drop events to protect the agent loop from backpressure.


Event Flow

sequenceDiagram
    participant User
    participant Agent
    participant LLM
    participant Tools

    User->>Agent: Prompt(text)
    Agent->>Agent: EventAgentStart
    loop each LLM turn
        Agent->>Agent: EventTurnStart ยท EventMessageStart
        Agent->>LLM: provider.Stream()
        LLM-->>Agent: EventTextDelta (ร—n)
        LLM-->>Agent: EventThinkingDelta (ร—n, if thinking enabled)
        LLM-->>Agent: EventToolCall (ร—n, if tools requested)
        Agent->>Agent: EventMessageEnd
        loop each tool call
            Agent->>Tools: execTool()
            Tools-->>Agent: EventToolDelta (streaming)
            Agent->>Agent: EventToolOutput
        end
        Agent->>Agent: EventTurnEnd
    end
    Agent->>Agent: EventAgentEnd

State Machine

The agent transitions through explicit states to prevent concurrent modification:

stateDiagram-v2
    [*] --> Idle
    Idle --> Thinking : Prompt()
    Thinking --> Executing : tool calls present
    Thinking --> Idle : no tool calls
    Thinking --> Compacting : token limit reached
    Thinking --> Aborting : Abort() called
    Executing --> Thinking : more turns needed
    Executing --> Idle : done
    Compacting --> Thinking : resume
    Aborting --> Idle
    Thinking --> Error
    Error --> [*]

Prompt Queues

Two queues support non-blocking interaction while the agent is running:

  • SteerQueue โ€” Injected as a user message at the next tool boundary (interrupt-style)
  • FollowUpQueue โ€” Processed as a new turn after the agent goes Idle

Tool System

Tools implement a simple interface:

type Tool interface {
    Name() string
    Description() string
    Schema() json.RawMessage
    Execute(ctx context.Context, args json.RawMessage, update ToolUpdate) (*ToolResult, error)
    IsReadOnly() bool
}

A ToolRegistry holds all registered tools. During a turn, when the LLM emits a tool call, execTool looks up the tool by name, executes it, and streams partial output via EventToolDelta before emitting the final EventToolOutput.

Built-in tools: read, write, edit, bash, grep, ls, find

Safety Enforcements

  • Dry-Run Mode: When DryRun is enabled, any tool that is not marked as read-only will bypass execution and return a descriptive preview of what it would have done.
  • Input Sanitization: Prompt template expansion automatically wraps user inputs in <untrusted_input> tags to prevent prompt injection into the base instructions.

Service Architecture

sharur follows a Strict Protobuf Internal Architecture. Instead of UI modes calling Go functions directly, all interfaces are treated as clients of a central AgentService.


Protobuf Boundary

The interface between the UI and the core is defined in proto/sharur/v1/agent.proto. This boundary ensures:

  • Consistency: All modes (TUI, CLI, JSON, Remote gRPC) use the exact same code paths and logic.
  • Decoupling: UI logic is completely isolated from agent state, session persistence, and provider adapters.
  • Interoperability: Any gRPC-capable client can interact with a sharur service.

In-Process Communication

For local CLI usage, sharur uses a specialized In-Process Client (internal/service/client.go). It uses bufconn to implement the pb.AgentServiceClient interface over an in-memory pipe. This provides the safety and structure of gRPC without the latency or configuration complexity of network ports.


Backend Service (internal/service)

The Service struct implements pb.AgentServiceServer. It owns the session.Manager and manages the lifecycle of agent.Agent instances. It translates between internal agent events (Go channels) and Protobuf event streams.


Session Loading Strategy

RPCs split into three lookup strategies:

StrategyUsed byBehaviour
getOrCreate(id)Prompt, NewSessionAlways returns an entry โ€” creates a fresh agent if id is unknown, loading from disk if a matching session file exists
loadIfExists(id)GetState, GetMessages, ConfigureSession, ForkSession, CloneSessionReturns the entry if it is in memory or can be loaded from disk; returns NotFound for completely unknown IDs
lookup(id)Steer, Abort, FollowUp, StreamEventsIn-memory only โ€” these only make sense for a currently-running agent

This means a /resume <id> command can switch to any session ever saved to disk without a round-trip NewSession call: the first GetMessages or GetState call transparently loads it.

LLM Providers

Provider Interface

type Provider interface {
    Stream(ctx context.Context, req *CompletionRequest) (<-chan *Event, error)
    Info() ProviderInfo
}

All providers return a uniform Stream of Event values โ€” text deltas, thinking deltas, tool calls, and usage. The agent’s consumeStream function normalizes these into the internal Message format, making the agent completely provider-agnostic.


CompletionRequest

type CompletionRequest struct {
    Model       string
    Messages    []types.Message
    Tools       []types.ToolInfo
    System      string
    Thinking    types.ThinkingLevel
    MaxTokens   int
    Temperature float64
    StreamOpts  StreamOptions
}

The BeforeProviderRequest extension hook receives this struct as JSON and can modify any field before it is sent to the provider โ€” useful for overriding temperature, trimming the tool list, or adjusting MaxTokens per request.


ProviderInfo

type ProviderInfo struct {
    Name          string
    Model         string
    MaxTokens     int
    ContextWindow int  // 0 = unknown
    HasToolCall   bool
    HasImages     bool
}

Info() is called once at startup. The service uses ContextWindow to trigger compaction when the conversation grows too large. HasImages controls whether the TUI offers image attachment UI.


ModelLister

type ModelLister interface {
    ListModels() ([]string, error)
}

All five adapters implement ModelLister. When --list-models is passed, the CLI casts the active provider to ModelLister and prints the result. Each adapter queries the appropriate API:

ProviderQuery mechanism
ollamaGET /api/tags
llamacppGET /v1/models
openaiGET /v1/models
anthropicGET /v1/models
googleGemini model list API

Supported Providers

ProviderBackend
ollamaLocal Ollama server (HTTP)
llamacppllama.cpp server (HTTP, OpenAI-compatible)
openaiOpenAI API or any OpenAI-compatible endpoint
anthropicAnthropic Messages API
googleGoogle Gemini API

Each adapter lives in internal/llm/ and translates the provider’s wire format into the uniform Stream abstraction.


Feature Matrix

ProviderToolsImagesThinkingContext Window
ollamaโœ“โœ“model-dependent4096 (default)
llamacppโœ“โœ—โœ—from server n_ctx
openaiโœ“โœ“reasoning modelsmodel-dependent
anthropicโœ“โœ“โœ“ extendedmodel-dependent
googleโœ“โœ“โœ—1,000,000+

Per-Provider Notes

Ollama

The Ollama adapter uses the /api/chat endpoint with streaming enabled. Context window defaults to 4096 when not reported by the server. Thinking is supported on models that emit <think> tokens (e.g. qwq, deepseek-r1) โ€” sharur surfaces these as EventThinkingDelta events by detecting the tag boundaries in the stream.

llama.cpp

Uses the OpenAI-compatible /v1/chat/completions endpoint. The context window (n_ctx) is queried from the server at startup. Image attachments are not supported because llama.cpp’s OpenAI endpoint does not accept multipart vision payloads in the standard format.

OpenAI

Uses the standard /v1/chat/completions streaming endpoint. Any server implementing this API โ€” vLLM, LM Studio, Groq, Together AI โ€” can be used by setting openAIBaseURL. Reasoning models (o3, o4-mini) emit reasoning_content deltas that are surfaced as EventThinkingDelta.

Anthropic

Uses the Messages API (/v1/messages) with streaming. Extended thinking is activated when req.Thinking is medium or high:

  • medium โ€” 10,000-token thinking budget
  • high โ€” 20,000-token thinking budget

The API requires temperature: 1.0 when extended thinking is enabled; the adapter sets this automatically and overrides any user-supplied temperature for that request.

Google

Uses the Gemini generateContent API via the google.golang.org/genai client library. Gemini 1.5 Pro and later have context windows of 1M+ tokens; compaction is rarely triggered for typical sessions.


Adding a Provider

Implement the Provider interface in internal/llm/yourprovider.go and register it in internal/config/factory.go. Implement ModelLister to enable --list-models. The adapter receives a fully-formed CompletionRequest; it is responsible for translating Message.ToolCalls and Message.Images into the target API’s format.

Session Management

Sessions are persisted as JSONL files in a project-aware directory:

~/.sharur/sessions/
  --Users-alice-Projects-myapp--/     โ† sanitized CWD
    2026-04-23T07-06-54_{uuid}.jsonl  โ† timestamped session file
    2026-04-23T09-12-11_{uuid}.jsonl

Session File Format

Each .jsonl file contains one JSON object per line:

  • Line 0 (header): kind=header โ€” session ID, parentId, model, timestamps, system prompt, compaction settings, dryRun flag
  • Subsequent lines: kind=message โ€” individual conversation messages with full payloads (role, content, thinking, tool calls, tool call ID)

Session Tree

Sessions form a linked tree via parentId. The session.Manager.BuildTree() method assembles all sessions from the project directory into a []*TreeNode tree. FlattenTree produces a depth-first flat list with structured layout metadata (gutters, connectors, indentation), which the TUI layer uses to render a clean Unicode box-drawing tree diagram.

flowchart TD
    A["Session A
(root)"] --> B["Session B
(/branch from A)"]
    A --> C["Session C
(/fork of A)"]
    B --> D["Session D
(/branch from B at msg 5)"]
    B --> E["Session E
(/rebase of B)"]
    B --> F["Session F
(/merge into B)"]

    style C stroke-dasharray: 5 5

/fork creates an independent copy (dashed border above) with no parentId link โ€” it does not appear as a child in the tree visualization.


Branching, Rebasing & Merging

flowchart TD
    Q{"What do you need?"}

    Q -->|"Explore an alternate
path from this point"| Branch["/branch [idx]
Child session, same history up to idx"]
    Q -->|"Independent copy
no tree relationship"| Fork["/fork
Detached snapshot"]
    Q -->|"Clean up the conversation
keep only specific messages"| Rebase["/rebase
Interactively select messages
for a new session"]
    Q -->|"Combine two sessions
into one context"| Merge["/merge <id>
LLM-synthesized merge turn
appended to current session"]
CommandCreates parent linkCopies historyInteractive
/branch [idx]โœ“up to idxโœ—
/forkโœ—fullโœ—
/rebaseโœ“selected messagesโœ“
/merge <id>โœ—appends other sessionLLM turn

The /tree modal (keyboard shortcut B, F, R on a selected session) exposes all of these without leaving the TUI.


Compaction & Context Management

To stay within LLM context windows, sharur implements an auto-compaction strategy:

  1. Trigger: When tokens > ContextWindow - reserveTokens, compaction fires.
  2. Summarization: The agent uses the LLM to generate a structured summary (<!-- sharur-summary -->) of the pruned messages.
  3. File Tracking: The summary carries forward lists of files read and modified, so the assistant retains awareness of what it has already seen.
  4. Split Turn Handling: If compaction cuts mid-turn, a “Turn Prefix Summary” is generated to preserve context for the remaining tool calls.
  5. Session Tree Integration: Compaction events are stored as TypeCompaction records in the JSONL file, visible in /stats and preserved across restarts.

Compaction Configuration

// ~/.sharur/config.json or .sharur/config.json
{
  "compaction": {
    "enabled": true,
    "reserveTokens": 2048,
    "keepRecentTokens": 8192
  }
}
FieldDefaultDescription
enabledtrueWhether auto-compaction fires when the token budget is exceeded
reserveTokens2048Tokens to keep free at the top of the context window; compaction triggers when used > window - reserveTokens
keepRecentTokens8192Minimum recent-turn tokens to always retain after compaction, ensuring the current conversation thread survives

Trigger compaction manually at any time with /compact in the TUI or by calling the Compact RPC directly.


Export & Import

Sessions can be exported to and imported from JSONL files:

# Export from TUI
/export /path/to/session.jsonl

# Import into TUI (creates a new session from the file)
/import /path/to/session.jsonl

# Export from CLI without entering TUI
shr --export /path/to/session.html   # HTML snapshot

Exported JSONL files are self-contained: they include the session header and all messages. Imported sessions are assigned a new UUID and added to the current project’s session directory.

TUI Internals

The TUI is built with Bubble Tea (v2) and organized into focused files:

FileResponsibility
interactive.goRun() entry point, gRPC client wiring
model.gomodel struct definition, newModel()
update.goUpdate() โ€” key handling, slash commands, picker logic, promptGRPC()
events.gohandleAgentEvent() โ€” maps *pb.AgentEvent payloads to TUI history updates
view.goView() โ€” renders chat history, status bar, input
modal.goStats, Config, and Session Tree modal overlays
slash.goSlash command parsing and handlers (all via gRPC client)
picker.goFuzzy picker component (sessions, skills, files, prompts)
keys.goKeybinding helpers (Matches, K.Ctrl(...))
types.gohistoryEntry, contentItem, toolCallEntry โ€” render data model
utils.goHelper functions (Capitalize)

Prompt Submission

Prompt submission uses promptGRPC(), which opens a client.Prompt() server-streaming RPC and drains *pb.AgentEvent messages into m.eventCh in a goroutine. The listenForEvent Bubble Tea command feeds that channel back into the update loop one event at a time.


Prompt History

The TUI maintains a per-session prompt history in m.promptHistory, synced from the service via GetMessages at startup and after session switches. Users navigate previous prompts using Up/Down arrow keys while the editor is focused; the current draft is preserved as m.draftInput.


Render Data Model

The TUI stores conversation history as []historyEntry. Each entry has an ordered []contentItem slice that preserves the exact stream order:

historyEntry {
  role: "assistant"
  items: [
    { kind: contentItemThinking, text: "..." }
    { kind: contentItemText,     text: "..." }
    { kind: contentItemToolCall, tc: { id, name, arg, status, streamingOutput } }
    { kind: contentItemToolOutput, out: { toolCallID, content, isError } }
  ]
}

This mirrors the content[] array model, ensuring correct temporal ordering of thinking, text, and tool calls.


  • Stats โ€” Token counts, session metadata, file/path info
  • Config โ€” Active model, provider, compaction settings
  • Session Tree โ€” Interactive paginated tree with structured branch visualization; supports Resume (Enter) and Branch (B)
  • Rebase Picker โ€” Selection interface for history manipulation
  • Merge Picker โ€” Fuzzy finder for selecting sessions to merge into the current conversation

Build & Release

sharur uses a combination of Mage and GitHub Actions for CI/CD.


Versioning

The project version is maintained in a VERSION file in the repository root. During build, Magefile.go reads this file and injects it into the binary using linker flags (-ldflags "-X main.version=...").


Mage Targets

TargetDescription
BuildCompile shr for the current platform with version injection
TestRun all unit tests with coverage
VetStatic analysis with go vet
LintRun golangci-lint
VulnVulnerability scan with govulncheck
AllRun generate, build, test, vet, lint, and vuln in sequence
ReleaseCross-compile for Linux, macOS, and Windows (AMD64/ARM64), package into dist/
GenerateRun buf to regenerate protobuf stubs
DocsGenerate API reference (gomarkdoc) and build the Hugo site
DocsServeRun Hugo dev server at localhost:1313 with live reload
PkgSiteRun pkgsite for local full API browsing including internals

CI/CD Pipelines

Continuous Integration (ci.yml)

Triggered on every push to main and all pull requests. Runs mage all within a Nix environment on both ubuntu-latest and macos-latest, then uploads per-platform binaries as build artifacts. Coverage is collected and summarised via go tool cover.

Automated Release (release.yml)

Triggered by pushing a version tag (e.g., v1.2.3). Runs mage release to build cross-platform assets and uses softprops/action-gh-release to publish them to a new GitHub Release.

Docs Deploy (docs.yml)

Triggered on push to main and on published releases. Runs mage docs (gomarkdoc + Hugo build) and deploys docs/public/ to the gh-pages branch via peaceiris/actions-gh-pages.