Memory & Context

Gokin implements a three-level memory system and intelligent context management.

Memory Types#

Type	Scope	Lifetime	Description
Session	Current session	Until session ends	Temporary notes
Project	Current project	Persistent	Project knowledge
Global	All projects	Persistent	General preferences

Entry Structure#

Entry {
    ID        string     // SHA256: "mem_" + hash[:8]
    Key       string     // Optional key for lookup
    Content   string     // Memory content
    Type      MemoryType // session, project, global
    Tags      []string   // Auto-extracted tags
    Timestamp time.Time
    Project   string     // Project identifier (path hash)
}

Auto-Tagging#

Tags are automatically extracted from content:

File paths: /[a-zA-Z0-9_.\-/]+
Function names: function [a-zA-Z_][a-zA-Z0-9_]*
Package names: package [a-zA-Z_][a-zA-Z0-9_]*

Memory Tools#

Tool	Description
memory	CRUD: add, get, search, list, remove
memorize	Typed storage: fact, preference, convention, pattern
pin_context	Pin text to system prompt for the session
history_search	Regex search through session history

Auto-Inject#

When memory.auto_inject: true, relevant memories are automatically added to the system prompt based on conversation context.

Storage#

Persistent storage in JSON files
Debounced writes (coalescing frequent saves)
Separated by project (path hash)
Entry limit (default: 1000)
Key replacement (same key overwrites)

Context Management#

Token Counting#

Token counting with caching and fallback estimation:

LRU cache: 1000 entries
API counting: via provider client API
Fallback: character-based estimation (~4 chars = 1 token)
Async: semaphore for 3 parallel requests

Token Limits by Model#

Model	Input	Output
gemini-3-flash	1M	65K
gemini-3-pro	1M	65K
gemini-2.5-flash	1M	8K
gemini-2.5-pro	1M	8K
gemini-2.0-flash	1M	8K
glm-4.7	128K	131K

Context Warnings#

TokenUsage {
    InputTokens  int
    MaxTokens    int
    PercentUsed  float64
    NearLimit    bool      // > warning_threshold (80%)
    ExceedsLimit bool      // > max
    IsEstimate   bool      // API call failed
}

Auto-Summarization#

When context reaches warning_threshold (80%), automatic summarization triggers.

Summarization priorities (what is preserved):

File paths and modified functions/methods
Error messages and their solutions
Component dependencies
Configuration changes
Architectural decisions
Unresolved issues and next steps

Exclusions (what is removed):

Verbose tool output and raw logs
Intermediate failed attempts
UI confirmations
Repeated file reads

Hierarchical Summarization#

For large conversations (>100 messages):

Split into chunks
Intermediate summaries for each chunk
Merge summaries into final

Result Compaction#

ResultCompactor compresses long tool results:

Errors — higher limit (10K chars), fully preserved
Successful results — trimmed to tool_result_max_chars
Logs — preserve beginning and end (head/tail)

Recognized error indicators:

"error:", "panic:", "fatal:", "stack trace:", "exception:",
"traceback:", "undefined:", "cannot use", "--- fail",
"permission denied", "not found", "syntax error",
"compilation failed", "build failed"

Context Configuration#

context:
  warning_threshold: 0.8           # Warning at 80%
  summarization_ratio: 0.5         # Summarize to 50%
  tool_result_max_chars: 10000     # Max chars per tool result
  enable_auto_summary: true        # Auto-summarization
  auto_compact_threshold: 0.75     # Auto-compact at 75%

Summary Cache#

Summaries are cached for reuse:

Hash-based key from content
Prevents re-summarizing identical context
Invalidated on conversation changes