This repository has been archived on 2026-03-09. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
oubliette/PLAN.md
Torjus Håkestad 0133d956a5 feat: capture SSH exec commands (PLAN.md 4.4)
Bots often send commands via `ssh user@host <command>` (exec request)
rather than requesting an interactive shell. These were previously
rejected silently. Now exec commands are captured, stored on the session
record, and displayed in the web UI session detail page.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 17:43:11 +01:00

216 lines
9.8 KiB
Markdown

# Oubliette - SSH Honeypot
A fun SSH honeypot that logs login attempts, presents fake shells to "successful" logins, and tries to detect when a real human is poking around.
The name comes from the medieval dungeon concept - a place you throw people into and forget about them.
## Tech Stack
- **Language:** Go
- **SSH:** golang.org/x/crypto/ssh
- **Database:** SQLite
- **Web UI:** Go templates + htmx
- **Deployment:** Single binary with embedded assets
## Core Concepts
### Shell Profiles
Logins that "succeed" are routed to a fake shell. Shells are selected by weighted random from a registry. Each shell implements a common interface, making it easy to add new ones.
```go
type Shell interface {
Name() string
Description() string
Handle(ctx context.Context, ch ssh.Channel) error
}
```
### Smart Storage
To avoid the database growing unbounded on a small VPS:
- **Deduplication:** Store unique (username, password, IP) combinations with a count + first_seen/last_seen timestamps instead of one row per attempt.
- **Retention policy:** Configurable auto-pruning of records older than N days.
- **Aggregation:** Optionally roll up old raw data into daily summary tables before pruning.
### Human Detection
Score sessions based on signals that distinguish humans from bots:
- Keystroke timing (variable delays vs instant paste)
- Typos and backspace usage
- Tab completion and arrow key usage
- Adaptive behavior (commands that respond to previous output)
- Command diversity
- Session duration
Sessions crossing a human-likelihood threshold get flagged for review and can trigger webhook notifications.
### Login Realism
- Don't accept every attempt. Most attempts should fail. Bots commonly try thousands of combinations from a single IP (20k+ is not unusual), so the acceptance threshold should be high and configurable.
- **Credential memory:** When a credential is accepted, store it as a "valid" credential for a configurable TTL (e.g. 24-72 hours). If the same bot returns with the same username/password, it gets in immediately - making the credential appear legitimate and encouraging further interaction.
- Acceptance strategy is configurable: after N failed attempts from an IP, accept the next attempt (whatever the credentials are) and remember that combo.
- Optionally also support a static list of always-accepted credentials for testing.
---
## Phase 1 - Foundation
Goal: A working SSH honeypot that logs attempts, stores them in SQLite, and can present a basic fake shell. Minimal but functional.
### 1.1 Project Setup ✅
- Go module, directory structure, basic configuration (YAML or TOML)
- Configuration for: listen address, SSH host key path/auto-generation, database path, web UI listen address
- Nix flake with devshell and package output
- NixOS module for easy deployment (listen address, config path, state directory, etc.)
### 1.2 SSH Server ✅
- Listen for SSH connections using x/crypto/ssh
- Handle authentication callbacks
- Log all login attempts (username, password, source IP, timestamp)
- Configurable credential list that triggers "successful" login
- Basic login realism: reject first N attempts before accepting
### 1.3 SQLite Storage ✅
- Schema: login_attempts table with deduplication (username, password, ip, count, first_seen, last_seen)
- Schema: sessions table for successful logins (id, ip, username, shell_name, connected_at, disconnected_at, human_score)
- Schema: session_logs table for command logging (session_id, timestamp, input, output)
- Retention policy: background goroutine that prunes old records on a schedule
- **Database migrations:** Version-tracked migrations using embedded SQL files. Store current schema version in a `schema_version` table, apply pending migrations on startup. Keep it simple - no external migration tool, just sequential numbered `.sql` files embedded in the binary.
### 1.4 Shell Interface & Registry ✅
- Shell interface definition
- Registry with weighted random selection
- Basic bash-like shell:
- Prompt that looks like `user@hostname:~$`
- Handful of commands: `ls`, `cd`, `cat`, `pwd`, `whoami`, `uname`, `id`, `exit`
- Fake filesystem with a few interesting-looking files
- Log all input/output to the session_logs table
#### Session Context
Shells receive a `SessionContext` struct instead of just `ssh.Channel`, providing:
- `SessionID` (storage UUID)
- `Username` (authenticated user, from `ssh.ConnMetadata`)
- `RemoteAddr` (client IP, from `ssh.ConnMetadata`)
- `ClientVersion` (SSH client version string)
- `Store` (for session logging)
This lets shells build realistic prompts (`username@hostname:~$`) and log activity without needing direct access to the SSH connection.
#### Shell Configuration
- Define a `ShellConfig` sub-struct in the config with common fields: hostname, banner/MOTD, fake username
- Per-shell overrides via `map[string]map[string]any` (e.g. `[shell.bash]`, `[shell.cisco]`) so each Phase 3 shell can have its own knobs
- Shells receive the relevant config section, not the entire project config — keeps a clean boundary
#### Transparent I/O Recording (designed for 2.3 Session Replay)
- Wrap `ssh.Channel` in a `RecordingChannel` before passing it to the shell
- `RecordingChannel` intercepts every `Read` (client input) and `Write` (server output), logging raw byte chunks with precise timestamps to storage
- Shells don't need to know about recording — they just read/write normally
- This ensures consistent, complete capture regardless of shell implementation, and avoids needing to refactor shells when session replay is added in Phase 2.3
- The current `session_logs` schema (input/output text pairs) may need a companion `session_keystrokes` table with `(session_id, timestamp, direction, data)` for byte-level replay fidelity — evaluate when implementing
### 1.5 Minimal Web UI ✅
- Embedded static assets (Go embed)
- Dashboard: total attempts, attempts over time, unique IPs
- Tables: top usernames, top passwords, top source IPs
- List of active/recent sessions
---
## Phase 2 - Detection & Notification
Goal: Detect likely-human sessions and make the system smarter.
### 2.1 Human Detection Scoring ✅
- Keystroke timing analysis
- Track backspace, tab, arrow key usage
- Command diversity scoring
- Compute per-session human score, store in sessions table
- Flag sessions above configurable threshold
### 2.2 Notifications ✅
- Webhook support (generic HTTP POST, works with Slack/Discord/ntfy)
- Trigger on: human score threshold crossed, new session started, configurable
- Include session details in payload
### 2.3 Session Replay ✅
- Store keystroke-by-keystroke data with timing information
- Web UI: replay a session in a terminal-like viewer, watching commands play back in real-time
- Filter/sort sessions by human score
### 2.4 Adaptive Shell Routing
- If early keystrokes suggest a bot, route to basic shell or disconnect
- If keystrokes suggest a human, route to a more interesting shell
---
## Phase 3 - Fun Shells
Goal: Add the entertaining shell implementations.
### 3.1 Bash Shell Variations
- **Infinite sudo:** always asks for password, never works, logs every attempt
- **Slow decay:** shell gets progressively slower, commands take longer and longer
- **Haunted:** commands gradually return stranger output, files appear/disappear, `whoami` returns different users
- **Bread crumbs:** fake .bash_history, id_rsa files, database configs pointing to other honeypots
### 3.2 Cisco IOS Shell ✅
- Realistic `>` and `#` prompts
- Common commands: `show running-config`, `show interfaces`, `enable`, `configure terminal`
- Fake device info that looks like a real router
### 3.3 Smart Fridge Shell ✅
- Samsung FridgeOS boot banner
- Inventory management commands
- Temperature warnings
- "WARNING: milk expires in 2 days"
- Per-credential shell routing via `shell` field in static credentials
### 3.4 Text Adventure ✅
- Zork-style dungeon crawler
- "You are in a dimly lit server room."
- Navigation, items, puzzles
- The dungeon is the oubliette itself
### 3.5 Banking TUI Shell ✅
- 80s-style green-on-black bank terminal
### 3.6 Other Shell Ideas (Future)
- **Nuclear launch terminal:** "ENTER LAUNCH AUTHORIZATION CODE"
- **ELIZA therapist:** every response is a therapy question
- **Pizza ordering terminal:** "Welcome to PizzaNet v2.3"
- **Haiku shell:** every response is a haiku
---
## Phase 4 - Polish
Goal: Make the web UI great and add operational niceties.
### 4.1 Enhanced Web UI
- GeoIP lookups and world map visualization of attack sources
- Charts: attempts over time, hourly patterns, credential trends
- Session detail view with full command log
- Filtering and search
### 4.2 Operational ✅
- Prometheus metrics endpoint ✅
- Structured logging (slog) ✅
- Graceful shutdown ✅
- Docker image (nix dockerTools) ✅
- Systemd unit file / deployment docs ✅
### 4.3 GeoIP ✅
- Embed a lightweight GeoIP database or use an API ✅
- Store country/city with each attempt ✅
- Aggregate stats by country ✅
### 4.4 Capture SSH Exec Commands
Many bots send a command directly via `ssh user@host <command>` (an SSH "exec" request) rather than requesting an interactive shell. Currently these are rejected and the command is lost. We should capture them.
- Handle `"exec"` request type in the server's request loop (alongside `"pty-req"` and `"shell"`)
- Parse the command string from the exec payload
- Add an `exec_command` column (nullable) to the `sessions` table via a new migration
- Store the command on the session record before closing the channel
- Optionally return plausible fake output for common commands (e.g. `uname`, `id`, `cat /etc/passwd`) to encourage further interaction
- Surface exec commands in the web UI (session detail view)
#### 4.4.1 Fake Exec Output
Return plausible fake output for common exec commands (e.g. `uname`, `id`, `cat /etc/passwd`) to encourage bots to interact further. Implement after collecting data on what bots commonly try to run.