This repository has been archived on 2026-03-09. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
oubliette/PLAN.md
Torjus Håkestad 40fda3420c feat: add psql shell and username-to-shell routing
Add a PostgreSQL psql interactive terminal shell with backslash
meta-commands, SQL statement handling with multi-line buffering, and
canned responses for common queries. Add username-based shell routing
via [shell.username_routes] config (second priority after credential-
specific shell, before random selection). Bump version to 0.13.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 19:58:34 +01:00

246 lines
12 KiB
Markdown

# Oubliette - SSH Honeypot
A fun SSH honeypot that logs login attempts, presents fake shells to "successful" logins, and tries to detect when a real human is poking around.
The name comes from the medieval dungeon concept - a place you throw people into and forget about them.
## Tech Stack
- **Language:** Go
- **SSH:** golang.org/x/crypto/ssh
- **Database:** SQLite
- **Web UI:** Go templates + htmx
- **Deployment:** Single binary with embedded assets
## Core Concepts
### Shell Profiles
Logins that "succeed" are routed to a fake shell. Shells are selected by weighted random from a registry. Each shell implements a common interface, making it easy to add new ones.
```go
type Shell interface {
Name() string
Description() string
Handle(ctx context.Context, ch ssh.Channel) error
}
```
### Smart Storage
To avoid the database growing unbounded on a small VPS:
- **Deduplication:** Store unique (username, password, IP) combinations with a count + first_seen/last_seen timestamps instead of one row per attempt.
- **Retention policy:** Configurable auto-pruning of records older than N days.
- **Aggregation:** Optionally roll up old raw data into daily summary tables before pruning.
### Human Detection
Score sessions based on signals that distinguish humans from bots:
- Keystroke timing (variable delays vs instant paste)
- Typos and backspace usage
- Tab completion and arrow key usage
- Adaptive behavior (commands that respond to previous output)
- Command diversity
- Session duration
Sessions crossing a human-likelihood threshold get flagged for review and can trigger webhook notifications.
### Login Realism
- Don't accept every attempt. Most attempts should fail. Bots commonly try thousands of combinations from a single IP (20k+ is not unusual), so the acceptance threshold should be high and configurable.
- **Credential memory:** When a credential is accepted, store it as a "valid" credential for a configurable TTL (e.g. 24-72 hours). If the same bot returns with the same username/password, it gets in immediately - making the credential appear legitimate and encouraging further interaction.
- Acceptance strategy is configurable: after N failed attempts from an IP, accept the next attempt (whatever the credentials are) and remember that combo.
- Optionally also support a static list of always-accepted credentials for testing.
---
## Phase 1 - Foundation
Goal: A working SSH honeypot that logs attempts, stores them in SQLite, and can present a basic fake shell. Minimal but functional.
### 1.1 Project Setup ✅
- Go module, directory structure, basic configuration (YAML or TOML)
- Configuration for: listen address, SSH host key path/auto-generation, database path, web UI listen address
- Nix flake with devshell and package output
- NixOS module for easy deployment (listen address, config path, state directory, etc.)
### 1.2 SSH Server ✅
- Listen for SSH connections using x/crypto/ssh
- Handle authentication callbacks
- Log all login attempts (username, password, source IP, timestamp)
- Configurable credential list that triggers "successful" login
- Basic login realism: reject first N attempts before accepting
### 1.3 SQLite Storage ✅
- Schema: login_attempts table with deduplication (username, password, ip, count, first_seen, last_seen)
- Schema: sessions table for successful logins (id, ip, username, shell_name, connected_at, disconnected_at, human_score)
- Schema: session_logs table for command logging (session_id, timestamp, input, output)
- Retention policy: background goroutine that prunes old records on a schedule
- **Database migrations:** Version-tracked migrations using embedded SQL files. Store current schema version in a `schema_version` table, apply pending migrations on startup. Keep it simple - no external migration tool, just sequential numbered `.sql` files embedded in the binary.
### 1.4 Shell Interface & Registry ✅
- Shell interface definition
- Registry with weighted random selection
- Basic bash-like shell:
- Prompt that looks like `user@hostname:~$`
- Handful of commands: `ls`, `cd`, `cat`, `pwd`, `whoami`, `uname`, `id`, `exit`
- Fake filesystem with a few interesting-looking files
- Log all input/output to the session_logs table
#### Session Context
Shells receive a `SessionContext` struct instead of just `ssh.Channel`, providing:
- `SessionID` (storage UUID)
- `Username` (authenticated user, from `ssh.ConnMetadata`)
- `RemoteAddr` (client IP, from `ssh.ConnMetadata`)
- `ClientVersion` (SSH client version string)
- `Store` (for session logging)
This lets shells build realistic prompts (`username@hostname:~$`) and log activity without needing direct access to the SSH connection.
#### Shell Configuration
- Define a `ShellConfig` sub-struct in the config with common fields: hostname, banner/MOTD, fake username
- Per-shell overrides via `map[string]map[string]any` (e.g. `[shell.bash]`, `[shell.cisco]`) so each Phase 3 shell can have its own knobs
- Shells receive the relevant config section, not the entire project config — keeps a clean boundary
#### Transparent I/O Recording (designed for 2.3 Session Replay)
- Wrap `ssh.Channel` in a `RecordingChannel` before passing it to the shell
- `RecordingChannel` intercepts every `Read` (client input) and `Write` (server output), logging raw byte chunks with precise timestamps to storage
- Shells don't need to know about recording — they just read/write normally
- This ensures consistent, complete capture regardless of shell implementation, and avoids needing to refactor shells when session replay is added in Phase 2.3
- The current `session_logs` schema (input/output text pairs) may need a companion `session_keystrokes` table with `(session_id, timestamp, direction, data)` for byte-level replay fidelity — evaluate when implementing
### 1.5 Minimal Web UI ✅
- Embedded static assets (Go embed)
- Dashboard: total attempts, attempts over time, unique IPs
- Tables: top usernames, top passwords, top source IPs
- List of active/recent sessions
---
## Phase 2 - Detection & Notification
Goal: Detect likely-human sessions and make the system smarter.
### 2.1 Human Detection Scoring ✅
- Keystroke timing analysis
- Track backspace, tab, arrow key usage
- Command diversity scoring
- Compute per-session human score, store in sessions table
- Flag sessions above configurable threshold
### 2.2 Notifications ✅
- Webhook support (generic HTTP POST, works with Slack/Discord/ntfy)
- Trigger on: human score threshold crossed, new session started, configurable
- Include session details in payload
### 2.3 Session Replay ✅
- Store keystroke-by-keystroke data with timing information
- Web UI: replay a session in a terminal-like viewer, watching commands play back in real-time
- Filter/sort sessions by human score
### 2.4 Adaptive Shell Routing
- If early keystrokes suggest a bot, route to basic shell or disconnect
- If keystrokes suggest a human, route to a more interesting shell
---
## Phase 3 - Fun Shells
Goal: Add the entertaining shell implementations.
### 3.1 Bash Shell Variations
- **Infinite sudo:** always asks for password, never works, logs every attempt
- **Slow decay:** shell gets progressively slower, commands take longer and longer
- **Haunted:** commands gradually return stranger output, files appear/disappear, `whoami` returns different users
- **Bread crumbs:** fake .bash_history, id_rsa files, database configs pointing to other honeypots
### 3.2 Cisco IOS Shell ✅
- Realistic `>` and `#` prompts
- Common commands: `show running-config`, `show interfaces`, `enable`, `configure terminal`
- Fake device info that looks like a real router
### 3.3 Smart Fridge Shell ✅
- Samsung FridgeOS boot banner
- Inventory management commands
- Temperature warnings
- "WARNING: milk expires in 2 days"
- Per-credential shell routing via `shell` field in static credentials
### 3.4 Text Adventure ✅
- Zork-style dungeon crawler
- "You are in a dimly lit server room."
- Navigation, items, puzzles
- The dungeon is the oubliette itself
### 3.5 Banking TUI Shell ✅
- 80s-style green-on-black bank terminal
### 3.6 PostgreSQL psql Shell ✅
- Simulates psql interactive terminal with `db_name` and `pg_version` config
- Backslash meta-commands: `\q`, `\dt`, `\d <table>`, `\l`, `\du`, `\conninfo`, `\?`, `\h`
- SQL statement handling with multi-line buffering (semicolon-terminated)
- Canned responses for common queries (SELECT version(), current_database(), etc.)
- DDL/DML acknowledgments (CREATE TABLE, INSERT, UPDATE, DELETE, etc.)
- Username-to-shell routing: configurable `[shell.username_routes]` maps usernames to shells
### 3.7 Other Shell Ideas (Future)
- **Nuclear launch terminal:** "ENTER LAUNCH AUTHORIZATION CODE"
- **ELIZA therapist:** every response is a therapy question
- **Pizza ordering terminal:** "Welcome to PizzaNet v2.3"
- **Haiku shell:** every response is a haiku
---
## Phase 4 - Polish
Goal: Make the web UI great and add operational niceties.
### 4.1 Enhanced Web UI
- GeoIP lookups and world map visualization of attack sources
- Charts: attempts over time, hourly patterns, credential trends
- Session detail view with full command log
- Filtering and search
### 4.2 Operational ✅
- Prometheus metrics endpoint ✅
- Structured logging (slog) ✅
- Graceful shutdown ✅
- Docker image (nix dockerTools) ✅
- Systemd unit file / deployment docs ✅
### 4.3 GeoIP ✅
- Embed a lightweight GeoIP database or use an API ✅
- Store country/city with each attempt ✅
- Aggregate stats by country ✅
### 4.4 Capture SSH Exec Commands
Many bots send a command directly via `ssh user@host <command>` (an SSH "exec" request) rather than requesting an interactive shell. Currently these are rejected and the command is lost. We should capture them.
- Handle `"exec"` request type in the server's request loop (alongside `"pty-req"` and `"shell"`)
- Parse the command string from the exec payload
- Add an `exec_command` column (nullable) to the `sessions` table via a new migration
- Store the command on the session record before closing the channel
- Optionally return plausible fake output for common commands (e.g. `uname`, `id`, `cat /etc/passwd`) to encourage further interaction
- Surface exec commands in the web UI (session detail view)
#### 4.4.1 Fake Exec Output
Return plausible fake output for exec commands to encourage bots to interact further.
**Approach: regex-based output assembly.** Bots typically send a single long command that chains recon commands and then echoes a summary (e.g. `echo "UNAME:$uname"`). Rather than interpreting arbitrary shell pipelines, we scan the command string for known patterns and assemble fake output.
Implementation:
- A map of common command/variable patterns to fake output strings, e.g.:
- `uname -a` / `uname -s -v -n -m``"Linux ubuntu-server 5.15.0-91-generic #101-Ubuntu SMP Tue Jan 2 15:13:10 UTC 2024 x86_64"`
- `uname -m` / `arch``"x86_64"`
- `cat /proc/uptime``"86432.71 172801.55"`
- `nproc` / `grep -c "^processor" /proc/cpuinfo``"2"`
- `cat /proc/cpuinfo` → fake cpuinfo block
- `lspci` → empty (no GPU — discourages cryptominer targeting)
- `id``"uid=0(root) gid=0(root) groups=0(root)"`
- `cat /etc/passwd` → minimal fake passwd file
- `last` → fake login entries
- `cat --help`, `ls --help` → canned GNU coreutils help text
- Scan the exec command for `echo "KEY:$var"` patterns; for each key, look up the corresponding fake value from the variable assignment earlier in the command
- If we recognise echo patterns, assemble and return the expected output
- If we don't recognise the command at all, return empty output with exit 0 (current behaviour)
- Values should draw from the existing shell config where possible (hostname, fake_user) for consistency
- New package `internal/execfake` or a file in `internal/server/` — keep it simple
Gather more real-world bot examples before implementing to ensure good coverage of common recon patterns.