6.7 KiB
Native Nix Forgejo Runner on nix-cache02
Goal
Add a second Forgejo Actions runner instance on nix-cache02 that executes jobs directly on the host (no containers). This allows CI builds to populate the nix binary cache automatically, reducing reliance on manually triggered builds before deployments.
Motivation
- Nix store caching: The container-based
nixlabel runs in ephemeral Podman containers, losing all nix store paths between jobs. Native execution uses the host's persistent store, so builds reuse cached paths automatically. - Binary cache integration: nix-cache02 is the binary cache server (Harmonia). Paths built by CI are immediately available to all hosts.
- Faster deploy cycle: Currently updating a flake input (e.g. nixos-exporter) requires pushing to master, then waiting for the scheduled builder or manually triggering a build. With a native runner, repos can have CI workflows that run
nix build, and those derivations are in the cache by the time hosts auto-upgrade. - NixOS config builds: Enables future workflows that build
nixosConfigurations.*from this repo, populating the cache as a side effect of CI.
Design
Two Runner Instances
- actions1 (existing) — Container-based, global runner available to all Forgejo repos. Unchanged.
- actions-native (new) — Host-based, registered as a user-level runner under the
torjusForgejo account, so only repos owned by that user can target it.
Trusted Repos
Repos that should be allowed to use the native runner:
torjus/nixos-serverstorjus/nixos-exportertorjus/nixos(gunter/magicman configs)- Other repos with nix builds that benefit from cache population (add as needed)
Restriction is configured in the Forgejo web UI when registering the runner — scope it to the user or specific repos.
Label Configuration
labels = [ "native-nix:host" ];
Workflow files in trusted repos target this with runs-on: native-nix.
Host Packages
The runner needs nix and basic tools available on the host:
hostPackages = with pkgs; [
bash
coreutils
curl
gawk
git
gnused
nodejs
wget
nix
];
Security Analysis
What the runner CAN access
- Nix store — Can read and write derivations. This is the whole point; harmonia serves the store to all hosts.
- Network — Full network access during job execution.
- World-readable files — Standard for any process on the system.
What the runner CANNOT access
- Cache signing key —
/run/secrets/cache-secretis mode0400root-owned. Harmonia signs derivations on serve, not on store write. - Vault AppRole credentials —
/var/lib/vault/approle/is root-owned. - Other vault secrets — All in
/run/secrets/with restrictive permissions.
Mitigations
- User-level runner — Registered to the
torjususer on Forgejo (not global), so only repos owned by that user can submit jobs. - DynamicUser — The runner uses systemd DynamicUser, so no persistent user account. Each invocation gets an ephemeral UID.
- Nix sandbox — Nix builds already run sandboxed by default. Non-nix
run:steps execute as the runner's system user but have no special privileges. - Separate instance — Container-based jobs (untrusted repos) remain on actions1 and never get host access.
Accepted Risks
- A compromised trusted repo could inject bad derivations into the nix store/cache. This is an accepted risk since those repos already have deploy access to production hosts.
- Jobs can consume host resources (CPU, memory, disk). The
runner.capacitysetting limits concurrent jobs.
Implementation
1. Register runner on Forgejo and store token in Vault
- In Forgejo web UI: go to user settings > Actions > Runners, create a new runner registration token.
- Store the token in Vault via Terraform.
terraform/vault/variables.tf — add variable:
variable "forgejo_native_runner_token" {
description = "Forgejo Actions runner token for native nix runner on nix-cache02"
type = string
default = "PLACEHOLDER"
sensitive = true
}
terraform/vault/secrets.tf — add secret:
"hosts/nix-cache02/forgejo-native-runner-token" = {
auto_generate = false
data = { token = var.forgejo_native_runner_token }
}
2. Add NixOS configuration for native runner instance
Note: nix-cache02 already has an AppRole with access to secret/data/hosts/nix-cache02/* (defined in terraform/vault/hosts-generated.tf), so no approle changes are needed.
File: hosts/nix-cache02/actions-runner.nix
Add vault secret and runner instance alongside the existing overrides:
# Fetch native runner token from Vault
vault.secrets.forgejo-native-runner-token = {
secretPath = "hosts/nix-cache02/forgejo-native-runner-token";
extractKey = "token";
mode = "0444";
services = [ "gitea-runner-actions-native" ];
};
# Native nix runner instance
services.gitea-actions-runner.instances.actions-native = {
enable = true;
name = "${config.networking.hostName}-native";
url = "https://code.t-juice.club";
tokenFile = "/run/secrets/forgejo-native-runner-token";
labels = [ "native-nix:host" ];
hostPackages = with pkgs; [
bash coreutils curl gawk git gnused nodejs wget nix
];
settings = {
runner.capacity = 4;
cache = {
enabled = true;
dir = "/var/lib/gitea-runner/actions-native/cache";
};
};
};
3. Build and deploy
- Create feature branch
- Apply Terraform changes (variables + secrets + approle policy)
- Set the actual token value in
terraform.tfvars - Run
tofu applyinterraform/vault/ - Build the NixOS configuration:
nix build .#nixosConfigurations.nix-cache02.config.system.build.toplevel - Deploy to nix-cache02
- Verify the native runner appears as online in Forgejo UI
4. Test with a workflow
In a trusted repo (e.g. nixos-exporter):
name: Build
on: [push]
jobs:
build:
runs-on: native-nix
steps:
- uses: actions/checkout@v4
- run: nix build
Future Work
- NixOS config CI: Workflow that builds all
nixosConfigurationson push to master, populating the binary cache. - Nix store GC policy: CI builds will accumulate store paths. Since this host is the binary cache, GC needs to be conservative — only delete paths not referenced by current system configurations. Defer to a follow-up.
- Resource limits: Consider systemd MemoryMax/CPUQuota on the native runner if resource contention becomes an issue.
- Additional host packages: Evaluate whether tools like
cachixornix-prefetch-*should be added.
Open Questions
- Should
hostPackagesinclude additional tools beyond the basics listed above? - Do we want a separate capacity for the native runner vs container runner, or is 4 fine for both?