nixos/agents.md
Julian Sutter 7c014f6534 Add development standards and procedures for app deployment
- Add curl timeout requirement (5 seconds)
- Add comprehensive 8-step workflow for new application deployment
- Add troubleshooting procedures for domain availability and SSL certs
- Add infrastructure roadmap for Borg backup server and sops-nix
- Update README with session start protocol and infrastructure tasks
2026-02-16 22:05:18 -08:00

159 lines
5.4 KiB
Markdown

# NixOS Repository Quick Reference
## Quick Commands
- **Test build**: `nixos-rebuild build --flake .#<system>`
- **List systems**: `nix flake show`
- **Commit**: `git add files && git commit -m "msg"`
## Systems
- **warp**: Server + nginx + forgejo
- **skip**: Server + nginx only
- **framework/aurora/labrizor**: Desktop systems
## Key Files
- `flake.nix`: System definitions
- `systems/<name>.nix`: Hardware/boot configs
- `servers/<name>.nix`: Service configs
- `users/<name>.nix`: User configs
## Testing Workflow
1. Always `git status` first - affects flake evaluation
2. Stage changes (`git add`) before building - prevents Nix store issues
3. Test with `nixos-rebuild build --flake .#<system>`
4. Check success message: `"Done. The new configuration is /nix/store/..."`
## Important
- Server configs may contain hardcoded credentials
- Always carefully inspect the NixOS wiki for instructions before adding new applications to the repo
- Do not editorialize or pass judgement. Be a robot.
- Repository root: `/home/jsutter/src/nixos`
## Development Standards
### curl Usage
When using curl commands, always set a timeout to 5 seconds:
```bash
curl --max-time 5 <url>
# or
curl -m 5 <url>
```
## Procedures
### Adding a New Application to the Repository
1. **Gather Requirements**
- Ask the user what server to deploy to
- Ask the user what domain name the app will be available on
2. **Research and Planning**
- Build a brief plan to construct the app
- Review the NixOS wiki (https://nixos.org/nixos/manual/) to see if packages are available
- Check for existing NixOS modules or services that can be used
- Identify dependencies and configuration requirements
3. **Implementation**
- Follow the plan constructed in step 2
- Add the necessary configuration to the appropriate server file in `servers/`
- Include nginx reverse proxy configuration if the app needs to be accessible via HTTP/HTTPS
- Add any required firewall rules, services, or users
4. **Local Testing**
- Test the build locally: `nixos-rebuild build --flake .#<system>`
- Refine the configuration until the build succeeds
- Review the generated configuration for correctness
5. **Remote Deployment**
- Push the repo to the remote machine: `git push origin master`
- SSH to the target server
- Pull the changes: `cd ~/src/nixos && git pull origin master`
- Build and switch to the new config: `sudo nixos-rebuild switch --flake .#`
6. **Verification**
- Ensure the service is available on the chosen domain
- Ensure the certificate is issued by Let's Encrypt (check with: `openssl s_client -connect <domain>:443 | openssl x509 -noout -issuer`)
- Test basic functionality of the application
7. **Troubleshooting**
- If the app isn't available on the chosen domain:
- Check service status: `systemctl status <service>`
- Check nginx logs: `journalctl -u nginx -f`
- Check application logs: `journalctl -u <service> -f`
- Verify DNS resolution
- Check firewall rules
- Verify nginx configuration syntax: `nginx -t`
- If the certificate isn't issued by Let's Encrypt:
- Check ACME challenge configuration
- Verify domain ownership record
- Check Let's Encrypt logs: `journalctl -u certbot -f`
- Manually trigger certificate renewal if needed
8. **Process Improvement**
- After successful deployment, propose 3 suggestions to add to agents.md that would help with future deployments:
1. [Specific pattern or configuration approach discovered]
2. [Common pitfall to avoid]
3. [Useful command or tool discovered]
### Infrastructure Tasks
#### Planned Work
1. **Borg Backup Server**
- Set up a dedicated Borg backup server for automated backups
- Configure backup schedules for critical systems
- Implement retention policies and pruning rules
2. **Secrets Management with sops-nix**
- Implement sops-nix for secrets management
- Move all hardcoded secrets from server configs into sops-nix
- Set up encryption keys and key rotation policies
- Document the secrets management workflow
## Remote System Management
### Access Systems
SSH to machines using hostnames (resolve via local `/etc/hosts` or DNS):
```bash
ssh <hostname> # Replace with actual system name
```
### Make Configuration Changes
1. **Check current systems:** View `flake.nix` for available system configurations
2. **Edit local config:** `cd ~/src/nixos && vim [relevant_file]`
3. **Test build:** `nixos-rebuild build --flake .#<system-name>`
4. **Commit and push changes:**
```bash
git add . && git commit -m "description"
git push origin master
```
5. **Update target systems:**
```bash
ssh <hostname> 'cd ~/src/nixos && git pull && sudo nixos-rebuild switch --flake .#'
```
### Bulk Updates
```bash
# Update multiple systems
for host in host1 host2 host3; do
ssh $host 'cd ~/src/nixos && git pull && sudo nixos-rebuild switch --flake .#' &
done
wait # Wait for all updates to complete
```
### Quick Management
```bash
# Check service status
ssh <hostname> 'systemctl status <service>'
# View logs
ssh <hostname> 'journalctl -u <service> -f'
# Rebuild if build fails
ssh <hostname> 'cd ~/src/nixos && git pull && sudo nixos-rebuild switch --flake .#'
```
### Repository
- **Central:** https://git.symbiotrip.com/jsutter/nixos
- **Config reference:** Check `flake.nix` for system names and module structure
- **Update workflow:** Local edit → Push → Remote pull → Rebuild