Commit graph

15 commits

Author SHA1 Message Date
jedarden
301a5884ce fix(firmware): bust Kaniko cache + force sdkconfig regen to fix 16MB crash loop
Some checks are pending
CI Benchmark - Fusion Loop Timing / Fusion Loop Timing Benchmark (push) Waiting to run
The 0.1.352 Docker image contained firmware compiled with CONFIG_ESPTOOLPY_FLASHSIZE=16MB
despite sdkconfig.defaults being updated to 4MB in d837598. Kaniko served a cached
firmware layer, bypassing the sdkconfig.defaults change.

Result: ESP32-S3 (4MB flash) flashed via Web Serial crashed on every boot:
  spi_flash: Detected size(4096k) smaller than binary image header(16384k). Probe failed.

Fix:
- Add FIRMWARE_CACHE_BUST ARG before COPY in firmware stage (guarantees cache miss)
- Add RUN rm -f sdkconfig sdkconfig.old so idf.py set-target regenerates from
  sdkconfig.defaults (CONFIG_ESPTOOLPY_FLASHSIZE_4MB=y) on every build

Bumps version to 0.1.354 to trigger a fresh CI build.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-03 23:14:10 -04:00
jedarden
4b8e6f793f fix(docker): pin GOOS/GOARCH to linux/amd64 in go build
Some checks are pending
CI Benchmark - Fusion Loop Timing / Fusion Loop Timing Benchmark (push) Waiting to run
The multi-arch change (2cd4410) derived GOOS/GOARCH from TARGETPLATFORM
with wrong cut field indices (-f2/-f3), yielding the invalid pair
amd64/amd64 -> `go: unsupported GOOS/GOARCH pair amd64/amd64`, failing
every CI image build since May 24. CI builds amd64 only (ESP-IDF firmware
is x86_64-only), so pin linux/amd64 explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 13:26:26 -04:00
jedarden
3ca6e8fcd3 build: bake spaxel-sim binary into image for in-cluster simulator
Some checks are pending
CI Benchmark - Fusion Loop Timing / Fusion Loop Timing Benchmark (push) Waiting to run
Builds cmd/sim alongside the mothership and copies /spaxel-sim into the
final image so the same image can drive a synthetic-node CSI load against
a deployed mothership. Default ENTRYPOINT still runs the mothership.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 21:49:20 -04:00
jedarden
6d989918d8 feat(docker): harden Dockerfile with distroless nonroot runtime
Some checks are pending
CI Benchmark - Fusion Loop Timing / Fusion Loop Timing Benchmark (push) Waiting to run
- Change runtime from debian:12-slim to gcr.io/distroless/static-debian12:nonroot
- Remove wget health check (distroless has no shell)
- Embed dashboard via go:embed (dashboard files now part of binary)
- Add build tag support for conditional embedding (production vs development)
- Dashboard serving code supports both embedded and filesystem-based serving

The dashboard is now embedded in the Go binary using go:embed with the
'embed' build tag. Production Docker builds use -tags=embed to enable
dashboard embedding, while development builds fall back to filesystem
serving. This aligns with the plan's security requirements for non-root
distroless runtime while maintaining developer ergonomics.

Closes: bf-1chgr

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 10:22:01 -04:00
jedarden
2cd4410501 ci(docker): add multi-arch (amd64+arm64) build support
Some checks are pending
CI Benchmark - Fusion Loop Timing / Fusion Loop Timing Benchmark (push) Waiting to run
- Add ARG TARGETPLATFORM/TARGETARCH for cross-platform builds
- Cross-compile Go binary using GOOS/GOARCH from TARGETPLATFORM
- ESP32 firmware build is amd64-only (ESP-IDF is x86_64)
  - Creates placeholder on arm64 builds
  - Removes placeholder in final stage, adds README
- Supports docker buildx --platform linux/amd64,linux/arm64

Closes: bf-2bxpx

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 10:14:40 -04:00
jedarden
c3034ae9a2 Fix firmware flashing: correct merge binary, seed overwrite, inline flash UI
- Dockerfile: use --flash_size 4MB and drop OTA data from merge_bin (OTA
  data at 0xc10000 inflated binary to 12.6MB, exceeding 4MB chip flash)
- main.go: seedFirmwareDir now overwrites when source size differs, fixing
  PVC staleness where old 1.6MB app-only binary was never replaced
- onboard.js: renderFlashFirmware() rewritten so all elements (button,
  progress bar, status text, retry help, log panel) are inline in one
  container — no separate floating modal

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:41:51 -04:00
jedarden
55976c7058 Fix merge_bin flash parameters to match sdkconfig (dio/80m/16MB)
Guru Meditation/IllegalInstruction after flashing was caused by the
merged binary using default flash parameters instead of the project's
settings. esptool merge_bin flags use underscores and go after the
subcommand: --flash_mode dio --flash_freq 80m --flash_size 16MB.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 14:09:57 -04:00
jedarden
223b439c5a Use merged firmware binary for esp-web-tools flashing
esptool merge_bin combines bootloader (0x0), partition table (0x8000),
application (0x10000), and OTA data (0xc10000) into a single binary
flashable at offset 0x0 — matching the manifest address and enabling
correct initial flashing via the onboarding wizard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 13:33:01 -04:00
jedarden
b8684aad68 fix(ci): set-target esp32s3 before idf.py build
ESP-IDF 5.x requires explicit set-target even when CONFIG_IDF_TARGET
is present in sdkconfig.defaults.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 15:09:06 -04:00
jedarden
f2b609a2ef fix(ci): source IDF export.sh in firmware build stage
espressif/idf entrypoint is not invoked in multi-stage builds, so
idf.py is not in PATH. Sourcing export.sh activates the toolchain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 15:03:40 -04:00
jedarden
6b87040721 feat(ci): build ESP32 firmware in Dockerfile and seed OTA on startup
Add espressif/idf:v5.2 as a multi-stage build step so the firmware
binary is baked into the image at /firmware/spaxel-firmware.bin.
On startup the mothership copies it into /data/firmware/ (PVC) if not
already present, making it immediately available for the onboarding
wizard without a manual upload.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:58:14 -04:00
jedarden
e44dd345f6 feat: implement comprehensive /healthz endpoint
Add complete health check implementation for Docker HEALTHCHECK and
Traefik health routing with:

Response fields:
- status: "ok" or "degraded"
- uptime_s: seconds since mothership boot
- version: mothership version string
- nodes_online: count of connected nodes
- db: "ok" or "failing" (SELECT 1 with 100ms timeout)
- load_level: 0-3 from load shedding state
- reason: human-readable explanation (only when degraded)

HTTP status codes:
- 200 for healthy (status="ok")
- 503 for degraded (status="degraded")

Degraded conditions:
- Database unreachable
- Load level 3 sustained for >60 seconds
- No nodes connected after 5 minutes uptime

Docker HEALTHCHECK updated to verify status="ok" response.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:09:36 -04:00
jedarden
a2e00d39d4 fix(docker): use Go 1.25 to match modernc.org/sqlite v1.47 requirement 2026-04-01 22:59:13 -04:00
jedarden
46c3516941 fix(docker): remove invalid HEALTHCHECK directive for distroless image 2026-04-01 22:42:18 -04:00
jedarden
3f2962f945 feat(deploy): Docker packaging with multi-stage build and compose orchestration
- Dockerfile: golang:1.23-bookworm builder → distroless/static-debian12:nonroot
- docker-compose.yml: host networking (required for mDNS), Traefik labels, resource limits
- VERSION: 0.1.0 for image tagging
- .dockerignore: excludes docs, build artifacts, IDE files
- .gitignore: standard Go/ESP-IDF ignores

Key decisions:
- Host networking required: Docker bridge blocks mDNS multicast 224.0.0.251
- distroless/static-debian12:nonroot: no shell, minimal attack surface, UID 65532
- Firmware via volume mount: users provide their own binaries for OTA
- Traefik labels disabled by default: enable SPAXEL_TRAEFIK_ENABLE=true for TLS

Complete: Phase 1 Docker packaging — all foundation items now done
Remaining: Phase 2 signal processing (baseline, deltaRMS, Fresnel zones)
2026-03-26 07:46:15 -04:00