Files
ai-media-hub/TODO.md
T
AI Assistant 6f3149a443
build-push / docker (push) Successful in 4m15s
Add local self-test flow and fix fallback regressions
2026-03-13 18:09:32 +09:00

9.1 KiB

AI Media Hub Handover

Working Rule

  • From this point on, every meaningful change should be appended to this file so the next handoff can reconstruct:
    • what changed
    • why it changed
    • how it was verified
    • what remains risky
  • Treat this file as both backlog and handover log, not just a static TODO list.

Current Session Update (2026-03-13)

  • Added a local self-test workflow before push/container build:
    • scripts/selftest.sh
    • scripts/mock_searxng.py
  • Fixed Korean query translation fallback behavior:
    • If GEMINI_API_KEY is missing or Gemini translation fails, the code now still attempts Google Translate fallback.
    • If Google Translate fallback fails, dictionary replacement fallback still runs.
  • Added Go tests for translation fallback logic.
  • Fixed frontend HLS preview wiring:
    • hls.js is now loaded in frontend/index.html
    • frontend now tries hls.js first, then native HLS playback if available
  • Corrected the practical local verification note:
    • go build ./backend from repo root conflicts with the existing backend/ directory name
    • verified build command is now treated as go build -o /tmp/... ./backend

Local Self-Test Workflow

  • Primary command:
    • bash scripts/selftest.sh
  • What it currently verifies:
    • Go formatting for touched backend files
    • Python syntax for worker + mock SearXNG
    • go test ./...
    • backend binary build
    • local app boot with temp SQLite/download dirs
    • /healthz
    • /api/search using a local mock SearXNG server
    • /api/upload
  • Purpose:
    • allow safe local regression checks before push or container build without depending on real SearXNG, Gemini, or browser interaction

Project Summary

  • Project: ai-media-hub
  • Goal: AI-assisted media discovery + ingest dashboard for Unraid
  • Backend: Go
  • Worker: Python + yt-dlp + ffmpeg
  • Frontend: HTML + Vanilla JS + Tailwind CDN
  • Database: SQLite
  • Current search backend: SearXNG
  • Current vision/ranking backend: Gemini 2.5 Flash
  • Deployment target: single Docker container on Unraid
  • Git remote: https://git.savethenurse.com/savethenurse/ai-media-hub.git

Current Architecture

  • backend/main.go App bootstrap, env loading, static frontend serving, route registration
  • backend/handlers/api.go Upload/download/search APIs, WebSocket progress broadcast, debug event broadcast
  • backend/services/cse.go Actual search backend service Despite filename, this is no longer Google CSE logic It now wraps SearXNG search, source filtering, result enrichment, preview asset parsing
  • backend/services/gemini.go Query translation, deterministic query expansion helper, Gemini vision scoring Also extracts first video frame with ffmpeg when no thumbnail exists
  • backend/models/db.go SQLite init + download history
  • worker/downloader.py yt-dlp probe/download + ffmpeg clip extraction
  • frontend/index.html Main dashboard UI, preview modal, debug log panel
  • frontend/app.js API calls, WebSocket status bar, hover preview playback, debug logger panel, platform toggles
  • frontend/style.css Custom styles, clamp helpers, slider thumb styles, debug panel scrollbar styles
  • unraid-template.xml Unraid template for current git.savethenurse.com image source

Search Flow: Current Implementation

  1. User enters a query in Zone A.
  2. Frontend sends /api/search with:
    • query
    • selected platforms
  3. Backend translates the query to English in GeminiService.TranslateQuery. Fallback order:
    • Gemini translation
    • Google Translate HTTP fallback
    • small Korean media-term dictionary replacement
  4. Backend builds deterministic English search variants in GeminiService.ExpandQuery.
  5. Backend calls SearchService.SearchMedia(...).
  6. Search service queries SearXNG for:
    • Envato
    • Artgrid
    • Google Video
  7. Search service filters source URLs aggressively:
    • Google Video: YouTube-only
    • Envato: elements.envato.com item URLs only
    • Artgrid: artgrid.io/clip/... only
  8. Search service enriches results:
    • Envato: parses item page HTML for og:image and preview video URL
    • Artgrid: attempts clip API + HTML parsing for thumbnails and preview sources
  9. Backend ranks all results locally.
  10. Backend evaluates all ranked results with Gemini vision in batches.
  11. Backend merges Gemini recommendations + fallback ranked items and returns JSON to frontend.
  12. Frontend renders cards and hover previews.

Direct Downloader Flow: Current Implementation

  1. User enters URL in Zone C.
  2. Frontend checks duplicate history via /api/history/check.
  3. Frontend loads preview metadata via /api/download/preview.
  4. Preview modal opens with:
    • media preview
    • duration
    • crop dual-thumb slider
    • quality select
  5. User confirms download.
  6. Backend launches Python worker.
  7. Worker downloads source with yt-dlp, clips with ffmpeg, emits JSON progress lines.
  8. Backend rebroadcasts progress over WebSocket.

Current Features Implemented

  • Project folder structure
  • Dockerfile
  • Gitea workflow
  • Unraid template
  • SQLite download history
  • File upload
  • yt-dlp direct downloader
  • Preview modal for direct download
  • Crop selection slider
  • Quality selection
  • WebSocket realtime progress
  • Search source toggles
  • Search card hover preview support
  • Debug log panel in frontend
  • .log download from debug panel

Important Current Constraints / Known Problems

  • Search backend has been rewritten multiple times and is still the main unstable area.
  • Envato previews are parsed mainly from page HTML metadata / structured data.
  • Artgrid previews are partially inferred from:
    • clip page HTML
    • clip API attempts
    • HLS preview handling in frontend
  • Search relevance is still not considered stable enough.
  • Gemini batch evaluation exists, but search quality can still degrade if upstream SearXNG results are noisy.
  • Frontend JavaScript was not linted with Node tooling in this environment because node is not installed here.
  • Full browser-level preview validation is still not covered by the local self-test script.
  • Search cards still render recommendation reason text, not a robust asset description/snippet mapping.

Frontend Debug Logger

  • UI button: bottom-right Logs
  • Files:
    • frontend/index.html
    • frontend/app.js
    • frontend/style.css
  • Logs currently capture:
    • API request / response
    • WebSocket progress messages
    • ignored WS debug messages
    • status updates
    • platform toggle state
    • preview source attach / detach
    • hover start / hover end
    • modal preview open / close
    • browser errors
    • promise rejections
    • backend debug broadcasts

Current Environment Variables

  • APP_ROOT
  • APP_ADDR
  • SQLITE_PATH
  • DOWNLOADS_DIR
  • FRONTEND_DIR
  • WORKER_SCRIPT
  • SEARXNG_BASE_URL
  • SEARXNG_GOOGLE_VIDEO_ENGINE
  • SEARXNG_WEB_ENGINE
  • GEMINI_API_KEY

Unraid Template Notes

  • Current image repository in template: git.savethenurse.com/savethenurse/ai-media-hub:latest
  • Current registry in template: https://git.savethenurse.com

Docker / Build Notes

  • Dockerfile uses:
    • Go build stage
    • static ffmpeg image stage
    • Python runtime stage
  • Heavy apt ffmpeg install path was removed earlier to reduce build time.

Git / Push Workflow Used So Far

  • Branch: main
  • Remote: origin
  • All requested changes were committed and pushed incrementally to: https://git.savethenurse.com/savethenurse/ai-media-hub.git

Recent Relevant Commits

  • 8ed1e84 Add in-app debug log panel
  • 823bf12 Reflect selected platforms in search status
  • cceb040 Update platform status and HLS previews
  • ad8afd5 Tighten source filters and add platform toggles
  • 27000db Hide overlays during hover preview
  • b78865d Rewrite search flow and enrich preview assets
  • de24886 Filter non-English expansions and prefer stock sources
  • 0bd458d Boost translated search fallback and source priority

Next Priority Areas

  • Search backend quality stabilization The search service is the main unresolved area.
  • Envato / Artgrid preview extraction hardening
  • Search result relevance validation against real user queries
  • Better matching between rendered description and actual linked asset
  • Add browser-level verification for preview/HLS behavior
  • Add more automated coverage for search ranking / filtering logic
  • Add proper frontend build/lint step if Node becomes available

Verified Locally In This Environment

  • go build -o /tmp/ai-media-hub ./backend
  • go test ./... (currently no broad test suite beyond the added fallback tests)
  • Python syntax check for worker + self-test helper
  • local app boot / /healthz through scripts/selftest.sh
  • local /api/search against mock SearXNG through scripts/selftest.sh
  • local /api/upload through scripts/selftest.sh
  • full browser-level validation was not fully reproducible in this environment

Short Handover Summary

  • The codebase exists and runs.
  • Upload/download features mostly exist.
  • Search is implemented but is still the most fragile subsystem.
  • A visible debug logging panel now exists in the web UI and should be used first when continuing work.