13 KiB
13 KiB
AI Media Hub Handover
Working Rule
- From this point on, every meaningful change should be appended to this file so the next handoff can reconstruct:
- what changed
- why it changed
- how it was verified
- what remains risky
- Treat this file as both backlog and handover log, not just a static TODO list.
Current Session Update (2026-03-13)
- Added a local self-test workflow before push/container build:
scripts/selftest.shscripts/mock_searxng.py
- Fixed Korean query translation fallback behavior:
- If
GEMINI_API_KEYis missing or Gemini translation fails, the code now still attempts Google Translate fallback. - If Google Translate fallback fails, dictionary replacement fallback still runs.
- If
- Added Go tests for translation fallback logic.
- Fixed frontend HLS preview wiring:
hls.jsis now loaded infrontend/index.html- frontend now tries
hls.jsfirst, then native HLS playback if available
- Corrected the practical local verification note:
go build ./backendfrom repo root conflicts with the existingbackend/directory name- verified build command is now treated as
go build -o /tmp/... ./backend
Current Session Update (2026-03-13, Search/Preview Follow-up)
- Investigated a production search failure using downloaded frontend logs.
- Identified the main timeout cause:
- too many search results were being collected
- too many Gemini Vision batches were being evaluated sequentially
- backend debug messages were broadcasting oversized result payloads
- Applied search pipeline optimization:
- reduced per-source result caps
- reduced query fan-out for Google Video
- reduced enrichment cap
- limited Gemini Vision evaluation to top-ranked candidates only
- Improved Google Video filtering:
- added bans for music/BGM/trailer-style noise results
- Improved Envato enrichment fidelity:
- source page metadata is now preferred over search-engine proxy thumbnails
- source snippet/title are now taken from page metadata when available
- preview mp4 extraction now works via HTML/JSON-LD parsing
- added Python HTML fetch fallback for Cloudflare-challenged Envato pages because Go HTTP alone was receiving 403 challenge pages in testing
- Improved Artgrid fidelity:
- source page title/description/thumbnail are now preferred over search-engine snippets when available
- preview extraction is still not considered solved for all Artgrid clips because public HTML tested here did not expose a stable mp4/m3u8 URL
- Improved logging:
- backend search debug events now emit summaries, timings, source counts, preview counts, and Gemini batch stats instead of giant raw arrays
- frontend now logs raw non-JSON error bodies instead of collapsing them to
{}on gateway/proxy failures
- Improved result rendering:
- search cards now show source snippet/description separately from AI reason to reduce confusion between asset metadata and Gemini commentary
Current Session Update (2026-03-13, Regression Fix)
- A regression was found after search optimization:
- Envato and Artgrid disappeared entirely for some real searches while Google Video still returned results
- Root cause:
- the first optimization reduced query-variant breadth too aggressively
- the first 3 query variants were not enough to recover Envato/Artgrid in some real SearXNG result sets
- Fix applied:
- search now runs in two stages
- stage 1 searches only the first few variants for speed
- stage 2 searches additional variants only for sources that still returned zero results
- Intent:
- keep the anti-timeout optimization
- recover Envato/Artgrid recall when the early pass is too narrow
Current Session Update (2026-03-13, HTML Snapshot Analysis)
- Used saved HTML snapshots supplied by the user for:
- Envato item page
- Artgrid clip page
- Findings:
- Envato page exposes clean
VideoObjectJSON-LD with:- exact asset title
- rich description
- thumbnail URL
- preview mp4 URL
- Artgrid page exposes reliable meta fields for:
- title
- description
- thumbnail
- canonical URL
- Artgrid snapshot still does not expose a stable preview mp4 or m3u8 in the saved HTML or downloaded asset bundle inspected here
- Envato page exposes clean
- Fixes applied from the snapshots:
- Envato enrichment now prefers
VideoObjectJSON-LD over generic meta tags - Envato search cards should now align much better with the actual source asset and preview
- Artgrid title/description are now cleaned so Gemini/source text is less polluted by site suffixes and generic boilerplate
- Envato enrichment now prefers
- Remaining limitation:
- Artgrid hover-video preview cannot be derived reliably from the provided snapshot alone
- if Artgrid preview video is still required, the next useful artifact is a browser HAR or DevTools network capture from an opened clip page
Local Self-Test Workflow
- Primary command:
bash scripts/selftest.sh
- What it currently verifies:
- Go formatting for touched backend files
- Python syntax for worker + mock SearXNG
go test ./...- backend binary build
- local app boot with temp SQLite/download dirs
/healthz/api/searchusing a local mock SearXNG server/api/upload
- Purpose:
- allow safe local regression checks before push or container build without depending on real SearXNG, Gemini, or browser interaction
Project Summary
- Project:
ai-media-hub - Goal: AI-assisted media discovery + ingest dashboard for Unraid
- Backend: Go
- Worker: Python +
yt-dlp+ffmpeg - Frontend: HTML + Vanilla JS + Tailwind CDN
- Database: SQLite
- Current search backend:
SearXNG - Current vision/ranking backend:
Gemini 2.5 Flash - Deployment target: single Docker container on Unraid
- Git remote:
https://git.savethenurse.com/savethenurse/ai-media-hub.git
Current Architecture
backend/main.goApp bootstrap, env loading, static frontend serving, route registrationbackend/handlers/api.goUpload/download/search APIs, WebSocket progress broadcast, debug event broadcastbackend/services/cse.goActual search backend service Despite filename, this is no longer Google CSE logic It now wraps SearXNG search, source filtering, result enrichment, preview asset parsingbackend/services/gemini.goQuery translation, deterministic query expansion helper, Gemini vision scoring Also extracts first video frame withffmpegwhen no thumbnail existsbackend/models/db.goSQLite init + download historyworker/downloader.pyyt-dlpprobe/download + ffmpeg clip extractionfrontend/index.htmlMain dashboard UI, preview modal, debug log panelfrontend/app.jsAPI calls, WebSocket status bar, hover preview playback, debug logger panel, platform togglesfrontend/style.cssCustom styles, clamp helpers, slider thumb styles, debug panel scrollbar stylesunraid-template.xmlUnraid template for currentgit.savethenurse.comimage source
Search Flow: Current Implementation
- User enters a query in Zone A.
- Frontend sends
/api/searchwith:query- selected
platforms
- Backend translates the query to English in
GeminiService.TranslateQuery. Fallback order:- Gemini translation
- Google Translate HTTP fallback
- small Korean media-term dictionary replacement
- Backend builds deterministic English search variants in
GeminiService.ExpandQuery. - Backend calls
SearchService.SearchMedia(...). - Search service queries SearXNG for:
EnvatoArtgridGoogle Video
- Search service filters source URLs aggressively:
- Google Video: YouTube-only
- Envato:
elements.envato.comitem URLs only - Artgrid:
artgrid.io/clip/...only
- Search service enriches results:
- Envato: parses item page HTML for
og:imageand preview video URL - Artgrid: attempts clip API + HTML parsing for thumbnails and preview sources
- Envato: parses item page HTML for
- Backend ranks all results locally.
- Backend evaluates all ranked results with Gemini vision in batches.
- Backend merges Gemini recommendations + fallback ranked items and returns JSON to frontend.
- Frontend renders cards and hover previews.
Direct Downloader Flow: Current Implementation
- User enters URL in Zone C.
- Frontend checks duplicate history via
/api/history/check. - Frontend loads preview metadata via
/api/download/preview. - Preview modal opens with:
- media preview
- duration
- crop dual-thumb slider
- quality select
- User confirms download.
- Backend launches Python worker.
- Worker downloads source with
yt-dlp, clips withffmpeg, emits JSON progress lines. - Backend rebroadcasts progress over WebSocket.
Current Features Implemented
- Project folder structure
- Dockerfile
- Gitea workflow
- Unraid template
- SQLite download history
- File upload
- yt-dlp direct downloader
- Preview modal for direct download
- Crop selection slider
- Quality selection
- WebSocket realtime progress
- Search source toggles
- Search card hover preview support
- Debug log panel in frontend
.logdownload from debug panel
Important Current Constraints / Known Problems
- Search backend has been rewritten multiple times and is still the main unstable area.
- Envato previews are parsed mainly from page HTML metadata / structured data.
- Artgrid previews are partially inferred from:
- clip page HTML
- clip API attempts
- HLS preview handling in frontend
- Search relevance is still not considered stable enough.
- Gemini batch evaluation exists, but search quality can still degrade if upstream SearXNG results are noisy.
- Frontend JavaScript was not linted with Node tooling in this environment because
nodeis not installed here. - Full browser-level preview validation is still not covered by the local self-test script.
- Search cards now separate source snippet from AI reason, but metadata fidelity still depends on source enrichment quality.
- Artgrid public pages inspected from this environment still did not expose a stable public preview video URL in HTML, so Artgrid hover-video support may remain partial until a browser-captured HTML/HAR sample reveals the real preview source pattern.
Frontend Debug Logger
- UI button: bottom-right
Logs - Files:
frontend/index.htmlfrontend/app.jsfrontend/style.css
- Logs currently capture:
- API request / response
- WebSocket progress messages
- ignored WS debug messages
- status updates
- platform toggle state
- preview source attach / detach
- hover start / hover end
- modal preview open / close
- browser errors
- promise rejections
- backend debug broadcasts
Current Environment Variables
APP_ROOTAPP_ADDRSQLITE_PATHDOWNLOADS_DIRFRONTEND_DIRWORKER_SCRIPTSEARXNG_BASE_URLSEARXNG_GOOGLE_VIDEO_ENGINESEARXNG_WEB_ENGINEGEMINI_API_KEY
Unraid Template Notes
- Current image repository in template:
git.savethenurse.com/savethenurse/ai-media-hub:latest - Current registry in template:
https://git.savethenurse.com
Docker / Build Notes
- Dockerfile uses:
- Go build stage
- static ffmpeg image stage
- Python runtime stage
- Heavy apt ffmpeg install path was removed earlier to reduce build time.
Git / Push Workflow Used So Far
- Branch:
main - Remote:
origin - All requested changes were committed and pushed incrementally to:
https://git.savethenurse.com/savethenurse/ai-media-hub.git
Recent Relevant Commits
8ed1e84Add in-app debug log panel823bf12Reflect selected platforms in search statuscceb040Update platform status and HLS previewsad8afd5Tighten source filters and add platform toggles27000dbHide overlays during hover previewb78865dRewrite search flow and enrich preview assetsde24886Filter non-English expansions and prefer stock sources0bd458dBoost translated search fallback and source priority
Next Priority Areas
- Search backend quality stabilization The search service is the main unresolved area.
- Envato / Artgrid preview extraction hardening
- Search result relevance validation against real user queries
- Better matching between rendered description and actual linked asset
- Add browser-level verification for preview/HLS behavior
- Add more automated coverage for search ranking / filtering logic
- If Artgrid hover preview is still required, collect one real clip HTML/HAR from a browser session and derive a stable preview URL parser
- Add proper frontend build/lint step if Node becomes available
Verified Locally In This Environment
go build -o /tmp/ai-media-hub ./backendgo test ./...(currently no broad test suite beyond the added fallback tests)- Python syntax check for worker + self-test helper
- local app boot /
/healthzthroughscripts/selftest.sh - local
/api/searchagainst mock SearXNG throughscripts/selftest.sh - local
/api/uploadthroughscripts/selftest.sh - full browser-level validation was not fully reproducible in this environment
Short Handover Summary
- The codebase exists and runs.
- Upload/download features mostly exist.
- Search is implemented but is still the most fragile subsystem.
- A visible debug logging panel now exists in the web UI and should be used first when continuing work.