367 lines
17 KiB
Markdown
367 lines
17 KiB
Markdown
# AI Media Hub Handover
|
|
|
|
## Working Rule
|
|
- From this point on, every meaningful change should be appended to this file so the next handoff can reconstruct:
|
|
- what changed
|
|
- why it changed
|
|
- how it was verified
|
|
- what remains risky
|
|
- Treat this file as both backlog and handover log, not just a static TODO list.
|
|
|
|
## Current Session Update (2026-03-13)
|
|
- Added a local self-test workflow before push/container build:
|
|
- `scripts/selftest.sh`
|
|
- `scripts/mock_searxng.py`
|
|
- Fixed Korean query translation fallback behavior:
|
|
- If `GEMINI_API_KEY` is missing or Gemini translation fails, the code now still attempts Google Translate fallback.
|
|
- If Google Translate fallback fails, dictionary replacement fallback still runs.
|
|
- Added Go tests for translation fallback logic.
|
|
- Fixed frontend HLS preview wiring:
|
|
- `hls.js` is now loaded in `frontend/index.html`
|
|
- frontend now tries `hls.js` first, then native HLS playback if available
|
|
- Corrected the practical local verification note:
|
|
- `go build ./backend` from repo root conflicts with the existing `backend/` directory name
|
|
- verified build command is now treated as `go build -o /tmp/... ./backend`
|
|
|
|
## Current Session Update (2026-03-13, Search/Preview Follow-up)
|
|
- Investigated a production search failure using downloaded frontend logs.
|
|
- Identified the main timeout cause:
|
|
- too many search results were being collected
|
|
- too many Gemini Vision batches were being evaluated sequentially
|
|
- backend debug messages were broadcasting oversized result payloads
|
|
- Applied search pipeline optimization:
|
|
- reduced per-source result caps
|
|
- reduced query fan-out for Google Video
|
|
- reduced enrichment cap
|
|
- limited Gemini Vision evaluation to top-ranked candidates only
|
|
- Improved Google Video filtering:
|
|
- added bans for music/BGM/trailer-style noise results
|
|
- Improved Envato enrichment fidelity:
|
|
- source page metadata is now preferred over search-engine proxy thumbnails
|
|
- source snippet/title are now taken from page metadata when available
|
|
- preview mp4 extraction now works via HTML/JSON-LD parsing
|
|
- added Python HTML fetch fallback for Cloudflare-challenged Envato pages because Go HTTP alone was receiving 403 challenge pages in testing
|
|
- Improved Artgrid fidelity:
|
|
- source page title/description/thumbnail are now preferred over search-engine snippets when available
|
|
- preview extraction is still not considered solved for all Artgrid clips because public HTML tested here did not expose a stable mp4/m3u8 URL
|
|
- Improved logging:
|
|
- backend search debug events now emit summaries, timings, source counts, preview counts, and Gemini batch stats instead of giant raw arrays
|
|
- frontend now logs raw non-JSON error bodies instead of collapsing them to `{}` on gateway/proxy failures
|
|
- Improved result rendering:
|
|
- search cards now show source snippet/description separately from AI reason to reduce confusion between asset metadata and Gemini commentary
|
|
|
|
## Current Session Update (2026-03-13, Regression Fix)
|
|
- A regression was found after search optimization:
|
|
- Envato and Artgrid disappeared entirely for some real searches while Google Video still returned results
|
|
- Root cause:
|
|
- the first optimization reduced query-variant breadth too aggressively
|
|
- the first 3 query variants were not enough to recover Envato/Artgrid in some real SearXNG result sets
|
|
- Fix applied:
|
|
- search now runs in two stages
|
|
- stage 1 searches only the first few variants for speed
|
|
- stage 2 searches additional variants only for sources that still returned zero results
|
|
- Intent:
|
|
- keep the anti-timeout optimization
|
|
- recover Envato/Artgrid recall when the early pass is too narrow
|
|
|
|
## Current Session Update (2026-03-13, HTML Snapshot Analysis)
|
|
- Used saved HTML snapshots supplied by the user for:
|
|
- Envato item page
|
|
- Artgrid clip page
|
|
- Findings:
|
|
- Envato page exposes clean `VideoObject` JSON-LD with:
|
|
- exact asset title
|
|
- rich description
|
|
- thumbnail URL
|
|
- preview mp4 URL
|
|
- Artgrid page exposes reliable meta fields for:
|
|
- title
|
|
- description
|
|
- thumbnail
|
|
- canonical URL
|
|
- Artgrid snapshot still does **not** expose a stable preview mp4 or m3u8 in the saved HTML or downloaded asset bundle inspected here
|
|
- Fixes applied from the snapshots:
|
|
- Envato enrichment now prefers `VideoObject` JSON-LD over generic meta tags
|
|
- Envato search cards should now align much better with the actual source asset and preview
|
|
- Artgrid title/description are now cleaned so Gemini/source text is less polluted by site suffixes and generic boilerplate
|
|
- Remaining limitation:
|
|
- Artgrid hover-video preview cannot be derived reliably from the provided snapshot alone
|
|
- if Artgrid preview video is still required, the next useful artifact is a browser HAR or DevTools network capture from an opened clip page
|
|
|
|
## Current Session Update (2026-03-13, Collector Refactor)
|
|
- Refactored the search pipeline into source-specific collectors:
|
|
- `envatoCollector`
|
|
- `artgridCollector`
|
|
- `googleVideoCollector`
|
|
- `SearchService` now acts mainly as:
|
|
- collector orchestration
|
|
- query-pass control
|
|
- dedupe
|
|
- cross-source enrichment scheduling
|
|
- Goal of the refactor:
|
|
- reduce cross-source coupling
|
|
- make future source-specific fixes safer
|
|
- make it easier to replace or disable one source without destabilizing the others
|
|
- Current implementation note:
|
|
- collectors are still in Go code under backend services, but the responsibilities are now separated by source instead of one monolithic search loop
|
|
|
|
## Current Session Update (2026-03-13, Artgrid Collector Fix + Ranker Split)
|
|
- Artgrid collector regression fixed:
|
|
- real search results can come back as `artlist.io/stock-footage/clip/.../<id>` instead of only `artgrid.io/clip/<id>/...`
|
|
- renderable filtering was rejecting those URLs, which caused `SearXNG returned no renderable results.` for Artgrid-only searches
|
|
- Fix applied:
|
|
- Artgrid renderability now accepts both `artgrid.io` and `artlist.io/stock-footage/clip/...` clip URLs
|
|
- Artgrid result links are normalized into `https://artgrid.io/clip/<id>/<slug>` inside the collector flow before filtering/enrichment
|
|
- Refactor continued:
|
|
- ranking / Gemini candidate evaluation / recommendation merge logic moved out of `handlers/api.go`
|
|
- new service layer file: `backend/services/ranker.go`
|
|
- handler is now thinner and less coupled to search internals
|
|
|
|
## Current Session Update (2026-03-13, 500 Fix)
|
|
- A server-side `request failed (500)` regression was found after the ranker split.
|
|
- Root cause:
|
|
- Gemini candidate cap logic returned `12` even when only `9` ranked candidates existed
|
|
- Gemini batch slicing then attempted to read beyond the available slice bounds
|
|
- Fix applied:
|
|
- `GeminiCandidateLimit` now never exceeds the real candidate count for totals up to 12
|
|
- Gemini evaluation now stays within valid ranked slice bounds
|
|
- Effect:
|
|
- avoids backend 500 during the Gemini Vision evaluation stage for mid-sized result sets
|
|
|
|
## Current Session Update (2026-03-13, Artgrid Query Coverage Fix)
|
|
- Another Artgrid no-results regression was found even after the collector URL matcher was widened.
|
|
- Root cause:
|
|
- Artgrid collector query generation still leaned on `site:artgrid.io/clip/`
|
|
- in practice, canonical clip pages can surface under `artlist.io/stock-footage/clip/...`
|
|
- so some Artgrid-only searches still returned zero renderable results even though the accept filter had been fixed
|
|
- Fix applied:
|
|
- Artgrid query generation now searches both:
|
|
- `site:artgrid.io/clip/`
|
|
- `site:artlist.io/stock-footage/clip/`
|
|
- Effect:
|
|
- improves Artgrid recall in SearXNG result sets that favor canonical Artlist URLs over Artgrid URLs
|
|
|
|
## Current Session Update (2026-03-16, Query / Preview Follow-up)
|
|
- Search intent translation was updated to better preserve compound media phrases:
|
|
- added explicit normalization for terms like `사이버 펑크` -> `cyberpunk`
|
|
- added a guard that rejects over-compressed translations when the original query contains a richer multi-word intent
|
|
- Artgrid page parsing was tightened:
|
|
- generic Artgrid homepage / challenge HTML should no longer be mistaken for a real clip page during enrichment
|
|
- this prevents homepage thumbnails/descriptions from overwriting real search result metadata
|
|
- Hover preview playback was changed to lazy attach on hover:
|
|
- preview source is now attached on mouseenter
|
|
- playback waits for media readiness instead of trying to play immediately from the render path
|
|
- source is detached again on mouseleave
|
|
- Self-test script search step now retries to reduce flaky startup timing failures during local smoke tests
|
|
|
|
## Local Self-Test Workflow
|
|
- Primary command:
|
|
- `bash scripts/selftest.sh`
|
|
- What it currently verifies:
|
|
- Go formatting for touched backend files
|
|
- Python syntax for worker + mock SearXNG
|
|
- `go test ./...`
|
|
- backend binary build
|
|
- local app boot with temp SQLite/download dirs
|
|
- `/healthz`
|
|
- `/api/search` using a local mock SearXNG server
|
|
- `/api/upload`
|
|
- Purpose:
|
|
- allow safe local regression checks before push or container build without depending on real SearXNG, Gemini, or browser interaction
|
|
|
|
## Project Summary
|
|
- Project: `ai-media-hub`
|
|
- Goal: AI-assisted media discovery + ingest dashboard for Unraid
|
|
- Backend: Go
|
|
- Worker: Python + `yt-dlp` + `ffmpeg`
|
|
- Frontend: HTML + Vanilla JS + Tailwind CDN
|
|
- Database: SQLite
|
|
- Current search backend: `SearXNG`
|
|
- Current vision/ranking backend: `Gemini 2.5 Flash`
|
|
- Deployment target: single Docker container on Unraid
|
|
- Git remote: `https://git.savethenurse.com/savethenurse/ai-media-hub.git`
|
|
|
|
## Current Architecture
|
|
- `backend/main.go`
|
|
App bootstrap, env loading, static frontend serving, route registration
|
|
- `backend/handlers/api.go`
|
|
Upload/download/search APIs, WebSocket progress broadcast, debug event broadcast
|
|
- `backend/services/cse.go`
|
|
Actual search backend service
|
|
Despite filename, this is no longer Google CSE logic
|
|
It now wraps SearXNG search, source filtering, result enrichment, preview asset parsing
|
|
- `backend/services/gemini.go`
|
|
Query translation, deterministic query expansion helper, Gemini vision scoring
|
|
Also extracts first video frame with `ffmpeg` when no thumbnail exists
|
|
- `backend/models/db.go`
|
|
SQLite init + download history
|
|
- `worker/downloader.py`
|
|
`yt-dlp` probe/download + ffmpeg clip extraction
|
|
- `frontend/index.html`
|
|
Main dashboard UI, preview modal, debug log panel
|
|
- `frontend/app.js`
|
|
API calls, WebSocket status bar, hover preview playback, debug logger panel, platform toggles
|
|
- `frontend/style.css`
|
|
Custom styles, clamp helpers, slider thumb styles, debug panel scrollbar styles
|
|
- `unraid-template.xml`
|
|
Unraid template for current `git.savethenurse.com` image source
|
|
|
|
## Search Flow: Current Implementation
|
|
1. User enters a query in Zone A.
|
|
2. Frontend sends `/api/search` with:
|
|
- `query`
|
|
- selected `platforms`
|
|
3. Backend translates the query to English in `GeminiService.TranslateQuery`.
|
|
Fallback order:
|
|
- Gemini translation
|
|
- Google Translate HTTP fallback
|
|
- small Korean media-term dictionary replacement
|
|
4. Backend builds deterministic English search variants in `GeminiService.ExpandQuery`.
|
|
5. Backend calls `SearchService.SearchMedia(...)`.
|
|
6. Search service queries SearXNG for:
|
|
- `Envato`
|
|
- `Artgrid`
|
|
- `Google Video`
|
|
7. Search service filters source URLs aggressively:
|
|
- Google Video: YouTube-only
|
|
- Envato: `elements.envato.com` item URLs only
|
|
- Artgrid: `artgrid.io/clip/...` only
|
|
8. Search service enriches results:
|
|
- Envato: parses item page HTML for `og:image` and preview video URL
|
|
- Artgrid: attempts clip API + HTML parsing for thumbnails and preview sources
|
|
9. Backend ranks all results locally.
|
|
10. Backend evaluates all ranked results with Gemini vision in batches.
|
|
11. Backend merges Gemini recommendations + fallback ranked items and returns JSON to frontend.
|
|
12. Frontend renders cards and hover previews.
|
|
|
|
## Direct Downloader Flow: Current Implementation
|
|
1. User enters URL in Zone C.
|
|
2. Frontend checks duplicate history via `/api/history/check`.
|
|
3. Frontend loads preview metadata via `/api/download/preview`.
|
|
4. Preview modal opens with:
|
|
- media preview
|
|
- duration
|
|
- crop dual-thumb slider
|
|
- quality select
|
|
5. User confirms download.
|
|
6. Backend launches Python worker.
|
|
7. Worker downloads source with `yt-dlp`, clips with `ffmpeg`, emits JSON progress lines.
|
|
8. Backend rebroadcasts progress over WebSocket.
|
|
|
|
## Current Features Implemented
|
|
- [x] Project folder structure
|
|
- [x] Dockerfile
|
|
- [x] Gitea workflow
|
|
- [x] Unraid template
|
|
- [x] SQLite download history
|
|
- [x] File upload
|
|
- [x] yt-dlp direct downloader
|
|
- [x] Preview modal for direct download
|
|
- [x] Crop selection slider
|
|
- [x] Quality selection
|
|
- [x] WebSocket realtime progress
|
|
- [x] Search source toggles
|
|
- [x] Search card hover preview support
|
|
- [x] Debug log panel in frontend
|
|
- [x] `.log` download from debug panel
|
|
|
|
## Important Current Constraints / Known Problems
|
|
- Search backend has been rewritten multiple times and is still the main unstable area.
|
|
- Envato previews are parsed mainly from page HTML metadata / structured data.
|
|
- Artgrid previews are partially inferred from:
|
|
- clip page HTML
|
|
- clip API attempts
|
|
- HLS preview handling in frontend
|
|
- Search relevance is still not considered stable enough.
|
|
- Gemini batch evaluation exists, but search quality can still degrade if upstream SearXNG results are noisy.
|
|
- Frontend JavaScript was not linted with Node tooling in this environment because `node` is not installed here.
|
|
- Full browser-level preview validation is still not covered by the local self-test script.
|
|
- Search cards now separate source snippet from AI reason, but metadata fidelity still depends on source enrichment quality.
|
|
- Artgrid public pages inspected from this environment still did not expose a stable public preview video URL in HTML, so Artgrid hover-video support may remain partial until a browser-captured HTML/HAR sample reveals the real preview source pattern.
|
|
|
|
## Frontend Debug Logger
|
|
- UI button: bottom-right `Logs`
|
|
- Files:
|
|
- `frontend/index.html`
|
|
- `frontend/app.js`
|
|
- `frontend/style.css`
|
|
- Logs currently capture:
|
|
- API request / response
|
|
- WebSocket progress messages
|
|
- ignored WS debug messages
|
|
- status updates
|
|
- platform toggle state
|
|
- preview source attach / detach
|
|
- hover start / hover end
|
|
- modal preview open / close
|
|
- browser errors
|
|
- promise rejections
|
|
- backend debug broadcasts
|
|
|
|
## Current Environment Variables
|
|
- `APP_ROOT`
|
|
- `APP_ADDR`
|
|
- `SQLITE_PATH`
|
|
- `DOWNLOADS_DIR`
|
|
- `FRONTEND_DIR`
|
|
- `WORKER_SCRIPT`
|
|
- `SEARXNG_BASE_URL`
|
|
- `SEARXNG_GOOGLE_VIDEO_ENGINE`
|
|
- `SEARXNG_WEB_ENGINE`
|
|
- `GEMINI_API_KEY`
|
|
|
|
## Unraid Template Notes
|
|
- Current image repository in template:
|
|
`git.savethenurse.com/savethenurse/ai-media-hub:latest`
|
|
- Current registry in template:
|
|
`https://git.savethenurse.com`
|
|
|
|
## Docker / Build Notes
|
|
- Dockerfile uses:
|
|
- Go build stage
|
|
- static ffmpeg image stage
|
|
- Python runtime stage
|
|
- Heavy apt ffmpeg install path was removed earlier to reduce build time.
|
|
|
|
## Git / Push Workflow Used So Far
|
|
- Branch: `main`
|
|
- Remote: `origin`
|
|
- All requested changes were committed and pushed incrementally to:
|
|
`https://git.savethenurse.com/savethenurse/ai-media-hub.git`
|
|
|
|
## Recent Relevant Commits
|
|
- `8ed1e84` Add in-app debug log panel
|
|
- `823bf12` Reflect selected platforms in search status
|
|
- `cceb040` Update platform status and HLS previews
|
|
- `ad8afd5` Tighten source filters and add platform toggles
|
|
- `27000db` Hide overlays during hover preview
|
|
- `b78865d` Rewrite search flow and enrich preview assets
|
|
- `de24886` Filter non-English expansions and prefer stock sources
|
|
- `0bd458d` Boost translated search fallback and source priority
|
|
|
|
## Next Priority Areas
|
|
- [ ] Search backend quality stabilization
|
|
The search service is the main unresolved area.
|
|
- [ ] Envato / Artgrid preview extraction hardening
|
|
- [ ] Search result relevance validation against real user queries
|
|
- [ ] Better matching between rendered description and actual linked asset
|
|
- [ ] Add browser-level verification for preview/HLS behavior
|
|
- [ ] Add more automated coverage for search ranking / filtering logic
|
|
- [ ] If Artgrid hover preview is still required, collect one real clip HTML/HAR from a browser session and derive a stable preview URL parser
|
|
- [ ] Add proper frontend build/lint step if Node becomes available
|
|
|
|
## Verified Locally In This Environment
|
|
- [x] `go build -o /tmp/ai-media-hub ./backend`
|
|
- [x] `go test ./...` (currently no broad test suite beyond the added fallback tests)
|
|
- [x] Python syntax check for worker + self-test helper
|
|
- [x] local app boot / `/healthz` through `scripts/selftest.sh`
|
|
- [x] local `/api/search` against mock SearXNG through `scripts/selftest.sh`
|
|
- [x] local `/api/upload` through `scripts/selftest.sh`
|
|
- [ ] full browser-level validation was not fully reproducible in this environment
|
|
|
|
## Short Handover Summary
|
|
- The codebase exists and runs.
|
|
- Upload/download features mostly exist.
|
|
- Search is implemented but is still the most fragile subsystem.
|
|
- A visible debug logging panel now exists in the web UI and should be used first when continuing work.
|