From b43886e950b221d4eaf64976f546a2b569db9ef4 Mon Sep 17 00:00:00 2001 From: AI Assistant Date: Mon, 16 Mar 2026 10:12:12 +0900 Subject: [PATCH] Add in-app result viewer and expand Gemini review --- TODO.md | 482 ++++++++++++++++++------------------- backend/handlers/api.go | 6 +- backend/services/gemini.go | 23 +- backend/services/ranker.go | 69 ++++-- frontend/app.js | 99 ++++++-- frontend/index.html | 41 +++- 6 files changed, 414 insertions(+), 306 deletions(-) diff --git a/TODO.md b/TODO.md index 2f90dc5..bba65fd 100644 --- a/TODO.md +++ b/TODO.md @@ -1,238 +1,138 @@ # AI Media Hub Handover ## Working Rule -- From this point on, every meaningful change should be appended to this file so the next handoff can reconstruct: +- This file is both backlog and handover log. +- Every meaningful change should record: - what changed - why it changed - how it was verified - - what remains risky -- Treat this file as both backlog and handover log, not just a static TODO list. + - what is still risky or incomplete +- If a push fails or a change remains local-only, that must be written here explicitly. -## Current Session Update (2026-03-13) -- Added a local self-test workflow before push/container build: - - `scripts/selftest.sh` - - `scripts/mock_searxng.py` -- Fixed Korean query translation fallback behavior: - - If `GEMINI_API_KEY` is missing or Gemini translation fails, the code now still attempts Google Translate fallback. - - If Google Translate fallback fails, dictionary replacement fallback still runs. -- Added Go tests for translation fallback logic. -- Fixed frontend HLS preview wiring: - - `hls.js` is now loaded in `frontend/index.html` - - frontend now tries `hls.js` first, then native HLS playback if available -- Corrected the practical local verification note: - - `go build ./backend` from repo root conflicts with the existing `backend/` directory name - - verified build command is now treated as `go build -o /tmp/... ./backend` - -## Current Session Update (2026-03-13, Search/Preview Follow-up) -- Investigated a production search failure using downloaded frontend logs. -- Identified the main timeout cause: - - too many search results were being collected - - too many Gemini Vision batches were being evaluated sequentially - - backend debug messages were broadcasting oversized result payloads -- Applied search pipeline optimization: - - reduced per-source result caps - - reduced query fan-out for Google Video - - reduced enrichment cap - - limited Gemini Vision evaluation to top-ranked candidates only -- Improved Google Video filtering: - - added bans for music/BGM/trailer-style noise results -- Improved Envato enrichment fidelity: - - source page metadata is now preferred over search-engine proxy thumbnails - - source snippet/title are now taken from page metadata when available - - preview mp4 extraction now works via HTML/JSON-LD parsing - - added Python HTML fetch fallback for Cloudflare-challenged Envato pages because Go HTTP alone was receiving 403 challenge pages in testing -- Improved Artgrid fidelity: - - source page title/description/thumbnail are now preferred over search-engine snippets when available - - preview extraction is still not considered solved for all Artgrid clips because public HTML tested here did not expose a stable mp4/m3u8 URL -- Improved logging: - - backend search debug events now emit summaries, timings, source counts, preview counts, and Gemini batch stats instead of giant raw arrays - - frontend now logs raw non-JSON error bodies instead of collapsing them to `{}` on gateway/proxy failures -- Improved result rendering: - - search cards now show source snippet/description separately from AI reason to reduce confusion between asset metadata and Gemini commentary - -## Current Session Update (2026-03-13, Regression Fix) -- A regression was found after search optimization: - - Envato and Artgrid disappeared entirely for some real searches while Google Video still returned results -- Root cause: - - the first optimization reduced query-variant breadth too aggressively - - the first 3 query variants were not enough to recover Envato/Artgrid in some real SearXNG result sets -- Fix applied: - - search now runs in two stages - - stage 1 searches only the first few variants for speed - - stage 2 searches additional variants only for sources that still returned zero results -- Intent: - - keep the anti-timeout optimization - - recover Envato/Artgrid recall when the early pass is too narrow - -## Current Session Update (2026-03-13, HTML Snapshot Analysis) -- Used saved HTML snapshots supplied by the user for: - - Envato item page - - Artgrid clip page -- Findings: - - Envato page exposes clean `VideoObject` JSON-LD with: - - exact asset title - - rich description - - thumbnail URL - - preview mp4 URL - - Artgrid page exposes reliable meta fields for: - - title - - description - - thumbnail - - canonical URL - - Artgrid snapshot still does **not** expose a stable preview mp4 or m3u8 in the saved HTML or downloaded asset bundle inspected here -- Fixes applied from the snapshots: - - Envato enrichment now prefers `VideoObject` JSON-LD over generic meta tags - - Envato search cards should now align much better with the actual source asset and preview - - Artgrid title/description are now cleaned so Gemini/source text is less polluted by site suffixes and generic boilerplate -- Remaining limitation: - - Artgrid hover-video preview cannot be derived reliably from the provided snapshot alone - - if Artgrid preview video is still required, the next useful artifact is a browser HAR or DevTools network capture from an opened clip page - -## Current Session Update (2026-03-13, Collector Refactor) -- Refactored the search pipeline into source-specific collectors: - - `envatoCollector` - - `artgridCollector` - - `googleVideoCollector` -- `SearchService` now acts mainly as: - - collector orchestration - - query-pass control - - dedupe - - cross-source enrichment scheduling -- Goal of the refactor: - - reduce cross-source coupling - - make future source-specific fixes safer - - make it easier to replace or disable one source without destabilizing the others -- Current implementation note: - - collectors are still in Go code under backend services, but the responsibilities are now separated by source instead of one monolithic search loop - -## Current Session Update (2026-03-13, Artgrid Collector Fix + Ranker Split) -- Artgrid collector regression fixed: - - real search results can come back as `artlist.io/stock-footage/clip/.../` instead of only `artgrid.io/clip//...` - - renderable filtering was rejecting those URLs, which caused `SearXNG returned no renderable results.` for Artgrid-only searches -- Fix applied: - - Artgrid renderability now accepts both `artgrid.io` and `artlist.io/stock-footage/clip/...` clip URLs - - Artgrid result links are normalized into `https://artgrid.io/clip//` inside the collector flow before filtering/enrichment -- Refactor continued: - - ranking / Gemini candidate evaluation / recommendation merge logic moved out of `handlers/api.go` - - new service layer file: `backend/services/ranker.go` - - handler is now thinner and less coupled to search internals - -## Current Session Update (2026-03-13, 500 Fix) -- A server-side `request failed (500)` regression was found after the ranker split. -- Root cause: - - Gemini candidate cap logic returned `12` even when only `9` ranked candidates existed - - Gemini batch slicing then attempted to read beyond the available slice bounds -- Fix applied: - - `GeminiCandidateLimit` now never exceeds the real candidate count for totals up to 12 - - Gemini evaluation now stays within valid ranked slice bounds -- Effect: - - avoids backend 500 during the Gemini Vision evaluation stage for mid-sized result sets - -## Current Session Update (2026-03-13, Artgrid Query Coverage Fix) -- Another Artgrid no-results regression was found even after the collector URL matcher was widened. -- Root cause: - - Artgrid collector query generation still leaned on `site:artgrid.io/clip/` - - in practice, canonical clip pages can surface under `artlist.io/stock-footage/clip/...` - - so some Artgrid-only searches still returned zero renderable results even though the accept filter had been fixed -- Fix applied: - - Artgrid query generation now searches both: - - `site:artgrid.io/clip/` - - `site:artlist.io/stock-footage/clip/` -- Effect: - - improves Artgrid recall in SearXNG result sets that favor canonical Artlist URLs over Artgrid URLs - -## Current Session Update (2026-03-16, Query / Preview Follow-up) -- Search intent translation was updated to better preserve compound media phrases: - - added explicit normalization for terms like `사이버 펑크` -> `cyberpunk` - - added a guard that rejects over-compressed translations when the original query contains a richer multi-word intent -- Artgrid page parsing was tightened: - - generic Artgrid homepage / challenge HTML should no longer be mistaken for a real clip page during enrichment - - this prevents homepage thumbnails/descriptions from overwriting real search result metadata -- Hover preview playback was changed to lazy attach on hover: - - preview source is now attached on mouseenter - - playback waits for media readiness instead of trying to play immediately from the render path - - source is detached again on mouseleave -- Self-test script search step now retries to reduce flaky startup timing failures during local smoke tests - -## Local Self-Test Workflow -- Primary command: - - `bash scripts/selftest.sh` -- What it currently verifies: - - Go formatting for touched backend files - - Python syntax for worker + mock SearXNG - - `go test ./...` - - backend binary build - - local app boot with temp SQLite/download dirs - - `/healthz` - - `/api/search` using a local mock SearXNG server - - `/api/upload` -- Purpose: - - allow safe local regression checks before push or container build without depending on real SearXNG, Gemini, or browser interaction - -## Project Summary +## Current State At A Glance - Project: `ai-media-hub` - Goal: AI-assisted media discovery + ingest dashboard for Unraid - Backend: Go - Worker: Python + `yt-dlp` + `ffmpeg` - Frontend: HTML + Vanilla JS + Tailwind CDN - Database: SQLite -- Current search backend: `SearXNG` -- Current vision/ranking backend: `Gemini 2.5 Flash` +- Search backend: `SearXNG` +- AI translation / visual ranking: `Gemini 2.5 Flash` - Deployment target: single Docker container on Unraid - Git remote: `https://git.savethenurse.com/savethenurse/ai-media-hub.git` +## Current Status Summary +- Upload / direct download flow is implemented and broadly usable. +- Search is implemented end-to-end and now refactored into source-specific collectors. +- Search remains the main unstable subsystem. +- Envato metadata and preview extraction are much stronger than before. +- Artgrid metadata fidelity is improved, but stable public hover-video preview extraction is still not solved. +- Frontend now logs more useful API and debug information than earlier versions. +- A local self-test workflow now exists and should be run before container builds or pushes. + ## Current Architecture - `backend/main.go` - App bootstrap, env loading, static frontend serving, route registration + - app bootstrap + - env loading + - static frontend serving + - route registration - `backend/handlers/api.go` - Upload/download/search APIs, WebSocket progress broadcast, debug event broadcast + - upload / download / search APIs + - WebSocket progress broadcast + - debug event broadcast + - search request orchestration only, with ranking/Gemini logic mostly moved out - `backend/services/cse.go` - Actual search backend service - Despite filename, this is no longer Google CSE logic - It now wraps SearXNG search, source filtering, result enrichment, preview asset parsing + - SearXNG querying + - shared search helpers + - source-specific enrich helpers + - URL filtering / parsing utilities +- `backend/services/search_collectors.go` + - source-specific collectors: + - `envatoCollector` + - `artgridCollector` + - `googleVideoCollector` +- `backend/services/ranker.go` + - ranking + - Gemini candidate cap logic + - Gemini batch evaluation wrapper + - recommendation merge logic - `backend/services/gemini.go` - Query translation, deterministic query expansion helper, Gemini vision scoring - Also extracts first video frame with `ffmpeg` when no thumbnail exists + - query translation + - deterministic query expansion + - Gemini vision scoring + - video frame extraction via `ffmpeg` when needed - `backend/models/db.go` - SQLite init + download history + - SQLite init + - download history - `worker/downloader.py` - `yt-dlp` probe/download + ffmpeg clip extraction + - `yt-dlp` probe / download + - `ffmpeg` clip extraction - `frontend/index.html` - Main dashboard UI, preview modal, debug log panel + - main dashboard UI + - result viewer modal + - preview modal + - debug log panel - `frontend/app.js` - API calls, WebSocket status bar, hover preview playback, debug logger panel, platform toggles + - API calls + - WebSocket status bar + - result viewer modal + - hover preview playback + - direct download handoff for Google Video results + - debug logger panel + - platform toggles - `frontend/style.css` - Custom styles, clamp helpers, slider thumb styles, debug panel scrollbar styles + - custom styles + - clamp helpers + - slider thumb styles + - debug panel scrollbar styles +- `scripts/selftest.sh` + - local smoke test flow +- `scripts/mock_searxng.py` + - local mock SearXNG used by self-test - `unraid-template.xml` - Unraid template for current `git.savethenurse.com` image source + - Unraid template for current image source ## Search Flow: Current Implementation 1. User enters a query in Zone A. 2. Frontend sends `/api/search` with: - `query` - selected `platforms` -3. Backend translates the query to English in `GeminiService.TranslateQuery`. - Fallback order: - - Gemini translation +3. Backend translates the query in `GeminiService.TranslateQuery`. + - Gemini translation if available - Google Translate HTTP fallback - - small Korean media-term dictionary replacement + - Korean media-term dictionary fallback + - explicit normalization for known compound phrases such as `사이버 펑크` -> `cyberpunk` 4. Backend builds deterministic English search variants in `GeminiService.ExpandQuery`. -5. Backend calls `SearchService.SearchMedia(...)`. -6. Search service queries SearXNG for: - - `Envato` - - `Artgrid` - - `Google Video` -7. Search service filters source URLs aggressively: - - Google Video: YouTube-only +5. `SearchService.SearchMedia(...)` orchestrates source-specific collectors. +6. Collectors query SearXNG separately for: + - Envato + - Artgrid + - Google Video +7. Each collector applies source-specific acceptance logic. + - Google Video: YouTube-only plus noise filtering - Envato: `elements.envato.com` item URLs only - - Artgrid: `artgrid.io/clip/...` only -8. Search service enriches results: - - Envato: parses item page HTML for `og:image` and preview video URL - - Artgrid: attempts clip API + HTML parsing for thumbnails and preview sources -9. Backend ranks all results locally. -10. Backend evaluates all ranked results with Gemini vision in batches. -11. Backend merges Gemini recommendations + fallback ranked items and returns JSON to frontend. -12. Frontend renders cards and hover previews. + - Artgrid: accepts both: + - `artgrid.io/clip/...` + - `artlist.io/stock-footage/clip/...` +8. Artgrid canonical links are normalized to: + - `https://artgrid.io/clip//` +9. Results are enriched source-by-source. + - Envato: + - `VideoObject` JSON-LD preferred + - page meta preferred over search-engine proxy thumbnail + - preview mp4 extraction via JSON-LD / HTML parsing + - Python HTML fetch fallback used when Go HTTP fetch gets Cloudflare challenge pages + - Artgrid: + - page title / description / thumbnail cleaning + - homepage / challenge HTML is now rejected so generic site metadata does not overwrite clip metadata + - preview video extraction still not stable +10. Ranked results are passed through the shared ranker. +11. All ranked candidates are evaluated with Gemini Vision in batches. +12. Merge order now prefers: + - Gemini recommended items + - Gemini-reviewed non-recommended items + - keyword fallback items only if Gemini output is incomplete +13. Frontend renders cards, result viewer modal, and hover previews. ## Direct Downloader Flow: Current Implementation 1. User enters URL in Zone C. @@ -248,6 +148,45 @@ 7. Worker downloads source with `yt-dlp`, clips with `ffmpeg`, emits JSON progress lines. 8. Backend rebroadcasts progress over WebSocket. +## Major Work Completed So Far +- Added local self-test workflow: + - `scripts/selftest.sh` + - `scripts/mock_searxng.py` +- Fixed translation fallback when Gemini key is missing. +- Added tests for translation fallback logic. +- Added HLS frontend wiring: + - `hls.js` script + - native HLS fallback +- Reduced search timeout risk by: + - limiting collector result caps + - limiting enrichment scope + - limiting Gemini Vision evaluation scope + - replacing oversized raw debug result payloads with summaries +- Improved Google Video filtering: + - rejects more music / trailer / BGM style noise +- Improved Envato fidelity: + - real title / description / thumbnail / preview from source page +- Improved Artgrid fidelity: + - accepts canonical Artlist URLs + - normalizes Artgrid clip URLs + - cleans title / description better +- Refactored search into source-specific collectors. +- Moved ranking and Gemini batch handling into `backend/services/ranker.go`. +- Fixed server-side 500 caused by Gemini candidate cap exceeding available ranked candidates. +- Improved frontend logging: + - raw non-JSON error body logging + - more compact debug payload rendering +- Changed hover preview playback to lazy attach on hover: + - attach source on `mouseenter` + - wait for readiness before `play()` + - detach source on `mouseleave` +- Added in-app result viewer modal for search results: + - results now open in a modal instead of directly opening a new tab + - modal shows embedded site iframe, external open button, source summary, and full AI note +- Google Video results can now jump directly into the existing direct-download preview / crop flow from the result viewer +- Gemini reason generation is now intended to be Korean-first for readability +- Gemini Vision evaluation now covers all ranked results instead of only a top subset + ## Current Features Implemented - [x] Project folder structure - [x] Dockerfile @@ -262,22 +201,34 @@ - [x] WebSocket realtime progress - [x] Search source toggles - [x] Search card hover preview support +- [x] Result viewer modal for search results +- [x] Google Video direct-download handoff from search results - [x] Debug log panel in frontend - [x] `.log` download from debug panel +- [x] Local self-test workflow +- [x] Source-specific search collectors +- [x] Shared ranker service layer ## Important Current Constraints / Known Problems -- Search backend has been rewritten multiple times and is still the main unstable area. -- Envato previews are parsed mainly from page HTML metadata / structured data. -- Artgrid previews are partially inferred from: - - clip page HTML - - clip API attempts - - HLS preview handling in frontend -- Search relevance is still not considered stable enough. -- Gemini batch evaluation exists, but search quality can still degrade if upstream SearXNG results are noisy. -- Frontend JavaScript was not linted with Node tooling in this environment because `node` is not installed here. -- Full browser-level preview validation is still not covered by the local self-test script. +- Search backend quality is still the most fragile subsystem. +- Search relevance is still heuristic-heavy and not yet benchmarked against a durable real-query set. +- Embedded result viewer uses an iframe, so some third-party sites may still block embedding with `X-Frame-Options` / CSP. +- Artgrid hover-video preview is still partial / unresolved: + - provided Artgrid HTML snapshots and downloaded asset bundles did not expose a stable public preview mp4/m3u8 URL + - public HTML often only exposes title / description / thumbnail / canonical URL +- Artgrid can still be sensitive to how SearXNG indexes canonical domains. +- Full browser-level validation is still not covered by local self-test. +- Frontend JavaScript still has no Node-based lint/build step in this environment. - Search cards now separate source snippet from AI reason, but metadata fidelity still depends on source enrichment quality. -- Artgrid public pages inspected from this environment still did not expose a stable public preview video URL in HTML, so Artgrid hover-video support may remain partial until a browser-captured HTML/HAR sample reveals the real preview source pattern. +- Gemini notes are now intended to be Korean, but final output quality still depends on Gemini response consistency. +- The local self-test script is better than before, but it is still a smoke test, not full integration coverage. + +## Current Risks Around Search Quality +- Upstream SearXNG quality still controls the candidate pool. +- Gemini Vision can only rerank the candidates it receives. +- If source enrichment fails, Gemini may still judge a weaker proxy thumbnail or fallback image. +- Compound Korean intents are better handled now, but the translation path is still heuristic and can drift on niche concepts. +- Running Gemini Vision across all ranked results increases latency and token usage compared with the earlier capped approach. ## Frontend Debug Logger - UI button: bottom-right `Logs` @@ -291,8 +242,10 @@ - ignored WS debug messages - status updates - platform toggle state + - result viewer modal open / close - preview source attach / detach - hover start / hover end + - hover play errors - modal preview open / close - browser errors - promise rejections @@ -310,57 +263,80 @@ - `SEARXNG_WEB_ENGINE` - `GEMINI_API_KEY` -## Unraid Template Notes -- Current image repository in template: - `git.savethenurse.com/savethenurse/ai-media-hub:latest` -- Current registry in template: - `https://git.savethenurse.com` - -## Docker / Build Notes -- Dockerfile uses: - - Go build stage - - static ffmpeg image stage - - Python runtime stage -- Heavy apt ffmpeg install path was removed earlier to reduce build time. - -## Git / Push Workflow Used So Far -- Branch: `main` -- Remote: `origin` -- All requested changes were committed and pushed incrementally to: - `https://git.savethenurse.com/savethenurse/ai-media-hub.git` - -## Recent Relevant Commits -- `8ed1e84` Add in-app debug log panel -- `823bf12` Reflect selected platforms in search status -- `cceb040` Update platform status and HLS previews -- `ad8afd5` Tighten source filters and add platform toggles -- `27000db` Hide overlays during hover preview -- `b78865d` Rewrite search flow and enrich preview assets -- `de24886` Filter non-English expansions and prefer stock sources -- `0bd458d` Boost translated search fallback and source priority - -## Next Priority Areas -- [ ] Search backend quality stabilization - The search service is the main unresolved area. -- [ ] Envato / Artgrid preview extraction hardening -- [ ] Search result relevance validation against real user queries -- [ ] Better matching between rendered description and actual linked asset -- [ ] Add browser-level verification for preview/HLS behavior -- [ ] Add more automated coverage for search ranking / filtering logic -- [ ] If Artgrid hover preview is still required, collect one real clip HTML/HAR from a browser session and derive a stable preview URL parser -- [ ] Add proper frontend build/lint step if Node becomes available +## Local Self-Test Workflow +- Primary command: + - `bash scripts/selftest.sh` +- What it currently verifies: + - Go formatting for touched backend files + - Python syntax for worker + mock SearXNG + - `go test ./...` + - backend binary build + - local app boot with temp SQLite/download dirs + - `/healthz` + - `/api/search` using local mock SearXNG + - `/api/upload` +- Notes: + - search step now retries to reduce startup timing flakiness + - this is a smoke test, not a browser-level verification suite ## Verified Locally In This Environment - [x] `go build -o /tmp/ai-media-hub ./backend` -- [x] `go test ./...` (currently no broad test suite beyond the added fallback tests) +- [x] `go test ./...` - [x] Python syntax check for worker + self-test helper - [x] local app boot / `/healthz` through `scripts/selftest.sh` - [x] local `/api/search` against mock SearXNG through `scripts/selftest.sh` - [x] local `/api/upload` through `scripts/selftest.sh` - [ ] full browser-level validation was not fully reproducible in this environment +## Unraid / Docker / CI Notes +- Dockerfile uses: + - Go build stage + - static ffmpeg image stage + - Python runtime stage +- Heavy apt ffmpeg install path was removed earlier to reduce build time. +- Gitea workflow builds and pushes: + - `git.savethenurse.com/savethenurse/ai-media-hub:latest` + - `git.savethenurse.com/savethenurse/ai-media-hub:${{ github.sha }}` + +## Recent Relevant Commits +- `9637b76` Improve query intent handling and preview playback +- `6d9391b` Expand Artgrid query coverage to artlist canonical URLs +- `d8cc32e` Fix Gemini candidate cap causing search 500s +- `e426261` Fix Artgrid collector matching and split ranker +- `5aebbef` Refactor search into source-specific collectors +- `ae091c5` Improve source parsing from Envato and Artgrid HTML +- `06ea4f3` Restore Envato and Artgrid fallback search breadth +- `7dfb1ad` Stabilize search pipeline and improve preview diagnostics +- `6f3149a` Add local self-test flow and fix fallback regressions +- `f968458` Rewrite TODO as project handover + +## Git / Push Status +- Last pushed commit known in earlier work: + - `6d9391b` was pushed successfully +- Local-only work currently exists: + - `9637b76 Improve query intent handling and preview playback` +- Push status for `9637b76`: + - not pushed + - remote rejected the push with: + - `remote unpack failed: unable to create temporary object directory` + - `remote rejected main -> main (unpacker error)` +- Interpretation: + - current blocker appears to be on the remote git server side, not a local git history issue + +## Highest-Value Next Steps +- [ ] Re-try push of local commit once remote git storage/unpacker issue is resolved +- [ ] Build collector-specific integration tests with recorded SearXNG samples +- [ ] Separate source enrichment tests from live network behavior using local fixtures +- [ ] Add a browser-level preview validation path, especially for hover video +- [ ] If Artgrid hover preview is still required, obtain one real clip HAR / DevTools network export and derive a stable preview URL parser +- [ ] Build a small fixed real-query benchmark set to evaluate search quality before further tuning +- [ ] If frontend tooling becomes available, add lint/build checks + ## Short Handover Summary -- The codebase exists and runs. +- The codebase runs. - Upload/download features mostly exist. -- Search is implemented but is still the most fragile subsystem. -- A visible debug logging panel now exists in the web UI and should be used first when continuing work. +- Search has been significantly refactored and is in a better shape than before, but is still the main unstable area. +- Envato source fidelity is much better than earlier. +- Artgrid source fidelity is better, but preview-video extraction is still incomplete. +- There is now a local self-test workflow. +- There is one known local commit that has not been pushed because the remote repo reported an unpacker error. diff --git a/backend/handlers/api.go b/backend/handlers/api.go index af5b5b5..a6f070e 100644 --- a/backend/handlers/api.go +++ b/backend/handlers/api.go @@ -320,7 +320,7 @@ func (a *App) searchMedia(c *gin.Context) { } scored := services.RankSearchResults(rankQuery, results) a.debug("search ranked summary", summarizeSearchResults(scored, time.Since(started), services.GeminiCandidateLimit(len(scored)), "")) - a.Hub.Broadcast("progress", gin.H{"type": "search", "status": "analyzing top candidate visuals with Gemini Vision", "progress": 75}) + a.Hub.Broadcast("progress", gin.H{"type": "search", "status": "analyzing all candidate visuals with Gemini Vision", "progress": 75}) recommended, geminiStats := services.EvaluateAllCandidatesWithGemini(a.GeminiService, req.Query, scored) a.debug("search gemini evaluation", geminiStats) err = nil @@ -337,8 +337,8 @@ func (a *App) searchMedia(c *gin.Context) { ThumbnailURL: result.ThumbnailURL, PreviewVideoURL: result.PreviewVideoURL, Source: result.Source, - Reason: "Keyword-ranked result added without extra Gemini vision tokens.", - Recommended: true, + Reason: "Gemini Vision 응답이 부족해 키워드 기준으로 보강된 결과입니다.", + Recommended: false, }) } warning := err.Error() diff --git a/backend/services/gemini.go b/backend/services/gemini.go index 06af951..32e4192 100644 --- a/backend/services/gemini.go +++ b/backend/services/gemini.go @@ -154,7 +154,8 @@ func (g *GeminiService) Recommend(query string, candidates []SearchResult) ([]AI { "text": `Analyze the provided images for the user's search intent. Return JSON only in this shape: {"recommendations":[{"index":0,"reason":"short reason","recommended":true}]} -Mark only the best matches as recommended=true. Keep reasons concise. Recommend up to 8 items. +Return one entry for every analyzed candidate. Use Korean for every reason. Keep reasons concise but specific enough to explain usefulness. +Mark the strongest matches as recommended=true and weaker matches as recommended=false. Prefer cinematic b-roll, stock footage, editorial footage, clean composition, usable establishing shots, and professional media thumbnails. Avoid clickbait faces, exaggerated expressions, meme aesthetics, low-information thumbnails, sensational text overlays, or gossip-style imagery. Favor thumbnails that look directly useful for media editing and footage sourcing. @@ -230,7 +231,7 @@ User query: ` + query, recommendations := make([]AIRecommendation, 0, len(parsed.Recommendations)) for _, rec := range parsed.Recommendations { - if rec.Index < 0 || rec.Index >= len(candidates) || !rec.Recommended { + if rec.Index < 0 || rec.Index >= len(candidates) { continue } src := candidates[rec.Index] @@ -241,13 +242,13 @@ User query: ` + query, ThumbnailURL: src.ThumbnailURL, PreviewVideoURL: src.PreviewVideoURL, Source: src.Source, - Reason: rec.Reason, - Recommended: true, + Reason: normalizeKoreanReason(rec.Reason), + Recommended: rec.Recommended, }) } if len(recommendations) == 0 { - for _, candidate := range candidates[:min(4, len(candidates))] { + for _, candidate := range candidates[:min(8, len(candidates))] { recommendations = append(recommendations, AIRecommendation{ Title: candidate.Title, Link: candidate.Link, @@ -255,8 +256,8 @@ User query: ` + query, ThumbnailURL: candidate.ThumbnailURL, PreviewVideoURL: candidate.PreviewVideoURL, Source: candidate.Source, - Reason: "Fallback result because Gemini returned no recommended items.", - Recommended: true, + Reason: "Gemini Vision 평가를 받지 못해 키워드 기준으로 보강된 결과입니다.", + Recommended: false, }) } } @@ -412,6 +413,14 @@ func truncateForError(text string, limit int) string { return trimmed[:limit] + "..." } +func normalizeKoreanReason(reason string) string { + trimmed := strings.TrimSpace(reason) + if trimmed == "" { + return "시각 정보가 제한적이지만 검색 의도와의 관련성을 기준으로 평가했습니다." + } + return trimmed +} + func buildSearchQueries(originalQuery, englishQuery string) []string { base := strings.TrimSpace(englishQuery) if base == "" { diff --git a/backend/services/ranker.go b/backend/services/ranker.go index b8dd79d..eed5f0f 100644 --- a/backend/services/ranker.go +++ b/backend/services/ranker.go @@ -3,6 +3,7 @@ package services import ( "sort" "strings" + "sync" ) type GeminiBatchStats struct { @@ -80,42 +81,63 @@ func RankSearchResults(query string, results []SearchResult) []SearchResult { } func GeminiCandidateLimit(total int) int { - switch { - case total <= 12: - return total - case total <= 16: - return 12 - default: - return 16 - } + return total } func EvaluateAllCandidatesWithGemini(service *GeminiService, query string, ranked []SearchResult) ([]AIRecommendation, GeminiBatchStats) { const chunkSize = 8 + const maxConcurrentBatches = 2 limit := GeminiCandidateLimit(len(ranked)) stats := GeminiBatchStats{ CandidateCap: limit, Requested: min(limit, len(ranked)), } - merged := make([]AIRecommendation, 0, len(ranked)) - seen := map[string]bool{} + type batchResult struct { + index int + recommendations []AIRecommendation + err error + } + batches := make([][]SearchResult, 0, (limit+chunkSize-1)/chunkSize) for start := 0; start < limit; start += chunkSize { end := start + chunkSize if end > limit { end = limit } - batch := ranked[start:end] - stats.Batches++ - recommended, err := service.Recommend(query, batch) - if err != nil { + batches = append(batches, ranked[start:end]) + } + stats.Batches = len(batches) + + results := make([]batchResult, len(batches)) + var wg sync.WaitGroup + sem := make(chan struct{}, maxConcurrentBatches) + for idx, batch := range batches { + wg.Add(1) + go func(batchIndex int, candidates []SearchResult) { + defer wg.Done() + sem <- struct{}{} + defer func() { <-sem }() + recommended, err := service.Recommend(query, candidates) + results[batchIndex] = batchResult{ + index: batchIndex, + recommendations: recommended, + err: err, + } + }(idx, batch) + } + wg.Wait() + + merged := make([]AIRecommendation, 0, len(ranked)) + seen := map[string]bool{} + for _, batch := range results { + if batch.err != nil { stats.Failed++ if len(stats.Errors) < 5 { - stats.Errors = append(stats.Errors, err.Error()) + stats.Errors = append(stats.Errors, batch.err.Error()) } continue } stats.Succeeded++ - for _, item := range recommended { + for _, item := range batch.recommendations { if item.Link == "" || seen[item.Link] { continue } @@ -132,6 +154,9 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult, seen := map[string]bool{} for _, item := range recommended { + if !item.Recommended { + continue + } if item.Link == "" || seen[item.Link] { continue } @@ -139,6 +164,14 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult, merged = append(merged, item) } + for _, item := range recommended { + if item.Recommended || item.Link == "" || seen[item.Link] || len(merged) >= limit { + continue + } + seen[item.Link] = true + merged = append(merged, item) + } + for _, item := range ranked { if len(merged) >= limit || item.Link == "" || seen[item.Link] { continue @@ -151,8 +184,8 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult, ThumbnailURL: item.ThumbnailURL, PreviewVideoURL: item.PreviewVideoURL, Source: item.Source, - Reason: "Keyword-ranked result added without extra Gemini vision tokens.", - Recommended: true, + Reason: "Gemini Vision 응답이 부족해 키워드 기준으로 보강된 결과입니다.", + Recommended: false, }) } return merged diff --git a/frontend/app.js b/frontend/app.js index f1fb8ea..075fc03 100644 --- a/frontend/app.js +++ b/frontend/app.js @@ -37,12 +37,22 @@ const clearLogs = document.getElementById("clearLogs"); const downloadLogs = document.getElementById("downloadLogs"); const debugLogList = document.getElementById("debugLogList"); const debugSummary = document.getElementById("debugSummary"); +const resultModal = document.getElementById("resultModal"); +const resultModalTitle = document.getElementById("resultModalTitle"); +const resultModalSource = document.getElementById("resultModalSource"); +const resultModalSnippet = document.getElementById("resultModalSnippet"); +const resultModalReason = document.getElementById("resultModalReason"); +const resultModalFrame = document.getElementById("resultModalFrame"); +const resultModalOpenExternal = document.getElementById("resultModalOpenExternal"); +const resultModalDownload = document.getElementById("resultModalDownload"); +const closeResultModal = document.getElementById("closeResultModal"); let pendingDownload = null; let cropStart = 0; let cropEnd = 0; let cropMax = 0; let activeThumb = null; +let activeResultItem = null; const activePlatforms = new Set(["envato", "artgrid", "google video"]); const hlsInstances = new WeakMap(); const debugEntries = []; @@ -319,13 +329,13 @@ function renderResults(results) { const image = node.querySelector("img"); const previewVideo = node.querySelector(".preview-hover"); const overlays = node.querySelectorAll(".preview-overlay"); - node.href = item.link; image.src = item.thumbnailUrl || "https://placehold.co/1280x720/0a0a0a/ffffff?text=Preview"; image.alt = item.title; node.querySelector("h3").textContent = item.title; node.querySelector(".result-snippet").textContent = item.snippet || item.reason || item.source || ""; - node.querySelector(".result-reason").textContent = item.reason ? `AI note: ${item.reason}` : ""; + node.querySelector(".result-reason").textContent = item.reason ? `AI 노트: ${item.reason}` : ""; node.querySelector(".source-badge").textContent = item.source; + node.addEventListener("click", () => openResultModal(item)); previewVideo.poster = item.thumbnailUrl || ""; if (item.previewVideoUrl) { const mediaArea = node.querySelector(".relative"); @@ -347,6 +357,53 @@ function renderResults(results) { } } +async function prepareDirectDownload(targetUrl) { + downloadResult.textContent = "checking duplicate history..."; + const dup = await api(`/api/history/check?url=${encodeURIComponent(targetUrl)}`); + let force = false; + if (dup.exists) { + force = window.confirm("동일 URL 다운로드 이력이 있습니다. 계속 진행할까요?"); + if (!force) { + downloadResult.textContent = "cancelled"; + return; + } + } + pendingDownload = { url: targetUrl, force }; + downloadResult.textContent = "loading preview..."; + const preview = await api("/api/download/preview", { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ url: targetUrl }), + }); + openPreviewModal(preview); + downloadResult.textContent = "preview loaded"; +} + +function openResultModal(item) { + activeResultItem = item; + resultModalTitle.textContent = item.title || "Untitled"; + resultModalSource.textContent = item.source || ""; + resultModalSnippet.textContent = item.snippet || "원본 페이지에서 사용할 수 있는 설명이 없습니다."; + resultModalReason.textContent = item.reason || "AI 노트가 없습니다."; + resultModalFrame.src = item.link || "about:blank"; + resultModalOpenExternal.href = item.link || "#"; + const canDirectDownload = item.source === "Google Video" && item.link; + resultModalDownload.classList.toggle("hidden", !canDirectDownload); + resultModal.classList.remove("hidden"); + resultModal.classList.add("flex"); + logEvent("result:modal:open", { title: item.title, source: item.source, link: item.link }); +} + +function closeResultViewer() { + if (!resultModal.classList.contains("hidden")) { + logEvent("result:modal:close", { title: activeResultItem?.title || "" }); + } + activeResultItem = null; + resultModalFrame.src = "about:blank"; + resultModal.classList.add("hidden"); + resultModal.classList.remove("flex"); +} + searchForm.addEventListener("submit", async (event) => { event.preventDefault(); setStatus("preparing search", 5); @@ -458,26 +515,8 @@ fileInput.addEventListener("change", async () => { downloadForm.addEventListener("submit", async (event) => { event.preventDefault(); - downloadResult.textContent = "checking duplicate history..."; try { - const dup = await api(`/api/history/check?url=${encodeURIComponent(downloadUrl.value)}`); - let force = false; - if (dup.exists) { - force = window.confirm("동일 URL 다운로드 이력이 있습니다. 계속 진행할까요?"); - if (!force) { - downloadResult.textContent = "cancelled"; - return; - } - } - pendingDownload = { url: downloadUrl.value, force }; - downloadResult.textContent = "loading preview..."; - const preview = await api("/api/download/preview", { - method: "POST", - headers: { "Content-Type": "application/json" }, - body: JSON.stringify({ url: downloadUrl.value }), - }); - openPreviewModal(preview); - downloadResult.textContent = "preview loaded"; + await prepareDirectDownload(downloadUrl.value); } catch (error) { downloadResult.textContent = error.message; logEvent("download:preview:error", { message: error.message, data: error.data || null }); @@ -509,6 +548,24 @@ confirmDownload.addEventListener("click", async () => { }); closePreviewModal.addEventListener("click", closeModal); +closeResultModal.addEventListener("click", closeResultViewer); +resultModal.addEventListener("click", (event) => { + if (event.target === resultModal) { + closeResultViewer(); + } +}); +resultModalDownload.addEventListener("click", async () => { + if (!activeResultItem?.link) { + return; + } + try { + closeResultViewer(); + await prepareDirectDownload(activeResultItem.link); + } catch (error) { + downloadResult.textContent = error.message; + logEvent("download:preview:error", { message: error.message, data: error.data || null, source: activeResultItem?.source || "" }); + } +}); previewModal.addEventListener("click", (event) => { if (event.target === previewModal) { closeModal(); diff --git a/frontend/index.html b/frontend/index.html index 31b7338..2f5a833 100644 --- a/frontend/index.html +++ b/frontend/index.html @@ -149,8 +149,41 @@ + + - +