Add in-app result viewer and expand Gemini review

2026-03-16 10:12:12 +09:00
parent 9637b761bd
commit b43886e950
6 changed files with 414 additions and 306 deletions
@@ -1,238 +1,138 @@
 # AI Media Hub Handover

 ## Working Rule
- From this point on, every meaningful change should be appended to this file so the next handoff can reconstruct:
+- This file is both backlog and handover log.
+- Every meaningful change should record:
  - what changed
  - why it changed
  - how it was verified
-  - what remains risky
- Treat this file as both backlog and handover log, not just a static TODO list.
+  - what is still risky or incomplete
+- If a push fails or a change remains local-only, that must be written here explicitly.

-## Current Session Update (2026-03-13)
- Added a local self-test workflow before push/container build:
-  - `scripts/selftest.sh`
-  - `scripts/mock_searxng.py`
- Fixed Korean query translation fallback behavior:
-  - If `GEMINI_API_KEY` is missing or Gemini translation fails, the code now still attempts Google Translate fallback.
-  - If Google Translate fallback fails, dictionary replacement fallback still runs.
- Added Go tests for translation fallback logic.
- Fixed frontend HLS preview wiring:
-  - `hls.js` is now loaded in `frontend/index.html`
-  - frontend now tries `hls.js` first, then native HLS playback if available
- Corrected the practical local verification note:
-  - `go build ./backend` from repo root conflicts with the existing `backend/` directory name
-  - verified build command is now treated as `go build -o /tmp/... ./backend`
-
-## Current Session Update (2026-03-13, Search/Preview Follow-up)
- Investigated a production search failure using downloaded frontend logs.
- Identified the main timeout cause:
-  - too many search results were being collected
-  - too many Gemini Vision batches were being evaluated sequentially
-  - backend debug messages were broadcasting oversized result payloads
- Applied search pipeline optimization:
-  - reduced per-source result caps
-  - reduced query fan-out for Google Video
-  - reduced enrichment cap
-  - limited Gemini Vision evaluation to top-ranked candidates only
- Improved Google Video filtering:
-  - added bans for music/BGM/trailer-style noise results
- Improved Envato enrichment fidelity:
-  - source page metadata is now preferred over search-engine proxy thumbnails
-  - source snippet/title are now taken from page metadata when available
-  - preview mp4 extraction now works via HTML/JSON-LD parsing
-  - added Python HTML fetch fallback for Cloudflare-challenged Envato pages because Go HTTP alone was receiving 403 challenge pages in testing
- Improved Artgrid fidelity:
-  - source page title/description/thumbnail are now preferred over search-engine snippets when available
-  - preview extraction is still not considered solved for all Artgrid clips because public HTML tested here did not expose a stable mp4/m3u8 URL
- Improved logging:
-  - backend search debug events now emit summaries, timings, source counts, preview counts, and Gemini batch stats instead of giant raw arrays
-  - frontend now logs raw non-JSON error bodies instead of collapsing them to `{}` on gateway/proxy failures
- Improved result rendering:
-  - search cards now show source snippet/description separately from AI reason to reduce confusion between asset metadata and Gemini commentary
-
-## Current Session Update (2026-03-13, Regression Fix)
- A regression was found after search optimization:
-  - Envato and Artgrid disappeared entirely for some real searches while Google Video still returned results
- Root cause:
-  - the first optimization reduced query-variant breadth too aggressively
-  - the first 3 query variants were not enough to recover Envato/Artgrid in some real SearXNG result sets
- Fix applied:
-  - search now runs in two stages
-  - stage 1 searches only the first few variants for speed
-  - stage 2 searches additional variants only for sources that still returned zero results
- Intent:
-  - keep the anti-timeout optimization
-  - recover Envato/Artgrid recall when the early pass is too narrow
-
-## Current Session Update (2026-03-13, HTML Snapshot Analysis)
- Used saved HTML snapshots supplied by the user for:
-  - Envato item page
-  - Artgrid clip page
- Findings:
-  - Envato page exposes clean `VideoObject` JSON-LD with:
-    - exact asset title
-    - rich description
-    - thumbnail URL
-    - preview mp4 URL
-  - Artgrid page exposes reliable meta fields for:
-    - title
-    - description
-    - thumbnail
-    - canonical URL
-  - Artgrid snapshot still does **not** expose a stable preview mp4 or m3u8 in the saved HTML or downloaded asset bundle inspected here
- Fixes applied from the snapshots:
-  - Envato enrichment now prefers `VideoObject` JSON-LD over generic meta tags
-  - Envato search cards should now align much better with the actual source asset and preview
-  - Artgrid title/description are now cleaned so Gemini/source text is less polluted by site suffixes and generic boilerplate
- Remaining limitation:
-  - Artgrid hover-video preview cannot be derived reliably from the provided snapshot alone
-  - if Artgrid preview video is still required, the next useful artifact is a browser HAR or DevTools network capture from an opened clip page
-
-## Current Session Update (2026-03-13, Collector Refactor)
- Refactored the search pipeline into source-specific collectors:
-  - `envatoCollector`
-  - `artgridCollector`
-  - `googleVideoCollector`
- `SearchService` now acts mainly as:
-  - collector orchestration
-  - query-pass control
-  - dedupe
-  - cross-source enrichment scheduling
- Goal of the refactor:
-  - reduce cross-source coupling
-  - make future source-specific fixes safer
-  - make it easier to replace or disable one source without destabilizing the others
- Current implementation note:
-  - collectors are still in Go code under backend services, but the responsibilities are now separated by source instead of one monolithic search loop
-
-## Current Session Update (2026-03-13, Artgrid Collector Fix + Ranker Split)
- Artgrid collector regression fixed:
-  - real search results can come back as `artlist.io/stock-footage/clip/.../<id>` instead of only `artgrid.io/clip/<id>/...`
-  - renderable filtering was rejecting those URLs, which caused `SearXNG returned no renderable results.` for Artgrid-only searches
- Fix applied:
-  - Artgrid renderability now accepts both `artgrid.io` and `artlist.io/stock-footage/clip/...` clip URLs
-  - Artgrid result links are normalized into `https://artgrid.io/clip/<id>/<slug>` inside the collector flow before filtering/enrichment
- Refactor continued:
-  - ranking / Gemini candidate evaluation / recommendation merge logic moved out of `handlers/api.go`
-  - new service layer file: `backend/services/ranker.go`
-  - handler is now thinner and less coupled to search internals
-
-## Current Session Update (2026-03-13, 500 Fix)
- A server-side `request failed (500)` regression was found after the ranker split.
- Root cause:
-  - Gemini candidate cap logic returned `12` even when only `9` ranked candidates existed
-  - Gemini batch slicing then attempted to read beyond the available slice bounds
- Fix applied:
-  - `GeminiCandidateLimit` now never exceeds the real candidate count for totals up to 12
-  - Gemini evaluation now stays within valid ranked slice bounds
- Effect:
-  - avoids backend 500 during the Gemini Vision evaluation stage for mid-sized result sets
-
-## Current Session Update (2026-03-13, Artgrid Query Coverage Fix)
- Another Artgrid no-results regression was found even after the collector URL matcher was widened.
- Root cause:
-  - Artgrid collector query generation still leaned on `site:artgrid.io/clip/`
-  - in practice, canonical clip pages can surface under `artlist.io/stock-footage/clip/...`
-  - so some Artgrid-only searches still returned zero renderable results even though the accept filter had been fixed
- Fix applied:
-  - Artgrid query generation now searches both:
-    - `site:artgrid.io/clip/`
-    - `site:artlist.io/stock-footage/clip/`
- Effect:
-  - improves Artgrid recall in SearXNG result sets that favor canonical Artlist URLs over Artgrid URLs
-
-## Current Session Update (2026-03-16, Query / Preview Follow-up)
- Search intent translation was updated to better preserve compound media phrases:
-  - added explicit normalization for terms like `사이버 펑크` -> `cyberpunk`
-  - added a guard that rejects over-compressed translations when the original query contains a richer multi-word intent
- Artgrid page parsing was tightened:
-  - generic Artgrid homepage / challenge HTML should no longer be mistaken for a real clip page during enrichment
-  - this prevents homepage thumbnails/descriptions from overwriting real search result metadata
- Hover preview playback was changed to lazy attach on hover:
-  - preview source is now attached on mouseenter
-  - playback waits for media readiness instead of trying to play immediately from the render path
-  - source is detached again on mouseleave
- Self-test script search step now retries to reduce flaky startup timing failures during local smoke tests
-
-## Local Self-Test Workflow
- Primary command:
-  - `bash scripts/selftest.sh`
- What it currently verifies:
-  - Go formatting for touched backend files
-  - Python syntax for worker + mock SearXNG
-  - `go test ./...`
-  - backend binary build
-  - local app boot with temp SQLite/download dirs
-  - `/healthz`
-  - `/api/search` using a local mock SearXNG server
-  - `/api/upload`
- Purpose:
-  - allow safe local regression checks before push or container build without depending on real SearXNG, Gemini, or browser interaction
-
-## Project Summary
+## Current State At A Glance
 - Project: `ai-media-hub`
 - Goal: AI-assisted media discovery + ingest dashboard for Unraid
 - Backend: Go
 - Worker: Python + `yt-dlp` + `ffmpeg`
 - Frontend: HTML + Vanilla JS + Tailwind CDN
 - Database: SQLite
- Current search backend: `SearXNG`
- Current vision/ranking backend: `Gemini 2.5 Flash`
+- Search backend: `SearXNG`
+- AI translation / visual ranking: `Gemini 2.5 Flash`
 - Deployment target: single Docker container on Unraid
 - Git remote: `https://git.savethenurse.com/savethenurse/ai-media-hub.git`

+## Current Status Summary
+- Upload / direct download flow is implemented and broadly usable.
+- Search is implemented end-to-end and now refactored into source-specific collectors.
+- Search remains the main unstable subsystem.
+- Envato metadata and preview extraction are much stronger than before.
+- Artgrid metadata fidelity is improved, but stable public hover-video preview extraction is still not solved.
+- Frontend now logs more useful API and debug information than earlier versions.
+- A local self-test workflow now exists and should be run before container builds or pushes.
+
 ## Current Architecture
 - `backend/main.go`
-  App bootstrap, env loading, static frontend serving, route registration
+  - app bootstrap
+  - env loading
+  - static frontend serving
+  - route registration
 - `backend/handlers/api.go`
-  Upload/download/search APIs, WebSocket progress broadcast, debug event broadcast
+  - upload / download / search APIs
+  - WebSocket progress broadcast
+  - debug event broadcast
+  - search request orchestration only, with ranking/Gemini logic mostly moved out
 - `backend/services/cse.go`
-  Actual search backend service
-  Despite filename, this is no longer Google CSE logic
-  It now wraps SearXNG search, source filtering, result enrichment, preview asset parsing
+  - SearXNG querying
+  - shared search helpers
+  - source-specific enrich helpers
+  - URL filtering / parsing utilities
+- `backend/services/search_collectors.go`
+  - source-specific collectors:
+    - `envatoCollector`
+    - `artgridCollector`
+    - `googleVideoCollector`
+- `backend/services/ranker.go`
+  - ranking
+  - Gemini candidate cap logic
+  - Gemini batch evaluation wrapper
+  - recommendation merge logic
 - `backend/services/gemini.go`
-  Query translation, deterministic query expansion helper, Gemini vision scoring
-  Also extracts first video frame with `ffmpeg` when no thumbnail exists
+  - query translation
+  - deterministic query expansion
+  - Gemini vision scoring
+  - video frame extraction via `ffmpeg` when needed
 - `backend/models/db.go`
-  SQLite init + download history
+  - SQLite init
+  - download history
 - `worker/downloader.py`
-  `yt-dlp` probe/download + ffmpeg clip extraction
+  - `yt-dlp` probe / download
+  - `ffmpeg` clip extraction
 - `frontend/index.html`
-  Main dashboard UI, preview modal, debug log panel
+  - main dashboard UI
+  - result viewer modal
+  - preview modal
+  - debug log panel
 - `frontend/app.js`
-  API calls, WebSocket status bar, hover preview playback, debug logger panel, platform toggles
+  - API calls
+  - WebSocket status bar
+  - result viewer modal
+  - hover preview playback
+  - direct download handoff for Google Video results
+  - debug logger panel
+  - platform toggles
 - `frontend/style.css`
-  Custom styles, clamp helpers, slider thumb styles, debug panel scrollbar styles
+  - custom styles
+  - clamp helpers
+  - slider thumb styles
+  - debug panel scrollbar styles
+- `scripts/selftest.sh`
+  - local smoke test flow
+- `scripts/mock_searxng.py`
+  - local mock SearXNG used by self-test
 - `unraid-template.xml`
-  Unraid template for current `git.savethenurse.com` image source
+  - Unraid template for current image source

 ## Search Flow: Current Implementation
 1. User enters a query in Zone A.
 2. Frontend sends `/api/search` with:
   - `query`
   - selected `platforms`
-3. Backend translates the query to English in `GeminiService.TranslateQuery`.
-   Fallback order:
-   - Gemini translation
+3. Backend translates the query in `GeminiService.TranslateQuery`.
+   - Gemini translation if available
   - Google Translate HTTP fallback
-   - small Korean media-term dictionary replacement
+   - Korean media-term dictionary fallback
+   - explicit normalization for known compound phrases such as `사이버 펑크` -> `cyberpunk`
 4. Backend builds deterministic English search variants in `GeminiService.ExpandQuery`.
-5. Backend calls `SearchService.SearchMedia(...)`.
-6. Search service queries SearXNG for:
-   - `Envato`
-   - `Artgrid`
-   - `Google Video`
-7. Search service filters source URLs aggressively:
-   - Google Video: YouTube-only
+5. `SearchService.SearchMedia(...)` orchestrates source-specific collectors.
+6. Collectors query SearXNG separately for:
+   - Envato
+   - Artgrid
+   - Google Video
+7. Each collector applies source-specific acceptance logic.
+   - Google Video: YouTube-only plus noise filtering
   - Envato: `elements.envato.com` item URLs only
-   - Artgrid: `artgrid.io/clip/...` only
-8. Search service enriches results:
-   - Envato: parses item page HTML for `og:image` and preview video URL
-   - Artgrid: attempts clip API + HTML parsing for thumbnails and preview sources
-9. Backend ranks all results locally.
-10. Backend evaluates all ranked results with Gemini vision in batches.
-11. Backend merges Gemini recommendations + fallback ranked items and returns JSON to frontend.
-12. Frontend renders cards and hover previews.
+   - Artgrid: accepts both:
+     - `artgrid.io/clip/...`
+     - `artlist.io/stock-footage/clip/...`
+8. Artgrid canonical links are normalized to:
+   - `https://artgrid.io/clip/<id>/<slug>`
+9. Results are enriched source-by-source.
+   - Envato:
+     - `VideoObject` JSON-LD preferred
+     - page meta preferred over search-engine proxy thumbnail
+     - preview mp4 extraction via JSON-LD / HTML parsing
+     - Python HTML fetch fallback used when Go HTTP fetch gets Cloudflare challenge pages
+   - Artgrid:
+     - page title / description / thumbnail cleaning
+     - homepage / challenge HTML is now rejected so generic site metadata does not overwrite clip metadata
+     - preview video extraction still not stable
+10. Ranked results are passed through the shared ranker.
+11. All ranked candidates are evaluated with Gemini Vision in batches.
+12. Merge order now prefers:
+   - Gemini recommended items
+   - Gemini-reviewed non-recommended items
+   - keyword fallback items only if Gemini output is incomplete
+13. Frontend renders cards, result viewer modal, and hover previews.

 ## Direct Downloader Flow: Current Implementation
 1. User enters URL in Zone C.
@@ -248,6 +148,45 @@
 7. Worker downloads source with `yt-dlp`, clips with `ffmpeg`, emits JSON progress lines.
 8. Backend rebroadcasts progress over WebSocket.

+## Major Work Completed So Far
+- Added local self-test workflow:
+  - `scripts/selftest.sh`
+  - `scripts/mock_searxng.py`
+- Fixed translation fallback when Gemini key is missing.
+- Added tests for translation fallback logic.
+- Added HLS frontend wiring:
+  - `hls.js` script
+  - native HLS fallback
+- Reduced search timeout risk by:
+  - limiting collector result caps
+  - limiting enrichment scope
+  - limiting Gemini Vision evaluation scope
+  - replacing oversized raw debug result payloads with summaries
+- Improved Google Video filtering:
+  - rejects more music / trailer / BGM style noise
+- Improved Envato fidelity:
+  - real title / description / thumbnail / preview from source page
+- Improved Artgrid fidelity:
+  - accepts canonical Artlist URLs
+  - normalizes Artgrid clip URLs
+  - cleans title / description better
+- Refactored search into source-specific collectors.
+- Moved ranking and Gemini batch handling into `backend/services/ranker.go`.
+- Fixed server-side 500 caused by Gemini candidate cap exceeding available ranked candidates.
+- Improved frontend logging:
+  - raw non-JSON error body logging
+  - more compact debug payload rendering
+- Changed hover preview playback to lazy attach on hover:
+  - attach source on `mouseenter`
+  - wait for readiness before `play()`
+  - detach source on `mouseleave`
+- Added in-app result viewer modal for search results:
+  - results now open in a modal instead of directly opening a new tab
+  - modal shows embedded site iframe, external open button, source summary, and full AI note
+- Google Video results can now jump directly into the existing direct-download preview / crop flow from the result viewer
+- Gemini reason generation is now intended to be Korean-first for readability
+- Gemini Vision evaluation now covers all ranked results instead of only a top subset
+
 ## Current Features Implemented
 - [x] Project folder structure
 - [x] Dockerfile
@@ -262,22 +201,34 @@
 - [x] WebSocket realtime progress
 - [x] Search source toggles
 - [x] Search card hover preview support
+- [x] Result viewer modal for search results
+- [x] Google Video direct-download handoff from search results
 - [x] Debug log panel in frontend
 - [x] `.log` download from debug panel
+- [x] Local self-test workflow
+- [x] Source-specific search collectors
+- [x] Shared ranker service layer

 ## Important Current Constraints / Known Problems
- Search backend has been rewritten multiple times and is still the main unstable area.
- Envato previews are parsed mainly from page HTML metadata / structured data.
- Artgrid previews are partially inferred from:
-  - clip page HTML
-  - clip API attempts
-  - HLS preview handling in frontend
- Search relevance is still not considered stable enough.
- Gemini batch evaluation exists, but search quality can still degrade if upstream SearXNG results are noisy.
- Frontend JavaScript was not linted with Node tooling in this environment because `node` is not installed here.
- Full browser-level preview validation is still not covered by the local self-test script.
+- Search backend quality is still the most fragile subsystem.
+- Search relevance is still heuristic-heavy and not yet benchmarked against a durable real-query set.
+- Embedded result viewer uses an iframe, so some third-party sites may still block embedding with `X-Frame-Options` / CSP.
+- Artgrid hover-video preview is still partial / unresolved:
+  - provided Artgrid HTML snapshots and downloaded asset bundles did not expose a stable public preview mp4/m3u8 URL
+  - public HTML often only exposes title / description / thumbnail / canonical URL
+- Artgrid can still be sensitive to how SearXNG indexes canonical domains.
+- Full browser-level validation is still not covered by local self-test.
+- Frontend JavaScript still has no Node-based lint/build step in this environment.
 - Search cards now separate source snippet from AI reason, but metadata fidelity still depends on source enrichment quality.
- Artgrid public pages inspected from this environment still did not expose a stable public preview video URL in HTML, so Artgrid hover-video support may remain partial until a browser-captured HTML/HAR sample reveals the real preview source pattern.
+- Gemini notes are now intended to be Korean, but final output quality still depends on Gemini response consistency.
+- The local self-test script is better than before, but it is still a smoke test, not full integration coverage.
+
+## Current Risks Around Search Quality
+- Upstream SearXNG quality still controls the candidate pool.
+- Gemini Vision can only rerank the candidates it receives.
+- If source enrichment fails, Gemini may still judge a weaker proxy thumbnail or fallback image.
+- Compound Korean intents are better handled now, but the translation path is still heuristic and can drift on niche concepts.
+- Running Gemini Vision across all ranked results increases latency and token usage compared with the earlier capped approach.

 ## Frontend Debug Logger
 - UI button: bottom-right `Logs`
@@ -291,8 +242,10 @@
  - ignored WS debug messages
  - status updates
  - platform toggle state
+  - result viewer modal open / close
  - preview source attach / detach
  - hover start / hover end
+  - hover play errors
  - modal preview open / close
  - browser errors
  - promise rejections
@@ -310,57 +263,80 @@
 - `SEARXNG_WEB_ENGINE`
 - `GEMINI_API_KEY`

-## Unraid Template Notes
- Current image repository in template:
-  `git.savethenurse.com/savethenurse/ai-media-hub:latest`
- Current registry in template:
-  `https://git.savethenurse.com`
-
-## Docker / Build Notes
- Dockerfile uses:
-  - Go build stage
-  - static ffmpeg image stage
-  - Python runtime stage
- Heavy apt ffmpeg install path was removed earlier to reduce build time.
-
-## Git / Push Workflow Used So Far
- Branch: `main`
- Remote: `origin`
- All requested changes were committed and pushed incrementally to:
-  `https://git.savethenurse.com/savethenurse/ai-media-hub.git`
-
-## Recent Relevant Commits
- `8ed1e84` Add in-app debug log panel
- `823bf12` Reflect selected platforms in search status
- `cceb040` Update platform status and HLS previews
- `ad8afd5` Tighten source filters and add platform toggles
- `27000db` Hide overlays during hover preview
- `b78865d` Rewrite search flow and enrich preview assets
- `de24886` Filter non-English expansions and prefer stock sources
- `0bd458d` Boost translated search fallback and source priority
-
-## Next Priority Areas
- [ ] Search backend quality stabilization
-  The search service is the main unresolved area.
- [ ] Envato / Artgrid preview extraction hardening
- [ ] Search result relevance validation against real user queries
- [ ] Better matching between rendered description and actual linked asset
- [ ] Add browser-level verification for preview/HLS behavior
- [ ] Add more automated coverage for search ranking / filtering logic
- [ ] If Artgrid hover preview is still required, collect one real clip HTML/HAR from a browser session and derive a stable preview URL parser
- [ ] Add proper frontend build/lint step if Node becomes available
+## Local Self-Test Workflow
+- Primary command:
+  - `bash scripts/selftest.sh`
+- What it currently verifies:
+  - Go formatting for touched backend files
+  - Python syntax for worker + mock SearXNG
+  - `go test ./...`
+  - backend binary build
+  - local app boot with temp SQLite/download dirs
+  - `/healthz`
+  - `/api/search` using local mock SearXNG
+  - `/api/upload`
+- Notes:
+  - search step now retries to reduce startup timing flakiness
+  - this is a smoke test, not a browser-level verification suite

 ## Verified Locally In This Environment
 - [x] `go build -o /tmp/ai-media-hub ./backend`
- [x] `go test ./...` (currently no broad test suite beyond the added fallback tests)
+- [x] `go test ./...`
 - [x] Python syntax check for worker + self-test helper
 - [x] local app boot / `/healthz` through `scripts/selftest.sh`
 - [x] local `/api/search` against mock SearXNG through `scripts/selftest.sh`
 - [x] local `/api/upload` through `scripts/selftest.sh`
 - [ ] full browser-level validation was not fully reproducible in this environment

+## Unraid / Docker / CI Notes
+- Dockerfile uses:
+  - Go build stage
+  - static ffmpeg image stage
+  - Python runtime stage
+- Heavy apt ffmpeg install path was removed earlier to reduce build time.
+- Gitea workflow builds and pushes:
+  - `git.savethenurse.com/savethenurse/ai-media-hub:latest`
+  - `git.savethenurse.com/savethenurse/ai-media-hub:${{ github.sha }}`
+
+## Recent Relevant Commits
+- `9637b76` Improve query intent handling and preview playback
+- `6d9391b` Expand Artgrid query coverage to artlist canonical URLs
+- `d8cc32e` Fix Gemini candidate cap causing search 500s
+- `e426261` Fix Artgrid collector matching and split ranker
+- `5aebbef` Refactor search into source-specific collectors
+- `ae091c5` Improve source parsing from Envato and Artgrid HTML
+- `06ea4f3` Restore Envato and Artgrid fallback search breadth
+- `7dfb1ad` Stabilize search pipeline and improve preview diagnostics
+- `6f3149a` Add local self-test flow and fix fallback regressions
+- `f968458` Rewrite TODO as project handover
+
+## Git / Push Status
+- Last pushed commit known in earlier work:
+  - `6d9391b` was pushed successfully
+- Local-only work currently exists:
+  - `9637b76 Improve query intent handling and preview playback`
+- Push status for `9637b76`:
+  - not pushed
+  - remote rejected the push with:
+    - `remote unpack failed: unable to create temporary object directory`
+    - `remote rejected main -> main (unpacker error)`
+- Interpretation:
+  - current blocker appears to be on the remote git server side, not a local git history issue
+
+## Highest-Value Next Steps
+- [ ] Re-try push of local commit once remote git storage/unpacker issue is resolved
+- [ ] Build collector-specific integration tests with recorded SearXNG samples
+- [ ] Separate source enrichment tests from live network behavior using local fixtures
+- [ ] Add a browser-level preview validation path, especially for hover video
+- [ ] If Artgrid hover preview is still required, obtain one real clip HAR / DevTools network export and derive a stable preview URL parser
+- [ ] Build a small fixed real-query benchmark set to evaluate search quality before further tuning
+- [ ] If frontend tooling becomes available, add lint/build checks
+
 ## Short Handover Summary
- The codebase exists and runs.
+- The codebase runs.
 - Upload/download features mostly exist.
- Search is implemented but is still the most fragile subsystem.
- A visible debug logging panel now exists in the web UI and should be used first when continuing work.
+- Search has been significantly refactored and is in a better shape than before, but is still the main unstable area.
+- Envato source fidelity is much better than earlier.
+- Artgrid source fidelity is better, but preview-video extraction is still incomplete.
+- There is now a local self-test workflow.
+- There is one known local commit that has not been pushed because the remote repo reported an unpacker error.
@@ -320,7 +320,7 @@ func (a *App) searchMedia(c *gin.Context) {
 	}
 	scored := services.RankSearchResults(rankQuery, results)
 	a.debug("search ranked summary", summarizeSearchResults(scored, time.Since(started), services.GeminiCandidateLimit(len(scored)), ""))
-	a.Hub.Broadcast("progress", gin.H{"type": "search", "status": "analyzing top candidate visuals with Gemini Vision", "progress": 75})
+	a.Hub.Broadcast("progress", gin.H{"type": "search", "status": "analyzing all candidate visuals with Gemini Vision", "progress": 75})
 	recommended, geminiStats := services.EvaluateAllCandidatesWithGemini(a.GeminiService, req.Query, scored)
 	a.debug("search gemini evaluation", geminiStats)
 	err = nil
@@ -337,8 +337,8 @@ func (a *App) searchMedia(c *gin.Context) {
 				ThumbnailURL:    result.ThumbnailURL,
 				PreviewVideoURL: result.PreviewVideoURL,
 				Source:          result.Source,
-				Reason:          "Keyword-ranked result added without extra Gemini vision tokens.",
-				Recommended:     true,
+				Reason:          "Gemini Vision 응답이 부족해 키워드 기준으로 보강된 결과입니다.",
+				Recommended:     false,
 			})
 		}
 		warning := err.Error()
@@ -154,7 +154,8 @@ func (g *GeminiService) Recommend(query string, candidates []SearchResult) ([]AI
 		{
 			"text": `Analyze the provided images for the user's search intent. Return JSON only in this shape:
 {"recommendations":[{"index":0,"reason":"short reason","recommended":true}]}
-Mark only the best matches as recommended=true. Keep reasons concise. Recommend up to 8 items.
+Return one entry for every analyzed candidate. Use Korean for every reason. Keep reasons concise but specific enough to explain usefulness.
+Mark the strongest matches as recommended=true and weaker matches as recommended=false.
 Prefer cinematic b-roll, stock footage, editorial footage, clean composition, usable establishing shots, and professional media thumbnails.
 Avoid clickbait faces, exaggerated expressions, meme aesthetics, low-information thumbnails, sensational text overlays, or gossip-style imagery.
 Favor thumbnails that look directly useful for media editing and footage sourcing.
@@ -230,7 +231,7 @@ User query: ` + query,

 	recommendations := make([]AIRecommendation, 0, len(parsed.Recommendations))
 	for _, rec := range parsed.Recommendations {
-		if rec.Index < 0 || rec.Index >= len(candidates) || !rec.Recommended {
+		if rec.Index < 0 || rec.Index >= len(candidates) {
 			continue
 		}
 		src := candidates[rec.Index]
@@ -241,13 +242,13 @@ User query: ` + query,
 			ThumbnailURL:    src.ThumbnailURL,
 			PreviewVideoURL: src.PreviewVideoURL,
 			Source:          src.Source,
-			Reason:          rec.Reason,
-			Recommended:     true,
+			Reason:          normalizeKoreanReason(rec.Reason),
+			Recommended:     rec.Recommended,
 		})
 	}

 	if len(recommendations) == 0 {
-		for _, candidate := range candidates[:min(4, len(candidates))] {
+		for _, candidate := range candidates[:min(8, len(candidates))] {
 			recommendations = append(recommendations, AIRecommendation{
 				Title:           candidate.Title,
 				Link:            candidate.Link,
@@ -255,8 +256,8 @@ User query: ` + query,
 				ThumbnailURL:    candidate.ThumbnailURL,
 				PreviewVideoURL: candidate.PreviewVideoURL,
 				Source:          candidate.Source,
-				Reason:          "Fallback result because Gemini returned no recommended items.",
-				Recommended:     true,
+				Reason:          "Gemini Vision 평가를 받지 못해 키워드 기준으로 보강된 결과입니다.",
+				Recommended:     false,
 			})
 		}
 	}
@@ -412,6 +413,14 @@ func truncateForError(text string, limit int) string {
 	return trimmed[:limit] + "..."
 }

+func normalizeKoreanReason(reason string) string {
+	trimmed := strings.TrimSpace(reason)
+	if trimmed == "" {
+		return "시각 정보가 제한적이지만 검색 의도와의 관련성을 기준으로 평가했습니다."
+	}
+	return trimmed
+}
+
 func buildSearchQueries(originalQuery, englishQuery string) []string {
 	base := strings.TrimSpace(englishQuery)
 	if base == "" {
@@ -3,6 +3,7 @@ package services
 import (
 	"sort"
 	"strings"
+	"sync"
 )

 type GeminiBatchStats struct {
@@ -80,42 +81,63 @@ func RankSearchResults(query string, results []SearchResult) []SearchResult {
 }

 func GeminiCandidateLimit(total int) int {
-	switch {
-	case total <= 12:
-		return total
-	case total <= 16:
-		return 12
-	default:
-		return 16
-	}
+	return total
 }

 func EvaluateAllCandidatesWithGemini(service *GeminiService, query string, ranked []SearchResult) ([]AIRecommendation, GeminiBatchStats) {
 	const chunkSize = 8
+	const maxConcurrentBatches = 2
 	limit := GeminiCandidateLimit(len(ranked))
 	stats := GeminiBatchStats{
 		CandidateCap: limit,
 		Requested:    min(limit, len(ranked)),
 	}
-	merged := make([]AIRecommendation, 0, len(ranked))
-	seen := map[string]bool{}
+	type batchResult struct {
+		index           int
+		recommendations []AIRecommendation
+		err             error
+	}
+	batches := make([][]SearchResult, 0, (limit+chunkSize-1)/chunkSize)
 	for start := 0; start < limit; start += chunkSize {
 		end := start + chunkSize
 		if end > limit {
 			end = limit
 		}
-		batch := ranked[start:end]
-		stats.Batches++
-		recommended, err := service.Recommend(query, batch)
-		if err != nil {
+		batches = append(batches, ranked[start:end])
+	}
+	stats.Batches = len(batches)
+
+	results := make([]batchResult, len(batches))
+	var wg sync.WaitGroup
+	sem := make(chan struct{}, maxConcurrentBatches)
+	for idx, batch := range batches {
+		wg.Add(1)
+		go func(batchIndex int, candidates []SearchResult) {
+			defer wg.Done()
+			sem <- struct{}{}
+			defer func() { <-sem }()
+			recommended, err := service.Recommend(query, candidates)
+			results[batchIndex] = batchResult{
+				index:           batchIndex,
+				recommendations: recommended,
+				err:             err,
+			}
+		}(idx, batch)
+	}
+	wg.Wait()
+
+	merged := make([]AIRecommendation, 0, len(ranked))
+	seen := map[string]bool{}
+	for _, batch := range results {
+		if batch.err != nil {
 			stats.Failed++
 			if len(stats.Errors) < 5 {
-				stats.Errors = append(stats.Errors, err.Error())
+				stats.Errors = append(stats.Errors, batch.err.Error())
 			}
 			continue
 		}
 		stats.Succeeded++
-		for _, item := range recommended {
+		for _, item := range batch.recommendations {
 			if item.Link == "" || seen[item.Link] {
 				continue
 			}
@@ -132,6 +154,9 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult,
 	seen := map[string]bool{}

 	for _, item := range recommended {
+		if !item.Recommended {
+			continue
+		}
 		if item.Link == "" || seen[item.Link] {
 			continue
 		}
@@ -139,6 +164,14 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult,
 		merged = append(merged, item)
 	}

+	for _, item := range recommended {
+		if item.Recommended || item.Link == "" || seen[item.Link] || len(merged) >= limit {
+			continue
+		}
+		seen[item.Link] = true
+		merged = append(merged, item)
+	}
+
 	for _, item := range ranked {
 		if len(merged) >= limit || item.Link == "" || seen[item.Link] {
 			continue
@@ -151,8 +184,8 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult,
 			ThumbnailURL:    item.ThumbnailURL,
 			PreviewVideoURL: item.PreviewVideoURL,
 			Source:          item.Source,
-			Reason:          "Keyword-ranked result added without extra Gemini vision tokens.",
-			Recommended:     true,
+			Reason:          "Gemini Vision 응답이 부족해 키워드 기준으로 보강된 결과입니다.",
+			Recommended:     false,
 		})
 	}
 	return merged
@@ -37,12 +37,22 @@ const clearLogs = document.getElementById("clearLogs");
 const downloadLogs = document.getElementById("downloadLogs");
 const debugLogList = document.getElementById("debugLogList");
 const debugSummary = document.getElementById("debugSummary");
+const resultModal = document.getElementById("resultModal");
+const resultModalTitle = document.getElementById("resultModalTitle");
+const resultModalSource = document.getElementById("resultModalSource");
+const resultModalSnippet = document.getElementById("resultModalSnippet");
+const resultModalReason = document.getElementById("resultModalReason");
+const resultModalFrame = document.getElementById("resultModalFrame");
+const resultModalOpenExternal = document.getElementById("resultModalOpenExternal");
+const resultModalDownload = document.getElementById("resultModalDownload");
+const closeResultModal = document.getElementById("closeResultModal");

 let pendingDownload = null;
 let cropStart = 0;
 let cropEnd = 0;
 let cropMax = 0;
 let activeThumb = null;
+let activeResultItem = null;
 const activePlatforms = new Set(["envato", "artgrid", "google video"]);
 const hlsInstances = new WeakMap();
 const debugEntries = [];
@@ -319,13 +329,13 @@ function renderResults(results) {
    const image = node.querySelector("img");
    const previewVideo = node.querySelector(".preview-hover");
    const overlays = node.querySelectorAll(".preview-overlay");
-    node.href = item.link;
    image.src = item.thumbnailUrl || "https://placehold.co/1280x720/0a0a0a/ffffff?text=Preview";
    image.alt = item.title;
    node.querySelector("h3").textContent = item.title;
    node.querySelector(".result-snippet").textContent = item.snippet || item.reason || item.source || "";
-    node.querySelector(".result-reason").textContent = item.reason ? `AI note: ${item.reason}` : "";
+    node.querySelector(".result-reason").textContent = item.reason ? `AI 노트: ${item.reason}` : "";
    node.querySelector(".source-badge").textContent = item.source;
+    node.addEventListener("click", () => openResultModal(item));
    previewVideo.poster = item.thumbnailUrl || "";
    if (item.previewVideoUrl) {
      const mediaArea = node.querySelector(".relative");
@@ -347,6 +357,53 @@ function renderResults(results) {
  }
 }

+async function prepareDirectDownload(targetUrl) {
+  downloadResult.textContent = "checking duplicate history...";
+  const dup = await api(`/api/history/check?url=${encodeURIComponent(targetUrl)}`);
+  let force = false;
+  if (dup.exists) {
+    force = window.confirm("동일 URL 다운로드 이력이 있습니다. 계속 진행할까요?");
+    if (!force) {
+      downloadResult.textContent = "cancelled";
+      return;
+    }
+  }
+  pendingDownload = { url: targetUrl, force };
+  downloadResult.textContent = "loading preview...";
+  const preview = await api("/api/download/preview", {
+    method: "POST",
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify({ url: targetUrl }),
+  });
+  openPreviewModal(preview);
+  downloadResult.textContent = "preview loaded";
+}
+
+function openResultModal(item) {
+  activeResultItem = item;
+  resultModalTitle.textContent = item.title || "Untitled";
+  resultModalSource.textContent = item.source || "";
+  resultModalSnippet.textContent = item.snippet || "원본 페이지에서 사용할 수 있는 설명이 없습니다.";
+  resultModalReason.textContent = item.reason || "AI 노트가 없습니다.";
+  resultModalFrame.src = item.link || "about:blank";
+  resultModalOpenExternal.href = item.link || "#";
+  const canDirectDownload = item.source === "Google Video" && item.link;
+  resultModalDownload.classList.toggle("hidden", !canDirectDownload);
+  resultModal.classList.remove("hidden");
+  resultModal.classList.add("flex");
+  logEvent("result:modal:open", { title: item.title, source: item.source, link: item.link });
+}
+
+function closeResultViewer() {
+  if (!resultModal.classList.contains("hidden")) {
+    logEvent("result:modal:close", { title: activeResultItem?.title || "" });
+  }
+  activeResultItem = null;
+  resultModalFrame.src = "about:blank";
+  resultModal.classList.add("hidden");
+  resultModal.classList.remove("flex");
+}
+
 searchForm.addEventListener("submit", async (event) => {
  event.preventDefault();
  setStatus("preparing search", 5);
@@ -458,26 +515,8 @@ fileInput.addEventListener("change", async () => {

 downloadForm.addEventListener("submit", async (event) => {
  event.preventDefault();
-  downloadResult.textContent = "checking duplicate history...";
  try {
-    const dup = await api(`/api/history/check?url=${encodeURIComponent(downloadUrl.value)}`);
-    let force = false;
-    if (dup.exists) {
-      force = window.confirm("동일 URL 다운로드 이력이 있습니다. 계속 진행할까요?");
-      if (!force) {
-        downloadResult.textContent = "cancelled";
-        return;
-      }
-    }
-    pendingDownload = { url: downloadUrl.value, force };
-    downloadResult.textContent = "loading preview...";
-    const preview = await api("/api/download/preview", {
-      method: "POST",
-      headers: { "Content-Type": "application/json" },
-      body: JSON.stringify({ url: downloadUrl.value }),
-    });
-    openPreviewModal(preview);
-    downloadResult.textContent = "preview loaded";
+    await prepareDirectDownload(downloadUrl.value);
  } catch (error) {
    downloadResult.textContent = error.message;
    logEvent("download:preview:error", { message: error.message, data: error.data || null });
@@ -509,6 +548,24 @@ confirmDownload.addEventListener("click", async () => {
 });

 closePreviewModal.addEventListener("click", closeModal);
+closeResultModal.addEventListener("click", closeResultViewer);
+resultModal.addEventListener("click", (event) => {
+  if (event.target === resultModal) {
+    closeResultViewer();
+  }
+});
+resultModalDownload.addEventListener("click", async () => {
+  if (!activeResultItem?.link) {
+    return;
+  }
+  try {
+    closeResultViewer();
+    await prepareDirectDownload(activeResultItem.link);
+  } catch (error) {
+    downloadResult.textContent = error.message;
+    logEvent("download:preview:error", { message: error.message, data: error.data || null, source: activeResultItem?.source || "" });
+  }
+});
 previewModal.addEventListener("click", (event) => {
  if (event.target === previewModal) {
    closeModal();
@@ -149,8 +149,41 @@
      </div>
    </div>

+    <div id="resultModal" class="fixed inset-0 z-50 hidden items-center justify-center bg-black/80 px-4">
+      <div class="flex h-[88vh] w-full max-w-7xl flex-col overflow-hidden rounded-3xl border border-white/10 bg-zinc-950 shadow-2xl">
+        <div class="flex items-center justify-between border-b border-white/10 px-5 py-4">
+          <div class="min-w-0">
+            <p id="resultModalSource" class="text-xs uppercase tracking-[0.25em] text-zinc-500"></p>
+            <h3 id="resultModalTitle" class="mt-1 truncate text-xl font-semibold text-white"></h3>
+          </div>
+          <div class="flex items-center gap-2">
+            <a id="resultModalOpenExternal" target="_blank" rel="noreferrer" class="rounded-full border border-white/10 px-3 py-2 text-xs uppercase tracking-[0.2em] text-zinc-300">Open</a>
+            <button id="resultModalDownload" type="button" class="hidden rounded-full border border-white/10 px-3 py-2 text-xs uppercase tracking-[0.2em] text-zinc-300">Direct Download</button>
+            <button id="closeResultModal" type="button" class="rounded-full border border-white/10 px-3 py-2 text-xs uppercase tracking-[0.2em] text-zinc-300">Close</button>
+          </div>
+        </div>
+        <div class="grid min-h-0 flex-1 gap-0 lg:grid-cols-[1.5fr_0.85fr]">
+          <div class="min-h-0 border-b border-white/10 lg:border-b-0 lg:border-r">
+            <iframe id="resultModalFrame" class="h-full w-full bg-white" referrerpolicy="no-referrer"></iframe>
+          </div>
+          <div class="min-h-0 overflow-auto px-5 py-5">
+            <div class="space-y-5">
+              <div>
+                <p class="text-xs uppercase tracking-[0.25em] text-zinc-500">Source Summary</p>
+                <p id="resultModalSnippet" class="mt-2 text-sm leading-7 text-zinc-300"></p>
+              </div>
+              <div>
+                <p class="text-xs uppercase tracking-[0.25em] text-zinc-500">AI Note</p>
+                <p id="resultModalReason" class="mt-2 whitespace-pre-wrap text-sm leading-7 text-zinc-200"></p>
+              </div>
+            </div>
+          </div>
+        </div>
+      </div>
+    </div>
+
    <template id="searchCardTemplate">
-      <a target="_blank" rel="noreferrer" class="group overflow-hidden rounded-3xl border border-white/10 bg-black/30 transition hover:border-white/30">
+      <button type="button" class="group overflow-hidden rounded-3xl border border-white/10 bg-black/30 text-left transition hover:border-white/30">
        <div class="relative aspect-video overflow-hidden bg-zinc-900">
          <img class="h-full w-full object-cover transition duration-500 group-hover:scale-105" alt="" />
          <video class="preview-hover absolute inset-0 hidden h-full w-full object-cover" muted loop playsinline preload="none"></video>
@@ -160,11 +193,11 @@
        <div class="space-y-2 p-5">
          <h3 class="line-clamp-2 text-base font-medium text-white"></h3>
          <p class="result-snippet line-clamp-3 text-sm text-zinc-400"></p>
-          <p class="result-reason line-clamp-2 text-xs uppercase tracking-[0.12em] text-zinc-500"></p>
+          <p class="result-reason line-clamp-2 text-xs tracking-[0.02em] text-zinc-500"></p>
        </div>
-      </a>
+      </button>
    </template>

-    <script src="/app.js?v=20260313i" defer></script>
+    <script src="/app.js?v=20260316a" defer></script>
  </body>
 </html>