Add in-app result viewer and expand Gemini review
build-push / docker (push) Successful in 4m52s

This commit is contained in:
AI Assistant
2026-03-16 10:12:12 +09:00
parent 9637b761bd
commit b43886e950
6 changed files with 414 additions and 306 deletions
+229 -253
View File
@@ -1,238 +1,138 @@
# AI Media Hub Handover
## Working Rule
- From this point on, every meaningful change should be appended to this file so the next handoff can reconstruct:
- This file is both backlog and handover log.
- Every meaningful change should record:
- what changed
- why it changed
- how it was verified
- what remains risky
- Treat this file as both backlog and handover log, not just a static TODO list.
- what is still risky or incomplete
- If a push fails or a change remains local-only, that must be written here explicitly.
## Current Session Update (2026-03-13)
- Added a local self-test workflow before push/container build:
- `scripts/selftest.sh`
- `scripts/mock_searxng.py`
- Fixed Korean query translation fallback behavior:
- If `GEMINI_API_KEY` is missing or Gemini translation fails, the code now still attempts Google Translate fallback.
- If Google Translate fallback fails, dictionary replacement fallback still runs.
- Added Go tests for translation fallback logic.
- Fixed frontend HLS preview wiring:
- `hls.js` is now loaded in `frontend/index.html`
- frontend now tries `hls.js` first, then native HLS playback if available
- Corrected the practical local verification note:
- `go build ./backend` from repo root conflicts with the existing `backend/` directory name
- verified build command is now treated as `go build -o /tmp/... ./backend`
## Current Session Update (2026-03-13, Search/Preview Follow-up)
- Investigated a production search failure using downloaded frontend logs.
- Identified the main timeout cause:
- too many search results were being collected
- too many Gemini Vision batches were being evaluated sequentially
- backend debug messages were broadcasting oversized result payloads
- Applied search pipeline optimization:
- reduced per-source result caps
- reduced query fan-out for Google Video
- reduced enrichment cap
- limited Gemini Vision evaluation to top-ranked candidates only
- Improved Google Video filtering:
- added bans for music/BGM/trailer-style noise results
- Improved Envato enrichment fidelity:
- source page metadata is now preferred over search-engine proxy thumbnails
- source snippet/title are now taken from page metadata when available
- preview mp4 extraction now works via HTML/JSON-LD parsing
- added Python HTML fetch fallback for Cloudflare-challenged Envato pages because Go HTTP alone was receiving 403 challenge pages in testing
- Improved Artgrid fidelity:
- source page title/description/thumbnail are now preferred over search-engine snippets when available
- preview extraction is still not considered solved for all Artgrid clips because public HTML tested here did not expose a stable mp4/m3u8 URL
- Improved logging:
- backend search debug events now emit summaries, timings, source counts, preview counts, and Gemini batch stats instead of giant raw arrays
- frontend now logs raw non-JSON error bodies instead of collapsing them to `{}` on gateway/proxy failures
- Improved result rendering:
- search cards now show source snippet/description separately from AI reason to reduce confusion between asset metadata and Gemini commentary
## Current Session Update (2026-03-13, Regression Fix)
- A regression was found after search optimization:
- Envato and Artgrid disappeared entirely for some real searches while Google Video still returned results
- Root cause:
- the first optimization reduced query-variant breadth too aggressively
- the first 3 query variants were not enough to recover Envato/Artgrid in some real SearXNG result sets
- Fix applied:
- search now runs in two stages
- stage 1 searches only the first few variants for speed
- stage 2 searches additional variants only for sources that still returned zero results
- Intent:
- keep the anti-timeout optimization
- recover Envato/Artgrid recall when the early pass is too narrow
## Current Session Update (2026-03-13, HTML Snapshot Analysis)
- Used saved HTML snapshots supplied by the user for:
- Envato item page
- Artgrid clip page
- Findings:
- Envato page exposes clean `VideoObject` JSON-LD with:
- exact asset title
- rich description
- thumbnail URL
- preview mp4 URL
- Artgrid page exposes reliable meta fields for:
- title
- description
- thumbnail
- canonical URL
- Artgrid snapshot still does **not** expose a stable preview mp4 or m3u8 in the saved HTML or downloaded asset bundle inspected here
- Fixes applied from the snapshots:
- Envato enrichment now prefers `VideoObject` JSON-LD over generic meta tags
- Envato search cards should now align much better with the actual source asset and preview
- Artgrid title/description are now cleaned so Gemini/source text is less polluted by site suffixes and generic boilerplate
- Remaining limitation:
- Artgrid hover-video preview cannot be derived reliably from the provided snapshot alone
- if Artgrid preview video is still required, the next useful artifact is a browser HAR or DevTools network capture from an opened clip page
## Current Session Update (2026-03-13, Collector Refactor)
- Refactored the search pipeline into source-specific collectors:
- `envatoCollector`
- `artgridCollector`
- `googleVideoCollector`
- `SearchService` now acts mainly as:
- collector orchestration
- query-pass control
- dedupe
- cross-source enrichment scheduling
- Goal of the refactor:
- reduce cross-source coupling
- make future source-specific fixes safer
- make it easier to replace or disable one source without destabilizing the others
- Current implementation note:
- collectors are still in Go code under backend services, but the responsibilities are now separated by source instead of one monolithic search loop
## Current Session Update (2026-03-13, Artgrid Collector Fix + Ranker Split)
- Artgrid collector regression fixed:
- real search results can come back as `artlist.io/stock-footage/clip/.../<id>` instead of only `artgrid.io/clip/<id>/...`
- renderable filtering was rejecting those URLs, which caused `SearXNG returned no renderable results.` for Artgrid-only searches
- Fix applied:
- Artgrid renderability now accepts both `artgrid.io` and `artlist.io/stock-footage/clip/...` clip URLs
- Artgrid result links are normalized into `https://artgrid.io/clip/<id>/<slug>` inside the collector flow before filtering/enrichment
- Refactor continued:
- ranking / Gemini candidate evaluation / recommendation merge logic moved out of `handlers/api.go`
- new service layer file: `backend/services/ranker.go`
- handler is now thinner and less coupled to search internals
## Current Session Update (2026-03-13, 500 Fix)
- A server-side `request failed (500)` regression was found after the ranker split.
- Root cause:
- Gemini candidate cap logic returned `12` even when only `9` ranked candidates existed
- Gemini batch slicing then attempted to read beyond the available slice bounds
- Fix applied:
- `GeminiCandidateLimit` now never exceeds the real candidate count for totals up to 12
- Gemini evaluation now stays within valid ranked slice bounds
- Effect:
- avoids backend 500 during the Gemini Vision evaluation stage for mid-sized result sets
## Current Session Update (2026-03-13, Artgrid Query Coverage Fix)
- Another Artgrid no-results regression was found even after the collector URL matcher was widened.
- Root cause:
- Artgrid collector query generation still leaned on `site:artgrid.io/clip/`
- in practice, canonical clip pages can surface under `artlist.io/stock-footage/clip/...`
- so some Artgrid-only searches still returned zero renderable results even though the accept filter had been fixed
- Fix applied:
- Artgrid query generation now searches both:
- `site:artgrid.io/clip/`
- `site:artlist.io/stock-footage/clip/`
- Effect:
- improves Artgrid recall in SearXNG result sets that favor canonical Artlist URLs over Artgrid URLs
## Current Session Update (2026-03-16, Query / Preview Follow-up)
- Search intent translation was updated to better preserve compound media phrases:
- added explicit normalization for terms like `사이버 펑크` -> `cyberpunk`
- added a guard that rejects over-compressed translations when the original query contains a richer multi-word intent
- Artgrid page parsing was tightened:
- generic Artgrid homepage / challenge HTML should no longer be mistaken for a real clip page during enrichment
- this prevents homepage thumbnails/descriptions from overwriting real search result metadata
- Hover preview playback was changed to lazy attach on hover:
- preview source is now attached on mouseenter
- playback waits for media readiness instead of trying to play immediately from the render path
- source is detached again on mouseleave
- Self-test script search step now retries to reduce flaky startup timing failures during local smoke tests
## Local Self-Test Workflow
- Primary command:
- `bash scripts/selftest.sh`
- What it currently verifies:
- Go formatting for touched backend files
- Python syntax for worker + mock SearXNG
- `go test ./...`
- backend binary build
- local app boot with temp SQLite/download dirs
- `/healthz`
- `/api/search` using a local mock SearXNG server
- `/api/upload`
- Purpose:
- allow safe local regression checks before push or container build without depending on real SearXNG, Gemini, or browser interaction
## Project Summary
## Current State At A Glance
- Project: `ai-media-hub`
- Goal: AI-assisted media discovery + ingest dashboard for Unraid
- Backend: Go
- Worker: Python + `yt-dlp` + `ffmpeg`
- Frontend: HTML + Vanilla JS + Tailwind CDN
- Database: SQLite
- Current search backend: `SearXNG`
- Current vision/ranking backend: `Gemini 2.5 Flash`
- Search backend: `SearXNG`
- AI translation / visual ranking: `Gemini 2.5 Flash`
- Deployment target: single Docker container on Unraid
- Git remote: `https://git.savethenurse.com/savethenurse/ai-media-hub.git`
## Current Status Summary
- Upload / direct download flow is implemented and broadly usable.
- Search is implemented end-to-end and now refactored into source-specific collectors.
- Search remains the main unstable subsystem.
- Envato metadata and preview extraction are much stronger than before.
- Artgrid metadata fidelity is improved, but stable public hover-video preview extraction is still not solved.
- Frontend now logs more useful API and debug information than earlier versions.
- A local self-test workflow now exists and should be run before container builds or pushes.
## Current Architecture
- `backend/main.go`
App bootstrap, env loading, static frontend serving, route registration
- app bootstrap
- env loading
- static frontend serving
- route registration
- `backend/handlers/api.go`
Upload/download/search APIs, WebSocket progress broadcast, debug event broadcast
- upload / download / search APIs
- WebSocket progress broadcast
- debug event broadcast
- search request orchestration only, with ranking/Gemini logic mostly moved out
- `backend/services/cse.go`
Actual search backend service
Despite filename, this is no longer Google CSE logic
It now wraps SearXNG search, source filtering, result enrichment, preview asset parsing
- SearXNG querying
- shared search helpers
- source-specific enrich helpers
- URL filtering / parsing utilities
- `backend/services/search_collectors.go`
- source-specific collectors:
- `envatoCollector`
- `artgridCollector`
- `googleVideoCollector`
- `backend/services/ranker.go`
- ranking
- Gemini candidate cap logic
- Gemini batch evaluation wrapper
- recommendation merge logic
- `backend/services/gemini.go`
Query translation, deterministic query expansion helper, Gemini vision scoring
Also extracts first video frame with `ffmpeg` when no thumbnail exists
- query translation
- deterministic query expansion
- Gemini vision scoring
- video frame extraction via `ffmpeg` when needed
- `backend/models/db.go`
SQLite init + download history
- SQLite init
- download history
- `worker/downloader.py`
`yt-dlp` probe/download + ffmpeg clip extraction
- `yt-dlp` probe / download
- `ffmpeg` clip extraction
- `frontend/index.html`
Main dashboard UI, preview modal, debug log panel
- main dashboard UI
- result viewer modal
- preview modal
- debug log panel
- `frontend/app.js`
API calls, WebSocket status bar, hover preview playback, debug logger panel, platform toggles
- API calls
- WebSocket status bar
- result viewer modal
- hover preview playback
- direct download handoff for Google Video results
- debug logger panel
- platform toggles
- `frontend/style.css`
Custom styles, clamp helpers, slider thumb styles, debug panel scrollbar styles
- custom styles
- clamp helpers
- slider thumb styles
- debug panel scrollbar styles
- `scripts/selftest.sh`
- local smoke test flow
- `scripts/mock_searxng.py`
- local mock SearXNG used by self-test
- `unraid-template.xml`
Unraid template for current `git.savethenurse.com` image source
- Unraid template for current image source
## Search Flow: Current Implementation
1. User enters a query in Zone A.
2. Frontend sends `/api/search` with:
- `query`
- selected `platforms`
3. Backend translates the query to English in `GeminiService.TranslateQuery`.
Fallback order:
- Gemini translation
3. Backend translates the query in `GeminiService.TranslateQuery`.
- Gemini translation if available
- Google Translate HTTP fallback
- small Korean media-term dictionary replacement
- Korean media-term dictionary fallback
- explicit normalization for known compound phrases such as `사이버 펑크` -> `cyberpunk`
4. Backend builds deterministic English search variants in `GeminiService.ExpandQuery`.
5. Backend calls `SearchService.SearchMedia(...)`.
6. Search service queries SearXNG for:
- `Envato`
- `Artgrid`
- `Google Video`
7. Search service filters source URLs aggressively:
- Google Video: YouTube-only
5. `SearchService.SearchMedia(...)` orchestrates source-specific collectors.
6. Collectors query SearXNG separately for:
- Envato
- Artgrid
- Google Video
7. Each collector applies source-specific acceptance logic.
- Google Video: YouTube-only plus noise filtering
- Envato: `elements.envato.com` item URLs only
- Artgrid: `artgrid.io/clip/...` only
8. Search service enriches results:
- Envato: parses item page HTML for `og:image` and preview video URL
- Artgrid: attempts clip API + HTML parsing for thumbnails and preview sources
9. Backend ranks all results locally.
10. Backend evaluates all ranked results with Gemini vision in batches.
11. Backend merges Gemini recommendations + fallback ranked items and returns JSON to frontend.
12. Frontend renders cards and hover previews.
- Artgrid: accepts both:
- `artgrid.io/clip/...`
- `artlist.io/stock-footage/clip/...`
8. Artgrid canonical links are normalized to:
- `https://artgrid.io/clip/<id>/<slug>`
9. Results are enriched source-by-source.
- Envato:
- `VideoObject` JSON-LD preferred
- page meta preferred over search-engine proxy thumbnail
- preview mp4 extraction via JSON-LD / HTML parsing
- Python HTML fetch fallback used when Go HTTP fetch gets Cloudflare challenge pages
- Artgrid:
- page title / description / thumbnail cleaning
- homepage / challenge HTML is now rejected so generic site metadata does not overwrite clip metadata
- preview video extraction still not stable
10. Ranked results are passed through the shared ranker.
11. All ranked candidates are evaluated with Gemini Vision in batches.
12. Merge order now prefers:
- Gemini recommended items
- Gemini-reviewed non-recommended items
- keyword fallback items only if Gemini output is incomplete
13. Frontend renders cards, result viewer modal, and hover previews.
## Direct Downloader Flow: Current Implementation
1. User enters URL in Zone C.
@@ -248,6 +148,45 @@
7. Worker downloads source with `yt-dlp`, clips with `ffmpeg`, emits JSON progress lines.
8. Backend rebroadcasts progress over WebSocket.
## Major Work Completed So Far
- Added local self-test workflow:
- `scripts/selftest.sh`
- `scripts/mock_searxng.py`
- Fixed translation fallback when Gemini key is missing.
- Added tests for translation fallback logic.
- Added HLS frontend wiring:
- `hls.js` script
- native HLS fallback
- Reduced search timeout risk by:
- limiting collector result caps
- limiting enrichment scope
- limiting Gemini Vision evaluation scope
- replacing oversized raw debug result payloads with summaries
- Improved Google Video filtering:
- rejects more music / trailer / BGM style noise
- Improved Envato fidelity:
- real title / description / thumbnail / preview from source page
- Improved Artgrid fidelity:
- accepts canonical Artlist URLs
- normalizes Artgrid clip URLs
- cleans title / description better
- Refactored search into source-specific collectors.
- Moved ranking and Gemini batch handling into `backend/services/ranker.go`.
- Fixed server-side 500 caused by Gemini candidate cap exceeding available ranked candidates.
- Improved frontend logging:
- raw non-JSON error body logging
- more compact debug payload rendering
- Changed hover preview playback to lazy attach on hover:
- attach source on `mouseenter`
- wait for readiness before `play()`
- detach source on `mouseleave`
- Added in-app result viewer modal for search results:
- results now open in a modal instead of directly opening a new tab
- modal shows embedded site iframe, external open button, source summary, and full AI note
- Google Video results can now jump directly into the existing direct-download preview / crop flow from the result viewer
- Gemini reason generation is now intended to be Korean-first for readability
- Gemini Vision evaluation now covers all ranked results instead of only a top subset
## Current Features Implemented
- [x] Project folder structure
- [x] Dockerfile
@@ -262,22 +201,34 @@
- [x] WebSocket realtime progress
- [x] Search source toggles
- [x] Search card hover preview support
- [x] Result viewer modal for search results
- [x] Google Video direct-download handoff from search results
- [x] Debug log panel in frontend
- [x] `.log` download from debug panel
- [x] Local self-test workflow
- [x] Source-specific search collectors
- [x] Shared ranker service layer
## Important Current Constraints / Known Problems
- Search backend has been rewritten multiple times and is still the main unstable area.
- Envato previews are parsed mainly from page HTML metadata / structured data.
- Artgrid previews are partially inferred from:
- clip page HTML
- clip API attempts
- HLS preview handling in frontend
- Search relevance is still not considered stable enough.
- Gemini batch evaluation exists, but search quality can still degrade if upstream SearXNG results are noisy.
- Frontend JavaScript was not linted with Node tooling in this environment because `node` is not installed here.
- Full browser-level preview validation is still not covered by the local self-test script.
- Search backend quality is still the most fragile subsystem.
- Search relevance is still heuristic-heavy and not yet benchmarked against a durable real-query set.
- Embedded result viewer uses an iframe, so some third-party sites may still block embedding with `X-Frame-Options` / CSP.
- Artgrid hover-video preview is still partial / unresolved:
- provided Artgrid HTML snapshots and downloaded asset bundles did not expose a stable public preview mp4/m3u8 URL
- public HTML often only exposes title / description / thumbnail / canonical URL
- Artgrid can still be sensitive to how SearXNG indexes canonical domains.
- Full browser-level validation is still not covered by local self-test.
- Frontend JavaScript still has no Node-based lint/build step in this environment.
- Search cards now separate source snippet from AI reason, but metadata fidelity still depends on source enrichment quality.
- Artgrid public pages inspected from this environment still did not expose a stable public preview video URL in HTML, so Artgrid hover-video support may remain partial until a browser-captured HTML/HAR sample reveals the real preview source pattern.
- Gemini notes are now intended to be Korean, but final output quality still depends on Gemini response consistency.
- The local self-test script is better than before, but it is still a smoke test, not full integration coverage.
## Current Risks Around Search Quality
- Upstream SearXNG quality still controls the candidate pool.
- Gemini Vision can only rerank the candidates it receives.
- If source enrichment fails, Gemini may still judge a weaker proxy thumbnail or fallback image.
- Compound Korean intents are better handled now, but the translation path is still heuristic and can drift on niche concepts.
- Running Gemini Vision across all ranked results increases latency and token usage compared with the earlier capped approach.
## Frontend Debug Logger
- UI button: bottom-right `Logs`
@@ -291,8 +242,10 @@
- ignored WS debug messages
- status updates
- platform toggle state
- result viewer modal open / close
- preview source attach / detach
- hover start / hover end
- hover play errors
- modal preview open / close
- browser errors
- promise rejections
@@ -310,57 +263,80 @@
- `SEARXNG_WEB_ENGINE`
- `GEMINI_API_KEY`
## Unraid Template Notes
- Current image repository in template:
`git.savethenurse.com/savethenurse/ai-media-hub:latest`
- Current registry in template:
`https://git.savethenurse.com`
## Docker / Build Notes
- Dockerfile uses:
- Go build stage
- static ffmpeg image stage
- Python runtime stage
- Heavy apt ffmpeg install path was removed earlier to reduce build time.
## Git / Push Workflow Used So Far
- Branch: `main`
- Remote: `origin`
- All requested changes were committed and pushed incrementally to:
`https://git.savethenurse.com/savethenurse/ai-media-hub.git`
## Recent Relevant Commits
- `8ed1e84` Add in-app debug log panel
- `823bf12` Reflect selected platforms in search status
- `cceb040` Update platform status and HLS previews
- `ad8afd5` Tighten source filters and add platform toggles
- `27000db` Hide overlays during hover preview
- `b78865d` Rewrite search flow and enrich preview assets
- `de24886` Filter non-English expansions and prefer stock sources
- `0bd458d` Boost translated search fallback and source priority
## Next Priority Areas
- [ ] Search backend quality stabilization
The search service is the main unresolved area.
- [ ] Envato / Artgrid preview extraction hardening
- [ ] Search result relevance validation against real user queries
- [ ] Better matching between rendered description and actual linked asset
- [ ] Add browser-level verification for preview/HLS behavior
- [ ] Add more automated coverage for search ranking / filtering logic
- [ ] If Artgrid hover preview is still required, collect one real clip HTML/HAR from a browser session and derive a stable preview URL parser
- [ ] Add proper frontend build/lint step if Node becomes available
## Local Self-Test Workflow
- Primary command:
- `bash scripts/selftest.sh`
- What it currently verifies:
- Go formatting for touched backend files
- Python syntax for worker + mock SearXNG
- `go test ./...`
- backend binary build
- local app boot with temp SQLite/download dirs
- `/healthz`
- `/api/search` using local mock SearXNG
- `/api/upload`
- Notes:
- search step now retries to reduce startup timing flakiness
- this is a smoke test, not a browser-level verification suite
## Verified Locally In This Environment
- [x] `go build -o /tmp/ai-media-hub ./backend`
- [x] `go test ./...` (currently no broad test suite beyond the added fallback tests)
- [x] `go test ./...`
- [x] Python syntax check for worker + self-test helper
- [x] local app boot / `/healthz` through `scripts/selftest.sh`
- [x] local `/api/search` against mock SearXNG through `scripts/selftest.sh`
- [x] local `/api/upload` through `scripts/selftest.sh`
- [ ] full browser-level validation was not fully reproducible in this environment
## Unraid / Docker / CI Notes
- Dockerfile uses:
- Go build stage
- static ffmpeg image stage
- Python runtime stage
- Heavy apt ffmpeg install path was removed earlier to reduce build time.
- Gitea workflow builds and pushes:
- `git.savethenurse.com/savethenurse/ai-media-hub:latest`
- `git.savethenurse.com/savethenurse/ai-media-hub:${{ github.sha }}`
## Recent Relevant Commits
- `9637b76` Improve query intent handling and preview playback
- `6d9391b` Expand Artgrid query coverage to artlist canonical URLs
- `d8cc32e` Fix Gemini candidate cap causing search 500s
- `e426261` Fix Artgrid collector matching and split ranker
- `5aebbef` Refactor search into source-specific collectors
- `ae091c5` Improve source parsing from Envato and Artgrid HTML
- `06ea4f3` Restore Envato and Artgrid fallback search breadth
- `7dfb1ad` Stabilize search pipeline and improve preview diagnostics
- `6f3149a` Add local self-test flow and fix fallback regressions
- `f968458` Rewrite TODO as project handover
## Git / Push Status
- Last pushed commit known in earlier work:
- `6d9391b` was pushed successfully
- Local-only work currently exists:
- `9637b76 Improve query intent handling and preview playback`
- Push status for `9637b76`:
- not pushed
- remote rejected the push with:
- `remote unpack failed: unable to create temporary object directory`
- `remote rejected main -> main (unpacker error)`
- Interpretation:
- current blocker appears to be on the remote git server side, not a local git history issue
## Highest-Value Next Steps
- [ ] Re-try push of local commit once remote git storage/unpacker issue is resolved
- [ ] Build collector-specific integration tests with recorded SearXNG samples
- [ ] Separate source enrichment tests from live network behavior using local fixtures
- [ ] Add a browser-level preview validation path, especially for hover video
- [ ] If Artgrid hover preview is still required, obtain one real clip HAR / DevTools network export and derive a stable preview URL parser
- [ ] Build a small fixed real-query benchmark set to evaluate search quality before further tuning
- [ ] If frontend tooling becomes available, add lint/build checks
## Short Handover Summary
- The codebase exists and runs.
- The codebase runs.
- Upload/download features mostly exist.
- Search is implemented but is still the most fragile subsystem.
- A visible debug logging panel now exists in the web UI and should be used first when continuing work.
- Search has been significantly refactored and is in a better shape than before, but is still the main unstable area.
- Envato source fidelity is much better than earlier.
- Artgrid source fidelity is better, but preview-video extraction is still incomplete.
- There is now a local self-test workflow.
- There is one known local commit that has not been pushed because the remote repo reported an unpacker error.
+3 -3
View File
@@ -320,7 +320,7 @@ func (a *App) searchMedia(c *gin.Context) {
}
scored := services.RankSearchResults(rankQuery, results)
a.debug("search ranked summary", summarizeSearchResults(scored, time.Since(started), services.GeminiCandidateLimit(len(scored)), ""))
a.Hub.Broadcast("progress", gin.H{"type": "search", "status": "analyzing top candidate visuals with Gemini Vision", "progress": 75})
a.Hub.Broadcast("progress", gin.H{"type": "search", "status": "analyzing all candidate visuals with Gemini Vision", "progress": 75})
recommended, geminiStats := services.EvaluateAllCandidatesWithGemini(a.GeminiService, req.Query, scored)
a.debug("search gemini evaluation", geminiStats)
err = nil
@@ -337,8 +337,8 @@ func (a *App) searchMedia(c *gin.Context) {
ThumbnailURL: result.ThumbnailURL,
PreviewVideoURL: result.PreviewVideoURL,
Source: result.Source,
Reason: "Keyword-ranked result added without extra Gemini vision tokens.",
Recommended: true,
Reason: "Gemini Vision 응답이 부족해 키워드 기준으로 보강된 결과입니다.",
Recommended: false,
})
}
warning := err.Error()
+16 -7
View File
@@ -154,7 +154,8 @@ func (g *GeminiService) Recommend(query string, candidates []SearchResult) ([]AI
{
"text": `Analyze the provided images for the user's search intent. Return JSON only in this shape:
{"recommendations":[{"index":0,"reason":"short reason","recommended":true}]}
Mark only the best matches as recommended=true. Keep reasons concise. Recommend up to 8 items.
Return one entry for every analyzed candidate. Use Korean for every reason. Keep reasons concise but specific enough to explain usefulness.
Mark the strongest matches as recommended=true and weaker matches as recommended=false.
Prefer cinematic b-roll, stock footage, editorial footage, clean composition, usable establishing shots, and professional media thumbnails.
Avoid clickbait faces, exaggerated expressions, meme aesthetics, low-information thumbnails, sensational text overlays, or gossip-style imagery.
Favor thumbnails that look directly useful for media editing and footage sourcing.
@@ -230,7 +231,7 @@ User query: ` + query,
recommendations := make([]AIRecommendation, 0, len(parsed.Recommendations))
for _, rec := range parsed.Recommendations {
if rec.Index < 0 || rec.Index >= len(candidates) || !rec.Recommended {
if rec.Index < 0 || rec.Index >= len(candidates) {
continue
}
src := candidates[rec.Index]
@@ -241,13 +242,13 @@ User query: ` + query,
ThumbnailURL: src.ThumbnailURL,
PreviewVideoURL: src.PreviewVideoURL,
Source: src.Source,
Reason: rec.Reason,
Recommended: true,
Reason: normalizeKoreanReason(rec.Reason),
Recommended: rec.Recommended,
})
}
if len(recommendations) == 0 {
for _, candidate := range candidates[:min(4, len(candidates))] {
for _, candidate := range candidates[:min(8, len(candidates))] {
recommendations = append(recommendations, AIRecommendation{
Title: candidate.Title,
Link: candidate.Link,
@@ -255,8 +256,8 @@ User query: ` + query,
ThumbnailURL: candidate.ThumbnailURL,
PreviewVideoURL: candidate.PreviewVideoURL,
Source: candidate.Source,
Reason: "Fallback result because Gemini returned no recommended items.",
Recommended: true,
Reason: "Gemini Vision 평가를 받지 못해 키워드 기준으로 보강된 결과입니다.",
Recommended: false,
})
}
}
@@ -412,6 +413,14 @@ func truncateForError(text string, limit int) string {
return trimmed[:limit] + "..."
}
func normalizeKoreanReason(reason string) string {
trimmed := strings.TrimSpace(reason)
if trimmed == "" {
return "시각 정보가 제한적이지만 검색 의도와의 관련성을 기준으로 평가했습니다."
}
return trimmed
}
func buildSearchQueries(originalQuery, englishQuery string) []string {
base := strings.TrimSpace(englishQuery)
if base == "" {
+51 -18
View File
@@ -3,6 +3,7 @@ package services
import (
"sort"
"strings"
"sync"
)
type GeminiBatchStats struct {
@@ -80,42 +81,63 @@ func RankSearchResults(query string, results []SearchResult) []SearchResult {
}
func GeminiCandidateLimit(total int) int {
switch {
case total <= 12:
return total
case total <= 16:
return 12
default:
return 16
}
return total
}
func EvaluateAllCandidatesWithGemini(service *GeminiService, query string, ranked []SearchResult) ([]AIRecommendation, GeminiBatchStats) {
const chunkSize = 8
const maxConcurrentBatches = 2
limit := GeminiCandidateLimit(len(ranked))
stats := GeminiBatchStats{
CandidateCap: limit,
Requested: min(limit, len(ranked)),
}
merged := make([]AIRecommendation, 0, len(ranked))
seen := map[string]bool{}
type batchResult struct {
index int
recommendations []AIRecommendation
err error
}
batches := make([][]SearchResult, 0, (limit+chunkSize-1)/chunkSize)
for start := 0; start < limit; start += chunkSize {
end := start + chunkSize
if end > limit {
end = limit
}
batch := ranked[start:end]
stats.Batches++
recommended, err := service.Recommend(query, batch)
if err != nil {
batches = append(batches, ranked[start:end])
}
stats.Batches = len(batches)
results := make([]batchResult, len(batches))
var wg sync.WaitGroup
sem := make(chan struct{}, maxConcurrentBatches)
for idx, batch := range batches {
wg.Add(1)
go func(batchIndex int, candidates []SearchResult) {
defer wg.Done()
sem <- struct{}{}
defer func() { <-sem }()
recommended, err := service.Recommend(query, candidates)
results[batchIndex] = batchResult{
index: batchIndex,
recommendations: recommended,
err: err,
}
}(idx, batch)
}
wg.Wait()
merged := make([]AIRecommendation, 0, len(ranked))
seen := map[string]bool{}
for _, batch := range results {
if batch.err != nil {
stats.Failed++
if len(stats.Errors) < 5 {
stats.Errors = append(stats.Errors, err.Error())
stats.Errors = append(stats.Errors, batch.err.Error())
}
continue
}
stats.Succeeded++
for _, item := range recommended {
for _, item := range batch.recommendations {
if item.Link == "" || seen[item.Link] {
continue
}
@@ -132,6 +154,9 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult,
seen := map[string]bool{}
for _, item := range recommended {
if !item.Recommended {
continue
}
if item.Link == "" || seen[item.Link] {
continue
}
@@ -139,6 +164,14 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult,
merged = append(merged, item)
}
for _, item := range recommended {
if item.Recommended || item.Link == "" || seen[item.Link] || len(merged) >= limit {
continue
}
seen[item.Link] = true
merged = append(merged, item)
}
for _, item := range ranked {
if len(merged) >= limit || item.Link == "" || seen[item.Link] {
continue
@@ -151,8 +184,8 @@ func MergeRecommendations(recommended []AIRecommendation, ranked []SearchResult,
ThumbnailURL: item.ThumbnailURL,
PreviewVideoURL: item.PreviewVideoURL,
Source: item.Source,
Reason: "Keyword-ranked result added without extra Gemini vision tokens.",
Recommended: true,
Reason: "Gemini Vision 응답이 부족해 키워드 기준으로 보강된 결과입니다.",
Recommended: false,
})
}
return merged
+78 -21
View File
@@ -37,12 +37,22 @@ const clearLogs = document.getElementById("clearLogs");
const downloadLogs = document.getElementById("downloadLogs");
const debugLogList = document.getElementById("debugLogList");
const debugSummary = document.getElementById("debugSummary");
const resultModal = document.getElementById("resultModal");
const resultModalTitle = document.getElementById("resultModalTitle");
const resultModalSource = document.getElementById("resultModalSource");
const resultModalSnippet = document.getElementById("resultModalSnippet");
const resultModalReason = document.getElementById("resultModalReason");
const resultModalFrame = document.getElementById("resultModalFrame");
const resultModalOpenExternal = document.getElementById("resultModalOpenExternal");
const resultModalDownload = document.getElementById("resultModalDownload");
const closeResultModal = document.getElementById("closeResultModal");
let pendingDownload = null;
let cropStart = 0;
let cropEnd = 0;
let cropMax = 0;
let activeThumb = null;
let activeResultItem = null;
const activePlatforms = new Set(["envato", "artgrid", "google video"]);
const hlsInstances = new WeakMap();
const debugEntries = [];
@@ -319,13 +329,13 @@ function renderResults(results) {
const image = node.querySelector("img");
const previewVideo = node.querySelector(".preview-hover");
const overlays = node.querySelectorAll(".preview-overlay");
node.href = item.link;
image.src = item.thumbnailUrl || "https://placehold.co/1280x720/0a0a0a/ffffff?text=Preview";
image.alt = item.title;
node.querySelector("h3").textContent = item.title;
node.querySelector(".result-snippet").textContent = item.snippet || item.reason || item.source || "";
node.querySelector(".result-reason").textContent = item.reason ? `AI note: ${item.reason}` : "";
node.querySelector(".result-reason").textContent = item.reason ? `AI 노트: ${item.reason}` : "";
node.querySelector(".source-badge").textContent = item.source;
node.addEventListener("click", () => openResultModal(item));
previewVideo.poster = item.thumbnailUrl || "";
if (item.previewVideoUrl) {
const mediaArea = node.querySelector(".relative");
@@ -347,6 +357,53 @@ function renderResults(results) {
}
}
async function prepareDirectDownload(targetUrl) {
downloadResult.textContent = "checking duplicate history...";
const dup = await api(`/api/history/check?url=${encodeURIComponent(targetUrl)}`);
let force = false;
if (dup.exists) {
force = window.confirm("동일 URL 다운로드 이력이 있습니다. 계속 진행할까요?");
if (!force) {
downloadResult.textContent = "cancelled";
return;
}
}
pendingDownload = { url: targetUrl, force };
downloadResult.textContent = "loading preview...";
const preview = await api("/api/download/preview", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url: targetUrl }),
});
openPreviewModal(preview);
downloadResult.textContent = "preview loaded";
}
function openResultModal(item) {
activeResultItem = item;
resultModalTitle.textContent = item.title || "Untitled";
resultModalSource.textContent = item.source || "";
resultModalSnippet.textContent = item.snippet || "원본 페이지에서 사용할 수 있는 설명이 없습니다.";
resultModalReason.textContent = item.reason || "AI 노트가 없습니다.";
resultModalFrame.src = item.link || "about:blank";
resultModalOpenExternal.href = item.link || "#";
const canDirectDownload = item.source === "Google Video" && item.link;
resultModalDownload.classList.toggle("hidden", !canDirectDownload);
resultModal.classList.remove("hidden");
resultModal.classList.add("flex");
logEvent("result:modal:open", { title: item.title, source: item.source, link: item.link });
}
function closeResultViewer() {
if (!resultModal.classList.contains("hidden")) {
logEvent("result:modal:close", { title: activeResultItem?.title || "" });
}
activeResultItem = null;
resultModalFrame.src = "about:blank";
resultModal.classList.add("hidden");
resultModal.classList.remove("flex");
}
searchForm.addEventListener("submit", async (event) => {
event.preventDefault();
setStatus("preparing search", 5);
@@ -458,26 +515,8 @@ fileInput.addEventListener("change", async () => {
downloadForm.addEventListener("submit", async (event) => {
event.preventDefault();
downloadResult.textContent = "checking duplicate history...";
try {
const dup = await api(`/api/history/check?url=${encodeURIComponent(downloadUrl.value)}`);
let force = false;
if (dup.exists) {
force = window.confirm("동일 URL 다운로드 이력이 있습니다. 계속 진행할까요?");
if (!force) {
downloadResult.textContent = "cancelled";
return;
}
}
pendingDownload = { url: downloadUrl.value, force };
downloadResult.textContent = "loading preview...";
const preview = await api("/api/download/preview", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url: downloadUrl.value }),
});
openPreviewModal(preview);
downloadResult.textContent = "preview loaded";
await prepareDirectDownload(downloadUrl.value);
} catch (error) {
downloadResult.textContent = error.message;
logEvent("download:preview:error", { message: error.message, data: error.data || null });
@@ -509,6 +548,24 @@ confirmDownload.addEventListener("click", async () => {
});
closePreviewModal.addEventListener("click", closeModal);
closeResultModal.addEventListener("click", closeResultViewer);
resultModal.addEventListener("click", (event) => {
if (event.target === resultModal) {
closeResultViewer();
}
});
resultModalDownload.addEventListener("click", async () => {
if (!activeResultItem?.link) {
return;
}
try {
closeResultViewer();
await prepareDirectDownload(activeResultItem.link);
} catch (error) {
downloadResult.textContent = error.message;
logEvent("download:preview:error", { message: error.message, data: error.data || null, source: activeResultItem?.source || "" });
}
});
previewModal.addEventListener("click", (event) => {
if (event.target === previewModal) {
closeModal();
+37 -4
View File
@@ -149,8 +149,41 @@
</div>
</div>
<div id="resultModal" class="fixed inset-0 z-50 hidden items-center justify-center bg-black/80 px-4">
<div class="flex h-[88vh] w-full max-w-7xl flex-col overflow-hidden rounded-3xl border border-white/10 bg-zinc-950 shadow-2xl">
<div class="flex items-center justify-between border-b border-white/10 px-5 py-4">
<div class="min-w-0">
<p id="resultModalSource" class="text-xs uppercase tracking-[0.25em] text-zinc-500"></p>
<h3 id="resultModalTitle" class="mt-1 truncate text-xl font-semibold text-white"></h3>
</div>
<div class="flex items-center gap-2">
<a id="resultModalOpenExternal" target="_blank" rel="noreferrer" class="rounded-full border border-white/10 px-3 py-2 text-xs uppercase tracking-[0.2em] text-zinc-300">Open</a>
<button id="resultModalDownload" type="button" class="hidden rounded-full border border-white/10 px-3 py-2 text-xs uppercase tracking-[0.2em] text-zinc-300">Direct Download</button>
<button id="closeResultModal" type="button" class="rounded-full border border-white/10 px-3 py-2 text-xs uppercase tracking-[0.2em] text-zinc-300">Close</button>
</div>
</div>
<div class="grid min-h-0 flex-1 gap-0 lg:grid-cols-[1.5fr_0.85fr]">
<div class="min-h-0 border-b border-white/10 lg:border-b-0 lg:border-r">
<iframe id="resultModalFrame" class="h-full w-full bg-white" referrerpolicy="no-referrer"></iframe>
</div>
<div class="min-h-0 overflow-auto px-5 py-5">
<div class="space-y-5">
<div>
<p class="text-xs uppercase tracking-[0.25em] text-zinc-500">Source Summary</p>
<p id="resultModalSnippet" class="mt-2 text-sm leading-7 text-zinc-300"></p>
</div>
<div>
<p class="text-xs uppercase tracking-[0.25em] text-zinc-500">AI Note</p>
<p id="resultModalReason" class="mt-2 whitespace-pre-wrap text-sm leading-7 text-zinc-200"></p>
</div>
</div>
</div>
</div>
</div>
</div>
<template id="searchCardTemplate">
<a target="_blank" rel="noreferrer" class="group overflow-hidden rounded-3xl border border-white/10 bg-black/30 transition hover:border-white/30">
<button type="button" class="group overflow-hidden rounded-3xl border border-white/10 bg-black/30 text-left transition hover:border-white/30">
<div class="relative aspect-video overflow-hidden bg-zinc-900">
<img class="h-full w-full object-cover transition duration-500 group-hover:scale-105" alt="" />
<video class="preview-hover absolute inset-0 hidden h-full w-full object-cover" muted loop playsinline preload="none"></video>
@@ -160,11 +193,11 @@
<div class="space-y-2 p-5">
<h3 class="line-clamp-2 text-base font-medium text-white"></h3>
<p class="result-snippet line-clamp-3 text-sm text-zinc-400"></p>
<p class="result-reason line-clamp-2 text-xs uppercase tracking-[0.12em] text-zinc-500"></p>
<p class="result-reason line-clamp-2 text-xs tracking-[0.02em] text-zinc-500"></p>
</div>
</a>
</button>
</template>
<script src="/app.js?v=20260313i" defer></script>
<script src="/app.js?v=20260316a" defer></script>
</body>
</html>