Stabilize search pipeline and improve preview diagnostics
build-push / docker (push) Successful in 4m14s
build-push / docker (push) Successful in 4m14s
This commit is contained in:
@@ -23,6 +23,33 @@
|
||||
- `go build ./backend` from repo root conflicts with the existing `backend/` directory name
|
||||
- verified build command is now treated as `go build -o /tmp/... ./backend`
|
||||
|
||||
## Current Session Update (2026-03-13, Search/Preview Follow-up)
|
||||
- Investigated a production search failure using downloaded frontend logs.
|
||||
- Identified the main timeout cause:
|
||||
- too many search results were being collected
|
||||
- too many Gemini Vision batches were being evaluated sequentially
|
||||
- backend debug messages were broadcasting oversized result payloads
|
||||
- Applied search pipeline optimization:
|
||||
- reduced per-source result caps
|
||||
- reduced query fan-out for Google Video
|
||||
- reduced enrichment cap
|
||||
- limited Gemini Vision evaluation to top-ranked candidates only
|
||||
- Improved Google Video filtering:
|
||||
- added bans for music/BGM/trailer-style noise results
|
||||
- Improved Envato enrichment fidelity:
|
||||
- source page metadata is now preferred over search-engine proxy thumbnails
|
||||
- source snippet/title are now taken from page metadata when available
|
||||
- preview mp4 extraction now works via HTML/JSON-LD parsing
|
||||
- added Python HTML fetch fallback for Cloudflare-challenged Envato pages because Go HTTP alone was receiving 403 challenge pages in testing
|
||||
- Improved Artgrid fidelity:
|
||||
- source page title/description/thumbnail are now preferred over search-engine snippets when available
|
||||
- preview extraction is still not considered solved for all Artgrid clips because public HTML tested here did not expose a stable mp4/m3u8 URL
|
||||
- Improved logging:
|
||||
- backend search debug events now emit summaries, timings, source counts, preview counts, and Gemini batch stats instead of giant raw arrays
|
||||
- frontend now logs raw non-JSON error bodies instead of collapsing them to `{}` on gateway/proxy failures
|
||||
- Improved result rendering:
|
||||
- search cards now show source snippet/description separately from AI reason to reduce confusion between asset metadata and Gemini commentary
|
||||
|
||||
## Local Self-Test Workflow
|
||||
- Primary command:
|
||||
- `bash scripts/selftest.sh`
|
||||
@@ -145,7 +172,8 @@
|
||||
- Gemini batch evaluation exists, but search quality can still degrade if upstream SearXNG results are noisy.
|
||||
- Frontend JavaScript was not linted with Node tooling in this environment because `node` is not installed here.
|
||||
- Full browser-level preview validation is still not covered by the local self-test script.
|
||||
- Search cards still render recommendation reason text, not a robust asset description/snippet mapping.
|
||||
- Search cards now separate source snippet from AI reason, but metadata fidelity still depends on source enrichment quality.
|
||||
- Artgrid public pages inspected from this environment still did not expose a stable public preview video URL in HTML, so Artgrid hover-video support may remain partial until a browser-captured HTML/HAR sample reveals the real preview source pattern.
|
||||
|
||||
## Frontend Debug Logger
|
||||
- UI button: bottom-right `Logs`
|
||||
@@ -215,6 +243,7 @@
|
||||
- [ ] Better matching between rendered description and actual linked asset
|
||||
- [ ] Add browser-level verification for preview/HLS behavior
|
||||
- [ ] Add more automated coverage for search ranking / filtering logic
|
||||
- [ ] If Artgrid hover preview is still required, collect one real clip HTML/HAR from a browser session and derive a stable preview URL parser
|
||||
- [ ] Add proper frontend build/lint step if Node becomes available
|
||||
|
||||
## Verified Locally In This Environment
|
||||
|
||||
Reference in New Issue
Block a user