Refresh TODO with current system state
build-push / docker (push) Successful in 4m14s

This commit is contained in:
AI Assistant
2026-03-16 15:41:42 +09:00
parent b3db1c89e5
commit c5f6c611ec
+62 -18
View File
@@ -429,35 +429,79 @@
- `f968458` Rewrite TODO as project handover
## Git / Push Status
- Last pushed commit known in earlier work:
- `6d9391b` was pushed successfully
- Local-only work currently exists:
- `9637b76 Improve query intent handling and preview playback`
- Push status for `9637b76`:
- not pushed
- remote rejected the push with:
- `remote unpack failed: unable to create temporary object directory`
- `remote rejected main -> main (unpacker error)`
- Interpretation:
- current blocker appears to be on the remote git server side, not a local git history issue
- Current branch in ongoing work: `main`
- Current state:
- latest work in this environment has been pushed successfully multiple times after the earlier remote unpacker issue
- the older push failure note is historical context only and should not be treated as the current repo state
- Operational note:
- because the frontend is static and aggressively cached, browser hard refreshes are often required after UI / modal / preview changes
## Highest-Value Next Steps
- [ ] Re-try push of local commit once remote git storage/unpacker issue is resolved
- [ ] Reduce `/api/search` latency further without collapsing result count
- [ ] Improve Envato / Artgrid preview acquisition reliability so Gemini Vision sees real frames more often
- [ ] Revisit Google Video UX:
- current YouTube embed was abandoned due error `153`
- current in-app panel is more reliable but less rich than a true embedded watch page
- [ ] Build collector-specific integration tests with recorded SearXNG samples
- [ ] Separate source enrichment tests from live network behavior using local fixtures
- [ ] Add a browser-level preview validation path, especially for hover video
- [ ] Add a browser-level preview validation path, especially for hover video and preview proxy routing
- [ ] If Artgrid hover preview is still required, obtain one real clip HAR / DevTools network export and derive a stable preview URL parser
- [ ] Build a small fixed real-query benchmark set to evaluate search quality before further tuning
- [ ] If frontend tooling becomes available, add lint/build checks
## Short Handover Summary
- The codebase runs.
- Upload/download features mostly exist.
- Search has been significantly refactored and is in a better shape than before, but is still the main unstable area.
- Envato source fidelity is much better than earlier.
- Artgrid source fidelity is better, but preview-video extraction is still incomplete.
- Upload / direct-download features mostly exist and are broadly usable.
- Search is functional but still the least stable subsystem by a wide margin.
- Envato source fidelity is better than before, but Cloudflare / fetch failures still affect enrichment and preview acquisition.
- Artgrid source fidelity is improved, but query coverage and preview extraction are still unreliable.
- There is now a local self-test workflow.
- There is one known local commit that has not been pushed because the remote repo reported an unpacker error.
- Backend debug logging is now much more detailed and intended to support exported log-file analysis from the in-app `Logs` panel.
## Current Reality Check
- Search request flow is now heavily instrumented.
- The frontend `Logs` panel can capture:
- API request start / completion
- SearXNG request / response counts
- collector query expansion
- enrichment start / finish
- Gemini translation / vision preparation / batch failures
- preview proxy fetch / cache events
- The latest broad issue pattern observed in logs is:
- too many SearXNG calls for a single request can still push total latency too high
- Envato / Artgrid often fail to provide enough preview-capable candidates for Gemini Vision
- Google Video is frequently the easiest source to retrieve and therefore can dominate final results
- YouTube embed error `153` made the prior Google modal approach unreliable
## Active Problems
- `504 Gateway Time-out`
- Root cause: `/api/search` can still become too expensive when query expansion, source collectors, enrichment, and Gemini batch retries stack together.
- Current mitigation: request-level time budget and partial-result return path.
- Residual risk: fewer reviewed results can be returned when the budget is exhausted.
- Too many Google Video-only result sets
- Root cause: Envato / Artgrid queries can still produce repeated `rawCount: 0` responses from SearXNG.
- Current mitigation: looser unquoted query variants for both sources.
- Residual risk: upstream SearXNG quality still dominates discovery.
- Gemini Vision partial or weak evaluation
- Root cause: many candidates still lack usable thumbnails / preview frames, so Gemini sees fewer real visuals than the raw result count suggests.
- Current mitigation: more verbose visual-fetch logging, preview-video-first strategy for Envato / Artgrid, and partial backfill from ranked candidates.
- Residual risk: if source media cannot be fetched, Gemini quality still degrades sharply.
- Envato / Artgrid preview instability
- Root cause: source HTML can be incomplete, fetches can fail, and some previews may only appear after client-side rendering or protected media access paths.
- Current mitigation: JSON-LD/meta/hydration parsing, delayed retry, preview proxy route, MP4 cache, and HLS playlist rewriting.
- Residual risk: a real browser-rendered fetch path may still be needed later for some pages.
- Google Video popup UX
- Root cause: YouTube embed error `153`.
- Current mitigation: dedicated in-app Google panel instead of direct embed.
- Residual risk: this is reliable but not as rich as showing the live watch page.
## Current Technical Notes
- Preview proxy route:
- `/api/preview/stream`
- MP4 responses can be cached to disk
- HLS playlists are rewritten so segment fetches also flow through the backend
- Frontend cache busting is done via `/app.js?v=...`
- If behavior in the browser does not match the latest backend/frontend code, the first assumption should be stale frontend assets until proven otherwise
## Recent Change Log
- Date: `2026-03-16`