uap-analyzer
The analyst's tool for working through declassified UAP releases.
An MCP server that ingests the U.S. Department of War's PURSUE-era UAP corpus — videos, PDFs, photos — and exposes the heavy work (frame extraction, FLIR vision describe, PDF OCR, FTS5 search) as fast typed tools your LLM session can call. No cloud APIs in the analysis path. The chat never has to load a 274 MB MP4 to know what's in it.
The war.gov release pages drop hundreds of files per tranche — FLIR clips, redacted mission reports, scanned 1940s memos, FBI photo packets. The honest interactive workflow is: download everything, open files one at a time, ask an LLM "what's in this?" — and burn tens of thousands of context tokens loading raw bytes. The chat shouldn't be the substrate that holds raw media. The analyzer is.
/// THREE PRIMITIVES
[ P-01 ] CORPUS
Filesystem-rooted corpus
Drop releases under ~/uap-data/Release_N/. The server indexes everything on rescan — videos to ffprobe, PDFs to pdfplumber + Tesseract OCR fallback, photos to Pillow. SQLite caches everything.
[ P-02 ] INFERENCE
Local-only inference
Vision-describe runs through a local ollama instance — llama3.2-vision:11b for frames, qwq:32b for text. No cloud APIs in the analysis path. The corpus + your analyst notes stay on your LAN.
[ P-03 ] RETRIEVAL
FTS5 full-text search
Once index_corpus has run, search_corpus("range fouler") spans every PDF in the tranche in milliseconds. bm25-ranked, with snippets. Find the mission report that matches the FLIR clip.
The chat session is the wrong place to hold raw media. The substrate is. Every tool returns text or structured JSON; raw video stays in the container, raw OCR stays in the cache, raw frames sit on disk until you ask for one.
When you read the analyst notes in Findings, you're reading distilled output. The mile-deep media that produced them sits on the LAN, not in your transcript.
/// WHERE IT SLOTS IN
One MCP server, registered with Claude Code via claude mcp add --transport http. From there it's just a set of typed tools the LLM session can call by name.
// REFERENCE
7 tools, ready
list_corpus, analyze_video, extract_frame, describe_image, analyze_pdf, search_corpus, index_corpus.
// WORKED EXAMPLES
Four findings from Release 1
A maritime tracking clip with a wake-event. A NASA-logo placeholder file revealing a redaction-by-substitution pattern. A coastal-recon sensor anomaly. Federal LE witness statements from a Western US restricted zone.
// DOCS
Architecture
Container + ollama + corpus bind-mount + SQLite/FTS5 cache. The "shape of the box" walk-through.
// DOCS
Deployment
Run locally or on a LAN box. Docker compose + bootstrap script + the FastMCP transport-security gotcha you'll hit on first run.
/// 60-SECOND TOUR
# Once the analyzer is registered as an MCP server, your LLM session
# calls these tools by name — e.g. inside Claude Code:
> list_corpus(rescan=True)
→ {data_dir: "/srv/uap-data", count: 157, fts_indexed: 88, items: [...]}
> search_corpus(query="range fouler", k=5)
→ [{path: "Release_1/DOW-UAP-D44-...pdf", score: 1.84, snippet: "..."}]
> extract_frame(path="videos/DOD_111689090.mp4", timestamp=260, width=1600)
→ writes frame JPG to cache, returns path + dimensions
> describe_image(path="reports/frames/DOD_111689090_t260.jpg", prompt="FLIR")
→ 3-5 sentence description with FLIR-anchored prompting // OPERATOR NOTE
"Disclosure doesn't end at the download button. It ends when someone has actually read the corpus and can tell you what's in it."
— the analyzer is the gap between.