// SYSTEM ONLINE // CORPUS: 157 ITEMS INDEXED · FTS5 LIVE // 37.2431°N 115.7930°W XXXX

// INVESTIGATION · v0.1 · α CONFIDENTIAL UNCLASSIFIED // FOR ANALYST USE

uap-analyzer

The analyst's tool for working through declassified UAP releases.

An MCP server that ingests the U.S. Department of War's PURSUE-era UAP corpus — videos, PDFs, photos — and exposes the heavy work (frame extraction, FLIR vision describe, PDF OCR, FTS5 search) as fast typed tools your LLM session can call. No cloud APIs in the analysis path. The chat never has to load a 274 MB MP4 to know what's in it.

QUICKSTART → TOOL REFERENCE WORKED FINDINGS GITHUB

// SECTION 01 · THE FAILURE MODE

The war.gov release pages drop hundreds of files per tranche — FLIR clips, redacted mission reports, scanned 1940s memos, FBI photo packets. The honest interactive workflow is: download everything, open files one at a time, ask an LLM "what's in this?" — and burn tens of thousands of context tokens loading raw bytes. The chat shouldn't be the substrate that holds raw media. The analyzer is.

/// THREE PRIMITIVES

SECTION 02

[ P-01 ] CORPUS

Filesystem-rooted corpus

Drop releases under ~/uap-data/Release_N/. The server indexes everything on rescan — videos to ffprobe, PDFs to pdfplumber + Tesseract OCR fallback, photos to Pillow. SQLite caches everything.

[ P-02 ] INFERENCE

Local-only inference

Vision-describe runs through a local ollama instance — llama3.2-vision:11b for frames, qwq:32b for text. No cloud APIs in the analysis path. The corpus + your analyst notes stay on your LAN.

[ P-03 ] RETRIEVAL

FTS5 full-text search

Once index_corpus has run, search_corpus("range fouler") spans every PDF in the tranche in milliseconds. bm25-ranked, with snippets. Find the mission report that matches the FLIR clip.

// RULE · CORPUS-FIRST

The chat session is the wrong place to hold raw media. The substrate is. Every tool returns text or structured JSON; raw video stays in the container, raw OCR stays in the cache, raw frames sit on disk until you ask for one.

When you read the analyst notes in Findings, you're reading distilled output. The mile-deep media that produced them sits on the LAN, not in your transcript.

/// WHERE IT SLOTS IN

SECTION 03

One MCP server, registered with Claude Code via claude mcp add --transport http. From there it's just a set of typed tools the LLM session can call by name.

// REFERENCE

/// 60-SECOND TOUR

SECTION 04 · LIVE READOUT

TERMINAL · MCP TOOL CALLS

# Once the analyzer is registered as an MCP server, your LLM session
# calls these tools by name — e.g. inside Claude Code:

> list_corpus(rescan=True)
→ {data_dir: "/srv/uap-data", count: 157, fts_indexed: 88, items: [...]}

> search_corpus(query="range fouler", k=5)
→ [{path: "Release_1/DOW-UAP-D44-...pdf", score: 1.84, snippet: "..."}]

> extract_frame(path="videos/DOD_111689090.mp4", timestamp=260, width=1600)
→ writes frame JPG to cache, returns path + dimensions

> describe_image(path="reports/frames/DOD_111689090_t260.jpg", prompt="FLIR")
→ 3-5 sentence description with FLIR-anchored prompting

// OPERATOR NOTE

"Disclosure doesn't end at the download button. It ends when someone has actually read the corpus and can tell you what's in it."

— the analyzer is the gap between.

uap-analyzer

/// THREE PRIMITIVES

Filesystem-rooted corpus

Local-only inference

FTS5 full-text search

/// WHERE IT SLOTS IN

7 tools, ready

Four findings from Release 1

Architecture

Deployment

/// 60-SECOND TOUR