LINKS
Official anchors first, then mirrors, enrichment projects, convenience UIs, datasets, and preservation/community infrastructure.
Internal Site Links
Quick jumps to major pages on this site, including the gallery pages.
- Image & Photo Analysis Main image analysis hub page with extraction/classification methodology and the full image-content breakdown.
- Gallery Picks Curated visual highlights selected from the broader extracted image archive.
- Epstein Face Matches Face-match output view for extracted imagery, intended for review and validation workflows.
- Handwritten Notes Focused page for handwritten-page review and related note content.
- Category Gallery: Person One-click entry into the category gallery system; use it to branch into other gallery categories.
Official / Government
Primary provenance and legal anchors.
- DOJ Epstein Hub The DOJ landing page for the release program. Use this as the first provenance checkpoint before relying on mirrors or reposts.
- DOJ Epstein Search Official search interface across DOJ-hosted Epstein materials. Best for quick term lookups when you need a government-hosted result URL.
- DOJ Disclosures (Datasets 1-12) Canonical dataset index for the DOJ disclosure batches. This is the main source of truth for published files, notices, and release structure.
- DOJ OPA Press Release (3.5M Pages) DOJ narrative statement about scope, compliance, and release framing. Useful context when citing official claims about counts, exceptions, and process.
- DOJ OPA Video Page Video companion page from DOJ public affairs. Helpful for timeline context and official presentation language around the release.
- Congress: H.R. 4405 (Overview) Bill summary view on Congress.gov with status, sponsors, and actions. Use this when you want quick legislative context without reading full bill text.
- Congress: H.R. 4405 Text (Public Law 119-38) Primary legal text for the act language and requirements. This is the citation target for release obligations and statutory wording.
- House Oversight: DOJ Record Release Committee release page for records provided through oversight channels. Useful for tracking congressional framing and packet-specific drops.
- House Oversight: Additional Estate Documents Follow-on committee publication for additional estate-related materials. Use alongside the earlier oversight release page to compare what changed.
- FBI Vault: Jeffrey Epstein FBI reading-room style FOIA publication page for Epstein-related records. Separate from DOJ library batches, but important for historical federal sourcing.
- CBP FOIA Library CBP-hosted FOIA library page for records under that agency domain. Useful when you need agency-specific provenance outside DOJ/FBI repositories.
Court / FOIA Ecosystem
Court systems are authoritative; mirrors and hosts are convenience layers.
- PACER The official federal court docket and filing system. Use this for authoritative case metadata and filing provenance, even when a mirrored copy exists elsewhere.
- CourtListener RECAP Searchable public mirror of many PACER filings via the RECAP ecosystem. Great for discovery and free access, then confirm sensitive citations back to docket context.
- Free Law Project RECAP Project home for RECAP, including mission and infrastructure details. Useful if you want to understand where mirrored dockets originate and how coverage is built.
- DocumentCloud: Epstein Documents Hosted document collection with text layers, pages, and shareable links. Handy for quick review and collaborative annotation workflows.
- MuckRock FOIA Project FOIA request tracking and publication hub around this topic cluster. Helpful for seeing pending requests, releases, and agency response history.
Trackers / Indexes / Mirrors
High-value handoff nodes for preservation and verification.
- WikiEpstein Community tracker that centralizes release links and navigation paths. Good as a map layer, but validate any critical claim against original source artifacts.
- yung-megafone/Epstein-Files Major mirror/index hub with integrity and distribution focus. Strong handoff target for cross-linking, preservation coordination, and ingestion of derived indexes.
- Internet Archive Combined Mirror Bulk convenience mirror on Internet Archive for resilient access. Useful for availability and download continuity when upstream endpoints are unstable.
- FULL_EPSTEIN_INDEX Community-maintained indexing project intended to improve browseability. Useful for alternate indexing logic and cross-checking coverage gaps.
- Surebob/epstein-files-downloader Downloader tooling to automate retrieval and local syncing of file sets. Helpful for reproducible local archives and refresh workflows.
OCR / Enrichment / Visualization Projects
Projects converting raw drops into searchable entities, graphs, and timelines.
- epstein-docs (Repository) Source code and pipeline implementation for one of the better-known OCR/index builds. Use this to inspect methods, issues, and contribution pathways.
- epstein-docs (Site) Public-facing output from the epstein-docs processing pipeline. Useful for quick searchability and entity-oriented browsing.
- SvetimFM Visualizations (Repository) Repository for a visualization-heavy analysis stack (entities, timelines, and links). Good reference for UI patterns and enrichment approaches.
- SvetimFM Visualizations (Demo) Live demo of the visualization output. Best for exploratory pattern recognition before deep source validation.
- maxandrews/Epstein-doc-explorer Graph-oriented explorer focused on relationships and document linkages. Useful when mapping network-style connections across records.
- markramm/EpsteinFiles Community codebase for ingestion and analysis workflows. Useful as an alternate implementation path for extraction and indexing.
- stonesalltheway1/Epstein-Pipeline Pipeline-centric project for processing and organizing records into queryable forms. Helpful for reproducible ETL-style analysis runs.
- paulgp/epstein-document-search Search-focused tooling for document retrieval and indexing. Useful if you want a leaner search implementation rather than a full portal stack.
Third-Party Portals (Convenience UIs)
Useful for navigation and discovery. Validate claims against official artifacts.
- epsteinsimages.com Image-centric browsing portal for visual artifacts from the broader corpus. Useful when your workflow starts with photos or scanned-image discovery.
- EpsteinExposed Third-party aggregation and navigation layer with searchable presentation. Treat as a convenience UI and confirm key claims from primary records.
- Sifter Labs / Epstein Document Search Search-forward interface focused on retrieval over raw archives. Helpful for fast term discovery, with provenance checks done downstream.
- Epstein File Explorer Portal with searchable views and legal/methodology framing pages. Useful for quick orientation and cross-referencing surfaced records.
- Epsteingate Community-facing portal used for discovery and aggregation. Works well as a secondary browse layer when triangulating between tools.
- Librarius (Site) Hosted UI for the Librarius project by BoltzmannEntropy. Useful for exploring their curated structure and outputs before inspecting the codebase.
- Librarius (Repository) Source repository behind the Librarius site. Useful for implementation details, reproducibility, and contribution paths.
- jmail.world Convenience viewer layer used by some researchers for navigation. Use with caution and verify critical items against official source artifacts.
- jmail.world / JeffTube Video-oriented subsection of jmail focused on media discovery. Useful as an exploratory index, not as primary evidence provenance.
- Jmail Privacy Policy Policy page describing data and privacy posture for the jmail platform. Include this for transparency when linking to that ecosystem.
Derived Datasets (Not Authoritative)
Great for search and enrichment workflows; always trace key claims back to primary records.
- HF: teyler/epstein-files-20k Derived Hugging Face subset designed for faster experimentation than full archives. Useful for prototyping search, parsing, and model pipelines.
- HF: tensonaut/EPSTEIN_FILES_20K Alternative processed 20k-style corpus with different preparation assumptions. Good for cross-checking behavior across independently prepared subsets.
- HF: svetfm post-OCR embeddings Embedding-focused dataset for semantic retrieval on OCR output. Useful when building vector search and nearest-neighbor exploration tools.
- HF: ChromaDB vector embeddings Vector dataset aligned with ChromaDB workflows for local semantic indexing. Helpful for standing up fast RAG-style experiments.
- HF: epstractor-raw Raw extracted corpus intended as a base material layer. Good if you prefer to run your own cleaning, OCR correction, and enrichment stack.
Pinpoint / Searchable Newsroom Collections
Pinpoint collections are useful exploration layers and may require Google account access.
- Google Pinpoint: Estate Collection Pinpoint-hosted corpus tuned for newsroom-style document search and entity extraction. Useful for high-speed triage and quote-location workflows.
- Google Pinpoint Collection A Additional Pinpoint collection for cross-comparing coverage and indexing behavior. Good for finding documents that may be harder to surface elsewhere.
- Google Pinpoint Collection B Companion collection with potentially different ingest slices. Use it as an alternate retrieval path when validating search misses.
Preservation Tooling
General infrastructure tools for resilient archiving and distribution workflows.
- ArchiveBox Open-source self-hosted web archiving stack. Useful for capturing source pages and retaining reproducible snapshots over time.
- IPFS Documentation Core docs for content-addressed distributed storage workflows. Useful for resilient distribution and hash-addressed file verification practices.
Community Hubs / Threads
Useful for coordination and issue discovery. Apply privacy rules and verify claims before reposting.
- Data Horde Archiving community entrypoint with current social and coordination links. Useful when you need help with mirrors, storage, and long-term availability.
- Reddit /r/Epstein Discussion forum for ongoing document and case conversation. Useful for finding emerging leads, then validating everything against source docs.
- Reddit /r/DataHoarder Large general community focused on preservation workflows and tooling. Useful for replication strategies, storage tactics, and mirror resilience.
- Osintly Community Listing Listing page for an OSINT-focused Discord community. Useful for networking with investigators and tool users across adjacent projects.
- Reddit Hidden-File Recovery Thread Thread documenting community scripts and methods for hidden file extension recovery. Useful for reproduction notes, caveats, and follow-up experiments.
- tommycarstensen.com/epstein Third-party research hub aggregating media, records, and project-specific navigation for this corpus. Useful as a discovery layer before validating against primary artifacts.
- Epstein Video Gallery Third-party video gallery interface for exploring media artifacts. Useful as a browsing layer, with all critical items traced back to primary files.