Epstein Files Jan 30, 2026

Data hoarders on reddit have been hard at work archiving the latest Epstein Files release from the U.S. Department of Justice. Below is a compilation of their work with download links.

Please seed all torrent files to distribute and preserve this data.

Ref: https://old.reddit.com/r/DataHoarder/comments/1qrk3qk/epstein_files_datasets_9_10_11_300_gb_lets_keep/

Epstein Files Data Sets 1-8: INTERNET ARCHIVE LINK

Epstein Files Data Set 1 (2.47 GB): TORRENT MAGNET LINK
Epstein Files Data Set 2 (631.6 MB): TORRENT MAGNET LINK
Epstein Files Data Set 3 (599.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 4 (358.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 5: (61.5 MB) TORRENT MAGNET LINK
Epstein Files Data Set 6 (53.0 MB): TORRENT MAGNET LINK
Epstein Files Data Set 7 (98.2 MB): TORRENT MAGNET LINK
Epstein Files Data Set 8 (10.67 GB): TORRENT MAGNET LINK


Epstein Files Data Set 9 (Incomplete). Only contains 49 GB of 180 GB. Multiple reports of cutoff from DOJ server at offset 48995762176.

ORIGINAL JUSTICE DEPARTMENT LINK

  • TORRENT MAGNET LINK (removed due to reports of CSAM)

/u/susadmin’s More Complete Data Set 9 (96.25 GB)
De-duplicated merger of (45.63 GB + 86.74 GB) versions

  • TORRENT MAGNET LINK (removed due to reports of CSAM)

Epstein Files Data Set 10 (78.64GB)

ORIGINAL JUSTICE DEPARTMENT LINK

  • TORRENT MAGNET LINK (removed due to reports of CSAM)
  • INTERNET ARCHIVE FOLDER (removed due to reports of CSAM)
  • INTERNET ARCHIVE DIRECT LINK (removed due to reports of CSAM)

Epstein Files Data Set 11 (25.55GB)

ORIGINAL JUSTICE DEPARTMENT LINK

SHA1: 574950c0f86765e897268834ac6ef38b370cad2a


Epstein Files Data Set 12 (114.1 MB)

ORIGINAL JUSTICE DEPARTMENT LINK

SHA1: 20f804ab55687c957fd249cd0d417d5fe7438281
MD5: b1206186332bb1af021e86d68468f9fe
SHA256: b5314b7efca98e25d8b35e4b7fac3ebb3ca2e6cfd0937aa2300ca8b71543bbe2


This list will be edited as more data becomes available, particularly with regard to Data Set 9 (EDIT: NOT ANYMORE)


EDIT [2026-02-02]: After being made aware of potential CSAM in the original Data Set 9 releases and seeing confirmation in the New York Times, I will no longer support any effort to maintain links to archives of it. There is suspicion of CSAM in Data Set 10 as well. I am removing links to both archives.

Some in this thread may be upset by this action. It is right to be distrustful of a government that has not shown signs of integrity. However, I do trust journalists who hold the government accountable.

I am abandoning this project and removing any links to content that commenters here and on reddit have suggested may contain CSAM.

Ref 1: https://www.nytimes.com/2026/02/01/us/nude-photos-epstein-files.html
Ref 2: https://www.404media.co/doj-released-unredacted-nude-images-in-epstein-files

    • DigitalForensick@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      I can work on a script that crawls/detects which of these files are like this. I looked at the hexadecimal data of one of the examples from the reddit thread and there might be some indicators. I wasnt able to get a video to plat, but

      Just from this file I can see that they created/edited these pdfs using something called reportlab.com, edited on 12/23/2025. Theres also some text refering to Kids? “Count 1/ Kids [ 3 0 R ]” This is odd. screenshot

      File EFTS0024813.pdf/mp4? (Dataset 8, folder 0005)

      • StinkyFred@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        1 day ago

        I assume you mean EFTA00024813?

        If so while it doesn’t show on the website it is included in dataset 8 if you just download the complete set.

        It is a .xlsx file.

        Its in DataSet 8\VOL00008\NATIVES\0001

        This is the case for most of these files, dataset 9 might be missing a few though.

  • internauta@lemmy.world
    link
    fedilink
    arrow-up
    4
    ·
    4 days ago

    someone on reddit ( u/FuckThisSite3 ) posted a more complete DataSet 9:

    I assembled a tar file with all I got from dataset-9.

    Magnet link: magnet:?xt=urn:btih:5b50564ee995a54009fec387c97f9465eb18ba00&dn=dataset-9_by_fuckthissite3.tar&xl=148072017920

    SHA256: 5adc043bcf94304024d718e57267c1aa009d782835f6adbe6ad7fdbb763f15c5

    The tar contains 254,477 files which is 148072017920 bytes (148.07GB/137.9GiB)

    • locke1@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      3 days ago

      While it’s bigger in size, this one seems to be missing a ton of files? I grabbed the earlier collections and my file count is at 531,256. I’ll have to compare when I finish downloading.

    • ZaInT@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      4 days ago

      Seeding 1 node, 3 on the way EDIT: 3 running, 4th one planned to be temporary but should soon be up

    • ZaInT@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      5 days ago

      I am seeding 8, 10, 11, 12 (the ones only available as .zip on justice.gov) for the forseable future (as well as the partials of set 9). I’m looking “everywhere” hoping for some success on part 9 and will be pushing that one until bandwidth dies or until a dozen or so seeders are on - whenever the complete bundle is assembled. Hoping for some good news soon, things seem to be nuked very rapidly now.

      I also read that the court documents and one other page was taken down - I have those files but they are not sorted by page, just thrown in a bulk download directory as I had a feeling this would happen and I wanted to pull them quickly. If there’s any use for them anyway I put them on Mega and Gofile a few days ago and they’ve not been taken down so far;

      https://gofile.io/d/dff931d5-a646-46f1-b34e-079798f508a2 https://mega.nz/folder/XVMCgLLR#EKVS8Sfiry-VtVAxZ7q_Ig

      It’s most likely files that “everyone” already has but better one mirror too much than one less.

      • ZaInT@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        5 days ago

        Also seeding (but will probably not for very long unless seeders start dropping, they are all at 300-1200 ATM) 59975667f8bdd5baf9945b0e2db8a57d52d32957 0a3d4b84a77bd982c9c2761f40944402b94f9c64 7ac8f771678d19c75a26ea6c14e7d4c003fbf9b6 c3a522d6810ee717a2c7e2ef705163e297d34b72 d509cc4ca1a415a9ba3b6cb920f67c44aed7fe1f e618654607f2c34a41c88458bf2fcdfa86a52174 acb9cb1741502c7dc09460e4fb7b44eac8022906

        Trying to pull c100b1b7c4b1e662dd8adc79ae3e42eef6080aee (reduntant limited dataset for that GitHub relations chart)

        Pulling f5cbe5026b1f86617c520d0a9cd610d6254cbe85 (just listed on the GitHub repo that lists the same magnets as here - will probably become 2nd seeder in an hour or two and will stay seeding on that one for at least a week or until the swarm looks healthy by the dozens or so.)

        Will continue to monitor whatever progress is being made here. I should also have a small subset of DS9 but it will likely only be the first 200 files or so at most. Needless to say I will compare against the existing torrents just in case.

        Thanks everyone for your hard work, this is exactly why I started hoarding :)

        EDIT: The last magnet ID I listed is the summarized torrent from the repo linked by Nomad64.

  • ArzymKoteyko@lemmy.world
    link
    fedilink
    arrow-up
    4
    ·
    6 days ago

    Hi every one, maybe I’m a bit late to this, but I wanted to share my findings. I parsed every page up to 40k in DS9 3 times and results matched by distribution with PeoplesElbow findings (no content after page 14k and a lot of dublications) BUT I parsed 4 times more unique urls 246_079 (still 2x short of official size). And a strange thing is that on second pass (one day after the first one) I started receiving new urls on old pages.

    Here is stat by file type:

     count  | file type 
    --------+------
          1 | ts
          8 | mov
        236 | mp4
     244326 | pdf
         73 | m4a
          1 | vob
          1 | docx
          1 | doc
          9 | m4v
       1422 | avi
          1 | wmv
    
    • DigitalForensick@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      Nice work man! I also discovered something yesterday that I think is worth pointing out.

      DUPLICATE FILES: Within the datasets, there are often emails, doc scans, etc that are duplicate entries. (Im not talking about multi torrent stitching, but actual duplicate documents within the raw dataset.) **These duplicates mustbe preserved. ** When looking at two copies of the same duplicate file, I found that sometimes the redactions are in different places! This can be used to extract more info later down the road.

      • ArzymKoteyko@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        5 days ago

        Finally got my hands on original DS9 OPT file and I have started downloading files from it. Don’t know how long it will take. Also made a git with stats and index files from doj website and opt from archive: https://github.com/ArzymKoteyko/JEDatasets In short the only difference is that I got additional 1753 links to video files and a strange .docx file with size of 0 bytes [EFTA00335487.docx].

  • DigitalForensick@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 days ago

    For anyone looking into doing some OSINT work, this is an epic file EFTA00809187

    It contains lists of ALL know JE emails, usernames, websites, social medias, etc from that time

  • Xenom0rph@lemmy.world
    link
    fedilink
    arrow-up
    3
    ·
    6 days ago

    I’m still seeding the partial Dataset 9 (45.63GB and 89.54GB) and all the other datasets. Is there a newer dataset 9 available?

  • Dhoard@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    6 days ago

    Theoretically speaking, if a website has the archives, what is stopping people from downloading each file on a page by page bases from the archive?

    Edit: Never mind to this I saw a full list of URLs that arhive managed to save and it is missing a lot.

    • DigitalForensick@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      nothing, but event the archived pages arent 100% because some of the files were “faked” in the paginated file lists on the DOJ site. it does work well enough though. I did this to recover all the court records and FOIA files

  • susadmin@lemmy.world
    link
    fedilink
    arrow-up
    50
    ·
    edit-2
    10 days ago

    I’m in the process of downloading both dataset 9 torrents (45.63 GB + 86.74 GB). I will then compare the filenames in both versions (the 45.63GB version has 201,358 files alone), note any duplicates, and merge all unique files into one folder. I’ll upload that as a torrent once it’s done so we can get closer to a complete dataset 9 as one file.

    • Edit 31Jan2026 816pm EST - Making progress. I finished downloading both dataset 9s (45.6 GB and the 86.74 GB). The 45.6GB set is 200,000 files and the 86GB set is 500,000 files. I have a .csv of the filenames and sizes of all files in the 45.6GB version. I’m creating the same .csv for the 86GB version now.

    • Edit 31Jan2026 845pm EST -

      • dataset 9 (45.63 GB) = 201357 files
      • dataset 9 (86.74 GB) = 531257 files

      I did an exact filename combined with an exact file size comparison between the two dataset9 versions. I also did an exact filename combined with a fuzzy file size comparison (tolerance of +/- 1KB) between the two dataset9 versions. There were:

      • 201330 exact matches
      • 201330 fuzzy matches (+/- 1KB)

      Meaning there are 201330 duplicate files between the two dataset9 versions.

      These matches were written to a duplicates file. Then, from each dataset9 version, all files/sizes matching the file and size listed in the duplicates file will be moved to a subfolder. Then I’ll merge both parent folders into one enormous folder containing all unique files and a folder of duplicates. Finally, compress it, make a torrent, and upload it.


    • Edit 31Jan2026 945pm EST -

      Still moving duplicates into subfolders.


    • Edit 31Jan2026 1027pm EST -

      Going off of xodoh74984’s comment (https://lemmy.world/post/42440468/21884588), I’m increasing the rigor of my determination of whether the files that share a filename and size between both version of dataset9 are in fact duplicates. This will be identical to rsync --checksum to verify bit-for-bit that the files are the same by calculating their MD5 hash. This will take a while but is the best way.


    • Edit 01Feb2026 1227am EST -

      Checksum comparison complete. 73 files found that have the same file name and size but different content. Total number of duplicate files = 201257. Merging both dataset versions now, while keeping one subfolder of the duplicates, so nothing is deleted.


    • Edit 01Feb2026 1258am EST -

      Creating the .tar.zst file now. 531285 total files, which includes all unique files between dataset9 (45.6GB) and dataset9 (86.7GB), as well as a subfolder containing the files that were found in both dataset9 versions.


    • Edit 01Feb2026 215am EST -

      I was using wayyyy to high a compression value for no reason (ztsd --ultra --22). Restarted the .tar.zst file creation (with ztsd -12) and it’s going 100x faster now. Should be finished within the hour


    • Edit 01Feb2026 311am EST -

      .tar.zst file creation is taking very long. I’m going to let it run overnight - will check back in a few hours. I’m tired boss.


    • EDIT 01Feb2026 831am EST -

    COMPLETE!

    And then I doxxed myself in the torrent. One moment please while I fix that…


    Final magnet link is HERE. GO GO GOOOOOO

    I’m seeding @ 55 MB/s. I’m also trying to get into the new r/EpsteinPublicDatasets subreddit to share the torrent there.

    • epstein_files_guy@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      10 days ago

      looking forward to your torrent, will seed.

      I have several incomplete sets of files from dataset 9 that I downloaded with a scraped set of urls - should I try to get them to you to compare as well?

      • susadmin@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        10 days ago

        Yes! I’m not sure the best way to do that - upload them to MEGA and message me a download link?

        • epstein_files_guy@lemmy.world
          link
          fedilink
          arrow-up
          6
          ·
          10 days ago

          maybe archive.org? that way they can be torrented if others want to attempt their own merging techniques? either way it will be a long upload, my speed is not especially good. I’m still churning through one set of urls that is 1.2M lines, most are failing but I have 65k from that batch so far.

            • epstein_files_guy@lemmy.world
              link
              fedilink
              arrow-up
              5
              ·
              edit-2
              10 days ago

              I’ll get the first set (42k files in 31G) uploading as soon as I get it zipped up. it’s the one least likely to have any new files in it since I started at the beginning like others but it’s worth a shot

              edit 01FEB2026 1208AM EST - 6.4/30gb uploaded to archive.org

              edit 01FEB2026 0430AM EST - 13/30gb uploaded to archive.org; scrape using a different url set going backwards is currently at 75.4k files

              edit 01FEB2026 1233PM EST - had an internet outage overnight and lost all progress on the archive.org upload, currently back to 11/30gb. the scrape using a previous url set seems to be getting very few new files now, sitting at 77.9k at the moment

    • thetrekkersparky@startrek.website
      link
      fedilink
      arrow-up
      8
      ·
      10 days ago

      I’m downloading 8-11 now, I’m seeding 1-7+12 now. I’ve tried checking up on reddit, but every other time i check in the post is nuked or something. My home server never goes down and I’m outside USA. I’m working on the 100GB+ #9 right now and I’ll seed whatever you can get up here too.

    • helpingidiot@lemmy.world
      link
      fedilink
      arrow-up
      6
      ·
      10 days ago

      Have a good night. I’ll be waiting to download it, seed it, make hardcopies and redistribute it.

      Please check back in with us

    • xodoh74984@lemmy.worldOP
      link
      fedilink
      arrow-up
      4
      ·
      edit-2
      10 days ago

      When merging versions of Data Set 9, is there any risk of loss with simply using rsync --checksum to dump all files into one directory?

      • susadmin@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        10 days ago

        rsync --checksum is better than my file name + file size comparison, since you are calculating the hash of each file and comparing it to the hash all other files. For example, if there is a file called data1.pdf with size 1024 bytes in dataset9-v1, and another file called data1.pdf with size 1024 bytes in dataset9-v2, but their content is different, my method will still detect them as identical files.

        I’m going to modify my script to calculate and compare the hashes of all files that I previously determined to be duplicates. If the hashes of the duplicates in dataset9 (45GB torrent) match the hashes of the duplicates in dataset9 (86GB torrent), then they are in fact duplicates between the two datasets.

        • xodoh74984@lemmy.worldOP
          link
          fedilink
          arrow-up
          2
          ·
          10 days ago

          Amazing, thank you. That was my thought, check hashes while merging the files to keep any copies that might have been modified by DOJ and discard duplicates even if the duplicates have different metadata, e.g. timestamps.

      • GorillaCall@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        7 days ago

        anyone have the original 186gb magnet link from that thread? someone said reddit keeps nuking it because it implicates reddit admins like spez

        • idiomaddict@lemmy.world
          link
          fedilink
          arrow-up
          1
          ·
          7 days ago

          This is it, encoded in base 64 format, according to the comment:

          bWFnbmV0Oj94dD11cm46YnRpaDo3YWM4Zjc3MTY3OGQxOWM3NWEyNmVhNmMxNGU3ZDRjMDAzZmJmOWI2JmRuPWRhdGFzZXQ5LW1vcmUtY29tcGxldGUudGFyLnpzdCZ4bD05NjE0ODcyNDgzNyZ0cj11ZHAlM0ElMkYlMkZ0cmFja2VyLm9wZW50cmFja3Iub3JnJTNBMTMzNyUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRm9wZW4uZGVtb25paS5jb20lM0ExMzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGZXhvZHVzLmRlc3luYy5jb20lM0E2OTY5JTJGYW5ub3VuY2UmdHI9aHR0cCUzQSUyRiUyRm9wZW4udHJhY2tlci5jbCUzQTEzMzclMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZvcGVuLnN0ZWFsdGguc2klM0E4MCUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnplcjBkYXkuY2glM0ExMzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGd2Vwem9uZS5uZXQlM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlcjEubXlwb3JuLmNsdWIlM0E5MzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci50b3JyZW50LmV1Lm9yZyUzQTQ1MSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnRyYWNrZXIudGhlb2tzLm5ldCUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZ0cmFja2VyLnNydjAwLmNvbSUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZ0cmFja2VyLnF1LmF4JTNBNjk2OSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnRyYWNrZXIuZGxlci5vcmclM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci5iaXR0b3IucHclM0ExMzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci5hbGFza2FudGYuY29tJTNBNjk2OSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnRyYWNrZXItdWRwLmdiaXR0LmluZm8lM0E4MCUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnJ1bi5wdWJsaWN0cmFja2VyLnh5eiUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZvcGVudHJhY2tlci5pbyUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZvcGVuLmRzdHVkLmlvJTNBNjk2OSUyRmFubm91bmNlJnRyPWh0dHBzJTNBJTJGJTJGdHJhY2tlci56aHVxaXkuY29tJTNBNDQzJTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci5maWxlbWFpbC5jb20lM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdC5vdmVyZmxvdy5iaXolM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGbWFydGluLWdlYmhhcmR0LmV1JTNBMjUlMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZldmFuLmltJTNBNjk2OSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRmQ0MDk2OS5hY29kLnJlZ3J1Y29sby5ydSUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkY2YWhkZHV0YjF1Y2MzY3AucnUlM0E2OTY5JTJGYW5ub3VuY2U

    • ModernSimian@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      10 days ago

      Be prepared to wait a while… idk why this person chose xz, it is so slow. I’ve been just trying to get the tarball out for an hour.

  • jankscripts@lemmy.world
    link
    fedilink
    arrow-up
    20
    ·
    10 days ago

    Heads up that the DOJ site is a tar pit, it’s going to return 50 files on the page regardless of the page number your on seems like somewhere between 2k-5k pages it just wraps around right now.

    Testing page 2000... ✓ 50 new files (out of 50)
    Testing page 5000... ○ 0 new files - all duplicates
    Testing page 10000... ○ 0 new files - all duplicates
    Testing page 20000... ○ 0 new files - all duplicates
    Testing page 50000... ○ 0 new files - all duplicates
    Testing page 100000... ○ 0 new files - all duplicates

    • WorldlyBasis9838@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      10 days ago

      I saw this too; yesterday I tried manually accessing the page to explore just how many there are. Seems like some of the pages are duplicates (I was simply comparing the last listed file name and content between some of the first 10 pages, and even had 1-2 duplications.)

      Far as maximum page number goes, if you use the query parameter ?page=200000000 it will still resolve a list of files. — actually crazy.

      https://www.justice.gov/epstein/doj-disclosures/data-set-9-files?page=200000000

    • jankscripts@lemmy.world
      link
      fedilink
      arrow-up
      4
      ·
      10 days ago

      The last page I got a non-duplicate URL from was 10853 which curiously only had 36 URLs on page. When I browsed directly to page 10853 36 URLs were displayed but then moving back and forth in the page count the tar pit logic must have re-looped there and it went back to 50 Displayed. I ended with 224751 URLs

  • hYcG68caGB7WvLX67@lemmy.world
    link
    fedilink
    arrow-up
    18
    ·
    10 days ago

    I was quick to download dataset 12 after it was discovered to exist, and apparently my dataset 12 contains some files that were later removed. Uploaded to IA in case it contains anything that later archivists missed. https://archive.org/details/data-set-12_202602

    Specifically doc number 2731361 and others around it were at some point later removed from DoJ, but are still within this early-download DS12. Maybe more, unsure

    • susadmin@lemmy.world
      link
      fedilink
      arrow-up
      8
      ·
      edit-2
      10 days ago

      The files in this (early) dataset 12 are identical to the dataset 12 here, which is the link in the OP. The MD5 hashes are identical.

      I shared a .csv file of the calculated MD5 hashes here