Epstein Files Jan 30, 2026
Data hoarders on reddit have been hard at work archiving the latest Epstein Files release from the U.S. Department of Justice. Below is a compilation of their work with download links.
Please seed all torrent files to distribute and preserve this data.
Epstein Files Data Sets 1-8: INTERNET ARCHIVE LINK
Epstein Files Data Set 1 (2.47 GB): TORRENT MAGNET LINK
Epstein Files Data Set 2 (631.6 MB): TORRENT MAGNET LINK
Epstein Files Data Set 3 (599.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 4 (358.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 5: (61.5 MB) TORRENT MAGNET LINK
Epstein Files Data Set 6 (53.0 MB): TORRENT MAGNET LINK
Epstein Files Data Set 7 (98.2 MB): TORRENT MAGNET LINK
Epstein Files Data Set 8 (10.67 GB): TORRENT MAGNET LINK
Epstein Files Data Set 9 (Incomplete). Only contains 49 GB of 180 GB. Multiple reports of cutoff from DOJ server at offset 48995762176.
ORIGINAL JUSTICE DEPARTMENT LINK
- TORRENT MAGNET LINK (removed due to reports of CSAM)
/u/susadmin’s More Complete Data Set 9 (96.25 GB)
De-duplicated merger of (45.63 GB + 86.74 GB) versions
- TORRENT MAGNET LINK (removed due to reports of CSAM)
Epstein Files Data Set 10 (78.64GB)
ORIGINAL JUSTICE DEPARTMENT LINK
- TORRENT MAGNET LINK (removed due to reports of CSAM)
- INTERNET ARCHIVE FOLDER (removed due to reports of CSAM)
- INTERNET ARCHIVE DIRECT LINK (removed due to reports of CSAM)
Epstein Files Data Set 11 (25.55GB)
ORIGINAL JUSTICE DEPARTMENT LINK
SHA1: 574950c0f86765e897268834ac6ef38b370cad2a
Epstein Files Data Set 12 (114.1 MB)
ORIGINAL JUSTICE DEPARTMENT LINK
SHA1: 20f804ab55687c957fd249cd0d417d5fe7438281
MD5: b1206186332bb1af021e86d68468f9fe
SHA256: b5314b7efca98e25d8b35e4b7fac3ebb3ca2e6cfd0937aa2300ca8b71543bbe2
This list will be edited as more data becomes available, particularly with regard to Data Set 9 (EDIT: NOT ANYMORE)
EDIT [2026-02-02]: After being made aware of potential CSAM in the original Data Set 9 releases and seeing confirmation in the New York Times, I will no longer support any effort to maintain links to archives of it. There is suspicion of CSAM in Data Set 10 as well. I am removing links to both archives.
Some in this thread may be upset by this action. It is right to be distrustful of a government that has not shown signs of integrity. However, I do trust journalists who hold the government accountable.
I am abandoning this project and removing any links to content that commenters here and on reddit have suggested may contain CSAM.
Ref 1: https://www.nytimes.com/2026/02/01/us/nude-photos-epstein-files.html
Ref 2: https://www.404media.co/doj-released-unredacted-nude-images-in-epstein-files
So it turns out there’s a pile of videos if you rename files from pdf to mp4. There’s some video torrents for anyone who wants to back them up (datasets 8-12)
Reddit post: https://www.reddit.com/r/Epstein/comments/1qx81dj/type_this_in_and_change_pdf_to_mp4 Fediverse post: https://lemmy.world/post/42756746
I can work on a script that crawls/detects which of these files are like this. I looked at the hexadecimal data of one of the examples from the reddit thread and there might be some indicators. I wasnt able to get a video to plat, but
Just from this file I can see that they created/edited these pdfs using something called reportlab.com, edited on 12/23/2025. Theres also some text refering to Kids? “Count 1/ Kids [ 3 0 R ]” This is odd. screenshot
File EFTS0024813.pdf/mp4? (Dataset 8, folder 0005)
I assume you mean EFTA00024813?
If so while it doesn’t show on the website it is included in dataset 8 if you just download the complete set.
It is a .xlsx file.
Its in DataSet 8\VOL00008\NATIVES\0001
This is the case for most of these files, dataset 9 might be missing a few though.
Fediverse post I linked has a similar script already made
someone on reddit ( u/FuckThisSite3 ) posted a more complete DataSet 9:
I assembled a tar file with all I got from dataset-9.
Magnet link:
magnet:?xt=urn:btih:5b50564ee995a54009fec387c97f9465eb18ba00&dn=dataset-9_by_fuckthissite3.tar&xl=148072017920SHA256: 5adc043bcf94304024d718e57267c1aa009d782835f6adbe6ad7fdbb763f15c5
The tar contains 254,477 files which is 148072017920 bytes (148.07GB/137.9GiB)
While it’s bigger in size, this one seems to be missing a ton of files? I grabbed the earlier collections and my file count is at 531,256. I’ll have to compare when I finish downloading.
Seeding 1 node, 3 on the way EDIT: 3 running, 4th one planned to be temporary but should soon be up
A consolidated (and structured) torrent file has been released: https://github.com/yung-megafone/Epstein-Files/issues/1#issuecomment-3860836655
Currently clearing data from my seedbox to get this added.
In it with 4 nodes still :)
https://archive.org/details/ds-9-efta-gap-repair
Repaired gaps in from Partial Dataset 9 EFTA00593870.pdf EFTA00595160.pdf EFTA00595410.pdf EFTA00595694.pdf EFTA00595820.pdf EFTA00597207.pdf EFTA00605675.pdf EFTA00645624.pdf EFTA00774768.pdf EFTA01175426.pdf EFTA01220934.pdf
I don’t see this posted here yet. Below is the Github repo link for an index of torrent files, DOJ links, and mirrors, for every dataset. https://github.com/yung-megafone/Epstein-Files
I am seeding 8, 10, 11, 12 (the ones only available as .zip on justice.gov) for the forseable future (as well as the partials of set 9). I’m looking “everywhere” hoping for some success on part 9 and will be pushing that one until bandwidth dies or until a dozen or so seeders are on - whenever the complete bundle is assembled. Hoping for some good news soon, things seem to be nuked very rapidly now.
I also read that the court documents and one other page was taken down - I have those files but they are not sorted by page, just thrown in a bulk download directory as I had a feeling this would happen and I wanted to pull them quickly. If there’s any use for them anyway I put them on Mega and Gofile a few days ago and they’ve not been taken down so far;
https://gofile.io/d/dff931d5-a646-46f1-b34e-079798f508a2 https://mega.nz/folder/XVMCgLLR#EKVS8Sfiry-VtVAxZ7q_Ig
It’s most likely files that “everyone” already has but better one mirror too much than one less.
Thanks for the links. I downloaded the docs and will add them to the pile.
Also seeding (but will probably not for very long unless seeders start dropping, they are all at 300-1200 ATM) 59975667f8bdd5baf9945b0e2db8a57d52d32957 0a3d4b84a77bd982c9c2761f40944402b94f9c64 7ac8f771678d19c75a26ea6c14e7d4c003fbf9b6 c3a522d6810ee717a2c7e2ef705163e297d34b72 d509cc4ca1a415a9ba3b6cb920f67c44aed7fe1f e618654607f2c34a41c88458bf2fcdfa86a52174 acb9cb1741502c7dc09460e4fb7b44eac8022906
Trying to pull c100b1b7c4b1e662dd8adc79ae3e42eef6080aee (reduntant limited dataset for that GitHub relations chart)
Pulling f5cbe5026b1f86617c520d0a9cd610d6254cbe85 (just listed on the GitHub repo that lists the same magnets as here - will probably become 2nd seeder in an hour or two and will stay seeding on that one for at least a week or until the swarm looks healthy by the dozens or so.)
Will continue to monitor whatever progress is being made here. I should also have a small subset of DS9 but it will likely only be the first 200 files or so at most. Needless to say I will compare against the existing torrents just in case.
Thanks everyone for your hard work, this is exactly why I started hoarding :)
EDIT: The last magnet ID I listed is the summarized torrent from the repo linked by Nomad64.
Hi every one, maybe I’m a bit late to this, but I wanted to share my findings. I parsed every page up to 40k in DS9 3 times and results matched by distribution with PeoplesElbow findings (no content after page 14k and a lot of dublications) BUT I parsed 4 times more unique urls 246_079 (still 2x short of official size). And a strange thing is that on second pass (one day after the first one) I started receiving new urls on old pages.
Here is stat by file type:
count | file type --------+------ 1 | ts 8 | mov 236 | mp4 244326 | pdf 73 | m4a 1 | vob 1 | docx 1 | doc 9 | m4v 1422 | avi 1 | wmvNice work man! I also discovered something yesterday that I think is worth pointing out.
DUPLICATE FILES: Within the datasets, there are often emails, doc scans, etc that are duplicate entries. (Im not talking about multi torrent stitching, but actual duplicate documents within the raw dataset.) **These duplicates mustbe preserved. ** When looking at two copies of the same duplicate file, I found that sometimes the redactions are in different places! This can be used to extract more info later down the road.
Can you make a torrent of the new files if you find any?
Finally got my hands on original DS9 OPT file and I have started downloading files from it. Don’t know how long it will take. Also made a git with stats and index files from doj website and opt from archive: https://github.com/ArzymKoteyko/JEDatasets In short the only difference is that I got additional 1753 links to video files and a strange .docx file with size of 0 bytes [EFTA00335487.docx].
For anyone looking into doing some OSINT work, this is an epic file EFTA00809187
It contains lists of ALL know JE emails, usernames, websites, social medias, etc from that time
nah, i didn’t hear anything back
EFTA00809187 Did that guy from pastebin with the complete file DS9file ever answer you?
Here is the download link for a text file that has all the original URL’s https://wormhole.app/PpjJ3P#SFfAOKm1bnCyi-h2YroRyA The link will only last for 24 hours.
I have never made a torrent file before so feel free to correct me if it doesn’t work. Here is the magnet link for this as a torrent file so its up for more than an hour magnet:?xt=urn:btih:694535d1e3879e899a53647769f1975276723db7&xt=urn:btmh:12207cf818f0f0110ca5e44614f2c65e016eca2fe7bc569810f9fb25e80ff608fc9b&dn=DOJ%20Epstein%20file%20urls.txt&xl=81991719&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
What does this contain? anything new?
Its the URL’s of the original dataset 9. It was posted on the original reddit post.
please post again. thank you.
its a file list but not the actual files tho.
I’m still seeding the partial Dataset 9 (45.63GB and 89.54GB) and all the other datasets. Is there a newer dataset 9 available?
Theoretically speaking, if a website has the archives, what is stopping people from downloading each file on a page by page bases from the archive?
Edit: Never mind to this I saw a full list of URLs that arhive managed to save and it is missing a lot.
nothing, but event the archived pages arent 100% because some of the files were “faked” in the paginated file lists on the DOJ site. it does work well enough though. I did this to recover all the court records and FOIA files
I’m in the process of downloading both dataset 9 torrents (45.63 GB + 86.74 GB). I will then compare the filenames in both versions (the 45.63GB version has 201,358 files alone), note any duplicates, and merge all unique files into one folder. I’ll upload that as a torrent once it’s done so we can get closer to a complete dataset 9 as one file.
- Edit 31Jan2026 816pm EST - Making progress. I finished downloading both dataset 9s (45.6 GB and the 86.74 GB). The 45.6GB set is 200,000 files and the 86GB set is 500,000 files. I have a .csv of the filenames and sizes of all files in the 45.6GB version. I’m creating the same .csv for the 86GB version now.
-
Edit 31Jan2026 845pm EST -
- dataset 9 (45.63 GB) = 201357 files
- dataset 9 (86.74 GB) = 531257 files
I did an exact filename combined with an exact file size comparison between the two dataset9 versions. I also did an exact filename combined with a fuzzy file size comparison (tolerance of +/- 1KB) between the two dataset9 versions. There were:
- 201330 exact matches
- 201330 fuzzy matches (+/- 1KB)
Meaning there are 201330 duplicate files between the two dataset9 versions.
These matches were written to a duplicates file. Then, from each dataset9 version, all files/sizes matching the file and size listed in the duplicates file will be moved to a subfolder. Then I’ll merge both parent folders into one enormous folder containing all unique files and a folder of duplicates. Finally, compress it, make a torrent, and upload it.
-
Edit 31Jan2026 945pm EST -
Still moving duplicates into subfolders.
-
Edit 31Jan2026 1027pm EST -
Going off of xodoh74984’s comment (https://lemmy.world/post/42440468/21884588), I’m increasing the rigor of my determination of whether the files that share a filename and size between both version of dataset9 are in fact duplicates. This will be identical to
rsync --checksumto verify bit-for-bit that the files are the same by calculating their MD5 hash. This will take a while but is the best way.
-
Edit 01Feb2026 1227am EST -
Checksum comparison complete. 73 files found that have the same file name and size but different content. Total number of duplicate files = 201257. Merging both dataset versions now, while keeping one subfolder of the duplicates, so nothing is deleted.
-
Edit 01Feb2026 1258am EST -
Creating the
.tar.zstfile now. 531285 total files, which includes all unique files between dataset9 (45.6GB) and dataset9 (86.7GB), as well as a subfolder containing the files that were found in both dataset9 versions.
-
Edit 01Feb2026 215am EST -
I was using wayyyy to high a compression value for no reason (
ztsd --ultra --22). Restarted the.tar.zstfile creation (withztsd -12) and it’s going 100x faster now. Should be finishedwithin the hour
-
Edit 01Feb2026 311am EST -
.tar.zstfile creation is taking very long. I’m going to let it run overnight - will check back in a few hours. I’m tired boss.
- EDIT 01Feb2026 831am EST -
COMPLETE!
And then I doxxed myself in the torrent. One moment please while I fix that…
Final magnet link is HERE. GO GO GOOOOOO
I’m seeding @ 55 MB/s. I’m also trying to get into the new r/EpsteinPublicDatasets subreddit to share the torrent there.
deleted by creator
deleted by creator
looking forward to your torrent, will seed.
I have several incomplete sets of files from dataset 9 that I downloaded with a scraped set of urls - should I try to get them to you to compare as well?
Yes! I’m not sure the best way to do that - upload them to MEGA and message me a download link?
maybe archive.org? that way they can be torrented if others want to attempt their own merging techniques? either way it will be a long upload, my speed is not especially good. I’m still churning through one set of urls that is 1.2M lines, most are failing but I have 65k from that batch so far.
archive.org is a great idea. Post the link here when you can!
I’ll get the first set (42k files in 31G) uploading as soon as I get it zipped up. it’s the one least likely to have any new files in it since I started at the beginning like others but it’s worth a shot
edit 01FEB2026 1208AM EST - 6.4/30gb uploaded to archive.org
edit 01FEB2026 0430AM EST - 13/30gb uploaded to archive.org; scrape using a different url set going backwards is currently at 75.4k files
edit 01FEB2026 1233PM EST - had an internet outage overnight and lost all progress on the archive.org upload, currently back to 11/30gb. the scrape using a previous url set seems to be getting very few new files now, sitting at 77.9k at the moment
I’m downloading 8-11 now, I’m seeding 1-7+12 now. I’ve tried checking up on reddit, but every other time i check in the post is nuked or something. My home server never goes down and I’m outside USA. I’m working on the 100GB+ #9 right now and I’ll seed whatever you can get up here too.
Thank you so much for keeping us updated!!
Have a good night. I’ll be waiting to download it, seed it, make hardcopies and redistribute it.
Please check back in with us
When merging versions of Data Set 9, is there any risk of loss with simply using
rsync --checksumto dump all files into one directory?rsync --checksumis better than my file name + file size comparison, since you are calculating the hash of each file and comparing it to the hash all other files. For example, if there is a file called data1.pdf with size 1024 bytes in dataset9-v1, and another file called data1.pdf with size 1024 bytes in dataset9-v2, but their content is different, my method will still detect them as identical files.I’m going to modify my script to calculate and compare the hashes of all files that I previously determined to be duplicates. If the hashes of the duplicates in dataset9 (45GB torrent) match the hashes of the duplicates in dataset9 (86GB torrent), then they are in fact duplicates between the two datasets.
Amazing, thank you. That was my thought, check hashes while merging the files to keep any copies that might have been modified by DOJ and discard duplicates even if the duplicates have different metadata, e.g. timestamps.
here is the file contents w/ SHA-256 hashes: deleted this
the original post on reddit was deleted after sharing this https://old.reddit.com/r/DataHoarder/comments/1qsfv3j/epstein_9_10_11_12_reddit_keeps_nuking_thread_we/o2vqgoc/
anyone have the original 186gb magnet link from that thread? someone said reddit keeps nuking it because it implicates reddit admins like spez
This is it, encoded in base 64 format, according to the comment:
bWFnbmV0Oj94dD11cm46YnRpaDo3YWM4Zjc3MTY3OGQxOWM3NWEyNmVhNmMxNGU3ZDRjMDAzZmJmOWI2JmRuPWRhdGFzZXQ5LW1vcmUtY29tcGxldGUudGFyLnpzdCZ4bD05NjE0ODcyNDgzNyZ0cj11ZHAlM0ElMkYlMkZ0cmFja2VyLm9wZW50cmFja3Iub3JnJTNBMTMzNyUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRm9wZW4uZGVtb25paS5jb20lM0ExMzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGZXhvZHVzLmRlc3luYy5jb20lM0E2OTY5JTJGYW5ub3VuY2UmdHI9aHR0cCUzQSUyRiUyRm9wZW4udHJhY2tlci5jbCUzQTEzMzclMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZvcGVuLnN0ZWFsdGguc2klM0E4MCUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnplcjBkYXkuY2glM0ExMzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGd2Vwem9uZS5uZXQlM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlcjEubXlwb3JuLmNsdWIlM0E5MzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci50b3JyZW50LmV1Lm9yZyUzQTQ1MSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnRyYWNrZXIudGhlb2tzLm5ldCUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZ0cmFja2VyLnNydjAwLmNvbSUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZ0cmFja2VyLnF1LmF4JTNBNjk2OSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnRyYWNrZXIuZGxlci5vcmclM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci5iaXR0b3IucHclM0ExMzM3JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci5hbGFza2FudGYuY29tJTNBNjk2OSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnRyYWNrZXItdWRwLmdiaXR0LmluZm8lM0E4MCUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRnJ1bi5wdWJsaWN0cmFja2VyLnh5eiUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZvcGVudHJhY2tlci5pbyUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZvcGVuLmRzdHVkLmlvJTNBNjk2OSUyRmFubm91bmNlJnRyPWh0dHBzJTNBJTJGJTJGdHJhY2tlci56aHVxaXkuY29tJTNBNDQzJTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdHJhY2tlci5maWxlbWFpbC5jb20lM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGdC5vdmVyZmxvdy5iaXolM0E2OTY5JTJGYW5ub3VuY2UmdHI9dWRwJTNBJTJGJTJGbWFydGluLWdlYmhhcmR0LmV1JTNBMjUlMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkZldmFuLmltJTNBNjk2OSUyRmFubm91bmNlJnRyPXVkcCUzQSUyRiUyRmQ0MDk2OS5hY29kLnJlZ3J1Y29sby5ydSUzQTY5NjklMkZhbm5vdW5jZSZ0cj11ZHAlM0ElMkYlMkY2YWhkZHV0YjF1Y2MzY3AucnUlM0E2OTY5JTJGYW5ub3VuY2U
Thank you so much for re-archiving it in a better format
Be prepared to wait a while… idk why this person chose xz, it is so slow. I’ve been just trying to get the tarball out for an hour.
Thank you for the final link, downloading now. Will seed forever if needed.
deleted by creator
this method is not working for me anymore
deleted by creator
I messaged you on the other site; I’m currently getting a
Could not determine Content-Length (got None)errordeleted by creator
I also was getting the same error. Going to the link successfully downloads.
Updating the cookies fixed the issue.
Can also confirm, receiving more chunks again.
EDIT: Someone should play around with the retry and backoff settings to see if a certain configuration can avoid being blocked for a longer period of time. IP rotating is too much trouble.
deleted by creator
age gate > page not found
deleted by creator
alrighty, I’m currently in the middle of the archive.org upload but I can transfer the chunks I already have over to a different machine and do it there with a new IP
deleted by creator
Nor I. I got a single chunk back before never getting anything again.
I’m using a partial download I already had and not the 48gb version but I will be gathering as many chunks as I can as well. Thanks for making this
deleted by creator
about 25gb
deleted by creator
Is anyone able to get this working again? It seemed to stop. I have updated cookies. If I remove the chunks it seems to start connecting again but when I put them back it runs for a few mins and then kicks the bucket.
Funny how a rag-tag ad-hoc group can seed data so much better than the DOJ. Beautiful to see in action.
The doj could do better, they are ordered not to.
Heads up that the DOJ site is a tar pit, it’s going to return 50 files on the page regardless of the page number your on seems like somewhere between 2k-5k pages it just wraps around right now.
Testing page 2000... ✓ 50 new files (out of 50)
Testing page 5000... ○ 0 new files - all duplicates
Testing page 10000... ○ 0 new files - all duplicates
Testing page 20000... ○ 0 new files - all duplicates
Testing page 50000... ○ 0 new files - all duplicates
Testing page 100000... ○ 0 new files - all duplicatesI saw this too; yesterday I tried manually accessing the page to explore just how many there are. Seems like some of the pages are duplicates (I was simply comparing the last listed file name and content between some of the first 10 pages, and even had 1-2 duplications.)
Far as maximum page number goes, if you use the query parameter
?page=200000000it will still resolve a list of files. — actually crazy.https://www.justice.gov/epstein/doj-disclosures/data-set-9-files?page=200000000
The last page I got a non-duplicate URL from was 10853 which curiously only had 36 URLs on page. When I browsed directly to page 10853 36 URLs were displayed but then moving back and forth in the page count the tar pit logic must have re-looped there and it went back to 50 Displayed. I ended with 224751 URLs
I was quick to download dataset 12 after it was discovered to exist, and apparently my dataset 12 contains some files that were later removed. Uploaded to IA in case it contains anything that later archivists missed. https://archive.org/details/data-set-12_202602
Specifically doc number 2731361 and others around it were at some point later removed from DoJ, but are still within this early-download DS12. Maybe more, unsure
I’ve got that one too, maybe we should compare dataset 12 versions too

