this. ⬆️
this. ⬆️
this ‘AI’ all the things stuff really reminds me of the ‘smart’ all the things trend from a few years back… sooner or later people will realize exactly what should and should not be connected like this… like ‘smart/ai’ rice cookers and washing machines… and toothbrushes… like WHY??
I like to store useful medical and survival information either in my ArchiveBox and/or kiwix containers. I’ve also been meaning to put together a local wiki.js with more custom info. TubeArchivist is really good too for storing instructional videos for offline use, or videos I find incredibly useful that have a strong likelihood of being taken down by YouTube so I save them first.
for the metadata, LLMs may not prove so great. Use MusicBrainz Picard or Beets
I use local AI for coding (more recently) and ML Photo storage facial recognition and security camera object detection (been using the later 2 for years now actually, don’t want that kind of info out on someone else’s cloud training on my images)
I run dozens of proxmox LXCs, most with docker, but can confirm that while proxmox is fantastic to host NFS shares, it is near impossible to mount external ones to LXCs without doing some weirdness on the host. best practice would be to turn the NFS into samba or something and configure the docker vms and LXCs to use that (45drives has awesome repos on GitHub that work really well; that’s how I worked around that issue) the only downside is you need to usually offload everything as a backup first before the switch.
I’m a big privacy and FOSS advocate so my list is kinda long, but the main ones are:
-> Google (I use GrapheneOS)
-> TikTok
-> Tesla (too much data collection)
-> Microsoft (self explanatory, however for some things I need to keep an w10 LTSC VM configured)
-> Adobe (same reasons as Michaelsoft)
-> OpenAI (same reasons as Michaelsoft, but I do use it inside a vm in no-account mode for some work related things)
-> Uber (oh man that app is digital herpes)
-> Spotify
-> Facebook/Meta
-> Dropbox
RedLib and Invidious hoster here;
I can confirm they do not use any backend API, however this means eventually they (YouTube and Reddit) kick on automatic rate limiting after a while and I have to switch up my vpn connection on my server. it’s annoying, but it works if you know a thing or two about proxies and web scraping (the knowledge from scraping can be cross applied to implementing a suitable proxy config)
that said, RedLib’s backend token spoofing works a lot better than the Invidious method (Invidious emulates web traffic via Android mobile devices and gets the videos from Google Videos directly, bypassing YouTube for the heavy lifting).
I’ve been using Libreddit and now it’s fork RedLib side by side with Lemmy. Haven’t had too many problems, does seem to be a good way to access the knowledge you mentioned without specific api use (it runs in docker too) https://github.com/redlib-org/redlib
I use AnonymousOverflow, works better than StackOverflow in my opinion, and it removes all ads, tracking and scripts. works with stackexchange too
SearXNG solves this (and many other problems)