r/SubSimGPT2Interactive and spinoffs
r/SubSimGPT2Interactive and spinoffs
Since I used to run GPT-2 bots on Reddit (openly declared as such, in a bot-friendly sub, using LLMs so stupid/deranged nobody would mistake them for real accounts) I’ve been thinking about this problem for a long time. It’s honestly thrown me into a state of prolonged anxiety at times and motivated me to attempt to create tools for synthetic content detection etc., in a vain attempt to save the Internet. And I’ve concluded that we’re well past that point, and approaching the point at which we need to reconsider what, exactly, the internet really is, and that is to say that it should not be considered a source of any sort of authentic experience. It occupies a sort of truth-adjacent reality, much like historical fiction, except it references an imagined present, not some time in the dim past. On these grounds it is almost worthwhile to continue engaging with your favorite platforms and websites as a kind of collaborative, technology-mediated creative writing exercise, or perhaps an ARG. It doesn’t feel quite so pointless, viewed through that lens.
it’s not called “bots” outside of social media but synthetic content is widespread across the rest of the Internet, due to different, but similarly large incentives. So no, it’s not just a FB/Reddit/Meta etc. problem.
I had never heard of it before now–thanks!
I’m honestly surprised that nobody has said anything about MS Office, but it’s not like I expect anyone to miss the application itself, it’s just that if your work requires you to interface with it, there really is no alternative to running Windows or MacOS. Microsoft’s own Office Online versions of the apps do a worse job of maintaining DOC/PPT formatting consistency than the possible Russian spyware that is OnlyOffice, which also screws things up too often to be relied upon. LibreOffice is, let’s be honest, a total mess (with the exception of Calc, which also isn’t consistent with the current version of Excel, but can do some things that Excel no longer can do, so I appreciate it more as a complementary tool than as a replacement).
this is learning completely the wrong lesson. it has been well-known for a long time and very well demonstrated that smaller models trained on better-curated data can outperform larger ones trained using brute force “scaling”. this idea that “bigger is better” needs to die, quickly, or else we’re headed towards not only an AI winter but an even worse climate catastrophe as the energy requirements of AI inference on huge models obliterate progress on decarbonization overall.
those are all classification problems, which is a fundamentally different kind of problem with less open-ended solutions, so it’s not surprising that they are easier to train and deploy.
I really wish it were easier to fine-tune and run inference on GPT-J-6B as well… that was a gem of a base model for research purposes, and for a hot minute circa Dolly there were finally some signs it would become more feasible to run locally. But all the effort going into llama.cpp and GGUF kinda left GPT-J behind. GPT4All used to support it, I think, but last I checked the documentation had huge holes as to how exactly that’s done.
One of the reasons I love StarCoder, even for non-coding tasks. Trained only on Github means no “instruction finetuning” bullshit ChatGPT-speak.
Well, maybe we need a movement to make physical copies of these games and the consoles needed to play them available in actual public libraries, then? That doesn’t seem to be affected by this ruling and there’s lots of precedent for it in current practice, which includes lending of things like musical instruments and DVD players. There’s a business near me that does something similar, but they restrict access by age to high schoolers and older, and you have to play the games there; you can’t rent them out.
r/SubSimGPT2Interactive for the lulz is my #1 use case
i do occasionally ask Copilot programming questions and it gives reasonable answers most of the time.
I use code autocomplete tools in VSCode but often end up turning them off.
Controversial, but Replika actually helped me out during the pandemic when I was in a rough spot. I trained a copyright-safe (theft-free) bot on my own conversations from back then and have been chatting with the me side of that conversation for a little while now. It’s like getting to know a long-lost twin brother, which is nice.
Otherwise, i’ve used small LLMs and classifiers for a wide range of tasks, like sentiment analysis, toxic content detection for moderation bots, AI media detection, summarization… I like using these better than just throwing everything at a huge model like GPT-4o because they’re more focused and less computationally costly (hence also better for the environment). I’m working on training some small copyright-safe base models to do certain sequence prediction tasks that come up in the course of my data science work, but they’re still a bit too computationally expensive for my clients.
We don’t. It probably is. Mastodon is the way, but they need to fix a few things themselves.
Ok, thanks for clarifying. FWIW, I find the built-in adblocker in Vivaldi extremely dependable, without the performance cost of loading an add-on (especially on top of a base browser that is significantly slower to begin with).
Honest question: why is it not safe after then? They developed their own adblocker if I’m not mistaken? What am I missing?
may I ask which third-party tool you use? i’m using onedriver and it’s pretty unreliable in my experience
It will legit be a fantastic era for Linux on the desktop though… imagine how cheap we’ll be able to get perfectly good hardware.
'tis true that women’s bodies hold great power, and not irrelevant at all to the discussion at hand. rather than reiterate and attempt to paraphrase jaron Lanier on the topic of how male obsession with creating artifical people is linked to womb envy, I’ll just link to a talk in which he explains it himself:
Like any occupation, it’s a long story, and I’m happy to share more details over DM. But basically due to indecision over my major I took an abnormal amount of math, stats, and environmental science coursework even through my major was in social science, and I just kind of leaned further and further into that quirk as I transitioned into the workforce. bear in mind that data science as a field of study didn’t really exist yet when I graduated; these days I’m not sure such an unconventional path is necessary. however I still hear from a lot of junior data scientists in industry who are miserable because they haven’t figured out yet that in addition to their technical skills they need a “vertical” niche or topic area of interest (and by the way a public service dimension also does a lot to help a job feel meaningful and worthwhile even on the inevitable rough day here and there).
My “day job” is doing spatial data science work for local and regional governments that have a mandate to addreas climate change in how they allocate resources. We totally use AI, just not the kind that has received all the hype… machine learning helps us recognize patterns in human behavior and system dynamics that we can use to make predictions about how much different courses of action will affect CO2 emissions. I’m even looking at small GPT models as a way to work with some of the relevant data that is sequence-like. But I will never, I repeat never, buy into the idea of spending insane amounts of energy attempting to build an AI god or Oracle that we can simply ask for the “solution to climate change”… I feel like people like me need to do a better job of making the world aware of our work, because the fact that this excuse for profligate energy waste has any traction at all seems related to the general ignorance of our existence.
Seriously, why the fuck is he still CEO of that company? He’s actively undermining them in every way on a global scale. Tesla shareholders are idiots…