Sex offender banned from using AI tools in landmark UK case

girlfreddy@lemmy.ca · 7 months ago

Sex offender banned from using AI tools in landmark UK case

JackGreenEarth@lemm.ee · edit-2 7 months ago

As a UK citizen, I’m ashamed of my government.

I am firmly against child abusers, but AI images don’t harm anyone and are a safe and harmless way for pedophiles to fulfil their urges, which they cannot control.

Tippon@lemmy.dbzer0.com · 7 months ago

Where does the training data come from to create indecent images of children?

Dran@lemmy.world · edit-2 7 months ago

It doesn’t need csam data for training, it just needs to know what a boob looks like, and what a child looks like. I run some sdxl-based models at home and I’ve observed it can be difficult to avoid more often than you’d think. There are keywords in porn that blend the lines across datasets (“teen”, “petite”, “young”, “small” etc). The word “girl” in particular I’ve found that if you add that to basically any porn prompt gives you a small chance of inadvertently creating the undesirable. You have to be really careful and use words like “woman”, “adult”, etc instead to convince your image model not to make things that look like children. If you’ve ever wondered why internet-based porn generators are on super heavy guardrails, this is why.

xmunk@sh.itjust.works · 7 months ago

It is true, a 10 year old naked woman is just a 30 year old naked woman scaled down by 40%. /s

No buddy, there isn’t some vector of “this is the distance between kid and adult” that a model can apply to generate what a hypothetical child looks like. The base model was almost certainly trained on more than just anatomical drawings from Wikipedia - it ate some csam.

If you’ve seen stuff about “Hitler - Germany + Italy = Mousillini” for models where that’s true (which is not universal) it takes an awful lot of training data to establish and strengthen those vectors. Unless the generated images were comically inaccurate then a lot of training went into this too.

rebelsimile@sh.itjust.works · 7 months ago

Right, and the google image ai gobbled up a bunch of images of black george washington, right? They must have been in the data set, there’s no way to blend a vector from one value to another, like you said. That would be madness. Nope, must have been copious amounts of asian nazis in the training set, since the model is incapable of blending concepts.

PotatoKat@lemmy.world · 7 months ago

From a few months ago

https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

xmunk@sh.itjust.works · 7 months ago

You’re incorrect and you should fucking know better.

I have no idea why my comment above was downvoted to hell but AI can’t “dream up” what a naked young person looks like. An AI can figure that adults wear different clothes and put a black woman in a revolutionary war outfit. These are totally different concepts.

You can downvote me if you like but your AI generated csam is based on real csam so fuck off. I’m disappointed there is such a large proportion of people defending csam here especially since lemmy should be technically oriented - I expect to see more input from fellow AI fluent people.

rebelsimile@sh.itjust.works · 7 months ago

You’re spreading misinformation and getting called out for it.

xmunk@sh.itjust.works · 7 months ago

It isn’t misinformation, though, generative AI needs a basis for it’s generation.

xmunk@sh.itjust.works · 7 months ago

Just a note - csam has been found in model training sets: https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

redlue@startrek.website · edit-2 7 months ago

Removed by mod

7 months ago

Bro googled the word vector and was waiting to use it.

Leate_Wonceslace@lemmy.dbzer0.com · edit-2 7 months ago

No, they’s referring to the internal workings of AI models, which are essentially a series of incredibly high-dimension matrices with extra bits around them to make them work. Individual concepts are embedded as vectors in the space that these models work in. That’s why linear algebra is brought up so frequently in discussions of AI.

The_Vampire@lemmy.world · 7 months ago

While it’s true that linear algebra and vectors are used in learning models, they’re not using the term correctly in a way that says they know something about the subject (at least, the modern subject). Concepts aren’t embedded as vectors. In older models (before the craze), concepts were manually embedded as numbers or a collection of numbers, which could be a vector (but could be something else as well), and the machine would learn by modifying weights. However, in current models (and by current, I mean at least more than a couple years), concepts are learnt by the machine (weights are still modified by the machine as well) and the machine makes its own connections between features presented to it.

For example, you give it a dataset of 10x10 pixel images (with text descriptions) and it reads that as 100 pixels split into 3 numbers (RGB) and then looks for connections between those numbers and in which pixels. It’s not identifying what a boob is, but knows that when an image has ‘boob’ in the text description then there’s a very high likelihood that there will be a circular collection of pixels with lots of red somewhere in the image that are also connected to other pixels that are often also lots of red. That’s me breaking down what a human would think given the same task/information, but the reality is the machine will come up with its own connections/concepts which are both often far better than humans (when the model works, at least) and far more ineffable to humans.

PotatoKat@lemmy.world · 7 months ago

From a few months ago

https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

Dran@lemmy.world · 7 months ago

I’m not going to say that csam in training sets isn’t a problem. However, even if you remove it, the model remains largely the same, and its capabilities remain functionally identical.

PotatoKat@lemmy.world · 7 months ago

At that point it’s still using photos of children to generate csam even if you could somehow assure the model is 100% free of csam

Dran@lemmy.world · 7 months ago

That would be true, it’d be pretty difficult to build a model without any pictures of children at all, and then try and describe to the model how to alter an adult to make a child. Is anyone asking for that though? To make it illegal to have regular pictures of children in these datasets?

PotatoKat@lemmy.world · 7 months ago

No but it is a reason why generating csam should be illegal. You’re using data trained on pictures of real kids

Tippon@lemmy.dbzer0.com · 7 months ago

Thanks for the reply, it’s given me a good idea of what’s most likely happening :)

It’s a shame that the rest of the thread went to shit, but unfortunately it’s an emotional topic, and brings out emotional responses

Dran@lemmy.world · 7 months ago

Always happy to try and productively add to someone’s learning.

Daxtron2@startrek.website · 7 months ago

The whole point of diffusion models is that you can generate new concepts using training data. Models trained on any nsfw images can combine those concepts with any of its non-nsfw concepts. Of course, that’s not to say there isn’t CSAM in any training data, because there objectively has been in the past, but there doesn’t need to be any to generate it.

Tippon@lemmy.dbzer0.com · 7 months ago

Thanks for the reply, that makes a lot of sense :)

Daxtron2@startrek.website · 7 months ago

Thanks for not being a dick! I aim to inform

Turun@feddit.de · 7 months ago

Ai is able to fill in the last field in a table like “Old / young” vs “Clothed / naked” when given three of the four fields.

PotatoKat@lemmy.world · 7 months ago

Csam is in the training data. From a few months ago

https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

randomaside@lemmy.dbzer0.com · 7 months ago

Please reiterate your statement but instead using the “goose chase meme” format.

PotatoKat@lemmy.world · 7 months ago

From a few months ago

https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

Flying Squid@lemmy.world · 7 months ago

How could they possibly enforce this ban?

redlue@startrek.website · edit-2 7 months ago

Removed by mod

KubeRoot@discuss.tchncs.de · 7 months ago

Wasn’t the point that what he was using them for already illegal? Sounds like he already couldn’t get caught, so doesn’t seem like that’ll do much…

treefrog@lemm.ee · edit-2 7 months ago

What do you mean? Like how would they catch him?

In the States parole/probation means you lose most of your civil liberties. In other words, if this was the U.S. a PO would check his phone and possibly his computer. Possibly even pull ISP records depending on how bad they want to catch you/how full of shit they think you are.

Flying Squid@lemmy.world · 7 months ago

How will they even know he’s doing it? It doesn’t say they’re monitoring his internet connection. And even if they were monitoring his internet connection, he could go to some public wifi hotspot and sit in a car and do it.

treefrog@lemm.ee · 7 months ago

I edited my comment. You’re too quick.

But yeah, he could get around it. But, he’s an addict. He’s going to want that porn other places then his car and make mistakes. If he’s tech savvy, he can probably stay one step ahead of his probation agent (assuming he has one). If he’s not, he’ll slip up because he’s addicted, and that’s how people get caught.

magnusrufus@lemmy.world · 7 months ago

Put monitoring software on his devices.

Flying Squid@lemmy.world · 7 months ago

He could just get a burner phone. Realistically, there is no way to police this.

Scratch@sh.itjust.works · 7 months ago

This is pretty similar to restraining orders, make it more difficult and make the consequences more severe.

bobs_monkey@lemm.ee · 7 months ago

Or a burner laptop/Chromebook/whatever. Couple that with a VPN, using a neighbor’s wifi, public hotspots, etc, I don’t really see how they can realistically enforce someone motivated to do what they’re gonna do.

xmunk@sh.itjust.works · 7 months ago

In the modern world when we have cellphones that can do pretty much anything… it’s fucking hard. There will be a parole officer and monitoring software with periodic physical inspections along with watching his purchases. (That’s, at least, th American approach).

Usually the way it works is that when this dude slips up once he goes to prison for violating his court order.

magnusrufus@lemmy.world · 7 months ago

Have part of his probation be having his property searched to check for such devices.

stoly@lemmy.world · 7 months ago

There’s a log for everything. There really is. It’s just hard to piece it all together.

Deceptichum@sh.itjust.works · 7 months ago

Maybe they think he is capable of self enforcing the ruling?

Or that they want the option of gaoling him if they so much as get a hint he’s using one of the services in any capacity.

michaelmrose@lemmy.world · 7 months ago

Why didn’t he get banned from using the internet?

The Snark Urge@lemmy.world · 7 months ago

I definitely can’t let you do that, Hal.

bloodfart@lemmy.ml · 7 months ago

Jesus Christ is that @PotatoKat@lemmy.world ’s music I hear?

PotatoKat@lemmy.world · 7 months ago

Just annoyed to see everyone saying with such definitive wording that there isn’t any csam in training data. I’m a victim of CSA and can’t imagine how I would feel if photos of me were used to help get people off like that.

pory@lemmy.world · 7 months ago

Right? Training data is an absurd blob of everything the algorithm can get its hands on. It’s like trying to assure that there’s no alcohol or coca-cola in a lake.

bloodfart@lemmy.ml · 7 months ago

It’s great to see you wading into this shitshow with the folding chair, ngl.

AutoTL;DR@lemmings.world · 7 months ago

This is the best summary I could come up with:

The Internet Watch Foundation (IWF) said the prosecutions were a “landmark” moment that “should sound the alarm that criminals producing AI-generated child sexual abuse images are like one-man factories, capable of churning out some of the most appalling imagery”.

Susie Hargreaves, the charity’s chief executive, said that while AI-generated sexual abuse imagery currently made up “a relatively low” proportion of reports, they were seeing a “slow but continual increase” in cases, and that some of the material was “highly realistic”.

The Lucy Faithfull Foundation (LFF), which runs the confidential Stop It Now helpline for people worried about their thoughts or behaviour, said it had received multiple calls about AI images and that it was a “concerning trend growing at pace”.

The decision to ban an adult sex offender from using AI generation tools could set a precedent for future monitoring of people convicted of indecent image offences.

Stability AI, the company behind Stable Diffusion, said the concerns about child abuse material related to an earlier version of the software, which was released to the public by one of its partners.

It said that since taking over the exclusive licence in 2022 it had invested in features to prevent misuse including “filters to intercept unsafe prompts and outputs” and that it banned any use of its services for unlawful activity.

The original article contains 974 words, the summary contains 219 words. Saved 78%. I’m a bot and I’m open source!