• awesomesauce309@midwest.social
      link
      fedilink
      English
      arrow-up
      20
      ·
      edit-2
      1 year ago

      It’s probably Stable diffusion. I use comfyui since you can watch the sausage get made but there’s also other UIs like automatic1111. Originally for a qr pattern beautifier, there is a controlnet that takes a two tone black and white “guide” image. but you can guide it to follow any image you feed it. Such as a meme edited to be black and white, or text like “GAY SEX.”

    • CodeInvasion@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      17
      ·
      1 year ago

      This is done by combining a Diffusion model with ControlNet interface. As long as you have a decently modern Nvidia GPU and familiarity with Python and Pytorch it’s relatively simple to create your own model.

      The ControlNet paper is here: https://arxiv.org/pdf/2302.05543.pdf

      I implemented this paper back in March. It’s as simple as it is brilliant. By using methods originally intended to adapt large pre-trained language models to a specific application, the author’s created a new model architecture that can better control the output of a diffusion model.