A new tool lets artists add invisible changes to the pixels in their art before they upload it online so that if it’s scraped into an AI training set, it can cause the resulting model to break in chaotic and unpredictable ways.

The tool, called Nightshade, is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.
[…]
Zhao’s team also developed Glaze, a tool that allows artists to “mask” their own personal style to prevent it from being scraped by AI companies. It works in a similar way to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows.

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      0
      ·
      1 year ago

      There’s trivial workarounds for Glaze, which this is based off of, so I wouldn’t be surprised.

    • hh93@lemm.ee
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      The problem is identifying it. If it’s necessary to preprocess every image used for training instead of just feeding it is a model that already makes it much more resources costly

      • V H@lemmy.stad.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 year ago

        You wouldn’t want to. If you just feed it to the models, then if there are enough of these images to matter the model will learn to ignore the differences. You very specifically don’t want to prevent the model from learning to overcome these things, exactly because if you do you’re stuck with workarounds like that forever, but if you don’t the model will just become more robust to noisy data like this.

    • V H@lemmy.stad.social
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 year ago

      Yes: Train on more images processed by this.

      In other words: If the tool becomes popular it will be self-defeating by producing a large corpus of images teaching future models to ignore the noise it introduces.

      There are likely easier “quick fixes” while waiting for new models, but this is the general fix that will work against almost any adversarial attack like this.

      There might be theoretical attacks that’d be somewhat more difficult to overcome to the extent of requiring tweaks to the models, but given that there demonstrably exists a way of translating text to images that overcomes any such adversarial method that isn’t noticeable to humans, given that humans can, there will inherently always be a way to beat them.