• Luc@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    7 hours ago

    You could always type it over and say you’ve recreated it, but that also didn’t fly. Why would overfitting a machine learning algorithm on the data and then having it predict next tokens be any different?

    • Limonene@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      6 hours ago

      Because machine learning is already basically a mass copyright infringement. The training data contains copyrighted material. The model is clearly a derivative of the training data. The output is clearly a derivative of the model. Yet somehow, it’s legal (probably because they can afford good lawyers).