You could always type it over and say you’ve recreated it, but that also didn’t fly. Why would overfitting a machine learning algorithm on the data and then having it predict next tokens be any different?
Because machine learning is already basically a mass copyright infringement. The training data contains copyrighted material. The model is clearly a derivative of the training data. The output is clearly a derivative of the model. Yet somehow, it’s legal (probably because they can afford good lawyers).
You could always type it over and say you’ve recreated it, but that also didn’t fly. Why would overfitting a machine learning algorithm on the data and then having it predict next tokens be any different?
Because machine learning is already basically a mass copyright infringement. The training data contains copyrighted material. The model is clearly a derivative of the training data. The output is clearly a derivative of the model. Yet somehow, it’s legal (probably because they can afford good lawyers).