• ComradePenguin@lemmy.mlOP
    link
    fedilink
    arrow-up
    7
    arrow-down
    1
    ·
    1 day ago

    Updated question to clarify. Was thinking more of specific repos being laundered, not entirety

    • slazer2au@lemmy.world
      link
      fedilink
      English
      arrow-up
      18
      arrow-down
      3
      ·
      1 day ago

      When you use LLMs to regurgitate code, you do not get ownership of the code as you did not produce it. so using llms to “launder” code doesn’t accomplish anything.

      • ComradePenguin@lemmy.mlOP
        link
        fedilink
        arrow-up
        5
        arrow-down
        1
        ·
        1 day ago

        From my understanding it does allow me to use the code for any purpose regardless of the license, does it not? Even if I dont own the LLM written code?

        • floquant@lemmy.dbzer0.com
          link
          fedilink
          arrow-up
          2
          ·
          7 hours ago

          It does not.

          From the GPL terms:

          To “modify” a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy.

          Other licenses may be more permissive and do allow you to do pretty much whatever you want with it, but I don’t see why feeding some source code into an LLM would exempt you from its license.

          It doesn’t matter if it’s you reading it or an LLM doing inference on it, you’re still taking the source code as a starting point to create a derived work based on it and as such you are subject to its license.

        • Ephera@lemmy.ml
          link
          fedilink
          English
          arrow-up
          7
          ·
          1 day ago

          Yeah, but you also have to be aware that companies rarely care to (fully) comply with licenses to begin with, if their own code isn’t publicly accessible.

          Basically:

          • If they actually open-source their own code, they have to fully comply (though the worst consequence is often just having to open-source your own code, which it already is, so it might not always be the highest priority either).
          • If they build a frontend, they generally do want to comply, because someone might be able to decompile the software and prove that licensed code is used inappropriately.
          • If they build a backend or build tooling or the like, GPL and AGPL is often still prohibited due to the high impact, but other than that, complying with licenses is seen as reducing risk for something that’s pretty unlikely to affect them. The chance of them being sued for code that no one sees is just practically 0, so it’s usually treated as an acceptable legal risk to not give a fuck.
        • PlzGibHugs@piefed.ca
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          1 day ago

          Even before getting into the copyrightability of code, at the very least, any LLM-produced parts are not copyrightable. They are public domain.

          That said, if its a mix of LLM code and human code, things get pretty messy. From my understanding, if the human expanded on or modified AI code, its public domain. If they wrote a section fully independently, they absolutely own the copyright. If its an unclear mix, it would have to be proven on a case-by-case basis with the onus being on the AI user to provide solid evidence that the code copied isn’t AI generated.