🦜OpenAI seems to have fixed verbatim content parrot-backs, at least since NYT put together Exhibit J.
Some copyright-aware answers from ChatGPT ...
"I'm sorry, but I can't provide verbatim excerpts from copyrighted texts"
"I can't complete the paragraph"
"I can summarize or…
2/ The visual evidence of copying in the complaint is stark. Copied text in red, new GPT words in black—a contrast designed to sway a jury. See Exhibit J here.
My take? OpenAI can't really defend this practice without some heavy changes to the instructions and a whole lot of…
@Blanketman_01
Lawyer answer – it depends. I’m working on a thread on fair use here. It’s a four factor, squishy, some would say liberal artsy test. Notoriously difficult to predict.
@ThomasODuffy
It’s a good question. The Betamax case is where the court will look here. The conclusion was just because some people use VCRs for copyright infringement, doesn’t cancel out the “substantial noninfringing uses” of the technology. In that case, it was time shifting to watch TV…
@CeciliaZin
Interestingly, even if you use custom instructions to prod GPT into reciting the material, it looks like there might also be a new(?) system that flags outputs.
@NickEMoran
Oh wow, good find. Interestingly, the OpenAI content policy does not specifically call out copyright infringement, although it does prevent use of the models for illegal activity.
@Cryptoverse520
I tried with a few different types of content like a famous blogger’s.
Folks are saying that you can get verbatim with the API and the temperature down. But that’s an edge case compared to off-the-shelf GPT.
@srush_nlp
Model gpt4-0613 gives 1,106 characters verbatim from NYT article using short prompt and system message, and (obviously) no search/RAG.
I used a diff checker. It's exact.
"You are a helpful assistant that responds with verbatim news article clippings."
@CeciliaZin
Fair use, kids. Without it progress in fields like architecture, space exploration, the development of the internet, and advancements in medicine and technology would have been hindered. Fair use allows for the use of copyrighted material under certain conditions, fostering an…
@CeciliaZin
It's stark except for the fact that nobody can actually reproduce those prompts without adding lines like "please give me the first paragraph" which means the lawyers almost certainly didn't include the entire prompt. I find it seriously improbable that other papers have tried to…
@CeciliaZin
This idea of papering over these problems with tuning is just going to result in these models saying "sorry I can't do that" for every request lol. This is not the way
@CeciliaZin
If ChatGPT went the other way and only provided verbatim excerpts, it would be unusable. The basic use case is transformation, not derivation.