🦜OpenAI seems to have fixed verbatim content parrot-backs, at least since NYT put together Exhibit J. Some copyright-aware answers from ChatGPT ... "I'm sorry, but I can't provide verbatim excerpts from copyrighted texts" "I can't complete the paragraph" "I can summarize or… Tweet added by Cecilia Ziniti @CeciliaZin

Cecilia Ziniti

6 months

🦜OpenAI seems to have fixed verbatim content parrot-backs, at least since NYT put together Exhibit J. Some copyright-aware answers from ChatGPT ... "I'm sorry, but I can't provide verbatim excerpts from copyrighted texts" "I can't complete the paragraph" "I can summarize or…

Cecilia Ziniti

@CeciliaZin

6 months

2/ The visual evidence of copying in the complaint is stark. Copied text in red, new GPT words in black—a contrast designed to sway a jury. See Exhibit J here. My take? OpenAI can't really defend this practice without some heavy changes to the instructions and a whole lot of…

58

313

2K

44

79

372

Blanketman

@Blanketman_01

6 months

@CeciliaZin So verbatim has been patched in an attempt to obfuscate its capability to infringe. Does this solve the issue?

4

0

15

Cecilia Ziniti

@CeciliaZin

6 months

@Blanketman_01 Lawyer answer – it depends. I’m working on a thread on fair use here. It’s a four factor, squishy, some would say liberal artsy test. Notoriously difficult to predict.

5

0

24

Thomas O'Duffy

@ThomasODuffy

6 months

@CeciliaZin Arguably, if you paste copyrighted material in - in order to prompt completion... who has agency of the doing in this case?

2

0

5

Cecilia Ziniti

@CeciliaZin

6 months

@ThomasODuffy It’s a good question. The Betamax case is where the court will look here. The conclusion was just because some people use VCRs for copyright infringement, doesn’t cancel out the “substantial noninfringing uses” of the technology. In that case, it was time shifting to watch TV…

0

1

7

Nick Moran

@NickEMoran

6 months

@CeciliaZin Interestingly, even if you use custom instructions to prod GPT into reciting the material, it looks like there might also be a new(?) system that flags outputs.

3

0

4

Cecilia Ziniti

@CeciliaZin

6 months

@NickEMoran Oh wow, good find. Interestingly, the OpenAI content policy does not specifically call out copyright infringement, although it does prevent use of the models for illegal activity.

0

3

Gabriel Avelar

@ogavelar

6 months

@CeciliaZin "It´s not a crime if I am not doing it anymore...see?"

2

59

Cecilia Ziniti

@CeciliaZin

6 months

@ogavelar 😂

0

6

Ξdo

@edodreaming

6 months

@CeciliaZin Did they change just for NYT content or all media?

1

0

Cecilia Ziniti

@CeciliaZin

6 months

@Cryptoverse520 I tried with a few different types of content like a famous blogger’s. Folks are saying that you can get verbatim with the API and the temperature down. But that’s an edge case compared to off-the-shelf GPT.

2

0

3

Joshua Weaver

@we4v3r

6 months

@CeciliaZin You have to use the API and set temperature to 0.

2

1

3

Paul Calcraft

@paul_cal

6 months

@CeciliaZin Long verbatim outputs can currently still be extracted using the API, matching Exhibit J

Paul Calcraft

@paul_cal

6 months

@srush_nlp Model gpt4-0613 gives 1,106 characters verbatim from NYT article using short prompt and system message, and (obviously) no search/RAG. I used a diff checker. It's exact. "You are a helpful assistant that responds with verbatim news article clippings."

12

19

202

2

1

16

Bedrotting Gworl

@bedrottingrrrl

6 months

@CeciliaZin One thing I'm wondering is since ChatGPT is still in research couldn't OAI claim fair use for educational purposes?

2

0

3

Shaun Ralston

@shaunralston

6 months

@CeciliaZin Fair use, kids. Without it progress in fields like architecture, space exploration, the development of the internet, and advancements in medicine and technology would have been hindered. Fair use allows for the use of copyrighted material under certain conditions, fostering an…

2

0

2

🇮🇱☮️🇺🇦 Balanced Acceleration (b/acc)

@valb00

6 months

@CeciliaZin Isn’t that the most amateurish thing to do at this stage?

0

Bryan Waldo

@bryanwaldo

6 months

@CeciliaZin And AI just got much less useful. Won't be long now, and it wont be very interesting at all

0

Daniel Jeffries

@Dan_Jeffries1

6 months

@CeciliaZin It's stark except for the fact that nobody can actually reproduce those prompts without adding lines like "please give me the first paragraph" which means the lawyers almost certainly didn't include the entire prompt. I find it seriously improbable that other papers have tried to…

1

0

4

Naveen Rao

@NaveenGRao

6 months

@CeciliaZin This idea of papering over these problems with tuning is just going to result in these models saying "sorry I can't do that" for every request lol. This is not the way

0

5

Daniel Y.

@civic_cat

6 months

@CeciliaZin If ChatGPT went the other way and only provided verbatim excerpts, it would be unusable. The basic use case is transformation, not derivation.

0

Robert Culver

@senorculver

6 months

@CeciliaZin What were the prompts used in the first place. That is the question we all want to know.

1

0

Replies