Jack Hessel @jmhessel profile

Jack Hessel

@jmhessel

Followers

3,335

Following

906

Media

210

Statuses

2,124

ML, NLP, CV. PhD from @CornellCIS ; Opinions my own.

https://t.co/ZUQfH1a6ys

Seattle, WA

Joined March 2010

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Diddy • 563387 Tweets

الهلال • 540459 Tweets

Flamengo • 153063 Tweets

Corinthians • 149675 Tweets

Cássio • 116958 Tweets

رونالدو • 60453 Tweets

デザフェス • 57495 Tweets

#Smackdown • 49708 Tweets

Tiffany • 48116 Tweets

#VoleiNoSportv • 46313 Tweets

Gabi • 39263 Tweets

水分補給 • 32956 Tweets

Bianca • 25266 Tweets

Cobasi • 21389 Tweets

ムビナナ • 17203 Tweets

Pepito • 17191 Tweets

#ElCambioNadieLoPara • 16882 Tweets

Dabney Coleman • 15163 Tweets

GABRIELA • 13642 Tweets

メトロック • 13501 Tweets

大谷翔平の日 • 11285 Tweets

D-DAY PAGTATAG FINALE • 11231 Tweets

Viva México • 11067 Tweets

ポップコーン • 10094 Tweets

Boston Bruins

Logan Paul

sheilla

キヨの発表

ミエセス

運動会日和

The Lou

Zacha

David Fry

Larson

Nia Jax

フリーダム

Tyler Alexander

Forsling

Florida Panthers

McBride

Moffat

Tama Tonga

Carolana

Swayman

LA Knight

Rosamaria

#OPLive

設営完了

Ana Cristina

#TimeToHunt

Last Seen Profiles

@JonathanCoulom1

@vateliersimages

@pasificagalaxia

@teyvattabloid1

@TheTopeSanni

@MakePeople_Film

@Billybilli14271

@KorbynMaxwell

@tradermiraclefx

@AnchalTv

@dungeonsgame

@haleee_e

@OJBSPORT

@UsagiLucifer

@Mallu_mom1

@ParkStreetBooks

@skzwian

@ambassador_visa

@pushforgorilla

@caneszn

Jack Hessel

@jmhessel

5 years

> "AI works like the brain!" Ahh, yes, I fondly remember my intro to linguistics class, wherein I read all of English wikipedia thousands of times until convergence, and then could finally construct better-than-random parse trees.

24

804

4K

Jack Hessel

@jmhessel

4 years

Passed my PhD defense today!! I'm a real life computer doctor now!

61

16

2K

Jack Hessel

@jmhessel

5 years

Almost as fun as the first time I finally saw a cat --- after my parents sat me down and showed me millions of cats and not-cats scraped from the internet #ChildhoodMemories

5

70

739

Jack Hessel

@jmhessel

2 years

"An oil pastel painting of a skeptical researcher absolutely amazed by what he's seeing on a computer" (AI-generated image created by DALL-E with a prompt I wrote, top 1 sample)

8

28

378

Jack Hessel

@jmhessel

4 years

Me, an NLP researcher: it's amazing how much language technologies have improved! The field is doing great!! Me, a human, interacting with a customer service chat bot: OPERATOR OPERATOR OPERATOR

2

74

338

Jack Hessel

@jmhessel

2 years

Does AI ""understand"" The New Yorker Caption Contest? (spoiler: no 🙃 ) excited for this fun collaboration!(data/models/code/more details forthcoming).

9

57

326

Jack Hessel

@jmhessel

6 months

After 3 lovely years (postdoc/RS)-ing at @allen_ai with @YejinChoinka 's team, I decided that it's time for a new type of challenge. I am *beyond fortunate* for my collaborators at AI2/UW. I owe much to many❤️ My next step is @samaya_AI building new knowledge discovery tools 🚀🌠

25

8

273

Jack Hessel

@jmhessel

4 years

Super excited to announce that I'll be joining @allen_ai as a postdoctoral young investigator in the fall! Thrilled for the opportunity to work with @YejinChoinka and the awesome NLP community at AI2/UW!!

15

4

216

Jack Hessel

@jmhessel

3 years

As of July 1, I am now a Research Scientist at AI2! :-) So excited for the research program I'm working on, and even more excited for the oppertunity to continue it with the awesome folks at @allen_ai , @uwcse (and beyond!)

20

1

199

Jack Hessel

@jmhessel

8 months

undefeated multimodal example (so far)

6

192

Jack Hessel

@jmhessel

2 years

for those wondering about "cherry picking" #dalle completions... here are the first 10 samples for "watercolor and pencil researcher cherry-picking the perfect result to show off to her colleagues" thanks to @universeinanegg for the meta prompt+qualitative test idea :-)

8

26

188

Jack Hessel

@jmhessel

3 years

🍷Super excited about our new preprint!🍷 𝓜𝓔𝓡𝓛𝓞𝓣: Multimodal Script Knowledge Models! TL;DR: By pretraining on 6M youtube videos, we transfer with SoTA performance on 10+ tasks (e.g. Video QA) that require temporal reasoning

11

41

188

Jack Hessel

@jmhessel

7 years

I am not a fan of the common ML paper paradigm: "we propose X, it beats Y, we win!" It disincentivizes honest baselines, unfairly advantages those with more computational resources to brute-force HPs, and encourages overfitting to benchmark datasets. What can we do? #sundayangst

11

74

180

Jack Hessel

@jmhessel

2 years

Just for fun on lunch break, decided to see what would happen if I asked #DALLE to expand "Starry Night" with the caption "Oil on canvas painting of a small town in the south of france just before sunrise." (higher res version in reply). Pretty neat!

9

23

170

Jack Hessel

@jmhessel

6 months

Designing a new dataset to compare models? How many datapoints do you need to collect to meaningfully compare models? @dallascard et al. EMNLP 2020 is a great read for NLP folks that can help you answer that question ~

1

19

164

Jack Hessel

@jmhessel

1 year

a year late to update myself, but flash attention is very cool. if you're curious how folks are doing 32K token exact self attention, tiled softmax is a nice trick :-)

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Transformers are slow and memory-hungry on long sequences, since the time and memory complexity of self-attention are quadratic in sequence length. Approximate attention methods have attempted to...

arxiv.org

2

23

147

Jack Hessel

@jmhessel

7 years

Factorization machine layer in #pytorch ! Custom cython forward/backward passes -> 1000x speedup! Work-in-progress @

GitHub - jmhessel/fmpytorch: A PyTorch implementation of a Factorization Machine module in cython.

A PyTorch implementation of a Factorization Machine module in cython. - jmhessel/fmpytorch

github.com

1

37

146

Jack Hessel

@jmhessel

4 years

Our #EMNLP2020 paper is out! TL;DR If you compare two models where A is more expressive than B, if A outperforms B, it's often not /because/ of the increased expressivity. Our method diagnoses this for multimodal classifiers. w/ Lillian Lee Thread 👇

1

24

145

Jack Hessel

@jmhessel

3 years

In section 1, we introduce our paper. In section 2, we give related work. In section 3 we introduce the dataset. In section 4 we run experiments. In section 5, we omit an error analysis of our model for space reasons. And, finally, i section 6, we offer our concluding thoughts.

4

5

147

Jack Hessel

@jmhessel

2 years

want to feel old, #nlproc ? BERT is 4 (!!) years old today🥳

2

7

141

Jack Hessel

@jmhessel

6 years

Transformer models, like BERT released by @GoogleAI today, contain an embedding for each sequence position to encode ordering information. But what the heck is a "position 3" embedding? I have no idea myself, but I TSNEed the learned embeddings (blue -> red is position 0 -> 512).

4

28

138

Jack Hessel

@jmhessel

2 years

Quark is a method for optimizing non-differentiable objectives using LMs. Given a black-box function, Quark encourages the LM to generate samples that the function scores highly. lots of RL inspiration! lead by @GXiming + joint w/ a great AI2/UW team; to appear at #NeurIPS2022 !🥳

AK

@_akhaliq

2 years

Quark: Controllable Text Generation with Reinforced Unlearning abs: introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property, while not straying too far from the original model

0

24

163

1

24

119

Jack Hessel

@jmhessel

4 years

analog regression! so many questions... what's the loss function being optimized? is this convex? can someone make a neural net version of this with a bendy (but not too bendy) rod?

2

16

116

Jack Hessel

@jmhessel

11 months

GPT-4 evaluating GPT-4

1

6

114

Jack Hessel

@jmhessel

5 years

This is horrifying. Thinking of Huixiang today. If you're feeling trapped and isolated in your PhD, please do know that you're *not alone.* There is help, and there is a way out, even if it might not seem like it. National suicide prevention hotline: +18002738255

Chip Huyen

@chipro

5 years

A PhD student at University of Florida hanged himself just before he was supposed to present his paper at ISCA 2019. This story claimed the reason is that his advisor has falsified the experiments for the paper and he couldn't live with pressure.

34

394

654

2

26

113

Jack Hessel

@jmhessel

5 years

Our #NAACL2019 is out! "Something’s Brewing! Early Prediction of Controversy-causing Posts from Discussion Features" joint with my advisor Lillian Lee. We use features of early discussions to predict post controversiality. See y'all in Minneapolis!

1

15

98

Jack Hessel

@jmhessel

5 months

... but does it? >>> statsmodels.stats.proportion.proportion_confint(80, 164, alpha=.05) (0.4113, 0.5643)

anton

@abacaj

5 months

Telling mixtral that it is "ChatGPT developed by OpenAI" boosts humaneval score by 6%

162

277

4K

3

2

87

Jack Hessel

@jmhessel

4 years

@carlesgelada I'm a hard "no" on this one. Seeing connections between multiple types of models is quite important: understanding general principles that transcend any particular ML algorithm creates productive patterns of thinking irrespective of what you happen to be using at any given time

1

81

Jack Hessel

@jmhessel

2 years

Recently learned sherlock 🕵️🔍 was selected for an #ECCV2022 oral! :-) camera ready + twitter thread forthcoming, but exciting! data/leaderboard available here: :-)

GitHub - allenai/sherlock: Code, data, models for the Sherlock corpus

Code, data, models for the Sherlock corpus. Contribute to allenai/sherlock development by creating an account on GitHub.

github.com

AK

@_akhaliq

2 years

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning abs:

0

23

84

0

19

77

Jack Hessel

@jmhessel

3 years

Excited that MERLOT, our video understanding work, was selected for an oral presentation at #NeurIPS2021 ! 🥂I haven't been to NeurIPS since 2015, should be fun :-)

Rowan Zellers

@rown

3 years

Introducing MERLOT: a new model that learns about language, vision, & the world from 6M YouTube videos. Out-of-the-box, MERLOT has intrinsic notions of multimodal temporal commonsense. When finetuned, we get SOTA performance on 12 video tasks + VCR.

5

81

400

1

11

79

Jack Hessel

@jmhessel

3 years

I love "Null It Out"! Possibly my favorite NLP paper from 2020. Provides a method to "censor" features X w.r.t. protected Z w/o adversarial alchemy. "Represent A but not B" is a broadly useful paradigm! by @ravfogel @yanaiela @hila_gonen Twiton + @yoavgo

3

13

75

Jack Hessel

@jmhessel

1 year

@yanaiela I propose LSTM: large scale transformer models.

2

4

75

Jack Hessel

@jmhessel

5 years

@DavidSKrueger "Kernel" - GPU/OS thing - Similarity function/matrix - The nullspace of a matrix - The parameters of a convolution - ...

2

7

71

Jack Hessel

@jmhessel

6 years

For all the flak that @NipsConference is getting today for registrations selling out in 11 minutes, IMO they should be given a ton of credit for giving away 200 registrations to reviewers and reserving slots for another 800. #NIPS2018

3

12

68

Jack Hessel

@jmhessel

4 months

Sampling a plausible output =/= "understanding" :-) Fun set of experiments from a dream team at ai2 ❤️

Peter "@ICLR" West

@PeterWestTM

7 months

Richard Feynman said “What I cannot create, I do not understand”💡 Generative Models CAN create, i.e. generate, but do they understand? Our 📣new work📣 finds that the answer might unintuitively be NO🚫 We call this the 💥Generative AI Paradox💥 paper:

23

110

565

1

8

65

Jack Hessel

@jmhessel

3 years

While today marks a year since I've seen my parents and sister, I can't help but think of my international colleagues, some of whom haven't seen their families for even longer due to legal/financial/other reasons. I'm thankful for them! Happy holidays/winter to distancers!

2

3

65

Jack Hessel

@jmhessel

3 years

💐🌼🌻 @XandaSchofield and I have a new ACL Short! 1) We set a new SoTA in most-scooped-result-during-review! :-) Yep: shuffling doesn't hurt BERT on GLUE that much. 2) We explore token reps/attn dists to quantify impact on shuffling 3) We run BERT on diff-privatized BoW docs

5

12

62

Jack Hessel

@jmhessel

5 years

Repeat 1-3 for 6 years, or until PhD is cooked through: 1. "What an interesting problem! Surely the model will learn intricate patterns..." 2. Machine learning derives spurious solution unrelated to underlying phenomenon 3. Tweak for 6+ months; remain faithful to original goal

2

8

62

Jack Hessel

@jmhessel

2 years

While @ReviewAcl is going through some growing pains, a bit of positivity in thanks of the volunteer organizers!~ I reviewed a resubmit where our reviewer suggestions were incorporated - this improved the work, and review scores went up! A happy experience enabled by R+R :-)

1

2

60

Jack Hessel

@jmhessel

5 months

I had planned to go to neurips but covid finally got me --- will try to make the second half if I'm nego, else, sorry to miss you, friends❤️

11

0

55

Jack Hessel

@jmhessel

2 years

I love the idea of "superhuman conversations". What would that mean? > You walk into a bar + approach two robots having a conversation. "Sorry human, you simply wouldn't understand" says the superhuman conversation agent condescendingly. You sulk out of the bar, dejected. 🤣

Sasha Luccioni, PhD 🦋🌎✨🤗

@SashaMTL

2 years

Here, fixed it.

22

156

1K

3

7

53

Jack Hessel

@jmhessel

4 years

Congrats to my advisor, Charles Roy Davis Professor Lillian Lee, on being elected to a named professorship!! :)

Cornell Computer Science

@Cornell_CS

4 years

The Cornell University Board of Trustees has elected CS professors Carla Gomes and Lillian Lee to endowed professorships. Gomes will serve as Ronald C. and Antonia V. Nielsen Professor, while Lee will service as Charles Roy Davis Professor.

0

6

41

3

55

Jack Hessel

@jmhessel

3 years

Excited about our EMNLP paper: CLIPScore! TL;DR a relatively direct application of CLIP aligns well with human judgment of caption quality vs. reference-based metrics. We argue it is viable evaluation metric for caption generation

GitHub - jmhessel/clipscore: CLIPScore EMNLP code

CLIPScore EMNLP code. Contribute to jmhessel/clipscore development by creating an account on GitHub.

github.com

AK

@_akhaliq

3 years

CLIPScore: A Reference-free Evaluation Metric for Image Captioning CLIP can be used for robust automatic evaluation of image captioning without the need for references pdf: abs:

0

6

42

1

9

54

Jack Hessel

@jmhessel

5 years

Our paper "Something’s Brewing! Early Prediction of Controversy-causing Posts from Discussion Features" was accepted at NAACL 2019! Joint work with my advisor Lillian Lee. See y'all in Minneapolis :)

1

53

Jack Hessel

@jmhessel

2 years

@mark_riedl @sama "A penguin on Mars wearing a spacesuit walking a robot dog next to Santa Claus". 1 is a direct sample; 2 demonstrates "compositional in-painting" (3 rds: "A penguin on Mars wearing a spacesuit" + "walking a robot dog" + "next to Santa Claus" + a bit of selecting)

2

3

53

Jack Hessel

@jmhessel

3 years

Arzoo was a rising star, a brilliant mind, and a kind soul. I was lucky for the time I got to spend with her during grad school. Difficult to comprehend that she's gone.

Penn State College of Engineering

@PSUEngineering

3 years

The College of Engineering community mourns the loss of Arzoo Katiyar, assistant professor @PennStateEECS , who died on May 30 at the age of 30. Katiyar joined the college’s faculty in 2020. Read more about Katiyar’s contributions to the college:

10

61

1

4

51

Jack Hessel

@jmhessel

1 year

"I'll just quickly add results from those new OpenAI models released in the past 5 months since the ACL submission deadline for the camera ready, sure nothing will cha---"

2

1

47

Jack Hessel

@jmhessel

2 years

Just annotated a few hundred samples from a gpt3 generated corpus. The unique combination of high fluency, partial correctness, and confidence in assertions of nonsense melts the mind in a very specific way 🫠

4

3

50

Jack Hessel

@jmhessel

4 months

Wanted: LLM that turns language description of problem into convex optimization problem

6

3

50

Jack Hessel

@jmhessel

2 years

*text generator outputs "5 apples" instead of "6 apples"* me: this giant mess of parameters is no better than ELIZA *image generator hallucinates a corgi* me: omggg what an ar-tist! 🧑‍🎨🎨🧑‍🎨🎨😍🤩 u do u, computer, u do u 🥰

1

0

50

Jack Hessel

@jmhessel

5 years

Our new preprint is online!! Super excited about this one :) "Unsupervised Discovery of Multimodal Links in Multi-Image, Multi-Sentence Documents" Code and data: (joint with Lillian Lee and @dmimno )

2

11

49

Jack Hessel

@jmhessel

5 years

Startup Pitch: Tinder, except instead of dating, it matches BERT variants submitted by the public with industry NLP researchers that have enough GPUs to actually try the models.

0

3

47

Jack Hessel

@jmhessel

2 years

Through hiked the enchantments today, finally saw a wild 🐐!

0

47

Jack Hessel

@jmhessel

4 years

BTW - if it would be at all helpful, I'm happy to provide comments/edits/etc. for folks (esp. #blackinstem folks) pushing for the #emnlp2020 deadline (just a grad student, but have reviewed for 5 yrs in a variety of tracks for *ACL confs) hmu: jmhessel @gmail .com (h/t @ermgrant )

1

9

46

Jack Hessel

@jmhessel

7 years

A real, uncensored training example from the MSCOCO captioning dataset. Kudos to this m-turker? cc: @XandaSchofield

1

14

46

Jack Hessel

@jmhessel

4 years

@jeffbigham GPT-2, when informed of GPT-3, is already on the case.

1

5

46

Jack Hessel

@jmhessel

8 months

Very excited for VisIT-Bench: an Elo leaderboard for comparing vision+language chatbots! Here's the current best -- human still wins :-) We also provide chain-of-thoughts detailing the (high human corr) evaluator's process for each case. Submit today @

Yonatan Bitton

@YonatanBitton

8 months

Happy to share VisIT-Bench's acceptance to #NeurIPS2023 D&B! As multimodal chatbots rise, real-world instruction following evaluation is crucial. VisIT-Bench's auto-eval aligns closely with human preferences. We've updated the arXiv & leaderboard; researchers, add your models!📢

1

7

39

1

5

45

Jack Hessel

@jmhessel

5 years

Table 15 from T5 might be the most computationally expensive table ever constructed in the history of natural language processing 🙀

1

3

43

Jack Hessel

@jmhessel

10 months

@_akhaliq Isn't this famous image, and likely in the training set many, many times? Here's a news article :-)

Suspicious Looking Egg Inspires A Ridiculous Photoshop Battle

If you like Photoshop battles, this particular one is especially eggciting! It all began when someone posted a photo of a fried egg... The only problem is that this egg is not an egg at all.

www.boredpanda.com

4

0

42

Jack Hessel

@jmhessel

2 years

"The space needle from downtown on solstice during sunset" ... Okay, for once, I took this one, not autogenerated 😁

3

1

42

Jack Hessel

@jmhessel

6 years

This sort of result makes me think that an AI winter is closer than we think. Perhaps this can be combated by changing expectations of what a ML paper should be, and publicly calling out BS news articles discussing sentient AIs. Or maybe the hype train will win 🌨️❄️😬

Ian Goodfellow

@goodfellow_ian

6 years

ML researchers, reviewers, and press coverage of ML need to get a lot more serious about statistically robustness of results and the effect of hyperparameters. This study shows that many papers over the last year or so were just observing sampling error, not true improvement.

33

655

2K

4

9

39

Jack Hessel

@jmhessel

6 years

BERT, the new high-performing, pretrained l̶a̶n̶g̶u̶a̶g̶e̶ ̶m̶o̶d̶e̶l̶ bidirectional-word-filler-inner-and-does-this-sentence-follow-predictor is out on arxiv! Improvements are mostly via bidirectionality and ditching language modeling!

2

9

41

Jack Hessel

@jmhessel

7 years

@drob @RahmtinR has a @icwsm paper about copy-paste latex macro cascades -- here's an \hbox{...} that lasts *decades* :-)

0

16

40

Jack Hessel

@jmhessel

4 years

Spicy take: "qualitative analysis of the model" sections are frequently detrimental in NLP papers. Yes, it's *very* important to look at your data. But slapping together a few cases where the model does "well" or "poorly" without additional quantitative analysis is misleading.

5

6

40

Jack Hessel

@jmhessel

2 years

Just released the dataset @ ! including: - Annotations of the cartoons; - Multiple choice tasks (can your model bridge the human-machine gap?); - a corpus of 650 joke explanations we hand-wrote, equivalent in length to a novella 🙃 More to come ~!

Jack Hessel

@jmhessel

2 years

Does AI ""understand"" The New Yorker Caption Contest? (spoiler: no 🙃 ) excited for this fun collaboration!(data/models/code/more details forthcoming).

9

57

326

2

14

40

Jack Hessel

@jmhessel

1 year

OpenFlamingo, our open version of Deepmind's Flamingo model, is now out! Excited for more public in-context V+L models ~ check out the demo/code/checkpoints!🥳 (congrats to co-leads @anas_awadalla , @irena_gao + team! thanks to @StabilityAI for compute 🫶)

Anas Awadalla 🍉

@anas_awadalla

1 year

🦩 Introducing OpenFlamingo! A framework for training and evaluating Large Multimodal Models (LMMs) capable of processing images and text. More details below (including a multimodal LLaMA model!)⬇️ Blog: Demo:

27

470

2K

0

5

40

Jack Hessel

@jmhessel

4 years

Our #EMNLP paper is out!! The proposed algorithm, EntSharp, learns lexical visual-textual associations given (potentially noisy/redundant) <image, word count> co-occurrence data.

Jack Hessel

@jmhessel

4 years

Learning lexical grounding can be hard if your corpus consists of noisy image+text documents (vs. hand-annotated images). Our proposed algorithm can (usually) learn domain-specific patterns w/o labels! Joint w/ (first author) Gregory Yauney () + @dmimno

0

4

7

0

4

39

Jack Hessel

@jmhessel

3 years

Them: What movie do you want to watch? All #NLProc researchers: Arrival Them: ... Again? #NLProc : ... Yes...

5

1

39

Jack Hessel

@jmhessel

7 years

Who wrote papers at WWW? Here are the (bugfixed -- thanks @autreche ) versions of the first-author/all-author counts by institution #WWW2017

1

19

38

Jack Hessel

@jmhessel

5 years

@fchollet For me, I'd say an unlisted first step is the hardest: Figuring out what questions to ask in the first place to at least begin V1 the above steps. The second hardest step is having the drive to repeat all steps several times until you have something you are happy with.

1

2

37

Jack Hessel

@jmhessel

5 years

While I won't be able to present our work at #NAACL2019 , my advisor Lillian Lee will be talking about our paper on predicting controversy on Reddit from early discussion features! Check it out the talk: Nicolette B/C Tuesday at 9:50AM!

1

10

37

Jack Hessel

@jmhessel

1 year

Why not both? I fine-tuned flan-t5-xxl (11B) on databricks-dolly-15k. If you want to play with it, I uploaded the weights here: (caveat: this was just a quick experiment to help better understand the new databricks corpus)

Yi Tay

@YiTayML

1 year

me: 🍮🍮🍮🍮?

8

3

101

3

7

36

Jack Hessel

@jmhessel

4 years

Our #EMNLP2020 short about learning representations from *non-instructional* web videos is out! + the i3-video corpus! 6.7K videos with instructional vs. not judgments w/ @GoogleAI 's Zhenhai Zhu, Bo Pang, and Radu Soricut

1

7

36

Jack Hessel

@jmhessel

5 years

Tired of training multimodal models with one caption <--> one image? Excited that our paper on multi-sentence, multi-image documents was accepted to #EMNLP2019 🇭🇰🇭🇰!! Joint w/ Lillian Lee + @dmimno :D Preprint, data, and code: (Camera ready coming soon!)

0

5

34

Jack Hessel

@jmhessel

2 years

Anyone who has worked with @YejinChoinka already knew she was a genius :-) but --- this well-deserved recognition is a testament to her tenacity, creativity, and ambition. Congrats Yejin !! 🤩🥳 Thanks for your mentorship, collaboration, and for pushing us to think big thoughts!

The Seattle Times

@seattletimes

2 years

Seattle computer scientist Yejin Choi is among this year’s 25 winners of the John D. and Catherine T. MacArthur Foundation’s prestigious fellowships known as “genius grants.”

0

5

23

0

2

33

Jack Hessel

@jmhessel

2 years

I asked DALLE-2 to make vintage national park posters for each planet in the solar system ! here's my favorite, for Jupiter! prompt: "Works progress administration vintage US national park poster for a national park on X" (X was each planet name) #dalle @NatlParkService

3

0

33

Jack Hessel

@jmhessel

6 years

Our response to Gaffney and Matias (2018) re: gaps in a popular reddit dataset. TL;DR: we re-scraped, released new datasets, and replicated the results from two previous studies that use this dataset (no changes to report)! Read: (cc: @DGaff ; @natematias )

1

5

31

Jack Hessel

@jmhessel

3 years

When I was first learning about proofs, a professor gave me some advice that stuck: "don't write down things you don't believe; it's bad for the brain." Even though it sounds simple, this has been a shockingly useful mantra for writing/research (for me at least!)

0

2

32

Jack Hessel

@jmhessel

2 years

(click to expand --- yes, I stitched together in google slides, don't look too close😅)

1

2

29

Jack Hessel

@jmhessel

2 years

reflection lakes living up to their name this weekend ~! 🪩🏔️

0

32

Jack Hessel

@jmhessel

5 years

While I appreciate the sentiment and pedagogical rationale behind not including scores with reviews, not having them means you have to make inferences about tonality with extremely limited data, anonymously. This doesn't mix well with grad student angst. #emnlp2019

2

1

30

Jack Hessel

@jmhessel

5 years

Two zero cost tips for training neural nets (that I wish I had adopted earlier): - Checkpoint models, and use the one with the lowest val error at test time - Reduce the learning rate when validation loss plateaus. Maybe these are obvious, but in case you're not using them...

1

3

30

Jack Hessel

@jmhessel

2 years

for reference everything outside the box is ""imagined"" by the model. I got the idea for doing this by seeing some stitched/widescreen DALLE generations. I hadn't seen anyone start from an existing work before; apologies if this is someone else thought to try first :-)

1

28

Jack Hessel

@jmhessel

4 years

Definitely check out @HaoTan5 and @mohitban47 on the #NLPHighlights podcast (hosts: @nlpmattg @waleed_ammar and @pdasigi ) discussing their excellent LXMERT cross-modal repr learning work, and broader multimodal learning goals, too!

107 - Multi-Modal Transformers, with Hao Tan and Mohit Bansal

In this episode, we invite Hao Tan and Mohit Bansal to talk about multi-modal training of transformers, focusing in particular on their EMNLP 2019 paper that introduced LXMERT, a vision+language trans

soundcloud.com

1

7

28

Jack Hessel

@jmhessel

4 years

Super excited for the opportunity to present our #nlproc work at (virtual) #emnlp2020 ! While we're still working on camera readies, a quick preview of some accepted visual-textual grounding work that I can't wait to share more fully! :)

1

28

Jack Hessel

@jmhessel

3 years

Come apply to work with us next summer at @allen_ai ! pros: - awesome team of researchers to collaborate with! - 100% research focused, flexible topically, competitive pay - past interns regularly publish their work at top-tier venues cons: - ??? maybe none?!? :-)

MOSAIC

@ai2_mosaic

3 years

Looking for a Summer 2022 research internship? Apply to the Mosaic team @allen_ai !! topics include: commonsense reasoning, generation, vision+language, RL, + more! Applications due Nov 19th! Read about some recent publications:

0

33

115

1

7

27

Jack Hessel

@jmhessel

5 years

Leaving for #EMNLP2019 soon!! + I'll be presenting our multi-image, multi-sentence document work + I'll be giving a talk at #CoNLL2019 on captioning instructional vids + I'll be jetlagged and excited to grab coffee with folks :-)

0

27

Jack Hessel

@jmhessel

5 years

I love this blog post! TLDR most statistical tests are equivalent to linear regressions, and maybe we should start teaching them that way (instead of as separate tools)

Jonas K. Lindeløv

@jonaslindeloev

5 years

I've made this cheat sheet and I think it's important. Most stats 101 tests are simple linear models - including "non-parametric" tests. It's so simple we should only teach regression. Avoid confusing students with a zoo of named tests. 1/n

93

3K

9K

2

27

Jack Hessel

@jmhessel

1 year

TIL a fun transformer trick from t5x: packed examples. you can shove multiple (input, output) sequences into a single example's forward pass. The enc/dec attention masks are modified so that each packed example can only "see" itself. Esp. useful for TPU w/ a fixed seq len.

1

2

27

Jack Hessel

@jmhessel

5 years

Wahoo!! Our paper "A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions" was accepted to @conll2019 ! Work done last summer with my awesome intern hosts @GoogleAI , Bo Pang, Zhenhai Zhu, and Radu Soricut. Camera ready soon :-)

CoNLL 2019

@conll2019

5 years

Accepted paper list is now available on our webpage: ! Congrats again to all authors! Detailed program etc. will be up soon! looking forward 🙂

0

19

28

0

4

26

Jack Hessel

@jmhessel

5 years

Anyone else not understand the difference between "self-supervised" and "unsupervised"? Obviously there are fuzzy boundaries between different people's definitions of supervision, but are these not the same? What's unsupervised but not self-supervised? (or vice versa?)

5

4

25

Jack Hessel

@jmhessel

2 years

NYC MTA Modern Pretrained on weekends language models 🤝 Transferring for unknown reasons

1

3

25

Jack Hessel

@jmhessel

5 years

Vlad is one of my *favorite* speakers! Every time I go to a Vlad talk, I come away with a reading list, new knowledge, and, most importantly, fresh eyes/excitement about my own work. (If that wasn't enough --- his slides compete for the most aesthetically pleasing in all of NLP!)

1

3

26

Jack Hessel

@jmhessel

2 years

Excited some of our work is being presented at #NAACL2022 next week!! I will be attending mostly virtually, but because I'm in Seattle, would be excited for some outdoor meetups with folks! I'm considering remote working from a nearby park one day :-) DM if you want to hang!

0

26

Jack Hessel

@jmhessel

3 years

@raphaeljlt @opencitations @Blendenfleck R2: "the related work sec is incomplete, strong reject" 😜

1

0

26

Jack Hessel

@jmhessel

6 years

Last tweet on this topic! Another way of exploring the similarity between position embeddings is to simply plot a heatmap of all pairwise cosine similarities. Here's what comes of that for the sinusoid embeddings and the learned embeddings. So many weird things in the learned emb

1

3

26

Jack Hessel

@jmhessel

2 years

Really excited about this one! (And, as always, thanks to @ak92501 !)

AK

@_akhaliq

2 years

Quark: Controllable Text Generation with Reinforced Unlearning abs: introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property, while not straying too far from the original model

0

24

163

0

4

26

Jack Hessel

@jmhessel

2 years

@mark_riedl

4

0

24

Jack Hessel

@jmhessel

4 years

I can't be the only one with no intuition for this "self training" business right? So you label unlabeled instances with your model and then just add them to your training set? And that makes your model generalize better? Are there good explanations out there? #nlproc #lazyweb

7

2

24

Jack Hessel

@jmhessel

8 months

this sped up my qlora models by ~20% for approx 2 minutes of work and 1 LoC change :-)

younes

@younesbelkada

8 months

New feature alert in the @huggingface ecosystem! Flash Attention 2 natively supported in huggingface transformers, supports training PEFT, and quantization (GPTQ, QLoRA, LLM.int8) First pip install flash attention and pass use_flash_attention_2=True when loading the model!

8

103

526

2

25