CAML Lab @CAML_Lab X Profile

CAML Lab

@CAML_Lab

Followers

100

Following

54

Media

1

Statuses

26

Cambridge Applied Machine Learning Lab @Cambridge_Uni led by PI @SamuelAlbanie

Cambridge

Joined January 2023

Don't wanna be here? Send us removal request.

CAML Lab

@CAML_Lab

6 months

RT @JRobertsAI: We need you, eagle-eyed folks of X!. Help us red team ZeroBench to find errors. To recognise effort, we will offer co-autho….

0

7

0

CAML Lab

@CAML_Lab

6 months

📣📣 Challenging new visual benchmark from our lab!.

Jonathan Roberts

@JRobertsAI

6 months

Is computer vision “solved”?. Not yet. Current models score 0% on ZeroBench. 🧵1/6

0

1

Grok

@grok

6 days

What do you want to know?.

452

289

2K

CAML Lab

@CAML_Lab

10 months

🪡📢 New paper from our group!. We explore the ability of frontier LLMs to follow threads of information through long context windows. "Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?". Project page:

Jonathan Roberts

@JRobertsAI

10 months

🎺New paper!. "Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?". 🧵(1/5)

0

1

CAML Lab

@CAML_Lab

1 year

📢📢 Check out this new paper from our group!. A Practitioner's Guide to Continual Multimodal Pretraining:

arxiv.org

Multimodal foundation models serve numerous applications at the intersection of vision and language. Still, despite being pretrained on extensive data, they become outdated over time. To keep...

Vishaal Udandarao

@vishaal_urao

1 year

🚀New Paper: "A Practitioner's Guide to Continual Multimodal Pretraining"!. 🌐Foundation models like CLIP need constant updates to stay relevant. How to do this in the real-world?.Answer: Continual Pretraining!!. We studied how to effectively do this.🧵👇

0

2

CAML Lab

@CAML_Lab

1 year

🚨New paper from our group introducing GRAB!. GRAB is a challenging GRaph Analysis Benchmark for LMMs. Project page: Paper:

arxiv.org

Large multimodal models (LMMs) have exhibited proficiencies across many visual tasks. Although numerous well-known benchmarks exist to evaluate model performance, they increasingly have...

Jonathan Roberts

@JRobertsAI

1 year

🎉📢New Paper!.Introducing GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models. The highest-performing model scores just 21.7%. A thread 🧵

0

4

CAML Lab

@CAML_Lab

1 year

🚨New work from our group introducing SciFIBench!. We evaluate the scientific figure interpretation capabilities of 30 LMM, VLM and human baselines, including GPT-4o!. Paper: Data: Repo:

github.com

NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation - jonathan-roberts1/SciFIBench

Jonathan Roberts

@JRobertsAI

1 year

Introducing SciFIBench, a scientific figure interpretation benchmark for LMMs! . - We evaluate 30 LMM, VLM and human baselines.- GPT-4o is much better than GPT-4V.- The mean human narrowly outperforms GPT-4o & Gemini-Pro 1.5. (1/5)

0

1

5

CAML Lab

@CAML_Lab

1 year

RT @sinha_shiven: Excited to announce our preprint!. We develop a symbolic system for IMO Geometry that can rival Silver Medalists. Combine….

0

68

0

CAML Lab

@CAML_Lab

1 year

🚨🚨🚨Fresh-off-the-press work out of our group!. This work questions how meaningful the term "Zero-Shot" really is in the context of multimodal models; turns out its really more "exponential-shot"!!🤔. Check out the thread below for more details🧵👇.

Vishaal Udandarao

@vishaal_urao

1 year

🚀New Preprint Alert! . 📊Exploring the notion of "Zero-Shot" Generalization in Foundation Models. Is it all just a myth? Our latest preprint dives deep. Check it out!🔍.

0

4

CAML Lab

@CAML_Lab

1 year

RT @ChombaBupe: It turns out the data bottleneck problem is more dire than initially thought:. AI model performance - which can be largely….

0

283

0

CAML Lab

@CAML_Lab

1 year

RT @abursuc: Fun paper by teams of @bethgelab & @SamuelAlbanie diving into the so-thought "magic" zero-shot generalization properties of CL….

0

6

0

CAML Lab

@CAML_Lab

1 year

RT @bethgelab: 🚨Exciting new findings from our lab! . This work challenges the concept of "Zero-Shot Generalization" in multimodal models.….

0

13

0

CAML Lab

@CAML_Lab

1 year

RT @arankomatsuzaki: No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance. repo:….

0

52

0

CAML Lab

@CAML_Lab

1 year

RT @_akhaliq: No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance. Web-crawled p….

0

67

0

CAML Lab

@CAML_Lab

1 year

Fresh new work from our group introducing Lifelong Benchmarks! . Testing models on static benchmarks leads to severe "benchmark overfitting", inhibiting true generalisation. This paper tackles this problem and proposes a simple method to solve it. Check out more details here👇.

Vishaal Udandarao

@vishaal_urao

1 year

🚀New Paper Alert!.Ever faced the challenge of ML models overfitting to benchmarks? Have computational difficulties evaluating models on large benchmarks? We introduce Lifelong Benchmarks, a dynamic approach for model evaluation with 1000x efficiency! .

0

1

3

CAML Lab

@CAML_Lab

1 year

RT @AmyPrb: Arxiv papers is a super cool initiative! Has long and short GPT-generated videos for recent, impactful papers (.

0

9

0

CAML Lab

@CAML_Lab

2 years

RT @bethgelab: @thwiedemer @jackhb98 @kotekjedi_ml @JuhosAttila @wielandbr @MatthiasBethge @prasannamayil @evgenia_rusak .

arxiv.org

Recent advances in the development of vision-language models (VLMs) are yielding remarkable success in recognizing visual semantic content, including impressive instances of compositional image...

0

4

0

CAML Lab

@CAML_Lab

2 years

RT @bethgelab: Exciting news!🥳🎉.4 papers (1 oral, 3 posters) from our lab were accepted to #ICLR24!.Brief 🧵 below:.

0

14

0

CAML Lab

@CAML_Lab

2 years

RT @vishaal_urao: Excited to share that our work on Visual Data-Type Identification got accepted to #ICLR24🚀. Vienna incoming 🚆.

0

5

0

CAML Lab

@CAML_Lab

2 years

RT @vishaal_urao: Way too late on the #NeurIPS tweet train, but here goes. @MatthiasBethge and I will give a talk at the Scaling, Alignme….

sites.google.com

This is the 6th workshop in our Scaling workshop series that started in Oct 2021. The objective of these workshop series, organized by the CERC in Autonomous AI Lab led by Irina Rish at the Univers...

0

5

0

CAML Lab

@CAML_Lab

2 years

Catch the latest research from our lab introducing the problem of "Visual Data-Type Identification" for Vision-Language Models like CLIP & OpenFlamingo! Interesting insights into failure modes and how they emerge from properties of the pre-training distribution. Detailed🧵below:.

0

1