CAML_Lab Profile Banner
CAML Lab Profile
CAML Lab

@CAML_Lab

Followers
100
Following
54
Media
1
Statuses
26

Cambridge Applied Machine Learning Lab @Cambridge_Uni led by PI @SamuelAlbanie

Cambridge
Joined January 2023
Don't wanna be here? Send us removal request.
@CAML_Lab
CAML Lab
6 months
RT @JRobertsAI: We need you, eagle-eyed folks of X!. Help us red team ZeroBench to find errors. To recognise effort, we will offer co-autho….
0
7
0
@CAML_Lab
CAML Lab
6 months
📣📣 Challenging new visual benchmark from our lab!.
@JRobertsAI
Jonathan Roberts
6 months
Is computer vision “solved”?. Not yet. Current models score 0% on ZeroBench. 🧵1/6
Tweet media one
0
0
1
@grok
Grok
6 days
What do you want to know?.
452
289
2K
@CAML_Lab
CAML Lab
10 months
🪡📢 New paper from our group!. We explore the ability of frontier LLMs to follow threads of information through long context windows. "Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?". Project page:
@JRobertsAI
Jonathan Roberts
10 months
🎺New paper!. "Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?". 🧵(1/5)
Tweet media one
0
0
1
@CAML_Lab
CAML Lab
1 year
📢📢 Check out this new paper from our group!. A Practitioner's Guide to Continual Multimodal Pretraining:
Tweet card summary image
arxiv.org
Multimodal foundation models serve numerous applications at the intersection of vision and language. Still, despite being pretrained on extensive data, they become outdated over time. To keep...
@vishaal_urao
Vishaal Udandarao
1 year
🚀New Paper: "A Practitioner's Guide to Continual Multimodal Pretraining"!. 🌐Foundation models like CLIP need constant updates to stay relevant. How to do this in the real-world?.Answer: Continual Pretraining!!. We studied how to effectively do this.🧵👇
Tweet media one
0
0
2
@CAML_Lab
CAML Lab
1 year
🚨New paper from our group introducing GRAB!. GRAB is a challenging GRaph Analysis Benchmark for LMMs. Project page: Paper:
Tweet card summary image
arxiv.org
Large multimodal models (LMMs) have exhibited proficiencies across many visual tasks. Although numerous well-known benchmarks exist to evaluate model performance, they increasingly have...
@JRobertsAI
Jonathan Roberts
1 year
🎉📢New Paper!.Introducing GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models. The highest-performing model scores just 21.7%. A thread 🧵
Tweet media one
0
0
4
@CAML_Lab
CAML Lab
1 year
🚨New work from our group introducing SciFIBench!. We evaluate the scientific figure interpretation capabilities of 30 LMM, VLM and human baselines, including GPT-4o!. Paper: Data: Repo:
github.com
NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation - jonathan-roberts1/SciFIBench
@JRobertsAI
Jonathan Roberts
1 year
Introducing SciFIBench, a scientific figure interpretation benchmark for LMMs! . - We evaluate 30 LMM, VLM and human baselines.- GPT-4o is much better than GPT-4V.- The mean human narrowly outperforms GPT-4o & Gemini-Pro 1.5. (1/5)
Tweet media one
0
1
5
@CAML_Lab
CAML Lab
1 year
RT @sinha_shiven: Excited to announce our preprint!. We develop a symbolic system for IMO Geometry that can rival Silver Medalists. Combine….
0
68
0
@CAML_Lab
CAML Lab
1 year
🚨🚨🚨Fresh-off-the-press work out of our group!. This work questions how meaningful the term "Zero-Shot" really is in the context of multimodal models; turns out its really more "exponential-shot"!!🤔. Check out the thread below for more details🧵👇.
@vishaal_urao
Vishaal Udandarao
1 year
🚀New Preprint Alert! . 📊Exploring the notion of "Zero-Shot" Generalization in Foundation Models. Is it all just a myth? Our latest preprint dives deep. Check it out!🔍.
Tweet media one
0
0
4
@CAML_Lab
CAML Lab
1 year
RT @ChombaBupe: It turns out the data bottleneck problem is more dire than initially thought:. AI model performance - which can be largely….
0
283
0
@CAML_Lab
CAML Lab
1 year
RT @abursuc: Fun paper by teams of @bethgelab & @SamuelAlbanie diving into the so-thought "magic" zero-shot generalization properties of CL….
0
6
0
@CAML_Lab
CAML Lab
1 year
RT @bethgelab: 🚨Exciting new findings from our lab! . This work challenges the concept of "Zero-Shot Generalization" in multimodal models.….
0
13
0
@CAML_Lab
CAML Lab
1 year
RT @arankomatsuzaki: No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance. repo:….
0
52
0
@CAML_Lab
CAML Lab
1 year
RT @_akhaliq: No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance. Web-crawled p….
0
67
0
@CAML_Lab
CAML Lab
1 year
Fresh new work from our group introducing Lifelong Benchmarks! . Testing models on static benchmarks leads to severe "benchmark overfitting", inhibiting true generalisation. This paper tackles this problem and proposes a simple method to solve it. Check out more details here👇.
@vishaal_urao
Vishaal Udandarao
1 year
🚀New Paper Alert!.Ever faced the challenge of ML models overfitting to benchmarks? Have computational difficulties evaluating models on large benchmarks? We introduce Lifelong Benchmarks, a dynamic approach for model evaluation with 1000x efficiency! .
Tweet media one
0
1
3
@CAML_Lab
CAML Lab
1 year
RT @AmyPrb: Arxiv papers is a super cool initiative! Has long and short GPT-generated videos for recent, impactful papers (.
0
9
0
@CAML_Lab
CAML Lab
2 years
RT @bethgelab: Exciting news!🥳🎉.4 papers (1 oral, 3 posters) from our lab were accepted to #ICLR24!.Brief 🧵 below:.
0
14
0
@CAML_Lab
CAML Lab
2 years
RT @vishaal_urao: Excited to share that our work on Visual Data-Type Identification got accepted to #ICLR24🚀. Vienna incoming 🚆.
0
5
0
@CAML_Lab
CAML Lab
2 years
Catch the latest research from our lab introducing the problem of "Visual Data-Type Identification" for Vision-Language Models like CLIP & OpenFlamingo! Interesting insights into failure modes and how they emerge from properties of the pre-training distribution. Detailed🧵below:.
0
0
1