Steven Beeckman @stevenbeeckman X Profile

Steven Beeckman

@stevenbeeckman

Followers

3K

Following

50K

Media

814

Statuses

11K

Joined February 2009

Don't wanna be here? Send us removal request.

Defensieadjudant - Adjudant de La Défense 🇧🇪

@Def_Adjudant1

15 days

Today, 🇵🇱celebrates its national holiday. Many 🇧🇪have forgotten the sacrifice Poland made for our liberation during World War II. Thank you for your sacrifices.

SHAPE - NATO Allied Command Operations

@SHAPE_NATO

15 days

🇵🇱 Dziś obchodzimy Narodowe Święto Niepodległości Polski! 🇵🇱 Today, we celebrate Poland’s National Independence Day and recognise the country’s commitment to freedom and security.

1

7

SkalskiP

@skalskip92

28 days

3 years since I joined roboflow - 68k stars on github - 60 videos and streams on youtube - 2.5M views in total - 40 technical blogposts ↓ coolest stuff I made

45

151

2K

Thenewarea51

@thenewarea51

1 month

The RNLAF (Royal Netherlands Air Force) has been training in Texas since 1996 and has had a long relationship with the US Army hence why they did the flyover today. Flyovers of US military aircraft have been suspended since the US Government shutdown. 🇳🇱🤝🇺🇸 🎥 @TheNolanK

Cococat

@thatcococat

1 month

Was able to see the RNLAF AH-64s and CH-47s that conducted the COTA F1 flyover today up close before and during departure They kicked up a lot of grass on the way out, certainly a unique experience

40

185

3K

Hamel Husain

@HamelHusain

3 months

8 yr old video from Andrew Ng about error analysis which applies equally well (if not better) to debugging AI products Ng: "I usually do this in a spreadsheet, but using an ordinary text file would be ok" 70% of evals is looking and counting https://t.co/tS7mYl7Wms

7

13

149

Rohan Paul

@rohanpaul_ai

3 months

The paper shows a small model trained with reinforcement learning can outperform prompt only agents on machine learning engineering. Most agents just prompt large models and search longer, but they do not learn from experience. This work instead trains a 3B Qwen model with

12

127

725

Steven Beeckman

@stevenbeeckman

3 months

Believe the hype (one year later).

jane zhang

@jjanezhang

3 months

It's been about a year since my team has fully adopted all the AI coding tools (Cursor, Claude Code) And day to day I am feeling the added cruft in the code base. Unit tests are not catching regressions. Unneeded mocking, comments, are left in between. More refactoring is needed

0

Rohan Paul

@rohanpaul_ai

3 months

This Github has a very wide collection of High-quality datasets, tools, and concepts for LLM fine-tuning. All the datasets listed here should be under permissive licensing (Apache 2.0, MIT, cc-by-4.0, etc.). Categorized into segments like Math & Logic, Code, Conversation &

18

201

1K

Shreya Shankar

@sh_reya

3 months

people always ask me, why build custom interfaces for evaluating LLM traces? human evaluation is expensive. custom interfaces make human evaluation 10x-100x cheaper. thanks, Alex, for sharing your example!

Alex Strick van Linschoten

@strickvl

3 months

Built a lightweight trace viewer to speed up LLM evals—heavily inspired by lessons from @sh_reya and @HamelHusain's evals course. Kept it simple: FastAPI + vanilla HTML/JS. Features: failure banner, execution-flow timeline (LLM ↔ tools), keyboard shortcuts, and an annotation

5

14

110

Shreya Shankar

@sh_reya

3 months

One of the most pressing questions in our AI Evals course is: "Why can’t I just have an LLM write my LLM pipeline?" The nuanced answer is that you can use LLMs to assist, but not for the whole pipeline. Knowing where to put the LLM in the loop is the hard part. To unpack this,

8

30

248

Rohan Paul

@rohanpaul_ai

3 months

"The Impact of Artificial Intelligence on Human Thought" A big 132 page report. AI is shifting real thinking work onto external systems, which boosts convenience but can weaken the effort that builds understanding and judgment, A pattern the paper frames through cognitive

15

84

368

Rohan Paul

@rohanpaul_ai

3 months

This is that original MIT report that said 95% of AI pilots fail and which spooked investors across US Stockmarket. The reports says, most companies are stuck, because 95% of GenAI pilots produce zero ROI, while a small 5% win by using systems that learn, plug into real

90

534

4K

François Chollet

@fchollet

3 months

Important point from Deep Learning with Python...

13

91

767

François Chollet

@fchollet

3 months

We were able to reproduce the strong findings of the HRM paper on ARC-AGI-1. Further, we ran a series of ablation experiments to get to the bottom of what's behind it. Key findings: 1. The HRM model architecture itself (the centerpiece of the paper) is not an important factor.

46

302

3K

Justine Moore

@venturetwins

4 months

This may be the coolest emergent capability I've seen in a video model. Veo 3 can take a series of text instructions added to an image frame, understand them, and execute in sequence. Prompt was "immediately delete instructions in white on the first frame and execute in order"

112

287

4K

Google Labs

@GoogleLabs

4 months

We just discovered the 🔥 COOLEST 🔥 trick in Flow that we have to share: Instead of wordsmithing the perfect prompt, you can just... draw it. Take the image of your scene, doodle what you'd like on it (through any editing app), and then briefly describe what needs to happen

122

418

3K

Chip Huyen

@chipro

4 months

Very useful tips on tool use and memory from Manus's context engineering blog post. Key takeaways. 1. Reversible compact summary Most models allow 128K context, which can easily fill up after a few turns when working with data like PDFs or web pages. When the context gets

4

52

696

SkalskiP

@skalskip92

4 months

supervision-0.26.0 is out we finally released support for ViTPose and ViTPose++ pose estimation models from @huggingface transformers link: https://t.co/xXMRaS3Guk

24

153

991

Paweł Huryn

@PawelHuryn

4 months

RAG is the most critical part of context management in AI. But doing it right is tough. I created a free, interactive simulator that visualizes different variants: 🧵

11

117

634

Admiral Tanguy Botman

@COM_BelgianNavy

4 months

"Il faut être fort pour qu’aucun pays ne puisse imaginer attaquer l’Europe et en sortir vainqueur", affirme le patron de l'armée - RTBF Actus https://t.co/fv2HWFaetO

rtbf.be

'L’OTAN s’est réveillé sur son flanc Est', affirme le général Vansina. Et ce depuis l’invasion de la Crimée...

0

4

15

Chip Huyen

@chipro

4 months

I open sourced Sniffly, a tool that analyzes Claude Code logs to help me understand my usage patterns and errors. Key learnings. 1. The biggest type of errors Claude Code made is Content Not Found (20 - 30%). It tries to find files or functions that don't exist. So I

48

129

1K