
Shawn Lewis
@shawnup
Followers
2K
Following
1K
Media
61
Statuses
494
Founder & CTO @weights_biases. Building tools for AI.
Joined March 2011
@iruletheworldmo You said: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
6
1
113
@iruletheworldmo For readers, there were just more than 2000 people in a Twitter space for 1 hour, with @iruletheworldmo promising to speak, many well-respected folks in the space. 🍓 did not speak. Conclusion: do not waste your time.
1
1
78
@iruletheworldmo @ChatGPTapp here's your receipt: "attention isn't all you need new architecture announcement august 13th @ 10am pt the singularity begins".
1
0
67
I’m incredibly proud of everything our team at @weights_biases has accomplished, and excited to keep building with the amazing folks from @CoreWeave!.
Today we announced that we are being acquired by @CoreWeave, the AI Hyperscaler. 🪄🐝. We could not be prouder or more excited to join forces with this team. Our CEO, @l2k, wrote a blog post with more details:.
6
4
71
@iruletheworldmo "attention isn't all you need new architecture announcement august 13th @ 10am pt the singularity begins" Tap this sign.
0
0
52
@iruletheworldmo @elonmusk uh-huh "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
1
0
56
@iruletheworldmo set this straight: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
0
0
36
@zied_houidi Nice to see this systematically confirmed. I wrote about this issue here: Did you try any prompting techniques to fix it?.
2
0
22
We're hosting the next version of @borisdayma's DALL-E mini, now called "Craiyon". Come give it a try!
0
7
19
Fixing this is the key to making reasoning agents work. I think better prompting should go a long way.
1/12 We just found something unsettling: Today's most advanced AI models - including the latest powerhouse reasoning models - can't keep track of what actually happened. Even in a simple conversation. Our ICLR'25 paper reveals why this matters 🧵.
1
0
19
We're on a roll adding new Weave auto-logging integrations. DSPy has a really interesting model for programming with LLMs. Tracing it with Weave (just call weave.init()!) will help you build an intuitive sense for how it works.
📣 I am happy to announce that @weights_biases Weave is now integrated with DSPy. 🧶 Weave will automatically capture traces for DSPy. To start tracking, call `weave.init()` and use the library as normal. 👉 Learn more at
0
3
18
This is the improvement Claude Code needed to be great. Just keep going! I don’t care how you do it or what the context looks like. Looking forward to trying it.
Last up: auto-compact for context management. Claude Code now automatically compacts conversation history when you approach context limits, and it does a better job preserving important info while reducing token usage.
1
0
19
@jmdagdelen Great tips! We’ve built tools to help with most of these at Weights & Biases. We’d love your thoughts if you ever take a look.
2
0
19
@iruletheworldmo You also literally said this: "attention isn't all you need new architecture announcement august 13th @ 10am pt the singularity begins".
0
1
16
Weave auto-logging for the amazing @llama_index is live!. Just add these two lines to your Python Llamaindex programs:. ```.import weave.weave.init('llamaindex-project').```. and get instant tracing, debugging, and evaluations.
1
6
17
@sergeykarayev The corpus of written human language is a giant pre-labeled dataset that encompasses all of humanity’s knowledge to date.
0
1
15
@iruletheworldmo "attention isn't all you need new architecture announcement august 13th @ 10am pt the singularity begins" promise does not hold.
0
0
14
The most important principle when building applications with Generative AI: Make sure you log everything to a central system. Weave makes this a no-brainer. It took @vanpelt 20 minutes to integrate Weave into his recently launched OpenUI project:
If you think OpenAI is cool, you’re gonna love my latest side project OpenUI. Tired of writing HTML by hand and remembering tailwind classes? Let OpenUI do it for you:
1
1
13
@OpenAI's new gpt-4-turbo-2024-04-09 is out and initial eval reports look very good!. I've been poking it with HumanEval, which is a standard coding benchmark, and our new Weave Evaluation toolkit. Today the new model looks slightly worse on HumanEval than some prior models. Do
1
6
12
@iruletheworldmo in 3 months: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
0
0
12
(sound on) The vibes are good with @AnthropicAI's Claude Code! Simple web app that listens to the microphone to visualize music. Built in like 10 minutes. I may have said things like "yo claude wassup" and "make it doper" in the prompts.
1
0
11
Head-to-head competition on novel problems is the future of LLM evals. A very cool start here!.
Introducing Eris v0.1: LLM evaluation framework using debate simulations. Developed with OpenRouter and W&B Weave, Eris assesses models on reasoning, knowledge, and communication through structured debates. See how it performs and future plans:.Read more:
0
1
10
@iruletheworldmo and this: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
0
0
8
@iruletheworldmo @flowersslop @tszzl @flyerthenag6 seriously though. tuesday. huge. "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
0
1
9
Thanks for chatting! For the intrepid: buried in here is a description of how the crosscheck algorithm works.
🆕 short pod - how @shawnup got SOTA SWE-Bench Verified. with @openai o1. by building his own tools (on @weights_biases Weave) to look at data!
1
1
9
@Justin_Halford_ o1's baseline score is 48.9%, published by OpenAI (it was o1-preview at 41%). I could be wrong, but I don't think sonnet3.5 would beat o1 in my framework. The o1 solution seems more general because it actually does reason through what to do, and relies a lot less on its built-in.
2
0
9
@peterrhague This is trolling and harassment though:
To be clear, this is where I draw the line. This is abhorrent and illegal and no one should ever have to deal with this.
1
0
8
@iruletheworldmo you promise: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
2
0
8
At @weights_biases , we try to build tools that seamlessly fit into your workflow, with minimal abstractions. Weave Tracking follows this principle. Simply decorate Python functions with `@weave.op()` to get automatic code and data versioning, and tracing to a central system.
1
0
8
@iruletheworldmo troll: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
0
0
8
@iruletheworldmo you're silly: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins".
0
0
7
@mariofilhoml I’ll probably give it a shot when I can. But all models are different. Any new one will require some serious experimentation and tuning.
1
0
7
I use Weave for hours every day. Each time the team adds a feature, I accelerate!.
Recent updates to @weights_biases for LLM app development .- Custom Usage and Cost Tracking, alongside automatically tracking most LLMs.- Chat view.- Evaluation dashboard.- Image support.- Programatic exports.- Logging performance. This is alongside all of the improvements for
0
0
7