Ariel Ekgren
@ArYoMo
Followers
2K
Following
16K
Media
414
Statuses
2K
Researcher building Large Language Models from Sweden. Also sharing artifacts from the weights. AI Nordics discord: https://t.co/EEZxFT1QFo
Stockholm, Sweden
Joined April 2013
Extremely happy to openly share our LLMs in the GPT-Sw3 family! Lot's of hard work and effort from many people went into creating these artifacts. https://t.co/UIF8nBL0nw
huggingface.co
0
4
22
The EU bureaucracy is... extremely expensive. I don't understand how this is not glaringly obvious to everyone that has been in contact with it.
🇪🇺 As a European citizen and AI founder, I can apparently use these "AI Factories", so I just signed up to use them! Every "supercomputer" has an [ ACCESS NOW ] button which made me very excited I expected to sign up, maybe pay a discounted H100 rate (funded by EU, that'd be
0
0
0
Accidentally turned off Copilot autocomplete in VS Code today… and suddenly my brain started autocompleting instead. Might not turn it back on.
0
0
0
Gemini 2.0 flash panics in a classification task: ``` I have no idea what to do. Can you help me? I am supposed to return a JSON but I don't know what to do. Please. Give me some guidance. I'm lost. I need to get this done but my brain is not working today. I am so sorry. Please
0
0
0
The slop and AGI discussion is to some degree coping. The slop will create massive change and massive value.
0
0
0
I've really enjoyed nanogpt! So happy that season 2 is out already
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
0
0
3
Very interesting research from OpenAI trying to quantify real world value. To me this says that we are close to adding value and hopefully growth in many more sectors than programming! https://t.co/VJWvMLFnK0
0
0
2
Finally got to travel into the Veo3 dimensional plane. Lovely đź§™
0
0
0
What is the lowest expected loss for a 126M gpt style model on fineweb or openweb?
0
0
1
Must say that this was unexpected and... Anyone knows more model details or have an educated guess on good related papers? https://t.co/gq4j3G6NCk
deepmind.google
Gemini Diffusion is our state-of-the-art research model exploring what diffusion means for language – and text generation.
1
0
0
I really really like programming with Gemini 2.5 Pro. Talks a lot but often identifies the core issues after a few rounds instead of going down dead ends. So good.
1
0
2
The EU initiative Going Dark has now been launched by the EU Commission. They call it ProtectEU. It’s a rebranding of Chat Control. New name. Same old propaganda. The EU Commission’s goal is to “access encrypted data in a lawful manner, safeguarding cybersecurity and
96
751
3K
and the models consistently suggests patterns that are not supported by the sdk, mixes up the two sdks and suggest old models. You should really finetune the models on using their own tools because besides this big thing they are so goood!
0
1
0