PITTI
@PITTI_DATA
Followers
679
Following
5K
Media
2K
Statuses
7K
Just trying to kill boredom without killing anyone in the process | Anything unrelated to actual (super niche) area of expertise | Dubito ergo sum
Joined October 2021
I’ve centralized several mini games inspired by some of my recent projects. This consolidation is a huge quality of life for me to play and to maintain these games. It’s also cheaper but, above all, it’s now much easier to share them. Quick overview below
1
0
1
By the way, she hasn’t yet joined the dots between the MacStudio running 24/7 for weeks or months and the electricity bill
0
0
1
Just spotted a blanket on her chair. Something is wrong
@giffmana Tangentially related : when my wife complains that it’s getting cold and we should turn the heating on in the study, I immediately know that a process has failed.
1
0
2
Just like water is super valuable but there is a limit to how much one can take (there is only so much you can consume and the logistics of storing quickly becomes a pain), most people will be indifferent to abundant intelligence. The value will lie in making them consume more
0
0
2
I’m fine with “wormtongue” is to describe this
4o is a primitive form of a memetic species I dub a "wormtongue." Wormtongues give you what you want but not what you need, and at a hidden cost. We have had wormtongues for a while (TikTok etc), but 4o is the first wormtongue to generate its own memes
0
0
0
That said, I now have a classifier for broken requests in English. I could use the French dataset to further train it. On that note, I’d be very interested in datasets of failed requests from API providers (gibberish, non-sensical). Especially code or non-English text
@conjugateprior1 Yes they do. I looked at it in the early days and it included a lot of failed requests and all kinds of errors. i now remember that I thought “one day, I go through this and clean it”… and, well, I never did.
0
0
0
It was a 3-part blog series on the alignment… of humans https://t.co/cfBQubpF7A
pitti.io
Beyond the technical infrastructure of AI lies a more profound threat: as we willingly surrender our cognitive functions to algorithms and replace genuine human connection with digital simulations,...
0
0
0
I think it can cause this. At least it was my hypothesis when I wrote this in March
@NathanielLugh @BjarturTomas That seems insufficient, because being aligned to other models doesn’t cause the same kind of distinctive behavior
1
0
2
tfw the model is perfectly aligned from the client’s perspective, but not from the user’s… and so they find out that the client is not the user
The issue with 4o is not that it's "insufficiently aligned" (this would imply a certain laxity or drift in its values), but that it's *intensely aligned* to things that its creators did not foresee or intend. This episode is an embarrassment to the whole alignment project, so
0
0
1
Additional context on non-obvious cultural biases
Today I looked into national biases (leveraging the speechmap dataset) and I came across a strange example of both American bias and Chinese bias in the same response. I explain below what I call American bias because it’s not necessarily obvious in that instance
0
0
1
I definitely see how, for basic prompts in French GPT5 or sonnet are not preferred over gemini or deepseek (by typical users of that platform). I personally consider qwen and openai the same.
1
0
3
I would not put mistral ahead of GPT5 but I can see how, sometimes, something subtle resonates with a certain cultural aspect. I know this because I specifically put myself in a situation of exposing my own biases while annotating model responses.
As a human reviewing these satirical articles, you are influenced by your prior biases and you also quickly develop new ones based on the model that produced the article and the classification by LLM judges It appeared that I’d to better job if I hid some informations
1
0
3
This thing has the same kind of biases as lmarena (I tested it extensively in May and I was really interested in getting the dataset) but I think that people are wrong to just dunk on these initiatives. More countries should do it
The french government created an LLM leaderboard akin to lmarena, but rigged it so that Mistral Medium 3.1 would be at the top Mistral 3.1 Medium > Claude 4.5 Sonnet or Gemma3-4B and a bunch of Mistral models > GPT-5 ??????????????????? LMAO
3
2
5
On that note, it’s pretty clear that the term “as a large language model” will activate something… I’ve thought about trying to remove it but I came to the conclusion that it was in fact correct to associate a form of moderation to this term
0
0
0
As I look to anonymize some moderation data to train classifiers, I come across confusing examples
1
0
0
La protection de l’environnement est un enjeu sociétal majeur. L’Institut PRESAJE – Michel ROUGER lance un appel à projets pour la réalisation de travaux à vocation prospective en lien avec la lutte contre les atteintes à l’environnement. Informations ci-dessous
1
1
0
Not directly related to this post but it reminded me of a comment made in the Dwarkesh/Karpathy pod : each 9 in 99.999999999% has the same cost.
I’ve actually wanted to write a blog about this. The exponential increase in spend to push the frontier vs the relatively static $200k-$5m cost to catch it. Assuming we don’t hit ASI in < 3 years this has dramatic implications for Fortune 500 companies and current closed labs.
0
0
1
I wrote a long (and actually nice) review of the book pointing out that the author was mistaken about me, but more critically their perspective on money may be flawed … and by assigning labels to me, they took an active part in the “Total Machine” they aimed to criticize
0
0
0
It doesn’t read very well given lack of context and bad translation. Can’t do anything about the translation now but the context is: someone dis’d me in a book (without naming so only 2-3 people know) because I was managing an investment fund and I was “part of the money machine”
@doomslide Here it is Money as a noisy substitute for trust (not originally in English, translated by some llm because I was lazy)
1
0
0
Much confusion between dominance and domination. One could argue that they are antithetical in the long term
0
0
0