Vlado Boza Profile
Vlado Boza

@bozavlado

Followers
975
Following
190
Media
191
Statuses
3,821

second of his name. Destroyer of ML hype.

Bratislava
Joined February 2012
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@bozavlado
Vlado Boza
7 days
Kolmogorov-Arnold Network is just an ordinary MLP. Here is the Colab, which explains: The main point is, that if we consider KAN interaction as a piece-wise linear function, it can be rewritten like this: 1/n
Tweet media one
21
210
1K
@bozavlado
Vlado Boza
5 days
If you want to compare your great method to a baseline method M, you need to: a) Optimize baseline as hard as you can b) If somebody used M in the exact same setting, use their best setup and compare it to that. Otherwise, you will look like an idiot. MLP can easily fit this
Tweet media one
26
33
452
@bozavlado
Vlado Boza
2 years
@TaylorLagace So, you are just lying to them. (You pretend to like their content and that content is all over chat). Keep it up!
0
0
199
@bozavlado
Vlado Boza
8 days
@milos_ai As I thought, your MLP baseline is weak. You did not even read the warning about MLP optimization nonconvergence. If you slightly tune the MLP optimizer, MLP will be better than KAN:
Tweet media one
5
11
130
@bozavlado
Vlado Boza
5 days
@predict_addict Have you ever tried tuning the baseline??? Just increasing learning rate of MLP will get you better results than KAN!
Tweet media one
5
4
120
@bozavlado
Vlado Boza
5 days
@predict_addict The graph just compares KAN to an undertrained and unnecessary big MLP. If you train decent MLP properly, the MLP part will look like this:
Tweet media one
2
4
91
@bozavlado
Vlado Boza
1 year
@iROZHLAScz Titulok mal byt: "Máte problém zaparkovat? Tak vam treba, kupili ste si zbytocne velke auto."
0
0
78
@bozavlado
Vlado Boza
7 days
If we rearrange steps from multiple layers, we can have Linear+Repeat+Shift+ReLU instead of Repeat+shift+ReLU+Linear, which is basically MLP. KAN is just MLP. End.
4
5
79
@bozavlado
Vlado Boza
3 months
@Alzbietka Na tieto reci uz mam len toto
Tweet media one
3
0
76
@bozavlado
Vlado Boza
2 years
@MadiHilly What about transuranics which are radioactive for 10000s of years?
15
3
58
@bozavlado
Vlado Boza
3 months
Erik Kaliňák mi pripomína takých malých hustých pubertiakov, ktorí skákali do starsich, lebo vedeli, že keď bude zle, tak ich príde chrániť celá ich veľká rodina.
5
0
62
@bozavlado
Vlado Boza
25 days
@MCynik Huliak, Kuffa a Simkovicova robia velmi dobru dymovu clonu a odputavaju pozornost. To sa hodi vzdy.
3
0
44
@bozavlado
Vlado Boza
1 month
This is really nice thing, no more LR scheduler (still need to tune learning rate). Tested right now on some BERTlike fine-tuning, matches cosine decay final result, but also gives strong results in every point during optimization.
@aaron_defazio
Aaron Defazio
1 month
Schedule-Free Learning We have now open sourced the algorithm behind my series of mysterious plots. Each plot was either Schedule-free SGD or Adam, no other tricks!
Tweet media one
38
215
1K
1
4
45
@bozavlado
Vlado Boza
2 years
@GergelyOrosz This is true for any Google product. Top example is Tensorflow. Half of features are experimental and other one is deprecated.
0
0
42
@bozavlado
Vlado Boza
8 months
@daniel_prazak Keď spím v topánkach, tak ma ráno boli hlava.
0
0
39
@bozavlado
Vlado Boza
4 months
@ylecun IgNobel prizes are for funny research outputs, not for stupid research.
1
0
40
@bozavlado
Vlado Boza
7 days
Now, we can do this for the whole layer. An important observation is that if the grid is the same in every activation, then intermediate ReLU output can be shared. 2/n
Tweet media one
2
2
43
@bozavlado
Vlado Boza
7 days
@FreeFooooooood There is no interpretability when you have more than 2 hidden layers.
3
1
40
@bozavlado
Vlado Boza
7 days
This translates to the following simple Pytorch: 3/n
Tweet media one
1
4
44
@bozavlado
Vlado Boza
1 year
@MeitPastorek "Min ziv:hocikto z ps". Toto fakt nie. Potrebujeme normalneho cloveka, ktory chape, ze jadrovky su nutnost a nie zlo.
1
0
35
@bozavlado
Vlado Boza
4 years
Nanopore basecalling progress in last year (on our test set): Guppy before 3.4.4: 92.5% Guppy 3.4.4: 93.4% Bonito 0.1: 94.5% Bonito 0.2: 95.0% Guppy 4: 94.6% Bonito 0.3: 95.9% That's almost half of the error rate in one year.
1
7
35
@bozavlado
Vlado Boza
10 months
@CzechBar Bonus: Potesit sa, ze dotycne dieta nie je moje. (a samozrejme pokracovat v tom co robim)
1
0
32
@bozavlado
Vlado Boza
2 years
@GothamChess @cryptocom And now, you are getting unfollow. Fuck crypto!
0
0
33
@bozavlado
Vlado Boza
7 days
@productive_mayt I have strong doubts that they trained MLP as best than they could.
1
0
35
@bozavlado
Vlado Boza
5 months
@EvaMartosova Nahodím pohľad psychopata a idem stále rovno. 90% ľudí uskakuje.
2
0
30
@bozavlado
Vlado Boza
3 years
@ConnorHaggarty What is this garbage?
1
0
31
@bozavlado
Vlado Boza
1 year
Tweet media one
1
0
28
@bozavlado
Vlado Boza
2 years
@ylecun Maybe content moderation in English. Look at other minor languages (like Slovak) and you will see, that FB is full of hate and there are almost zero substantial discussions...
2
0
28
@bozavlado
Vlado Boza
7 days
@maxzimmerberlin The question is how straightforward the reduction (reformulation) is. Sometimes you have to do crazy things, but here it is really simple.
3
1
28
@bozavlado
Vlado Boza
2 years
@tunguz Graduate student descent.
0
0
24
@bozavlado
Vlado Boza
8 months
@MarekGalinski Aktuálne to vyzerá, že vodiči nevedia zipsovat a kopa kokotov sa najebe do križovatky, keď cez ňu nevie prejsť.
1
0
22
@bozavlado
Vlado Boza
8 days
@milos_ai All you need is lucky seed :). This is the code. regr = sklearn.neural_network.MLPRegressor(random_state=117, max_iter=5000, solver="lbfgs", hidden_layer_sizes=(25,), activation="relu").fit(X_train.reshape(-1,1), y_train - 2) mlp_preds=regr.predict(X_test.reshape(-1,1)) + 2
Tweet media one
2
3
22
@bozavlado
Vlado Boza
1 year
@dvassallo @storcube @VicVijayakumar After you use that rainwater for watering, it goes into streams, lakes, water tables, etc. So there is no logic 🤷
1
1
21
@bozavlado
Vlado Boza
1 year
@kareem_carr This one is fun. Started a right way and then put a total gibberish into last sentence.
Tweet media one
1
0
20
@bozavlado
Vlado Boza
1 year
@adamznasik Co za blbost je "komunisticka demokraticka ideologia". Bud je nieco komunisticke alebo demokraticke. Spolu to nejde dokopy.
2
0
20
@bozavlado
Vlado Boza
5 days
xLSTM looks very interesting. Sadly, their evaluation ends at 2.7B parameters. The question is what happens at 7B scale and beyond. Also, not being fully parallel is a bummer. But I still like it :)
@HochreiterSepp
Sepp Hochreiter
5 days
I am so excited that xLSTM is out. LSTM is close to my heart - for more than 30 years now. With xLSTM we close the gap to existing state-of-the-art LLMs. With NXAI we have started to build our own European LLMs. I am very proud of my team.
44
373
2K
1
0
20
@bozavlado
Vlado Boza
4 months
@MeitPastorek Mame myslim ~60000 ZS+SS ucitelov. Toto by im mohlo dvihnut plat o 1100 eur v superhrubom. Vlada curakov.
2
0
20
@bozavlado
Vlado Boza
1 month
@talafatka11 @papisBJ Čiže Pelle bude žalovať, že mu hackli web, ktorý ohovára Korcoka. Dobre tomu rozumiem?
0
0
19
@bozavlado
Vlado Boza
3 years
@smdiehl @ahcastor Via the STATE!
1
0
17
@bozavlado
Vlado Boza
2 years
@GergelyOrosz That color palette is terrible.
1
0
17
@bozavlado
Vlado Boza
8 days
@CLakaCoco @milos_ai If I understand KAN correctly, fair comparision would be MLP with 25 hidden nodes (to have same number of params). And if you actually use tanh activation with 25 units, you will get this:
Tweet media one
3
4
18
@bozavlado
Vlado Boza
5 days
@TroofChatty Simpliefied one from the original post (only trick is centralizing the input)
1
0
18
@bozavlado
Vlado Boza
6 months
@MeitPastorek Toto si uz normalne zasluzi demostraciu pred skolou.
2
0
17
@bozavlado
Vlado Boza
2 years
@JFPuget Ctrl+R
1
0
17
@bozavlado
Vlado Boza
5 days
@teortaxesTex My main message is: If you want to prove that B is better than A, then you should use the best version of A possible. Same as claiming that NNs are better than XGBoost for tabular data, which was all the rage 2 years ago
@tunguz
Bojan Tunguz
2 years
The main takeaway: for all three datasets used in the paper, the reported performance of XGBoost was widely inaccurate and the real performance was much better than their best results. 3/6
1
1
40
2
3
27
@bozavlado
Vlado Boza
1 year
@OwainEvans_UK These metrics seriously need lower/higher is better (or at least bolding the best result). It is not obvious thing which everybody knows like AUC or F1.
0
0
16
@bozavlado
Vlado Boza
1 year
@juraj_nevolnik Dovolene to je, ale tiez je nam dovolene na to pindat. Je to skarede, velke, nici to planetu (CO2) a este to aj zabija male deti (kedze ich nevidno).
@joskokolobezka
joskokolobezka
2 years
@PetaHos @janmolacek A ktoré by ste najradšej stretávali na vašej ulici?
Tweet media one
1
1
58
0
0
16
@bozavlado
Vlado Boza
2 years
@mhmck Yep secrecy:
@mhmck
Michael MacKay
2 years
Ukrainian defenders liberated the settlements of Mala Rohan and Vilkhivka from the Russian fascist invaders. These are suburban communities east of the city of Kharkiv. Head of the Kharkiv garrison, Serhiy Melnyk, reported enemy spies and the company commander were eliminated.
Tweet media one
8
133
751
0
1
16
@bozavlado
Vlado Boza
2 years
@Varunufi @OpenAI It just generates flat trajectory. Two tries on same day do not mean anything. And you should know better.
0
0
14
@bozavlado
Vlado Boza
5 months
@gl_sk Nenasytny politický kresťan sa volá Taraba. A podobne ako spieva Tatrofka: Taraba je k****ko.
0
0
15
@bozavlado
Vlado Boza
8 days
PSA: Kolmogorov-Arnold Networks are bullshit.
3
3
15
@bozavlado
Vlado Boza
5 months
@JimPethokoukis Ehm. For CapSet they just found a better solution for some problem size (which is great but overhyped success). For BinPacking they just generated some heuristic, which is better than simple baselines and does not beat anything serious
@JFPuget
JFPuget 🇺🇦
5 months
Saying it outperforms established approaches is wrong. What it outperforms is a basic heuristic. This in itself is interesting, no need to make unfounded broader claim.
2
1
39
0
1
13
@bozavlado
Vlado Boza
1 year
@MarekGalinski @veselovskyma @ZuzanaHanzelova A inac presne toto vysvetluje preco v SR mame tak 20 stredopravych stran a v CR tak 2.
0
0
15
@bozavlado
Vlado Boza
8 months
@MeitPastorek Presne kvoli takymto kokotom ucim svoju rodinu pozerat poriadne na to, ci bliziace sa auto brzdi alebo nie.
1
0
15
@bozavlado
Vlado Boza
8 months
@swfong @mateosfo You realize that EV car shares 99.99% of problems of the diesel car?
0
0
14
@bozavlado
Vlado Boza
1 year
@woj_zaremba Sounds like physics is not your strong suit...
0
0
13
@bozavlado
Vlado Boza
4 years
How to do a deep learning research: a) Try to reproduce some result b) Make a bug in the reproduction c) Fix the bug d) Observe that performance with bug is better and write a paper about it
1
1
14
@bozavlado
Vlado Boza
1 year
@BrianosaurRex Možno by sme reklamu na stávkovanie a kasína mohli zakázať (tak isto ako reklamu na alkohol a cigarety).
1
0
13
@bozavlado
Vlado Boza
9 months
@MeitPastorek "Napriek tomu, že ľudia nadávajú, som presvedčený, že sa máme najlepšie v histórii". Skôr toto treba tesať do kameňa. Slovenskému vyplakavaciemu národu aj desať krát za sebou.
1
0
14
@bozavlado
Vlado Boza
1 year
@PeterLawrey @tagir_valeev That's some fancy unreadable bullshit, where one needs to spend unnecessary 5 seconds to understand the intent. (Or look into tests, if there are one).
1
0
13
@bozavlado
Vlado Boza
2 years
@GergelyOrosz We had to write a subset of C compiler. We negotiated that function pointers could be omitted. That was a mistake, there are tons of worse traps in C. Best 10 days of continuous swearing ever...
1
0
13
@bozavlado
Vlado Boza
2 years
Crypto people: Day 1: "Fuck state" Day 2: "State help, they are talking about us and we do not like it"
@molly0xFFF
Molly White
2 years
lol
Tweet media one
513
2K
31K
0
0
11
@bozavlado
Vlado Boza
7 months
@MeitPastorek Veľa ľudí volilo SaS napriek Sulikovi kvôli kvalitným iným ľuďom. Ja sa pýtam ako sú to kvalitný ľudia, keď nevedia jedného trtka vymeniť.
3
0
13
@bozavlado
Vlado Boza
1 year
@rasbt There is also funny interplay between normalization layers and weight decay. Normalization forces weights to become bigger and bigger and thus effectively slowing the optimization. Weight decay fixes that.
2
0
13
@bozavlado
Vlado Boza
8 months
@linylinx @nirsd If somebody expect you to remember exact order of arguments, then you should just walk out of interview and not bother.
0
0
12
@bozavlado
Vlado Boza
1 year
@MeitPastorek Kto vydrží prácu v korporátne, vydrží aj sardinky.
2
0
13
@bozavlado
Vlado Boza
1 year
@fchollet Remember than JavaScript, PHP and Java are among the most popular languages. It tells nothing about their quality.
0
0
13
@bozavlado
Vlado Boza
1 year
@Fish_CTO Since Regex with backreferences are NP-complete () you should solve Travelling salesman problem with them. Let's go, we are waiting...
0
1
12
@bozavlado
Vlado Boza
1 month
@PeterTkacenko Trubim "na chodca", ked vidim v spataku, ze vo vedlajsom pruhu sa ruti debil na GLE coupe a nebrzdi.
1
0
13
@bozavlado
Vlado Boza
1 year
@onelemonandits1 Je strašná škoda, že v SR si vlastenectvo privlastnili hlupáci a náckovia. V princípe vyvesiť doma Slovensku vlajku znamená, že človek je psychopat, pritom v Švajčiarsku je to bežná vec
2
0
12
@bozavlado
Vlado Boza
1 year
@MarekGalinski Po Septembri mozno budeme na Ukrainu aj emigrovat
0
0
12
@bozavlado
Vlado Boza
3 years
@ylecun @BloombergME I beg to differ. Slovakia polarization is at all times high and FB is the main platform for spreading hoaxes and misinformation. And I have never seen a correct decision from content moderators.
1
0
12
@bozavlado
Vlado Boza
2 years
I hate mandatory test coverage targets in PRs. Correct solution is to display the coverage (of whole code and the changes) and let reviewer decide.
@GergelyOrosz
Gergely Orosz
2 years
"What is your take on a 100% unit test coverage target - including UI code?" Ask what you want to achieve. Unit tests have tons of benefits that go beyond the test - especially when testing business logic, especially in complex apps. Blindly chasing a target is a bad idea.
18
10
157
1
0
12
@bozavlado
Vlado Boza
2 months
@HanesMiki Debata kde Matovič pôsobil ako pricetny. Strašné panoptikum.
0
0
12
@bozavlado
Vlado Boza
7 days
@plaksiva9tr9pka Try removing layers with Llama-3. Everything before was undertrained.
1
0
14
@bozavlado
Vlado Boza
2 years
@GergelyOrosz As a manager, you need to pretend that you are doing some work. And changing processes is easiest thing for a manager to do. Actually digging in and figuring out what is wrong requires serious work and that's off-putting ;)
0
1
12
@bozavlado
Vlado Boza
8 months
@juraj_nevolnik To som fakt jediny, co rad chodi do officu (doma sa s malymi detmi velmi robit neda), ale zaroven chce mat moznost home officu kedykolvek bez akychkolvek approvalov a sprostosti okolo?
2
0
11
@bozavlado
Vlado Boza
4 years
New preprint. We fit our custom nanopore basecaller on edge TPU device (), which you can plug into anything with USB3 port. It gives realtime performance and has nice accuracy edge over fast Guppy. 1/n
1
3
11
@bozavlado
Vlado Boza
1 year
@MarekGalinski A tuto mumiu preco vytiahli do novin?
1
0
10
@bozavlado
Vlado Boza
5 months
@adamznasik2 Mozno by nebolo odveci zacat sekat niektoru byrokraciu, nech uradnici maju menej roboty (velmi casto zbytocnej).
0
0
11
@bozavlado
Vlado Boza
3 months
@jsuchal To fakt az teraz? Po vsetkych tych skusenostiach so statnym IT a korporatmi na to nacucnutymi?
1
0
11
@bozavlado
Vlado Boza
3 years
@koush @siska_pe Same goes for Tesla selling cars for BTC. They could sell 5.0 V12 cars and they environmental footprint would be same.
1
0
10
@bozavlado
Vlado Boza
2 months
@BajzathJakub Z vlastizrady?
1
0
11
@bozavlado
Vlado Boza
3 years
@GothamChess Women chess is usually much more interesting to watch.
0
0
11
@bozavlado
Vlado Boza
8 months
@juraj_nevolnik A ako to, ze Rakusiania a Svajciari maju taku dobru vlakovu a cestnu siet? To tam myslim nestavali sukromnici...
4
0
11
@bozavlado
Vlado Boza
2 months
@MarekGalinski @kyslakapusta Samotná voľba problém nie je. Problém je logika (resp. jej absencia) za tým.
0
0
11
@bozavlado
Vlado Boza
2 years
@rasbt Synthetic data generation?
1
2
11
@bozavlado
Vlado Boza
8 months
@MeitPastorek To asi ako wishful thinking že sa nájde Drucker a ďalší 3 ktorí s Fifom nepôjdu. Ale keďže štruktúry KDH sa trasu po funkciách, tak je to asi jedno.
0
0
10
@bozavlado
Vlado Boza
1 year
@giffmana Looks like they failed to cite Schmidhuber on multiple occasions.
1
0
10
@bozavlado
Vlado Boza
2 years
@Abebab No racism. Just travel to Czechia, Scotland, Spain and listen to accent differences when speaking English. It's same as with countries in Africa, Asia, ... Also bias in Alexa is bad and you should stop pretending to have moral high ground.
1
1
10
@bozavlado
Vlado Boza
1 year
@LastGenCZ @ilblog @faktaoklimatu Az na to, ze ked do slavneho LCOE zapocitame naklady na stabilitu siete, tak zrazu solar a wind su na tom fakt zle. A to ani nehovorim, ze cost nuclear je vysoky aj kvoli Greenpeacu a podobnym zmatenym aktivistiom.
@energybants
Mark Nelson
1 year
These new LCOE figures imply that the cost of the solar and wind farm themselves may, in the future, fall less than the firming costs rise. The firming cost depends on the amount of wind and solar already on the nearby grid, and the marginal cost of, say, natural gas generation.
Tweet media one
13
33
277
0
0
9
@bozavlado
Vlado Boza
9 months
@MarekGalinski Dpc. Riešiť dôležité veci ako severnú tangentu, nové električkové trate a podobne ťažké veci ani omylom. Ale množstvo náhodných kozmetických zmien to áno...
1
0
10
@bozavlado
Vlado Boza
1 year
@MattRobare @bwboi1 @holz_bau Try walking with a kid more than 500 m to a nearest park...
0
0
8
@bozavlado
Vlado Boza
3 months
@michal_pavlasek WTF? Znižovanie trestov za korupciu malo aspoň nejaké, aj keď zle argumenty. Toto je čisté prerazenie dna.
1
0
10
@bozavlado
Vlado Boza
4 months
Introducing: Fast and Optimal Weight Update for Pruned LLMs In this paper, we devise state of the art method (ADMM-Iter in the table) for pruning LLMs (without a huge GPU cluster). Here is how it works 🧵
Tweet media one
1
1
10
@bozavlado
Vlado Boza
4 years
@jsuchal 4) debilne sa rozbiehava lokalny deployment na testovanie
1
0
10