Szilard Pafka
@SzilardPafka
Followers
4K
Following
3K
Media
720
Statuses
4K
physics PhD, chief (data/AI) scientist, meetup organizer, (visiting) professor, machine learning benchmarks
The Woodlands, Texas 🇺🇸
Joined February 2014
In the last 5 years I gave about 50 talks at various data science and machine learning conferences and meetups, many of them video recorded. Here is a pointer to the most up-to-date talk in each topic category: https://t.co/ZHHy4gSESs
#datascience #machinelearning #rstats #pydata
6
31
41
After 10+ years, I revisited my (minimal) benchmark of speed of aggregations and joins of tools/libraries/databases used for data science, here are the new results: https://t.co/ZM7WRoJh5M *** What tools are you using?
0
0
2
If you're wondering whether gradient boosting machines are still kicking around—or have been made obsolete by LLMs/ChatGPT—join my talk at the R Consortium's inaugural R+AI Conference (online) next week. Full program and registration here: https://t.co/SEqjCM5lDN
@RConsortium
0
0
1
Reminder: This talk is tomorrow: Szilard Pafka on "Gradient Boosting Machines (GBMs) in the Age of LLMs and ChatGPT" - online talk on Tue, Aug 19 6pm CT | 7pm ET | 4pm PT. RSVP and get the zoom link here: https://t.co/rGPqKEVcNL
0
1
3
Looking forward to this: rebooting the data science meetup with Szilard Pafka on "Gradient Boosting Machines (GBMs) in the Age of LLMs and ChatGPT" - online talk on Tue, Aug 19 6pm CT | 7pm ET | 4pm PT. RSVP and get the zoom link here: https://t.co/ay6UWmxdY7
0
1
4
2024 update: What gradient boosting machine (GBM) library have you been using the most this year?
0
0
0
- I added/I'm adding (WIP) results on newer hardware (EC2 instance types with newer CPUs/GPUs), stay tuned... More details: https://t.co/sDKtiiSbBo
1
1
1
- on CPU, the numbers have changed very little. The top performers are still XGBoost and LightGBM - on GPU XGBoost became even faster (2x on larger data and even more than 2x on smaller data) (it already was the best performer, so now even more so)
1
0
0
After quite a while, I updated the performance results of the most popular Gradient Boosting Machine (GBM) libraries (XGBoost, LightGBM, h2o and catboost) in my GBM-perf Github repo. Summary:
1
2
4
P(doom) in the next 50 years (probability that AI will destroy or significantly degrade human civilization) is
1
0
0
GPT-4 with simple engineering can predict the future around as well as crowds: https://t.co/TX1PMlk4o7 On hard questions, it can do better than crowds. If these systems become extremely good at seeing the future, they could serve as an objective, accurate third-party. This would
23
107
626
Why do Random Forests perform so well off-the-shelf & appear essentially immune to overfitting?!? I’ve found the text-book answer “it’s just variance reduction 🤷🏼♀️” to be a bit too unspecific, so in our new pre-print https://t.co/UXDO9ULnl6,
@Jeffaresalan & I investigate..🕵🏼♀️ 1/n
13
212
1K
@XGBoostProject continue supporting the data science community after so many years. Kudos to @hcho3_ml , who spent countless efforts leading the XGBoost development. Let the forest continue to grow 🌴🌳
0
5
23
Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have? Here are 60 LLMs getting it wrong. https://t.co/SiXJUoKFiY
168
225
1K
Incredibly, the placebo effect is (mostly) not real. It is a result of statistical confusion. Whenever you have a group with extreme values, they tend to exhibit regression to the mean. Eg. on average, sick people tend to become more healthy over time. Thus if you give one
what are your wildest ideas as to why the placebo effect has an effect even when you explicitly tell them it's a placebo
566
4K
22K
Steve Jobs Perfectly explaining AI in 1981 Legendary
54
409
2K
How likely is that AI will destroy human civilization in the next 100 years?
0
0
0
Thanks for this hot take dude who doesn't know the 60 year history of databases. H/T @prempv Mike and I have a WIP paper that analyzes all the (failed) attempts to replace SQL + relational model. This tweet has motivated me to finish it and submit it.
SQL is going to die at the hands of an AI. I’m serious. @mayowaoshin is already doing this. Takes your company’s data and ingests it into ChatGPT. Then, you can create a chatbot for the data and just ask it questions using natural language. This video demoes the output. 🤯
104
490
3K
Announcing the Week 2 update for the Chatbot Arena leaderboard! We've added some new models that are showcasing strong performance. Currently, @OpenAI's GPT-4 and @AnthropicAI's Claude lead the pack, with open-source models in hot pursuit. More findings: https://t.co/zB5PthkHsh
41
264
1K
The problem with the debate on AI is twofold. First, the defenders of AI all seem to be quite heavily invested in AI. Second, they mostly acknowledge that there is at least some risk in developing AIs with intelligence superior to ours. https://t.co/1feGkiheVU 1/4
bloomberg.com
The Cassandras are out in force claiming artificial intelligence will be the end of mankind. They have a very good point.
20
50
281