srchvrs Profile Banner
Leo Boytsov Profile
Leo Boytsov

@srchvrs

Followers
9K
Following
31K
Media
231
Statuses
25K

Machine learning scientist and engineer speaking πtorch & C++. Past @LTIatCMU, @awscloud. Opinions sampled from MY OWN 100T param LM.

Pittsburgh, PA
Joined November 2009
Don't wanna be here? Send us removal request.
@srchvrs
Leo Boytsov
1 year
🧵📢Attention folks working on LONG-document ranking & retrieval! We found evidence of a PROFOUND issue in existing long-document collections, most importantly MS MARCO Documents. It can potentially affect all papers comparing different architectures for long document ranking.⏩.
4
14
131
@srchvrs
Leo Boytsov
1 day
Pretraining has apparently hit the wall, but OpenAI researchers still believe in RL. This sure makes sense: learning from doing is a much more powerful paradigm.
@amir
Amir Efrati
3 days
GPT-5 is good. But model performance gains are still slower than in past years and this year has been a technically challenging one for OpenAI researchers. The inside story here.
Tweet media one
0
0
3
@srchvrs
Leo Boytsov
5 days
RT @lmthang: It’s my great honor to have just received the test-of-time award at #ACL2025 from @aclmeeting for our paper “Effective Approac….
0
21
0
@srchvrs
Leo Boytsov
7 days
Statistically speaking, chances of a life-changing experience (aka you won't need to work for money anymore) are very small (could be well under 1%), but you won't be able to finish your degree, which may change your life in more profound ways. 🟦.
0
0
5
@srchvrs
Leo Boytsov
7 days
🧵This! Same is true for PhD programs. If you lose interest in doing research & publishing papers, or you feel it's not your path otherwise, it's sure fine to drop out of a program, but I don't think it's worth doing just to join a startup. ↩️.
@paulg
Paul Graham
10 days
Don't drop out of college to start or work for a startup. There will be other (and probably better) startup opportunities, but you can't get your college years back.
1
0
10
@srchvrs
Leo Boytsov
8 days
It is a spot-on observation that the nature of search optimization is changing. However, in my view, an LLM is not a new user in the sense of being the ultimate consumer of information. Rather, it is a user in terms of the target optimization metric. Search now needs to support.
@jobergum
Jo Kristian Bergum
8 days
Search has a new user now, and I’m here for it
Tweet media one
1
0
7
@srchvrs
Leo Boytsov
10 days
Hey! Exploding offers give an employer the biggest bang for the buck (or so they hope at least).
@_arohan_
rohan anil
10 days
As a field we probably should get rid of exploding offers.
0
0
0
@srchvrs
Leo Boytsov
14 days
RT @thomasahle: What do we know about the new DeepMind and OpenAI methods for RL on hard reasoning?. - Google say they used a new RL method….
0
2
0
@srchvrs
Leo Boytsov
14 days
A very inspiring result. Congratulations everyone involved!.
@YiTayML
Yi Tay
14 days
First official Gold medal at IMO from DeepMind🥇 with Gemini Deep Think. A general purpose text-in text-out model achieving gold medal is something quite unthinkable just about one year ago and here we are! The frontier of AI is incredibly exciting! . Happy to have co-led /.
0
0
1
@srchvrs
Leo Boytsov
14 days
RE: if your text contains em-dashes (long dashes), it's likely generated or co-generated by ChatGPT. Fun fact, if you ask it to provide suggestions rather than revisions. It will complain about the lack of what it considers to be "typographically correct" symbols 😅: .“5-10x” →.
0
0
3
@srchvrs
Leo Boytsov
16 days
News are getting old very quickly nowadays. @alexwei_ and the team reports that an experimental variant of an OpenAI model is estimated to achieve the level of a gold medalist at IMO using a generalist model. Great achievement: I hope this model will be available to a general.
@srchvrs
Leo Boytsov
18 days
A blog post on evaluating LLMs on 2025 International Math Olympiad. The best LLM scored 31% (Gemini) and the next best one (o3-high) scored only 16%: Such a huge gap: Great job Gemini team! It is actually not yet clear if Gemini will be at a bronze level, but it's unlikely.
0
0
2
@srchvrs
Leo Boytsov
17 days
Results? If read tables correctly, there's only very modest boost in both recall & NDCG, which is within 2%. Given that the procedure requires a second retrieval, it does not seem to worth an effort. 🟦.
Tweet card summary image
dl.acm.org
0
0
3
@srchvrs
Leo Boytsov
17 days
PRF was not forgotten in the neural IR times, but how does it perform really? Revanth Gangi Reddy & colleagues ran a rather thorough experiment and published it SIGIR. ↩️.
1
0
2
@srchvrs
Leo Boytsov
17 days
It was doc2query before doc2query and, in fact, it improved performance (by a few%) of the IBM Watson QA system that beat human champions in Jeopardy!.↩️.
research.ibm.com
Statistical source expansion for question answering for CIKM 2011 by Nico Schlaefer et al.
1
0
2
@srchvrs
Leo Boytsov
17 days
I think this is a problem of completely unsupervised and blind approach of adding terms to the query. If we had some supervision signal to filter out potentially bad terms, this would work out better. In fact, a supervised approach was previously used to add terms to documents!.
1
0
3
@srchvrs
Leo Boytsov
17 days
Fixing this issue produced a sub-topic in the IR community devoted to fixing this issue and identifying cases where performance degrades substantially in advance. Dozens of approaches were proposed, but I do not think it was successful. Why⁉️.↩️.
1
0
2
@srchvrs
Leo Boytsov
17 days
PRF tends to improve things on average, but has a rather nasty property of tanking outcomes for some queries rather dramatically: When things go wrong (i.e., unlucky unrelated terms are added to the query), they can go very wrong. ↩️.
1
0
2
@srchvrs
Leo Boytsov
17 days
PRF is an old technique introduced 40 years ago in the SMART system (arguably the first open-source IR system). ↩️.
@srchvrs
Leo Boytsov
2 months
🧵40 years ago the SMART IR system was released. It introduced a few key concepts including vector space interpretation of the retrieval process and the relevance feedback algorithm. I also think it was probably the first open source search engine. ↩️.
1
0
2
@srchvrs
Leo Boytsov
17 days
🧵Pseudo-relevance feedback (PRF) (also known as blind feedback) is a technique of first retrieving/re-ranking top-k documents and adding some of their words to the initial query. Then, a second retrieval/ranking stage uses an updated query. ↩️.
1
0
8
@srchvrs
Leo Boytsov
17 days
For the record I think models are still pro mne to shortcuts but with more data (including pretraining), better architectures, and much larger number of parameters, the accuracy on many realistic tasks has improved dramatically (also many fewer silly mistakes are made).
@srchvrs
Leo Boytsov
17 days
🧵Just a decade ago, we had models that could not detect cows on the beaches due to over relying on shortcuts like presence of green grass, but now we have capable action models that interact with websites directly from pixels. ↩️.
0
0
0