Sumit
@_reachsumit
Followers
3K
Following
2K
Media
2K
Statuses
9K
Senior ML Engineer @Meta | prev: @TikTok_us, @Amazon, @Samsung | UChicago Alum https://t.co/hcCJ2n979W 🇮🇳→🇰🇷→🇦🇺→🇨🇦→🇺🇲
Seattle, WA
Joined April 2010
In the final post of the Adaptive RAG series, we explore how to treat selective retrieval as a core, learned skill, moving from passive observation to active, intelligent decision-making. https://t.co/MyjupeCBOS
blog.reachsumit.com
This final post of the Adaptive RAG series explores methods that treat adaptive retrieval as a learned skill and explicitly teach models when to retrieve. We examine three paradigms in increasing...
1
0
4
Practice on Long Behavior Sequence Modeling in Tencent Advertising Tencent presents methods for handling unified commercial behavior trajectories across advertising domains. 📝
arxiv.org
Long-sequence modeling has become an indispensable frontier in recommendation systems for capturing users' long-term preferences. However, user behaviors within advertising domains are inherently...
0
3
10
Your Dense Retriever is Secretly an Expeditious Reasoner Introduces a hybrid query reasoning framework with a Dense Reasoner that performs LLM-style reasoning in embedding space and a Reasoner Router. 📝 https://t.co/r44vAtDv8c 👨🏽💻 https://t.co/Qe9QSUPBgh
github.com
Contribute to maple826/AdaQR development by creating an account on GitHub.
0
0
1
DiffGRM: Diffusion-based Generative Recommendation Model Kuaishou introduces a diffusion-based generative recommendation framework that replaces autoregressive decoders with masked discrete diffusion models. 📝 https://t.co/tQ05AhA2HW 👨🏽💻
github.com
Contribute to liuzhao09/DiffGRM development by creating an account on GitHub.
0
0
4
Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders Meta compresses ultra-long user histories into cached embeddings, enabling scalability to lifelong sequences. 📝
arxiv.org
Modern large-scale recommendation systems rely heavily on user interaction history sequences to enhance the model performance. The advent of large language models and sequential modeling...
0
0
1
Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search LinkedIn introduces a training-free approach combining model pruning and context compression with serving optimizations to achieve 10x throughput improvement. 📝
arxiv.org
Large Language Models (LLMs) have demonstrated impressive quality when applied to predictive tasks such as relevance ranking and semantic search. However, deployment of such LLMs remains...
0
0
2
Hybrid-Vector Retrieval for Visually Rich Documents: Combining Single-Vector Efficiency and Multi-Vector Accuracy Achieves 99.87% of multi-vector model accuracy while reducing computation by 99.82%. 📝 https://t.co/pi4EN6M1uX 👨🏽💻
arxiv.org
Retrieval over visually rich documents is essential for tasks such as legal discovery, scientific search, and enterprise knowledge management. Existing approaches fall into two paradigms:...
0
1
6
Tools Are Under-Documented: Simple Document Expansion Boosts Tool Retrieval Enriches tool documentation with structured fields using LLM-based expansion, along with Tool-Embed and Tool-Rank models. 📝 https://t.co/6NhqmOPabX 👨🏽💻
github.com
Contribute to EIT-NLP/Tool-DE development by creating an account on GitHub.
0
0
2
Rule-Based Explanations for Retrieval-Augmented LLM Systems @JoelExplainsAI et al. propose a rule-based explanation framework for RAG systems that links the presence or absence of retrieved sources to LLM outputs. 📝
arxiv.org
If-then rules are widely used to explain machine learning models; e.g., "if employed = no, then loan application = rejected." We present the first proposal to apply rules to explain the emerging...
0
0
0
Windsock is Dancing: Adaptive Multimodal Retrieval-Augmented Generation Introduces a query-dependent module for adaptive multimodal RAG that decides when to retrieve and what modality to use. 📝
arxiv.org
Multimodal Retrieval-Augmented Generation (MRAG) has emerged as a promising method to generate factual and up-to-date responses of Multimodal Large Language Models (MLLMs) by incorporating...
0
1
1
E2Rank: Your Text Embedding Can Also be an Effective and Efficient Listwise Reranker @qiliu6777 et al. at Alibaba extend text embedding models to perform both retrieval and listwise reranking. 📝 https://t.co/LO4IJ81UM8 👨🏽💻 https://t.co/S18aoBW1ce
arxiv.org
Text embedding models serve as a fundamental component in real-world search applications. By mapping queries and documents into a shared embedding space, they deliver competitive retrieval...
0
4
20
Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models Investigates temporal biases in LLMs, revealing that both transformer and state-space models exhibit strong primacy and recency effects. 📝
arxiv.org
In-context learning is governed by both temporal and semantic relationships, shaping how Large Language Models (LLMs) retrieve contextual information. Analogous to human episodic memory, where the...
0
1
0
MGFRec: Towards Reinforced Reasoning Recommendation with Multiple Groundings and Feedback Proposes an RL framework enabling LLMs to perform multiple groundings in the actual item space during reasoning, to better align recommendations with real items. 📝
arxiv.org
The powerful reasoning and generative capabilities of large language models (LLMs) have inspired researchers to apply them to reasoning-based recommendation tasks, which require in-depth reasoning...
0
0
0
Tagging-Augmented Generation: Assisting Language Models in Finding Intricate Knowledge In Long Contexts Amazon introduces a lightweight data augmentation strategy that boosts LLM performance in long-context scenarios. 📝 https://t.co/6uaAOzZAy1 👨🏽💻
sites.google.com
Abstract Recent investigations into effective context lengths of modern flagship large language models (LLMs) have revealed major limitations in effective question answering (QA) and reasoning over...
0
0
2
Think before Recommendation: Autonomous Reasoning-enhanced Recommender Alibaba introduces an RL-based recommendation paradigm that trains a single LLM to autonomously develop reasoning capabilities for rating prediction. 📝 https://t.co/QGLM4SSJWn 👨🏽💻
github.com
Contribute to AkaliKong/RecZero development by creating an account on GitHub.
0
2
11
LimRank: Less is More for Reasoning-Intensive Information Reranking Demonstrates that modern LLMs can be effectively adapted for information reranking using minimal, high-quality supervision. 📝 https://t.co/E1VO1ypLoH 👨🏽💻
github.com
Official repository for EMNLP 2025 Paper "LimRank: Less is More for Reasoning-Intensive Information Reranking" - SighingSnow/limrank
0
2
14
Bi-Level Optimization for Generative Recommendation: Bridging Tokenization and Generation Introduces a bi-level optimization framework that explicitly models the interdependence between item tokenization and autoregressive generation. 📝
arxiv.org
Generative recommendation is emerging as a transformative paradigm by directly generating recommended items, rather than relying on matching. Building such a system typically involves two key...
0
0
6
Pctx: Tokenizing Personalized Context for Generative Recommendation Proposes a personalized context-aware tokenizer that incorporates user historical interactions to generate adaptive semantic IDs. 📝 https://t.co/rOIQHeKAF6 👨🏽💻 https://t.co/u6mAVQUKbH
github.com
PyTorch-based open-source code for paper "Pctx: Tokenizing Personalized Context for Generative Recommendation" - YoungZ365/Pctx
0
1
10
CausalRec: A CausalBoost Attention Model for Sequential Recommendation Alibaba introduces a causal attention framework that learns causal graphs in user behavior sequences to improve sequential recommendations. 📝 https://t.co/qjEIHq7z51 👨🏽💻
arxiv.org
Recent advances in correlation-based sequential recommendation systems have demonstrated substantial success. Specifically, the attention-based model outperforms other RNN-based and Markov...
0
0
8
Redefining Retrieval Evaluation in the Era of LLMs @GioTrappolini et al. propose a metric designed for RAG systems that accounts for both positive utility of relevant passages and negative impact of distracting ones. 📝 https://t.co/X0nfzFQOLz 👨🏽💻 https://t.co/QGZ86bJKQP
github.com
Contribute to GiovanniTRA/UDCG development by creating an account on GitHub.
0
2
7
Doc-Researcher: A Unified System for Multimodal Document Parsing and Deep Research Huawei presents a system integrating deep multimodal parsing with multi-agent research workflows, enabling iterative evidence gathering across documents. 📝
arxiv.org
Deep Research systems have revolutionized how LLMs solve complex questions through iterative reasoning and evidence gathering. However, current systems remain fundamentally constrained to textual...
0
1
12