Sumit @_reachsumit X Profile

Sumit

@_reachsumit

Followers

3K

Following

2K

Media

2K

Statuses

9K

Senior ML Engineer @Meta | prev: @TikTok_us, @Amazon, @Samsung | UChicago Alum https://t.co/hcCJ2n979W 🇮🇳→🇰🇷→🇦🇺→🇨🇦→🇺🇲

https://t.co/OHPOnaN5yM

Seattle, WA

Joined April 2010

Don't wanna be here? Send us removal request.

Sumit

@_reachsumit

21 days

In the final post of the Adaptive RAG series, we explore how to treat selective retrieval as a core, learned skill, moving from passive observation to active, intelligent decision-making. https://t.co/MyjupeCBOS

blog.reachsumit.com

This final post of the Adaptive RAG series explores methods that treat adaptive retrieval as a learned skill and explicitly teach models when to retrieve. We examine three paradigms in increasing...

1

0

4

Sumit

@_reachsumit

21 hours

Practice on Long Behavior Sequence Modeling in Tencent Advertising Tencent presents methods for handling unified commercial behavior trajectories across advertising domains. 📝

arxiv.org

Long-sequence modeling has become an indispensable frontier in recommendation systems for capturing users' long-term preferences. However, user behaviors within advertising domains are inherently...

0

3

10

Sumit

@_reachsumit

21 hours

Your Dense Retriever is Secretly an Expeditious Reasoner Introduces a hybrid query reasoning framework with a Dense Reasoner that performs LLM-style reasoning in embedding space and a Reasoner Router. 📝 https://t.co/r44vAtDv8c 👨🏽‍💻 https://t.co/Qe9QSUPBgh

github.com

Contribute to maple826/AdaQR development by creating an account on GitHub.

0

1

Sumit

@_reachsumit

21 hours

DiffGRM: Diffusion-based Generative Recommendation Model Kuaishou introduces a diffusion-based generative recommendation framework that replaces autoregressive decoders with masked discrete diffusion models. 📝 https://t.co/tQ05AhA2HW 👨🏽‍💻

github.com

Contribute to liuzhao09/DiffGRM development by creating an account on GitHub.

0

4

Sumit

@_reachsumit

21 hours

Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders Meta compresses ultra-long user histories into cached embeddings, enabling scalability to lifelong sequences. 📝

arxiv.org

Modern large-scale recommendation systems rely heavily on user interaction history sequences to enhance the model performance. The advent of large language models and sequential modeling...

0

1

Sumit

@_reachsumit

21 hours

Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search LinkedIn introduces a training-free approach combining model pruning and context compression with serving optimizations to achieve 10x throughput improvement. 📝

arxiv.org

Large Language Models (LLMs) have demonstrated impressive quality when applied to predictive tasks such as relevance ranking and semantic search. However, deployment of such LLMs remains...

0

2

Sumit

@_reachsumit

22 hours

Hybrid-Vector Retrieval for Visually Rich Documents: Combining Single-Vector Efficiency and Multi-Vector Accuracy Achieves 99.87% of multi-vector model accuracy while reducing computation by 99.82%. 📝 https://t.co/pi4EN6M1uX 👨🏽‍💻

arxiv.org

Retrieval over visually rich documents is essential for tasks such as legal discovery, scientific search, and enterprise knowledge management. Existing approaches fall into two paradigms:...

0

1

6

Sumit

@_reachsumit

22 hours

Tools Are Under-Documented: Simple Document Expansion Boosts Tool Retrieval Enriches tool documentation with structured fields using LLM-based expansion, along with Tool-Embed and Tool-Rank models. 📝 https://t.co/6NhqmOPabX 👨🏽‍💻

github.com

Contribute to EIT-NLP/Tool-DE development by creating an account on GitHub.

0

2

Sumit

@_reachsumit

22 hours

Rule-Based Explanations for Retrieval-Augmented LLM Systems @JoelExplainsAI et al. propose a rule-based explanation framework for RAG systems that links the presence or absence of retrieved sources to LLM outputs. 📝

arxiv.org

If-then rules are widely used to explain machine learning models; e.g., "if employed = no, then loan application = rejected." We present the first proposal to apply rules to explain the emerging...

0

Sumit

@_reachsumit

22 hours

Windsock is Dancing: Adaptive Multimodal Retrieval-Augmented Generation Introduces a query-dependent module for adaptive multimodal RAG that decides when to retrieve and what modality to use. 📝

arxiv.org

Multimodal Retrieval-Augmented Generation (MRAG) has emerged as a promising method to generate factual and up-to-date responses of Multimodal Large Language Models (MLLMs) by incorporating...

0

1

Sumit

@_reachsumit

22 hours

E2Rank: Your Text Embedding Can Also be an Effective and Efficient Listwise Reranker @qiliu6777 et al. at Alibaba extend text embedding models to perform both retrieval and listwise reranking. 📝 https://t.co/LO4IJ81UM8 👨🏽‍💻 https://t.co/S18aoBW1ce

arxiv.org

Text embedding models serve as a fundamental component in real-world search applications. By mapping queries and documents into a shared embedding space, they deliver competitive retrieval...

0

4

20

Sumit

@_reachsumit

22 hours

Beyond Semantics: How Temporal Biases Shape Retrieval in Transformer and State-Space Models Investigates temporal biases in LLMs, revealing that both transformer and state-space models exhibit strong primacy and recency effects. 📝

arxiv.org

In-context learning is governed by both temporal and semantic relationships, shaping how Large Language Models (LLMs) retrieve contextual information. Analogous to human episodic memory, where the...

0

1

0

Sumit

@_reachsumit

22 hours

MGFRec: Towards Reinforced Reasoning Recommendation with Multiple Groundings and Feedback Proposes an RL framework enabling LLMs to perform multiple groundings in the actual item space during reasoning, to better align recommendations with real items. 📝

arxiv.org

The powerful reasoning and generative capabilities of large language models (LLMs) have inspired researchers to apply them to reasoning-based recommendation tasks, which require in-depth reasoning...

0

Sumit

@_reachsumit

22 hours

Tagging-Augmented Generation: Assisting Language Models in Finding Intricate Knowledge In Long Contexts Amazon introduces a lightweight data augmentation strategy that boosts LLM performance in long-context scenarios. 📝 https://t.co/6uaAOzZAy1 👨🏽‍💻

sites.google.com

Abstract Recent investigations into effective context lengths of modern flagship large language models (LLMs) have revealed major limitations in effective question answering (QA) and reasoning over...

0

2

Sumit

@_reachsumit

22 hours

Think before Recommendation: Autonomous Reasoning-enhanced Recommender Alibaba introduces an RL-based recommendation paradigm that trains a single LLM to autonomously develop reasoning capabilities for rating prediction. 📝 https://t.co/QGLM4SSJWn 👨🏽‍💻

github.com

Contribute to AkaliKong/RecZero development by creating an account on GitHub.

0

2

11

Sumit

@_reachsumit

22 hours

LimRank: Less is More for Reasoning-Intensive Information Reranking Demonstrates that modern LLMs can be effectively adapted for information reranking using minimal, high-quality supervision. 📝 https://t.co/E1VO1ypLoH 👨🏽‍💻

github.com

Official repository for EMNLP 2025 Paper "LimRank: Less is More for Reasoning-Intensive Information Reranking" - SighingSnow/limrank

0

2

14

Sumit

@_reachsumit

2 days

Bi-Level Optimization for Generative Recommendation: Bridging Tokenization and Generation Introduces a bi-level optimization framework that explicitly models the interdependence between item tokenization and autoregressive generation. 📝

arxiv.org

Generative recommendation is emerging as a transformative paradigm by directly generating recommended items, rather than relying on matching. Building such a system typically involves two key...

0

6

Sumit

@_reachsumit

2 days

Pctx: Tokenizing Personalized Context for Generative Recommendation Proposes a personalized context-aware tokenizer that incorporates user historical interactions to generate adaptive semantic IDs. 📝 https://t.co/rOIQHeKAF6 👨🏽‍💻 https://t.co/u6mAVQUKbH

github.com

PyTorch-based open-source code for paper "Pctx: Tokenizing Personalized Context for Generative Recommendation" - YoungZ365/Pctx

0

1

10

Sumit

@_reachsumit

2 days

CausalRec: A CausalBoost Attention Model for Sequential Recommendation Alibaba introduces a causal attention framework that learns causal graphs in user behavior sequences to improve sequential recommendations. 📝 https://t.co/qjEIHq7z51 👨🏽‍💻

arxiv.org

Recent advances in correlation-based sequential recommendation systems have demonstrated substantial success. Specifically, the attention-based model outperforms other RNN-based and Markov...

0

8

Sumit

@_reachsumit

2 days

Redefining Retrieval Evaluation in the Era of LLMs @GioTrappolini et al. propose a metric designed for RAG systems that accounts for both positive utility of relevant passages and negative impact of distracting ones. 📝 https://t.co/X0nfzFQOLz 👨🏽‍💻 https://t.co/QGZ86bJKQP

github.com

Contribute to GiovanniTRA/UDCG development by creating an account on GitHub.

0

2

7

Sumit

@_reachsumit

2 days

Doc-Researcher: A Unified System for Multimodal Document Parsing and Deep Research Huawei presents a system integrating deep multimodal parsing with multi-agent research workflows, enabling iterative evidence gathering across documents. 📝

arxiv.org

Deep Research systems have revolutionized how LLMs solve complex questions through iterative reasoning and evidence gathering. However, current systems remain fundamentally constrained to textual...

0

1

12