
Sathwik Tejaswi
@carnaticfiddle
Followers
57
Following
17
Media
0
Statuses
8
Mid Training and Post Training Lead @servicenow
SF Bay Area
Joined April 2016
RT @f14bertolotti: This is an interesting technical LLM report. This 15B model beats QwQ32B while using quite fewer tokens. Most interestin….
0
48
0
RT @Vikas_NLP_UA: 🎉 Our work “Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs” is accepted at #ACLFinding….
arxiv.org
We present a simple meta quantization approach that quantizes different layers of a large language model (LLM) at different bit levels, and is independent of the underlying quantization technique....
0
3
0
RT @tscholak: 🚨🤯 Today Jensen Huang announced SLAM Lab's newest model on the @HelloKnowledge stage: Apriel‑Nemotron‑15B‑Thinker 🚨.A lean, m….
0
22
0
RT @tscholak: 🚨 SLAM Labs presents Apriel-5B! And it lands right in the green zone 🚨.Speed ⚡ + Accuracy 📈 + Efficiency 💸.This model punches….
0
49
0
RT @iScienceLuvr: BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks. abs: https:….
0
28
0
RT @PerouzT: 🌟🌟🌟 We just released BigDocs: An Open Multimodal Dataset — our latest work on scaling document understanding across diverse da….
0
15
0
RT @vaibhav_adlakha: We introduce LLM2Vec, a simple approach to transform any decoder-only LLM into a text encoder. We achieve SOTA perform….
0
168
0
RT @Vikas_NLP_UA: 📢📢Excited to share our new work 🍛CurryDPO.1/2.🔴Systematically curates multiple preference pairs and trains upon them in a….
0
12
0