Sathwik Tejaswi Profile
Sathwik Tejaswi

@SathwikTejaswi

Followers
66
Following
34
Media
0
Statuses
17

Technical Co-Lead of Apriel Mid Training and Post training https://t.co/7CEEfb5Fib

SF Bay Area
Joined April 2016
Don't wanna be here? Send us removal request.
@SathwikTejaswi
Sathwik Tejaswi
2 months
Thank you @mervenoyann for the shout-out!
@mervenoyann
merve
2 months
the new Apriel-1.5 reasoning vision language model by @ServiceNowRSRCH is so good! ๐Ÿ”ฅ๐Ÿ˜ฎ here's a small vibe test across languagesโคต๏ธ > ask to identify drug interactions in French label in English > it compares minerals > finally comes up with a look-up table with correct list!
0
0
2
@LysandreJik
Lysandre
2 months
ServiceNow-AI/Apriel-1.5-15b-Thinker running on a single GPU using `transformers serve` ๐Ÿ”ฅ great to have some very nice reasoning models that can run locally! next step, trying it on mps ๐Ÿ‘€
0
9
55
@NVIDIAAI
NVIDIA AI
2 months
๐Ÿ‘ Congratulations to @ServiceNowRSRCH on introducing Apriel-1.5-15B-Thinker โ€” a powerful new AI model that delivers frontier-level reasoning with a fraction of the compute. Weโ€™re proud that our Nemotron collection helped power its training .
@NVIDIAAIDev
NVIDIA AI Developer
2 months
๐ŸŽŠ Congratulations to @ServiceNowRSRCH on introducing Apriel-1.5-15B-Thinker โ€” their 15B-parameter model that matches DeepSeek-R1-0528, Mistral-medium-1.2 and Gemini Flash 2.5 on the Artificial Analysis Index (AAI 52) โ€” delivering comparable results at fraction of the size (at
5
9
78
@turingcom
Turing
2 months
๐๐‘๐„๐€๐Š๐ˆ๐๐†: @ServiceNow released a 15B parameter AI model today. The model is the product of a partnership with Turing, which provided the training data. Breakdown below.
5
20
142
@ArtificialAnlys
Artificial Analysis
2 months
ServiceNow has released Apriel-v1.5-15B-Thinker, a 15B open weights reasoning model that leads our Small Models category (<40B parameters) ๐Ÿ’ผย Overview: Apriel-v1.5-15B-Thinker is a dense, 15B parameter open weights reasoning model. This is not the first model ServiceNow has
19
61
502
@NVIDIAAIDev
NVIDIA AI Developer
2 months
๐ŸŽŠ Congratulations to @ServiceNowRSRCH on introducing Apriel-1.5-15B-Thinker โ€” their 15B-parameter model that matches DeepSeek-R1-0528, Mistral-medium-1.2 and Gemini Flash 2.5 on the Artificial Analysis Index (AAI 52) โ€” delivering comparable results at fraction of the size (at
12
23
180
@ServiceNowRSRCH
ServiceNow AI Research
2 months
SLAM Labs presents Apriel-1.5-15B-Thinker ๐Ÿš€ An open-weights multimodal reasoning model that hits frontier-level performance with just a fraction of the compute.
15
77
337
@f14bertolotti
Francesco Bertolotti
3 months
This is an interesting technical LLM report. This 15B model beats QwQ32B while using quite fewer tokens. Most interestingly, the authors heavily use model merging to combine the strengths of different checkpoints. ๐Ÿ”— https://t.co/thoIqNEeBd
5
48
346
@Vikas_NLP_UA
Vikas Yadav
6 months
๐ŸŽ‰ Our work โ€œVariable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMsโ€ is accepted at #ACLFindings2025 ๐Ÿ“Ž https://t.co/7fKAnZQIBr โ€” Keep key layers high-precision, push others lower โ†’ compact LLMs w/ ~no accuracy loss โ€” Simple LIM & ZD scores rank layers
arxiv.org
We present a simple meta quantization approach that quantizes different layers of a large language model (LLM) at different bit levels, and is independent of the underlying quantization technique....
1
3
6
@tscholak
Torsten Scholak
7 months
๐Ÿšจ๐Ÿคฏ Today Jensen Huang announced SLAM Lab's newest model on the @HelloKnowledge stage: Aprielโ€‘Nemotronโ€‘15Bโ€‘Thinker ๐Ÿšจ A lean, mean reasoning machine punching way above its weight class ๐Ÿ‘Š Built by SLAM ร— NVIDIA. Smaller models, bigger impact. ๐Ÿงต๐Ÿ‘‡
2
22
47
@tscholak
Torsten Scholak
8 months
๐Ÿšจ SLAM Labs presents Apriel-5B! And it lands right in the green zone ๐Ÿšจ Speed โšก + Accuracy ๐Ÿ“ˆ + Efficiency ๐Ÿ’ธ This model punches above its weight, beating bigger LLMs while training on a fraction of the compute. Built with Fast-LLM, our in-house training stack. ๐Ÿงต๐Ÿ‘‡
5
49
134
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
1 year
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks abs: https://t.co/l6wHdrGAt5 project page: https://t.co/55UGlS3FLQ BigDocs-7.5M is a high-quality, open-access dataset comprising 7.5 million multimodal documents across
2
28
142
@PerouzT
Perouz Taslakian
1 year
๐ŸŒŸ๐ŸŒŸ๐ŸŒŸ We just released BigDocs: An Open Multimodal Dataset โ€” our latest work on scaling document understanding across diverse data types! ๐Ÿ“„ ๐Ÿ‘‰ Dive into the details: https://t.co/KfOKZKARDS ๐Ÿง  or come see us at the #NeurIPS2024 RBFM workshop! #AI @ServiceNowRSRCH #bigdocs
0
15
17
@vaibhav_adlakha
Vaibhav Adlakha
2 years
We introduce LLM2Vec, a simple approach to transform any decoder-only LLM into a text encoder. We achieve SOTA performance on MTEB in the unsupervised and supervised category (among the models trained only on publicly available data). ๐Ÿงต1/N Paper: https://t.co/1ARXK1SWwR
13
165
874
@Vikas_NLP_UA
Vikas Yadav
2 years
๐Ÿ“ข๐Ÿ“ขExcited to share our new work ๐Ÿ›CurryDPO 1/2 ๐Ÿ”ดSystematically curates multiple preference pairs and trains upon them in a curriculum learning setup with DPO framework ๐Ÿ”ดAchieves notable performance gains over vanilla DPO method on MTbench, Vicuna, WizardLM, and UltraFeedback
1
12
19