
DeepSky
@DeepSkyAI
Followers
5K
Following
329
Media
41
Statuses
104
RT @S_Rashi: .@DeepSkyAI voted #1 project on the awesome @Peerlist community!! Thanks @designerdada and @hey_yogini - what you've built is….
0
4
0
RT @1337u53r: Some ideas for Opendoor with help from @DeepSkyAI . cc: @rabois @ericjackson @ihat @shrisha $OPEN @APompliano . 1. The Integr….
0
7
0
RT @Peerlist: 🥁 Shoutout to our top 3 projects of the week (wk 33). 🥇 @DeepSkyAI is a super agent for business professionals, focused on de….
0
9
0
RT @chrisatdeepsky: It’s built specifically for business operators and investors. You also get access to crunchbase, market data, sec filin….
0
4
0
RT @chrisatdeepsky: Swapped our browser agent web browser from @browserbasehq to @onkernel and startup latency went from ~(30 seconds - nev….
0
8
0
RT @Karp_God: Wow— . seems awesome. I’d been using Perplexity for financial research previously, but DeepSky seem….
deepsky.ai
0
3
0
Gradient's Chief Scientist, @LeoPekelis, recently sat down with @VentureBeat to discuss how we developed an open-source @AIatMeta Llama-3 model with a context length of 1M+ 🙌. As always, a huge thank you to @CrusoeAI for sponsoring the compute!
venturebeat.com
AI startup Gradient and cloud platform Crusoe teamed up to extend the context window of Meta's Llama 3 models to 1 million tokens.
1
6
18
Take a look at why we had to implement a new pipeline for synthetic long context data generation in order to overcome challenges in training our 1M+ @AIatMeta Llama-3 context models 🧐 . 🔗✅ Shifting from Pre to Post-Training.✅ Why Synthetic Data is
0
3
16
The major challenge in training long context models isn't just improving attention implementation. It's also the scarcity of adequate long context training data. In order train our @AIatMeta Llama-3 long context models, we developed a synthetic data generation pipeline that
0
5
17
Our Chief Scientist, @LeoPekelis, sat down with @MatthewBerman for a fireside chat 🔥 on our 1M Context Length @AIatMeta Llama-3 models:. 🔗 ✅ How We Did It.✅ Long Context Use Cases.✅ Challenges w/ Development.✅ Performance & Evals (s/o @GregKamradt).
0
4
12
Excited that the RULER Benchmark now includes Gradient's 1M Context Length @AIatMeta Llama-3 70B, placing it 4th overall! 🙌 Take a look at some our deep dives into long context to learn more:. ✅ RULER vs. 1M Llama3 70B: ✅ RULER
2
4
27
What an episode! Join our Co-Founder & Chief Architect, @markatgradient, as he breaks down the intricacies of scaling Llama3 beyond a 1M context window on the latest episode on @latentspacepod!. 🔗 ✅ History of Long Context.✅ Intro to RoPE, ALiBi, and.
latent.space
Scaling Llama3 beyond 1M context window with ~perfect utilization, the difference between ALiBi and RoPE, how to use GPT-4 to create synthetic data for your context extension finetunes, and more!
0
2
11
We're excited about how our 1M Context Length @AIatMeta Llama-3 70B performed against RULER. Dive into the results or checkout how we made this possible (e.g. scaling rotational embeddings). ✅ RULER Results: ✅ Made Possible By:
0
13
38