Toloka Profile Banner
Toloka Profile
Toloka

@TolokaAI

Followers
22,321
Following
56
Media
32
Statuses
411

Your high quality data partner for AI development

Amsterdam, Netherlands
Joined November 2020
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@TolokaAI
Toloka
2 years
How do you get AI art generators to produce amazing images that look like real art? Take a text-guided diffusion model and feed it the ideal text prompt with the right keywords 😎Take a peek at our favorite images, then check out this paper:
Tweet media one
Tweet media two
Tweet media three
95
123
946
@TolokaAI
Toloka
2 years
How do voice assistants get the words they say? What makes AVs truly autonomous? At the end of the day, does AI really need human involvement? To find the answers, tune in to the first episode of The ML Pond podcast:
48
48
431
@TolokaAI
Toloka
2 years
Get your exclusive copy of the just-released ML Value Chain Landscape by @TheSequenceAI , shaped by your ML peers! Follow the link to download it and receive some extra materials from Toloka to help guide you on your ML journey:
28
34
344
@TolokaAI
Toloka
2 years
Toloka is launching the Data Harvest hackathon! We give you $20 and just one task – collect the largest dataset possible for that amount. Join today to have fun and get some cool prizes:
2
15
155
@TolokaAI
Toloka
3 years
Competitor pricing information is a strategic weapon in e-commerce. With the goal to sharpen their pricing strategy, @yandexmarket turned to Toloka for help matching items to competitor products. Check out the results:
3
17
105
@TolokaAI
Toloka
1 year
We recently teamed up with @huggingface and @ServiceNowRSRCH to power their @BigCodeProject . Facts: 12K code chunks, 14 categories of data, 1399 Tolokers and 4349 hours of work in 4 days! In this post, we share it all — what, why and how we made it happen:
9
12
103
@TolokaAI
Toloka
3 years
Planning to be at @WSDMsocial next week? Be sure to drop by our talk on Challenges in Data Production for AI with Human-in-the-Loop! More details:
7
10
93
@TolokaAI
Toloka
3 years
5 ways to build your business on data - from finding the right location for your store to content creation. Read here:
2
7
61
@TolokaAI
Toloka
3 years
Meet Ask The Crowd, an exciting new UX tool powered by Toloka:
2
7
54
@TolokaAI
Toloka
3 years
Is there a painless way to digitize handwritten archives in record time and under budget? We've got an answer. Check it out:
2
2
34
@TolokaAI
Toloka
3 years
Get the full guide and learn more about the pros and cons of different approaches to data labeling and explore real-life cases
1
3
33
@TolokaAI
Toloka
3 years
How to architect a Human in the Loop system and build a more ethical and better stack with @pachyderminc and Toloka. Read here:
3
4
31
@TolokaAI
Toloka
2 years
Calling all ML engineers and data scientists — we need your input! Tell us what you want to hear about, and we’ll share everything we know in one of our next articles. So what's on your radar? Vote now and we promise to deliver!
Face blur on edge devices
52
DL quality metrics
35
Automated quality control
39
Task distribution
30
1
1
30
@TolokaAI
Toloka
11 months
There’s a recent trend our researchers couldn’t resist: playing chess with ChatGPT, just for fun 😊 But because we’re Toloka, we added a twist and let a crowd of enthusiastic Tolokers play collectively for the human side. Who won? Find out here:
5
2
23
@TolokaAI
Toloka
1 year
We're kicking off a series of posts on building an LLM (Large Language Model) from scratch. First question: where do you begin? Check out this article to find out:
1
0
14
@TolokaAI
Toloka
1 year
Calling all LLM Enthusiasts: Let's Talk Hallucinations! 🌌When it comes to these Language Models, perfection might be a stretch. So wlhat's your magic number for tolerable hallucinations? ✨
15-20% (currently GPT4)
107
Less than 10%
99
I need rate under 5%
101
0, no wild fantasies
177
0
2
11
@TolokaAI
Toloka
2 years
Wow, what a year! Thank you for growing with us as we build better technologies for ML and a stronger community. Some highlights: From all of us at Toloka, we hope your 2022 was as exciting as ours. Happy holidays! We’ll see you next year 🙌
3
3
13
@TolokaAI
Toloka
3 years
We’re excited to announce the launch of our e-commerce initiative. Toloka has always provided robust data labeling solutions for e-commerce, and now we’ve pulled them all together here:
Tweet media one
1
3
12
@TolokaAI
Toloka
11 months
You may have heard that LLMs are faster, cheaper, and better than humans at text annotation. Does this mean we no longer need human data labeling? Read this article to find out:
2
0
11
@TolokaAI
Toloka
1 year
Exciting news! As part of  @icmlconf 2023, our team will co-host a tutorial on Reinforcement Learning from Human Feedback with @natolambert from @huggingface . We'll share our experience using RLHF for training, optimizing, and evaluating LLMs. Join us:
Tweet media one
0
1
10
@TolokaAI
Toloka
2 years
@saiphcita 's research conducted with Toloka, Northeastern University and UNAM was selected by UNESCO as one of the most impactful in AI globally! Learn more:
0
34
5
@TolokaAI
Toloka
10 months
Meet Toloka's LLM Evaluation solution: a bridge between your business and technology. We offer the most extensive range of quality metrics and customized evaluation pipelines to get deep insights into model performance in your business context. Learn more:
3
0
9
@TolokaAI
Toloka
2 years
The results are in for our WSDM Cup Challenge with $6000 in prizes! This was an exciting event with 200 teams registered, and the leaders were neck and neck. We'll hear how they did it when they present at @WSDMSocial Learn more:
1
0
8
@TolokaAI
Toloka
1 year
We’ve built one of the largest and most diverse data labeling crowds on the planet. But who are these people we call Tolokers, the crowd contributors behind the usernames? In this post, we offer a glimpse into what our global community looks like in 2023:
1
2
9
@TolokaAI
Toloka
2 years
We’re excited to announce that Toloka has released the beta version of our new ML Platform, designed to deliver custom ML models in just a few clicks. Learn more and give it a try:
3
3
7
@TolokaAI
Toloka
2 years
📌 Where did you get the image descriptions? We took image descriptions from and Reddit, used a genetic algorithm to pick the next set of keywords, and appended them to descriptions.
1
3
8
@TolokaAI
Toloka
2 years
📌 Which keywords scored highest? The highest scoring keywords were: cinematic, colorful background, concept art, dramatic lighting, high detail, highly detailed, hyper realistic, intricate, intricate sharp details, octane render, smooth, studio lighting, trending on artstation
0
0
8
@TolokaAI
Toloka
2 years
Great read by Philipp Chapkovski on conducting interactive experiments on Toloka:
0
1
8
@TolokaAI
Toloka
3 months
🔍 Looking for ML/GenAI pros! Join our research study if you need data or human signals for model dev or quality control, or manage data annotation teams. 💰 $100 Amazon gift card ⏰ 45-min online interview Interested? Fill in the form:
Tweet media one
7
0
6
@TolokaAI
Toloka
2 years
📌 How much data did it take? We collected a total of 597,830 opinions from 12,724 people on Toloka.
1
0
5
@TolokaAI
Toloka
7 months
Check out Evgeniya Sukhodolskaya latest piece for @TDataScience to dive into the latest advancements of #gpt4v as well as the challenges it still faces when it comes to spacial reasoning. 🤖
0
2
5
@TolokaAI
Toloka
7 months
#AAAI24 attendees 📢 Stop by booth 103 to say hello 👋 and learn more about how Toloka is helping companies build accurate and reliable AI with the highest quality data 🚀
Tweet media one
6
0
6
@TolokaAI
Toloka
1 year
In case you missed our talk at @DISummit2030 , we've got you covered — check out this video to learn how data science can be augmented with effective crowdsourcing to make AI solutions more relevant and adaptive:
2
1
4
@TolokaAI
Toloka
1 year
In the #AI world, there's a never-ending season of #LLMs . Have you recently worked on evaluating LLMs? Share your approach!
Human evaluation
121
LLM-based: GPT-4/Claude
174
Automated: ROUGE/BLEU
67
Other
109
1
0
7
@TolokaAI
Toloka
3 years
This Wednesday, February 2. Come join us at the webinar on Data Labeling For Search Relevance Evaluation:
0
1
6
@TolokaAI
Toloka
3 years
We recently partnered with @TheSequenceAI and conducted a web survey that asked ML engineers, data scientists, and AI aficionados alike to answer questions about their data labeling habits and techniques. The results led to surprising conclusions:
0
0
6
@TolokaAI
Toloka
2 years
CALL FOR PAPERS! If you're a researcher who is currently exploring novel approaches for crowd-computer interaction or any other related areas, we would love to have you present your work at our workshop at @WSDMSocial . Submit your paper here:
0
3
6
@TolokaAI
Toloka
2 years
Curious? Here’s a mini FAQ: 📌 How did you evaluate quality? We ran a genetic optimization to find the best keywords for #stablediffusion 1.4. We asked real humans (Tolokers) to evaluate the quality of images generated with different keywords.
0
0
6
@TolokaAI
Toloka
1 year
Toloka Fest was a blast! Last week we gathered our whole team in Belgrade to see each other face to face, brainstorm ideas and have fun! Check out the thread for some fun facts and get a glimpse of the people behind Toloka🙂
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
0
5
@TolokaAI
Toloka
3 years
Our team was thrilled to participate as part of the Data, AI and Cybersecurity track at #HUBDAY @HUBInstitute last week. If you missed it, no worries — you can still watch the recording of our talk:
0
2
5
@TolokaAI
Toloka
10 months
Bias, hallucinations, privacy issues, data moats, and practical solutions for these challenges that surround LLMs — all in one article written by @MagdalenaKonki1 for @hackernoon . Check it out:
2
2
5
@TolokaAI
Toloka
1 year
Researchers and beyond: if you’re searching for a real-world graph dataset for experimentation, search no more. Check out our new Toloker Graph dataset which is included in PyTorch-Geometric (PyG). Get started now:
0
1
4
@TolokaAI
Toloka
11 months
📊Here are the results of our recent poll on copyrighted content used in #LLM training! The majority opposes it, while the rest are divided between "yes" and "yes, with source disclosure." Whatever your needs, we handle copyrighted text with utmost care when crafting LLMs🛡️
@TolokaAI
Toloka
1 year
#LLMs are usually trained on massive amounts of data that are publicly available. However, there are claims that some LLMs have been trained on copyrighted content without authors' permission. What is your stance on this, should copyrighted material be used in LLM training?
6
2
5
2
1
5
@TolokaAI
Toloka
1 year
Facts: - 225 people coming from 41 cities! - 5 days of team building in lovely Belgrade - 2 parties and 103838494 steps on the dance floor 😄 - Tons of great memories and countless amount of ideas how to make Toloka even better!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
0
2
@TolokaAI
Toloka
1 year
#LLMs are usually trained on massive amounts of data that are publicly available. However, there are claims that some LLMs have been trained on copyrighted content without authors' permission. What is your stance on this, should copyrighted material be used in LLM training?
Absolutely no
154
Yes, no problem at all
72
Yes, but disclose sources
79
6
2
5
@TolokaAI
Toloka
2 years
A hungry AI model should eat healthy data! Get top nutrition advice from our expert Phillipa Spassova:) Whether you’re looking for human-powered data, not sure where to get the right data, or simply want to talk all things AI - drop her a line here:
Tweet media one
0
2
3
@TolokaAI
Toloka
2 years
Toloka has just passed the ISO 27701 certification! This covers how we collect, handle, store and destroy personally identifiable information. On top of that, we've also renewed ISO 27001 certification. Read more on our blog:
0
0
4
@TolokaAI
Toloka
1 year
Toloka Fest 2023 is over but not forgotten — with tons of memories, ideas to bring to life, and sore feet from all those steps on the dance floor 😁 Watch the fun video recap: Watch the fun video recap:
0
0
4
@TolokaAI
Toloka
3 years
How to Improve Model Accuracy with Crowdsourced Data Labeling – Real World Use Cases. @TheSequenceAI , thanks for this piece!
0
0
4
@TolokaAI
Toloka
10 months
Are you at @NeurIPSConf 2023? You can find us at Booth 702! Just as promised, we're available all day every day to discuss our new LLM Evaluation solution and beyond. Swing by to catch up and meet these friendly faces 😊
Tweet media one
0
0
4
@TolokaAI
Toloka
3 years
We’re excited to be part of the vibrant Austin startup community and sponsor the AI & Data Science Track at @AtxStartupWeek . Learn more:
0
2
4
@TolokaAI
Toloka
3 years
Toloka's CEO, Olga Megorskaya, chats with The Sequence, a newsletter read by 100k+ ML/AI practitioners, about the balance between fully automated, crowdsourced or hybrid approaches to data labeling. #ai #data #ml
0
0
3
@TolokaAI
Toloka
1 year
In this article for @towards_AI , we compared crowdsourcing with GPT-models using two datasets. Check it out, the results might surprise you:
1
2
3
@TolokaAI
Toloka
2 years
The Toloka Visual Question Answering Challenge is now LIVE! Brought to you by our research team and supported by @WSDMSocial , this 4-month competition has a $6,000 prize pool and an opportunity to present your solution at WSDM 2023 in Singapore. Tune in:
1
2
4
@TolokaAI
Toloka
3 years
In this 40-minute webinar, we are taking a deep-dive into the data processing pipeline required for the cars to learn how to behave autonomously on the roads. Get the recording here:
2
0
4
@TolokaAI
Toloka
9 months
2023 was a whirlwind year of transformation with GenAI. What's next? Join us in a special 2023 recap and see our predictions for 2024: P.S. Happy holidays from all of us at Toloka, and thank you for being part of our journey! 🥂
2
0
4
@TolokaAI
Toloka
1 year
We’re doing lots of new things with generative AI — like building data annotation pipelines that integrate LLMs and human input. In this post, we dig into the human part of the equation. Just who are the human experts collaborating with LLMs? Read now:
0
0
4
@TolokaAI
Toloka
3 years
Join our Head of Toloka, Olga Megorskaya, at #GTC21 on April 14 in a discussion about the pipeline of speech recognition, including GPU usage and the data annotation process. Register now:
Tweet media one
1
4
4
@TolokaAI
Toloka
4 years
We have a full day of presentations at #NeurIPS2020 by leading scientists & researchers from around the globe. Start with @laroyo at 8:15 AM PC. Don't miss a panel discussion on the future and ethics of crowdsourcing beginning at 11 AM PT. Full schedule:
Tweet media one
0
2
4
@TolokaAI
Toloka
2 years
@VozdoPovo3 @TheSequenceAI Hello! Please clarify your question so that we can help you.
1
0
3
@TolokaAI
Toloka
10 months
When @eightifyapp wanted to compare GPT-3.5 and GPT-4 performance for their YouTube video summarization app, they turned to Toloka to develop the quality metrics and run in-depth evaluations with expert annotators. Learn about the results in our blog:
0
0
4
@TolokaAI
Toloka
3 years
Mohamed Amgad, a Pre-doctoral Fellow at Northwestern University, will give a talk titled “NuCLS: A Scalable Crowdsourcing Approach.” Using a recent dataset of breast cancer cells annotated by students, Mohamed will offer a crowdsourcing-based solution.
0
0
4
@TolokaAI
Toloka
2 years
📌 How did you score keywords? We asked annotators to compare images generated with different keywords. This gave us a score for each set of images, which we used for the genetic algorithm. Surprisingly, the most popular keywords did not result in the best-looking images.
0
0
4
@TolokaAI
Toloka
2 years
Save your seat!
0
0
3
@TolokaAI
Toloka
2 years
Great news: you can now leave it entirely up to us to generate control tasks for you! Just upload your project data, request control tasks, and wait for them to appear in your pool. Learn more and give it a try:
0
2
3
@TolokaAI
Toloka
3 years
Post your job opportunities with us! We now connect our graduates with employers around the globe. It’s a win-win for both sides. And for us – another way to give back to the global AI community. Learn more here:
1
2
3
@TolokaAI
Toloka
3 years
In this article, Valentin Biryukov, our Head of R&D, explained how to solve one of the most troublesome tasks in NLP — question answering. Check it out:
0
0
3