OpenDataLab_AI Profile Banner
OpenDataLab Profile
OpenDataLab

@OpenDataLab_AI

Followers
156
Following
21
Media
24
Statuses
42

Joined February 2023
Don't wanna be here? Send us removal request.
@OpenDataLab_AI
OpenDataLab
2 months
Based on the Classification System, we divided #WanJuanSiLu into 7 major categories, covering a wide range of content with characteristics of the language's geographic location, such as #history, #politics, #culture, #shopping, #encyclopedic knowledge.
Tweet media one
0
0
0
@OpenDataLab_AI
OpenDataLab
10 days
Beyond basic reasoning, REST specifically evaluates several under-tested capabilities: contextual priority allocation 🗂️, cross-problem interference resistance ⚖️, and dynamic cognitive load management⚙️. Paper link:
Tweet card summary image
arxiv.org
Recent Large Reasoning Models (LRMs) have achieved remarkable progress on task-specific benchmarks, yet their evaluation methods remain constrained by isolated problem-solving paradigms. Existing...
0
0
0
@OpenDataLab_AI
OpenDataLab
10 days
REST (Reasoning Evaluation through Simultaneous Testing), a stress-testing framework that concurrently exposes #LRMs to multiple problems. #REST transforms existing benchmarks to evaluate multiple questions at once, repurposing benchmarks into more challenging variants. #AI
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
1
@OpenDataLab_AI
OpenDataLab
13 days
#MathFusion is a novel framework that enhances mathematical reasoning through cross-problem instruction synthesis. 🦾Experimental results demonstrate that it achieves substantial improvements in mathematical reasoning while maintaining high data efficiency. #AI #Datasets
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
0
0
@OpenDataLab_AI
OpenDataLab
19 days
Vis3 is a visualization tool for #LLM and machine learning data, supporting cloud storage platforms with S3 protocol and various data formats. It offers interactive visualization through JSON, HTML, Markdown, and image views for efficient #data analysis.
Tweet media one
0
0
0
@OpenDataLab_AI
OpenDataLab
1 month
@CherryStudioHQ @cursor_ai This architecture enables any #AI tool supporting MCP protocol to easily integrate and leverage MinerU's document processing capabilities.📲.
0
0
0
@OpenDataLab_AI
OpenDataLab
1 month
🔍 MinerU MCP Server source code released! It accepts commands from #MCP protocol-supported clients (e.g., @CherryStudioHQ , @cursor_ai ), invokes MinerU API for actual conversion, and returns results to clients.🚀.Get MinerU MCP Server source code from:
1
0
0
@OpenDataLab_AI
OpenDataLab
2 months
#MinerU has officially cooperated with @CherryStudioHQ. You can directly call the MinerU function in Cherry Studio. MinerU officially provides each Cherry Studio user with a document processing quota of up to 500 pages per day.
Tweet media one
1
0
6
@OpenDataLab_AI
OpenDataLab
2 months
#OmniDocBench has been accepted by #CVPR 2025! OmniDocBench is a benchmark for evaluating diverse document parsing in real-world scenarios. 🤓We conducted an evaluation of current mainstream PDF parsing tools using OmniDocBench, and the results are as follows.
Tweet media one
Tweet media two
1
1
0
@OpenDataLab_AI
OpenDataLab
2 months
#LabelLLM introduces an open-source platform dedicated to optimizing the #data annotation process integral to the development of #LLM. There are key features of LabelLLM. Try me👉
Tweet media one
0
0
1
@OpenDataLab_AI
OpenDataLab
2 months
#MinerU leverages the sophisticated PDF-Extract-Kit models to extract content from diverse documents effectively and ensure the accuracy of the final results. As its core, MinerU commits to facilitating the #mathematical and extended formulas parsing.
Tweet media one
0
0
0
@OpenDataLab_AI
OpenDataLab
2 months
RT @AndrewYNg: Agentic Document Extraction just got much faster! From previous 135sec median processing time down to 8sec. Extracts not jus….
0
609
0
@OpenDataLab_AI
OpenDataLab
2 months
The open-source dataset WanJuanSiLu, designed to provide high-quality training corpora for low-resource languages, thereby advancing the research and development of multilingual models. WanJuanSiLu mainly consists of eight subsets: Thai, Russian, Arabic, Korean, Hungarian, etc.
Tweet media one
0
0
0
@OpenDataLab_AI
OpenDataLab
3 months
We are very pleased to know that one of our users just launched a website about #MinerU! The website has deployed open-source solutions for data processing, tutoring, sharing of usage experience, etc. Welcome to join the community :
Tweet media one
0
0
0
@OpenDataLab_AI
OpenDataLab
3 months
Document content analysis has been a crucial research area in computer vision. We present #MinerU, an open-source solution for high-precision document content extraction. Deep dive into MinerU via the technical report:
Tweet media one
0
0
1
@OpenDataLab_AI
OpenDataLab
3 months
MinerU - Dify Marketplace.
0
0
0
@OpenDataLab_AI
OpenDataLab
3 months
MinerU Dify Plugin has been launched on Dify Marketplace. The plugin was jointly developed by MinerU and @dify_ai . From now on, you can use it to set up workflow on Dify so that you can parse complex document data for any downstream LLM use case with high efficiency.
Tweet media one
1
0
1
@OpenDataLab_AI
OpenDataLab
3 months
Are you looking for a tool to help you labeling #data? You can try #LabelU, the flexible labeling tool, which is applicable to #CV, voice interaction and #AI-assisted labeling. 👉
0
0
1
@OpenDataLab_AI
OpenDataLab
3 months
What is your ideal data processing #tool? Get #MinerU as a professional assistant to help you get #AI-READY #data . Find out the core function of MinerU as your wish!
Tweet media one
1
1
1