 
            
              Rob Tang
            
            @XiangruTang
Followers
                1K
              Following
                1K
              Media
                110
              Statuses
                867
              Final-year CS PhD student @Yale. ex Research intern @google. This account is for academic purposes.
              
              New York City
            
            
              
              Joined March 2019
            
            
           🧬✨Excited to introduce CellForge: Agentic Design of Virtual Cell Models - the first fully autonomous AI system for single-cell perturbation modeling! 🌟 This is what the future of computational biology looks like - AI scientists designing AI models! 🚀 What makes it special: 
          
                
                2
              
              
                
                9
              
              
                
                229
              
             Posting my talk tomorrow on AI in Biomedicine at @EinsteinMed (hosted by @zhengdy)  https://t.co/phc9Joe0kO  Using AI for brain genomics & thinking about how it can do this research autonomously. Lots of new slides on AI coding & risks from @XiangruTang . 
          
                
                0
              
              
                
                1
              
              
                
                8
              
             This past summer, I had the incredible opportunity to intern at @Google🌟 Grateful to my PhD advisor @MarkGerstein for supporting me, and it was such a joy to host his visit at the NYC office — great chats with @taotu831 and @ank_parikh too! Though we didn’t get a photo, I’m 
          
                
                0
              
              
                
                1
              
              
                
                21
              
             9/9 Key innovations: âś“ Monitor-based RAG (implicit knowledge, no interruption) âś“ HSR (hierarchical refinement, not voting) âś“ QAIR (quality-aware iteration) Open-source  https://t.co/5aLukKnq1Q  w/@Eigen_AI_Labs @9LdROhjZE56jSh9 @suhui03 @hanrui_w @kingsquare2
          
          
                
                0
              
              
                
                0
              
              
                
                0
              
             8/9 We quantify the "tool tax" precisely: Traditional Explicit RAG: Accuracy increases BUT tokens explode (moves right on graph) Eigen-1: Accuracy increases AND tokens drop 53.5% (upper left quadrant!) Not a tradeoff—complete dominance. 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             7/9 Fascinating discovery: IR tasks (339 samples, slope 0.369): Solution diversity helps—different perspectives find complementary information. Reasoning tasks (392 samples, slope 0.851): Consensus matters—when multiple reasoning paths agree, they're likely correct. 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             6/9 Error analysis reveals stunning overlap! 92.78% involve reasoning process errors 88.66% involve knowledge application errors Only 13.40% execution/adherence errors 9.28% comprehension errors Insight: The challenge isn't knowledge OR reasoning—it's seamlessly integrating both 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             5/9 RAG backends within Monitor-based RAG Tested: Vanilla, Vanna, HippoRAG, LightRAG Winner: HippoRAG achieves ~41% accuracy Why? Its graph-structured indexing captures fine-grained context fragments without overwhelming the reasoning stream. 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             4/9 HSR example: Anchor picks ResNet, but references provide 4 targeted fixes: Numeric: Option A actually faster Method: Classification models can't count multiple objects Logic: Time isn't the only criterion Clarity: Option D is correct No democratic averaging—smart repair! 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             3/9 An example: Monitor detects uncertainty about "maximum recombination-induced change-points" → Querier searches "single crossover meiosis" → RAG retrieves "at most one breakpoint per gamete" → Injector adds facts seamlessly → Model excludes 4-change-point cases → answer 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             2/9 (a) Monitor-based RAG: Monitor scans reasoning stream → Querier generates queries → Injector integrates knowledge (b) Workflow: Proposer generates solutions → Corrector applies fixes → HSR enables cross-solution refinement → QAIR adapts to quality → Ranker selects best 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             1/9 Why traditional RAG fails: "tool tax" problem. Left: Model confidently recalls wrong formula. Right: Even correct formula from RAG, reasoning flow breaks after tool interruption. Eigen-1's solution: Monitor-based RAG that augments knowledge WITHOUT pausing the thinking. 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             🚨 Eigen-1 gets 48.3% (Pass@1) & 61.74% (Pass@5) on "Humanity's Last Exam" (HLE) gold subset @FutureHouseSF using DeepSeek V3.1. Prev. Grok4->30.2%, GPT-5->22.8%, Gemini 2.5 Pro->18.8% 📎  https://t.co/4Fhcp8VTBG  The future isn't bigger models, it's smarter agentic design! 🚀 
          
                
                1
              
              
                
                4
              
              
                
                86
              
             CellForge: Agentic Design of Virtual Cell Models 1. CellForge introduces an agentic system that leverages a multi-agent framework to transform biological datasets and research objectives directly into optimized computational models for virtual cells. This innovative approach 
          
                
                1
              
              
                
                16
              
              
                
                56
              
             9/9 🙏 Massive thanks to my incredible collaborators @GersteinLab Zhuoyun @JiapengChen_fs Yan @DanielStupid @Weixu_Ken_Wang @WUFang40615703 @yuchen_zhuang @WenqiShi0106 @ZhiHuangPhD @armancohan @XihongLin @fabian_theis @KrishnaswamyLab @MarkGerstein who made this ambitious 
          
                
                0
              
              
                
                0
              
              
                
                2
              
             8/9 🌍 Cross-Modal Excellence: CellForge's true power shines across the full spectrum of single-cell modalities with remarkable adaptability! From gene expression (scRNA-seq) capturing transcriptional responses, to chromatin accessibility (scATAC-seq) revealing regulatory 
          
                
                1
              
              
                
                0
              
              
                
                2
              
             7/9 🆚 Comprehensive Benchmarking: We tested against leading AI research systems AND conducted crucial ablation studies. Single-prompt Claude 3.7 achieved only 2.27/10 in research plan quality vs CellForge's 7.27/10 - a 220% improvement! More critically, simple approaches 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             6/9 🔬 Example Model Architecture: Here's what CellForge autonomously designed for the Norman et al. gene knockout dataset - a sophisticated hybrid architecture that no human would manually construct! The system intelligently combines: VAE encoder for handling high-dimensional 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             
               
             
            