Unso Jo @unsojo X Profile

Unso Jo

@unsojo

Followers

2K

Following

2K

Media

13

Statuses

255

CEO of Genabase; Teaching History of AI @Cornell

https://t.co/Hyzhc1hfjB

New York, USA

Joined November 2017

Don't wanna be here? Send us removal request.

Unso Jo

@unsojo

1 month

On episode 996 of "Is AI coming for your job too?" We present, Agent Bain vs. Agent McKinsey. A New text-to-SQL benchmark for the business domain (CORGI): https://t.co/Cr7L40rJRk Joint work among Cornell Bowers CIS, Cornell Johnson School of Management, & Gena AI (@YueeeLi_

2

3

Unso Jo

@unsojo

1 month

If for some Finger Lakey reason you happen to be in Ithaca, come to my talk at the Cornell Johnson School of Management. Sage Hall noon 21st Oct. https://t.co/8HSSisgP9Q Data Access for Everyone with AI For everyday people, databases are a foreign concept. There is a

business.cornell.edu

0

1

Unso Jo

@unsojo

1 month

read our paper: https://t.co/lMdUnKHpjF contribute to our open source benchmark and data: https://t.co/bbK7uS7zg1 evaluate and submit online: https://t.co/Cr7L40rJRk send questions to: rt529@cornell.edu or unsojo@cornell.edu

github.com

Contribute to corgibenchmark/CORGI development by creating an account on GitHub.

0

2

Unso Jo

@unsojo

1 month

A few interesting insights**: 1) AI is relatively bad at identifying latent patterns in data like seasonal sales trends 2) AI is worse at giving data-informed recommendations (future) than data-informed explanations (past). 3) AI relatively bad at giving advice on planning 4)

1

2

Unso Jo

@unsojo

1 month

Our new CORGI benchmark features 10 hand-curated databases representing modern enterprises spanning retail e-commerce such as Lululemon, DTC enablement like Shopify, and c2c like Airbnb and TheRealReal. Our databases are jacked 😳 with more tables and relations than seen in

2

0

2

Unso Jo

@unsojo

3 months

People, it's the data.

0

3

Hyung Won Chung

@hwchung27

4 months

This is my lecture from 2 months ago at @Cornell “How do I increase my output?” One natural answer is "I will just work a few more hours." Working longer can help, but eventually you hit a physical limit. A better question is, “How do I increase my output without increasing

44

763

5K

Unso Jo

@unsojo

6 months

Our work serves as a case study of how applying domain expertise can reduce LLM cost -- Using generic LLM methods bluntly results in unnecessary expenses. Leveraging knowledge of SQL and databases can level the playing field for efficient applications.

0

1

Unso Jo

@unsojo

6 months

Preprint is available here: https://t.co/F0kmxLIo4U Open source package available here: https://t.co/umRgA9CdmX [soon to be integrated to PyPI txt2sql]

github.com

Implementation of N-rep. Contribute to genaasia/N-rep development by creating an account on GitHub.

1

0

4

Unso Jo

@unsojo

6 months

With 14B Qwen + N-rep you can get better performance than o3-mini. Go from 46 cents per query to 3.9 cents for comparable performance.

1

0

Unso Jo

@unsojo

6 months

Using multiple schema representations, you can reduce the number of calls required in self-consistency that are sometimes 100+ LLM calls!

1

0

Unso Jo

@unsojo

6 months

Is your text-to-SQL AI worth the cost? We introduce "N-Rep," a new faster and 10x cheaper approach for text-to-SQL without chain-of-thought reasoning or expensive fine-tuning. Joint work with @genadotco @andreawwenyi @dmimno 🧵

1

4

Unso Jo

@unsojo

7 months

Adding @_KarenHao @sheonhan @restofworld as this might be of interest!

0

4

Unso Jo

@unsojo

7 months

Special thanks to @Xianbao_QIAN for helping brainstorm and @tsmullaney for inspirational book!

1

0

2

Unso Jo

@unsojo

7 months

China is a nation with over a hundred minority languages and many ethnic groups. What does this say about China’s 21st century AI policy?

1

0

Unso Jo

@unsojo

7 months

This suggests a break from China’s past stance of using inclusive language policy as a way to build a multiethnic nation. We see no evidence of socio-political pressure or carrots for Chinese AI groups to dedicate resources for linguistic inclusivity.

1

0

1

Unso Jo

@unsojo

7 months

In fact, many LLMs from China fail to even recognize some lower resource Chinese languages such as Uyghur.

1

0

1

Unso Jo

@unsojo

7 months

LLMs from China are highly correlated with Western LLMs in multilingual performance (0.93 - 9.99) on tasks such as reading comprehension.

1

0

Unso Jo

@unsojo

7 months

Do Chinese AI Models Speak Chinese Languages? Not really. Chinese LLMs like DeepSeek are better at French than Cantonese. [ https://t.co/22ZGZZI1Us] Joint work with @andreawwenyi @dmimno 🧵

2

4

26

Unso Jo

@unsojo

9 months

Feels so odd to be teaching a history of AI when it's very actively being contested in 2025, though time will tell what gets written!

0

4