George Fraser Profile
George Fraser

@frasergeorgew

Followers
3,023
Following
273
Media
194
Statuses
1,917

I mostly tweet about software, data, and the difficulty of changing our opinions based on what the data tell us.

Oakland, CA
Joined March 2012
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@frasergeorgew
George Fraser
2 years
If you don’t laugh you’ll cry: this company decided they need data visualization and BI, but none of that “data warehousing” malarkey.
Tweet media one
85
113
718
@frasergeorgew
George Fraser
2 years
The bad news is we are all going to get covid multiple times over the course of our lives. The good news is it is no longer any more dangerous than colds and flu. If we don’t want to spend the rest of our lives living in a TSA checkpoint, it’s time to start ignoring covid.
4
20
197
@frasergeorgew
George Fraser
7 months
Between now and when @sama announces his new company on Monday, I think that leaves me as the most important gay startup CEO. Savoring the moment 💆‍♂️
5
3
137
@frasergeorgew
George Fraser
3 years
The data show the covid doesn’t spread through schools. California has kept the schools closed. The data show that outdoor transmission is 99% less likely. California has outdoor mask mandates and is closing outdoor dining. Policy is being driven by scientists but not by science.
2
13
128
@frasergeorgew
George Fraser
2 years
Every salesperson when the new comp plan comes out
Tweet media one
9
7
98
@frasergeorgew
George Fraser
4 years
It's going to be a real breakthrough when SF tech companies discover this amazing new technology called "database transactions"
Tweet media one
2
9
92
@frasergeorgew
George Fraser
1 year
How fast is DuckDB compared to the best commercial data warehouses? I decided to benchmark it myself. Short version: very fast! But it's not (yet) great at scaling up to many cores.
Tweet media one
4
17
91
@frasergeorgew
George Fraser
3 years
Kafka is a good message broker, but “turning the database inside out” is a bad idea. I wrote a blog post with @narayanarjun about why.
5
11
87
@frasergeorgew
George Fraser
9 months
Data lake support is one of the most technically challenging things we've ever delivered. Writing updates to S3 requires building a quasi-DWH inside Fivetran. We use @DuckDB to rewrite the parquet files and built a BigQuery-style scale-out service to deal with large tables.
@fivetran
Fivetran
9 months
Breaking news! Amazon S3, fueled by Fivetran with Apache Iceberg, has officially moved to general availability. Swipe to discover a few highlights. ➡️ Check out this blog to discover more:
0
1
15
3
11
85
@frasergeorgew
George Fraser
1 year
I’m a big DuckDB fan but this direction worries me. If the syntax diverges so far from standard SQL it’s going to be really hard to build tooling for DuckDB.
@__AlexMonahan__
Alex Monahan
1 year
Do you wish your DB was more Pythonic? How about more fluent? In @DuckDB , now you can chain functions together! No more reading from the inside out - you can read your code left to right! SELECT ([1,2,3]).filter(x -> x<3).apply(x -> x*5) Modern SQL!
12
17
170
11
3
78
@frasergeorgew
George Fraser
23 days
Two years ago, Fivetran took on a unique type of debt from @generalcatalyst . Debt comes with covenants and payback and these can be deadly to a mid stage startup if things go wrong. But GC has created a unique offering that sits between traditional debt and equity. We’ve been
4
12
77
@frasergeorgew
George Fraser
2 months
@devonzuegel But they didn’t “break up the massing.” How can a building be beautiful if it doesn’t break up the massing?
Tweet media one
1
1
72
@frasergeorgew
George Fraser
3 years
The top-secret truth of the whole ETL vs ELT thing is...ELT is really still ETL. We do a *ton* of transformation at @Fivetran , but it’s all automated and generic. The normalized schema you see in your DWH doesn’t just *happen*.
7
5
73
@frasergeorgew
George Fraser
1 year
When Taylor and I started Fivetran, we each wrote a personal check to our new SVB bank account to purchase the initial shares. SVB had no branches so we went to their main office, rang the bell, and waited for someone to come out and take the checks.
2
1
73
@frasergeorgew
George Fraser
3 years
Yikes! One of the best decisions we made at Fivetran was *not* using an ORM. Human-written SQL queries tend to be much simpler and avoid exploring these crazy corner cases of the performance space.
4
7
70
@frasergeorgew
George Fraser
2 years
Unpopular opinion: metric stores are just BI tools, and the best metric stores will evolve into full-fledged BI tools over time.
16
1
67
@frasergeorgew
George Fraser
2 years
@VPrasadMDMPH is laser focused on the most important scientific issue of our time
Tweet media one
1
8
65
@frasergeorgew
George Fraser
2 years
2 years ago Fivetran introduced objective, consumption-based pricing. Today, we’re making some adjustments based on what we’ve learned in the last 2 years. Why are we making these changes and how will they affect our customers? 🧵
1
5
64
@frasergeorgew
George Fraser
4 years
@lpolovets @stripe Of the (many, many) investors who passed in Fivetran’s early days, two gave detailed feedback: you and @tonsing . I appreciated it greatly at the time and all these years later I still haven’t forgotten!
3
3
62
@frasergeorgew
George Fraser
4 months
New blog post by @cfwang1337 at @fivetran . The big 3 data platforms Snowflake BigQuery and Databricks are converging on the same 3 core capabilities: 1/ Vectorized SQL execution. 2/ Python dataframes. 3/ Lakehouse.
Tweet media one
7
8
58
@frasergeorgew
George Fraser
2 years
Let’s see if we can create a fake data trend. Headless data warehouse? Machine learning mesh? Reverse business intelligence? C’mon people, we can do this.
22
2
60
@frasergeorgew
George Fraser
26 days
Fun fact: Fivetran still uses a single vertically-scaled Postgres database for our production workload. 26k transactions per second. To replicate it to our data warehouse takes about 10m every 15m, using Fivetran (obviously). We replicate off the primary.
Tweet media one
2
4
60
@frasergeorgew
George Fraser
2 years
It's very peculiar how people will spend $100ks on head count but freak out at spending $10ks on tools. Especially with @Fivetran , where 90% of what you're paying for is various forms of data cleanup and failure-recovery that you would otherwise have to manage yourself. 🤷‍♂️
7
7
57
@frasergeorgew
George Fraser
2 years
3¢ per transaction for Stripe Data Pipeline is kinda nuts. By comparison, for a customer with 100m MAR, Fivetran is 0.01¢ / MAR, and Fivetran is not an especially cheap data pipeline.
5
6
54
@frasergeorgew
George Fraser
2 years
We’re seeing a new type of “hybrid analytical” workload at @fivetran . Meanwhile there’s a new type of analytical database—the in-memory column store—that might be a perfect match for it. @andy_pavlo
3
4
55
@frasergeorgew
George Fraser
3 years
Congrats to @getdbt @jthandy and @drewbanin on the series C! @fivetran is proud to be a member and supporter of the dbt community. dbt is a truly open-source, bottom-up phenomenon that empowers analysts, and that’s a great thing for the world.
1
4
51
@frasergeorgew
George Fraser
3 months
My PhD advisor likes to point out that every new technology is analogized to the brain. Hydraulics was like the brain, radio was like the brain, now computers are like the brain.
@pmarca
Marc Andreessen 🇺🇸
3 months
Step 1: Mankind invents a new technology. Step 2: "The brain works just like this new technology!" "The universe works just like this new technology!" Step 3: Go to step 1.
98
115
1K
3
5
48
@frasergeorgew
George Fraser
3 months
I see the claim that “the log is the database” so often and it’s so, so wrong. It’s the “you only use 10% of your brain” of data infrastructure. Everybody reads “designing data intensive applications” and comes away with this dumb idea.
7
4
50
@frasergeorgew
George Fraser
2 years
What does this even mean? You should never take a valuation above 10x revenue? I mean, I’m sure Bill Gurley would love it if founders followed that advice, but I don’t know why we would.
@bgurley
Bill Gurley
2 years
1) Previous "all-time" highs are completely irrelevant. It's not "cheap" because it is down 70%. Forget those prices happened. 2) Valuation multiples are always a hack proxy. Dangerous to use. If you insist, 10X should be considered AMAZING and an upper limit. Over that silly.
74
351
4K
6
0
48
@frasergeorgew
George Fraser
2 years
A harsh but directionally accurate* list of the things we can still get better at. The good news is we have big progress in the works for long-tail sources, outages, and reconciliation. (*The lost data claim is wrong)
Tweet media one
4
0
47
@frasergeorgew
George Fraser
2 years
@petrillic @martin_casado An m6gd.12xlarge can only support 250 daus?
2
0
47
@frasergeorgew
George Fraser
1 year
My most controversial opinion (and it’s a stiff competition) is that subqueries are great and more people should use them.
11
2
46
@frasergeorgew
George Fraser
1 year
I think this concept will succeed, but all the examples I have seen are extremely simple queries, and I’ve asked ChatGPT for things like net retention and it can’t figure it out. I suspect we need the AI to target a dimensional model rather than SQL.
@jackiexu_
Jacqueline X.
1 year
Ask your data any question using plain English. @getLogicLoop uses AI to generate SQL queries you can run directly on own unique data schema to find what you’re looking for. Manually writing SQL can be tedious. Business users can use AI to help query and analyze data faster and
12
7
97
12
3
46
@frasergeorgew
George Fraser
3 years
@sc13ts I don’t disagree but I’ve concluded that none of the problems with SQL are quite bad enough to motivate the ecosystem to move to something else. When we’re flying starships to Alpha Centauri, there will be a SQL database in there.
1
1
45
@frasergeorgew
George Fraser
2 years
Is anyone working on a “webflow for BI?” No code but deep menus that give you extreme customizability.
18
3
44
@frasergeorgew
George Fraser
2 years
Many ETL tools rely on “visual programming languages” but I had the good fortune to be a scientist before I was a startup founder and I…saw some things…in LabVIEW. And that’s how @Fivetran never had a clicky-flow-chart programming tool.
@compilerqueen
Ashley Sherwood (she/her)
2 years
@peter_rohde Good thing I know LabVIEW, then.
Tweet media one
2
0
24
4
6
44
@frasergeorgew
George Fraser
2 years
Everything moving to the cloud is scary for hardware innovation. How can you bring something new to the market when you have to get adoption by the hyperscalers to reach customers? Thinking of the failure of NVRAM/Optane but presumably applies across many domains.
9
8
44
@frasergeorgew
George Fraser
8 months
Hyper might still be the fastest analytical database, despite having been discontinued as an independent product in 2016. Such a shame that it's sitting on the shelf at Salesforce, an open-source Hyper could really shake up the analytics ecosystem. @muehlbau @mim_djo
Tweet media one
5
2
43
@frasergeorgew
George Fraser
1 year
🎯 @andy_pavlo ! Blockchain: a solution looking for a problem, other than paying ransoms and Ponzi schemes.
Tweet media one
Tweet media two
Tweet media three
3
9
43
@frasergeorgew
George Fraser
1 year
In the future, all humans will spend the first 50 years of their lives in school, the second 50 years working in either schools or hospitals, and the last 50 years in the hospital. All other needs and wants will be provided by robots.
@pmarca
Marc Andreessen 🇺🇸
1 year
Marc Andreessen Substack: Why AI Won't Cause Unemployment
109
160
908
11
4
41
@frasergeorgew
George Fraser
2 years
The CA mask mandate will finally be lifted on Wednesday, even though cases are still higher than the peak of the delta wave. Masks are and always have been about public opinion.
Tweet media one
4
4
41
@frasergeorgew
George Fraser
2 years
I recently noticed a mind-bending (to me) mathematical fact: positive NRR does not change your long-run growth rate 🤯
6
2
40
@frasergeorgew
George Fraser
9 months
YC has been life changing for me. Without a doubt @fivetran would not be around today if not for YC.
@sama
Sam Altman
9 months
YC has always had a lot of people implying that it sucks. notably this almost always comes from other investors (who are not thrilled about founders being more empowered and having such a good option)
80
93
2K
1
4
39
@frasergeorgew
George Fraser
2 years
Why is there an Okta conference? Who goes to a conference about a single sign on tool? People who REALLY hate their families, I guess? 🤔
@okta
Okta
2 years
🔴 Happening now: #Oktane22 is LIVE
34
58
730
2
0
40
@frasergeorgew
George Fraser
3 years
@MaterializeInc is a super interesting company to watch. They've started with the hardest problem in data warehouse, materialized views, which has never really been solved. Materialized views are useful in and of themselves, but look at their road map:
2
5
34
@frasergeorgew
George Fraser
2 months
Haha. The truth hurts @mattlynley
Tweet media one
5
1
34
@frasergeorgew
George Fraser
3 years
I made it to 1,000 @itunpredictable
Tweet media one
@itunpredictable
sisyphus bar and grill
4 years
4/5 people here (all the CEOs) have <5K twitter followers makes you wonder the CEO of Fivetran has 379 followers
6
2
39
3
1
34
@frasergeorgew
George Fraser
3 years
People who say “I believe in Science” and get mad at @NateSilver538 for having opinions about vaccine distribution don’t understand how science works. It’s not a belief system. Anyone can look at the evidence and form their own opinion about where the truth lies.
0
1
32
@frasergeorgew
George Fraser
1 year
We ran Fivetran with that same primary bank account for 10 years. We must have run over a billion dollars through it. SVB is a very special institution and I hope they continue in some form.
1
0
32
@frasergeorgew
George Fraser
3 years
300m rows per second is well within the capability of a single modest-sized columnar data warehouse. Ironically, the companies who spend a ton of engineering resources building custom Kafka/Data Lake/Query Engine infrastructure end up with worse results.
3
2
31
@frasergeorgew
George Fraser
2 years
@pedram_navid We are building this, it is BQ only right now but in time it will support the major DWHs.
Tweet media one
0
0
30
@frasergeorgew
George Fraser
2 years
Change my view: @elonmusk is an alien who’s trying to get back to his home planet. It’s the only explanation that makes sense.
1
0
30
@frasergeorgew
George Fraser
3 years
This is basically a dashboard of the effectiveness of every government. Everyone else on this list should be asking themselves, why is Israel’s government so much better than ours? Who should I vote for/against to make our government more like Israel’s?
3
5
29
@frasergeorgew
George Fraser
2 years
It’s remarkable the war for talent that’s taking place at the intersection of data warehousing and machine learning. According to Econ 101 this will continue until 100% of the economic rent accrues to labor and the median wage for ML software engineers will be $100m / year 💸
0
1
29
@frasergeorgew
George Fraser
2 years
@tayloramurphy So Fivetran does data inactivation then? We take your data and put it down for a nice nap in your data warehouse.
5
1
29
@frasergeorgew
George Fraser
2 years
I am big time, people write news articles (with very misleading titles) about my tweets now 💪
3
0
28
@frasergeorgew
George Fraser
1 year
It needed to be done: a @duckdb database of ducks.
Tweet media one
3
2
27
@frasergeorgew
George Fraser
4 years
@martinkl I apologize for that, I got a little overwrought in the heat wave last night...I have my issues with your work but nobody should ever be compared to Nassim Taleb 😉 You can take satisfaction that I’m getting in big trouble for that comment right now 🙈
6
0
27
@frasergeorgew
George Fraser
1 year
Tweet media one
1
0
25
@frasergeorgew
George Fraser
3 months
I don’t think SELECT machine_learning(*) is going to be the future.
6
0
24
@frasergeorgew
George Fraser
2 months
“You can’t just take your existing product and slap AI on it”
Tweet media one
3
1
24
@frasergeorgew
George Fraser
3 years
Does @fivetran have any haters? Seems like we haven’t really “made it” until we have some haters.
3
0
23
@frasergeorgew
George Fraser
3 years
We're far enough away from 2020 that we have good estimates of total mortality from @HMDatabase . What do the data tell us? First, in the 25 countries for which we have a complete estimate, mortality increased by about 10%, taking us back to the level of 2008.
Tweet media one
2
3
23
@frasergeorgew
George Fraser
3 months
Fivetran has a custom connector SDK in the works and it's pretty, pretty good 👌
3
3
24
@frasergeorgew
George Fraser
1 year
AI may destroy us all, but in the meantime, at least we don’t have to pay attention to crypto anymore 🙏
1
4
23
@frasergeorgew
George Fraser
3 years
I wrote in Forbes Tech Council about why data lakes are dead, and data warehouses/lakehouses are the future: (I’m also a big fan of actual lake houses, but that’s a subject for another day)
0
3
23
@frasergeorgew
George Fraser
7 months
Look folks, even Jesus didn’t rise again until the third day.
2
3
22
@frasergeorgew
George Fraser
1 year
2 years ago I posted about data validation in LocallyOptimistic, @jasonnochlin replied, it led to an acquisition and now we have 500 customers using the novel SELECT-only sync method he invented. Lesson: I need to spend more time hanging around in data slack communities.
Tweet media one
0
2
21
@frasergeorgew
George Fraser
7 months
Big news! Fivetran created an SDK that allows sources and destinations to write their own connectors. With a small amount of code, any application or database provider can enable their customers to centralize data in any destination supported by Fivetran.
@fivetran
Fivetran
7 months
Introducing Fivetran SDKs! 🧰 Our new Software Development Kits allow third-party vendors to build their own connectors and destinations. Join the movement with @Convex_dev , @PlanetScale , and @MotherDuck , who have already started creating new possibilities. Discover more:
0
7
26
1
3
19
@frasergeorgew
George Fraser
4 years
Had a great convo with @mattturck about the evolution of the modern data stack, the frontier of putting data into action, and how to use Google Trends to name your startup:
2
4
21
@frasergeorgew
George Fraser
3 years
@tayloramurphy Funny how many of these replies call out real time data. I’ve had a lot of conversations with Fivetran users who require very low latency pipelines, because someone in the business demands it, but they acknowledge it’s a kind of vanity metric. 🤷‍♂️
2
1
22
@frasergeorgew
George Fraser
1 year
Can I have this please? I will pay 10% of the value of all cancelled meetings, which at Fivetran’s scale should be roughly $10M 😂
@0xgaut
gaut
1 year
Google Calendar, but it shows the cost of the meeting
Tweet media one
969
13K
117K
1
0
22
@frasergeorgew
George Fraser
2 years
Wrong answer, ChatGPT!
Tweet media one
3
0
21
@frasergeorgew
George Fraser
2 years
Regardless of the outcome, it is fantastic when we test government policies using randomized trials. We should do this every chance we get. Who cares if the results favor team red or team blue, the results favor TRUTH.
@hamandcheese
Samuel Hammond 🌐🏛
2 years
Devastating new results on the effects of state-funded pre-K programs. In policy you rarely get stronger study designs than random assignment + multi-year longitudinal follow-up. Yikes.
Tweet media one
93
521
2K
0
2
21
@frasergeorgew
George Fraser
1 year
Close but no cigar, ChatGPT.
Tweet media one
3
1
20
@frasergeorgew
George Fraser
11 months
This is a big deal. This allows non-relational workloads to have a high-bandwidth interface to data stored in Snowflake. It’s an alternative to building a transactional data lake for customers that prefer the simplicity of storing all their data in SF.
0
1
20
@frasergeorgew
George Fraser
3 years
I think this might be my all time favorite Fivetran review. “I don’t really know how it works and I don’t particularly want to.”
@deoates
David Oates
3 years
Fivetran is an “ETL” tool -- extract, transform, and load. I don’t really know what that means but for us, it copies data from disparate sources into BigQuery (our data warehouse)
1
1
5
2
1
20
@frasergeorgew
George Fraser
3 months
@jasonnochlin I wrote a query planner as a hobby project (naturally) and implemented this algorithm in it. @mraasveldt YouTube talk on the subject is the key resource, it covers a bunch of edge cases not described in the paper.
3
3
19
@frasergeorgew
George Fraser
1 year
We got a lot of great feedback on the benchmark over the last few days, and it’s turned into a bit of a living document. Redshift numbers are still a little weird, I feel like I’m still missing something 🤔.
0
2
20
@frasergeorgew
George Fraser
9 months
It’s not just AI, for example the @SnowflakeDB founders are very immersed in the technical details to this day. Great technical leaders maintain the ability to “zoom in.”
@_jasonwei
Jason Wei
9 months
It seems to be not a coincidence that some of the strongest leaders in AI who manage large teams frequently do very low-level technical work. Jeff Dean doing weekly IC (individual contributor) work while managing 3k+ people at Google Research is the canonical example, but I've
27
150
1K
0
1
20
@frasergeorgew
George Fraser
1 year
My favorite thing about ChatGPT is that it answers the question I asked, instead of giving me a 15 minutes description of whatever it’s working on right now.
1
0
19
@frasergeorgew
George Fraser
1 year
In the early years, I would cancel our SVB debit cards every year so all our subscriptions would lapse and I could be sure we were only paying for the things we really needed. When I would break the cards off the page they came on, a little ragged tab would always stick to them.
1
0
19
@frasergeorgew
George Fraser
2 years
Respect for the man in the arena.
@Suhail
Suhail
2 years
1/ A bit of news: last week I decided to stop working on Mighty after 3.5 years 😓. If anyone is interested in buying the IP, please reach out. This week our team will begin work on making new kinds of creative tools using advances in AI. A new kind of Adobe Creative Suite.
390
177
5K
0
0
19
@frasergeorgew
George Fraser
3 years
@loganbartlett Incorrect. In 2035 a16z will just have @martin_casado , on 10,000 boards.
1
1
19
@frasergeorgew
George Fraser
1 year
If you use @Fivetran I guarantee you can make do with only two layers of VPs managing your nonexistent data engineers. Maybe 3 at the most.
Tweet media one
2
0
19
@frasergeorgew
George Fraser
2 years
@jthandy Your data is almost certainly more accurate. Idea! Let's pool our data and publish a combined data warehouse leaderboard. People would love it!
1
1
19
@frasergeorgew
George Fraser
1 year
Interest-bearing bank accounts seem like an anachronism. It would make more sense for banks to hold everything in 100% safe liquid form and charge a fee for this service, and for everyone who wants interest to get it from money market funds.
2
2
18
@frasergeorgew
George Fraser
1 month
Fivetran's Iceberg data lake implementation is basically a headless cloud data warehouse storage engine. We're looking for a principle engineer to lead development of it. Super cool opportunity for someone who loves DBMS! Open in many locations but here is the CA link:
5
3
18
@frasergeorgew
George Fraser
2 years
@mullinsms You can blame me for this. Gotta focus, basic SQL + dbt SQL were two ways to do the same thing.
2
0
18
@frasergeorgew
George Fraser
11 months
It’s so funny to me that people try to dunk on Fukuyama. Dude literally named Donald Trump as an example of a megalothymic individual who would be dissatisfied with being a mere rich developer and become a threat to democracy. In 1993.
Tweet media one
1
0
18
@frasergeorgew
George Fraser
10 months
@0interestrates And surprisingly high performance. We use it to stage data for loading into data warehouses @fivetran , we’ve tested parquet but CSV is faster for real world datasets, or at least it was a couple years ago.
2
0
17
@frasergeorgew
George Fraser
2 years
This is a great talk from the creators of @duckdb , among other things it argues persuasively that you can’t just shoehorn ML in your DBMS, the ecosystem is just too big, you have to figure out a way to interact with what already exists.
0
2
16
@frasergeorgew
George Fraser
1 year
Despite universal mask wearing, Japan has case rates similar to the US at the height of omicron. At this point, masking is a superstition, like astrology or chiropractic.
Tweet media one
4
2
17
@frasergeorgew
George Fraser
9 months
If Microsoft brings python to excel and makes it work cross-platform, it will add a couple of basis points to global GDP.
2
0
17
@frasergeorgew
George Fraser
3 years
@vboykis Databricks is doing a lot of great work, but this idea that you have to have a data lake in front of your data warehouse is ridiculous. They’re really straw-manning the “just use a SQL data warehouse” point of view.
4
3
17
@frasergeorgew
George Fraser
1 year
I had a great conversation with Frank at @Snowflake about how our products are complementary, consumption based pricing, and where the partnership is going:
0
3
17