
Apache Parquet
@ApacheParquet
Followers
9K
Following
20
Media
2
Statuses
361
Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides high performance compression
Joined April 2013
RT @andrewlamb1111: Turns out @ApacheParquet Bloom filters are better than I think many people understand. Trevor Hilton found that for a….
0
11
0
RT @andrewlamb1111: To anyone who thinks @ApacheParquet is dead, it it showing renewed signs of life 🌹.
0
1
0
RT @J_: I tweeted this ten years ago today. At the time I didn’t quite realize how much impact this little side project would have. To ten….
0
8
0
RT @J_: It’s happened! The @ApacheParquet Java implementation repo I now called parquet-java. Thank you @andrewlamb1111 for the nudge! This….
0
4
0
RT @rgaiacs: Last speaker on the #europython's scientific room before lunch is Peter Hoffmann talking about#Pandas and #Dask to work with l….
0
16
0
RT @GyulaFora: @GbrHrmnn @bol_com @apachekafka @ApacheParquet @ApacheFlink @bol_com_Techlab Have a look at the @ApacheFlink bucketing sink….
0
2
0
RT @rajatk95: @StackOverflow @ApacheSpark Can someone answere this -> why is @ApacheParquet format faster than other columnar storage like….
0
3
0
RT @frathgeber: 2nd #PyDataLDN #keynote - @holdenkarau & @BooProgrammer walk us through a zoo of #tools for #BigData & #distributed #data i….
0
10
0
RT @ReneeYao1: Join the #GPU accelerated #analytics and #ML revolution. @ApacheArrow @ApacheParquet and @gpuoai #GTC18 .
0
8
0
RT @lulufrego: Great benchmark between @ApacheParquet on #hdfs and @ApacheKudu In short kudu is faster than Parquet….
blog.clairvoyantsoft.com
Apache Kudu is an open-source columnar storage engine. It promises low latency random access and efficient execution of analytical queries…
0
10
0
RT @J_: If you’re a company using open source projects and not sure how to contribute, a release engineer would be a tremendous help. It’s….
0
7
0
RT @mustafaakin: You do not need Spark to create @ApacheParquet files, you can use plain Java and it can even fit in AWS Lambda for a serve….
0
7
0
RT @ylogx: @ApacheParquet @ApacheArrow Also the file size went down from 10Gigs to 3Gigs without any compression.
0
6
0
RT @ylogx: Working with a 10Gig csv data. Pandas read_csv took 16mins to load the csv into memory. Converted to @ApacheParquet with @Apache….
0
150
0
RT @J_: Come hear me talk about @ApacheArrow and @ApacheParquet at #NABDConf in Palo Alto next Tuesday!
0
4
0
RT @inathens: At @ucc_bdcat today in #Austin presenting our work with @pbr_wur on managing #agri #genomic #bigdata with @ApacheSpark and @A….
0
8
0