collapse
@collapse_R
Followers
974
Following
82
Media
30
Statuses
190
A C/C++ based package for advanced data transformation and statisical computing in R. Account managed by the author. #rcollapse
CRAN and Github
Joined June 2020
{collapse} is now also on BlueSky ( https://t.co/ODTcOh0GjA) and I am also there ( https://t.co/yQnDqHAANa) [and on Mastodon: https://t.co/kKomhUJbOp]. I will repost {collapse} posts but also share about research/data. This X account will remain active. #rstats
0
0
1
{collapse} 2.1.0 is out! It introduces a new fslice() function ( https://t.co/p25YbZujp5), a new theory-consistent weighted quantile algorithm ( https://t.co/NAJD2y7zpk) with excellent properties. And some convenience features such as join requirements: #rstats #DataScience
1
9
49
The {collapse} arXiv paper has just been updated - following extensive revision: https://t.co/Ft1vDdCGfQ. I believe it is a great resource for anyone doing scientific computing with #rstats.
arxiv.org
collapse is a large C/C++-based infrastructure package facilitating complex statistical computing, data transformation, and exploration tasks in R - at outstanding levels of performance and memory...
0
7
21
There is now a #fastverse benchmark wiki ( https://t.co/mwx7SR8rvT) where users can freely contribute benchmarks. If you have benchmarks involving {fastverse} packages ({collapse}, {data.table}, etc., including extensions) please contribute them (takes 1 min) #rstats #DataScience
github.com
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R - fastverse/fastverse
0
1
15
I just improved the vignette a bit further, adding some detailed benchmarks and a section on Global Options. I needed to correct myself: it is not true that {collapse} global options should never be invoked in packages - they just need to be reversed like #rstats global options.
0
0
2
It's nice to see an increasing number of #rstats packages using {collapse}. A developer focused vignette was long planned and now it is here - with modest advice on writing efficient R package code in general and using {collapse} in particular:
2
12
37
Check out the latest package to be granted the Seal of Approval: {collapse} by Sebastian Krantz! {collapse} is a partner package, that implements various data transformation and statistical analysis tasks using ultra fast C/C++ implementations. https://t.co/fJ0QdElEVX
3
8
75
{collapse} v2.0.15, with fast aggregation pivots, has just reached CRAN. A minor but neat feature worth pointing out in this release is enhanced join verbosity. In addition to the join success rates, the join relationship is now determined and reported - at no extra cost #rstats
0
7
40
New independent benchmark by Adrian Antico: https://t.co/prVkzllrtD Setup: - large local Windows machine - real data - broad range of tasks - scripts executed inside Rstudio and VScode -> shows that {collapse} is an absolute top performer in this setting #rstats #DataScience
github.com
Compare run times for various data frame packages. Contribute to AdrianAntico/Benchmarks development by creating an account on GitHub.
0
5
24
{collapse} v2.0.15, already available via install.packages("collapse", repos = " https://t.co/89o0WvIIbe"), adds wide/recast pivot()'s with aggregation, including some hard-coded internal functions. A game changer for pivot tables in R. More at https://t.co/Jo9ixPFMWs.
#rstats
2
6
57
An article on {collapse} is available on arXiv: https://t.co/Ft1vDdDe5o (submitted to Journal of Statistical Software). It highlights the aims and added value of collapse and its cutting-edge performance for many complex statistical tasks in #rstats. Please consider sharing it.
arxiv.org
collapse is a large C/C++-based infrastructure package facilitating complex statistical computing, data transformation, and exploration tasks in R - at outstanding levels of performance and memory...
1
9
57
{collapse} has been benchmarked in the DuckDB benchmark: https://t.co/HAvywhYJUn, and is pretty competitive on 0.5-5Gb (laptop-grade) operations. A surprise is that it seems to be the only framework next to DuckDB to be able perform large data joins (50Gb) efficiently. #rstats
2
5
60
I’m thrilled to announce the release of {collapse} 2.0, adding blazing fast joins, pivots, a flexible namespace, and many other features. It is a remarkable piece of R software and capable of enhancing the workflow of all R users. Spread the word #rstats
https://t.co/szDDZJ501Q
0
30
156
As I'm slowly moving towards the release of collapse 2.0, you have again opportunities to explore features in the development version and provide valuable feedback (API, performance, bugs etc.). In particular join() and pivot() are major innovations and likely of interest.
7
9
72
Released a minor update {collapse} v1.9.6, which, notably, includes a new vignette on how {collapse} handles R objects - a quick view behind the scenes of its class-agnostic R programming framework: https://t.co/djqcb29WHg
#rcollapse #rstats
0
0
6
{collapse} v1.9.5 is released - with a limited set of SIMD instructions. As of last week, collapse has been downloaded 1 million times from CRAN. I thus wrote a post reflecting on the past, present state and future of #rcollapse and #fastverse in #rstats:
2
6
25
I've created a small video tutorial about the new global argument default settings and OpenMP multithreading in {collapse}: https://t.co/VCFWlLSVt7
#rstats #rcollapse #fastverse
1
2
9
Released a C/C++ patch (v1.9.2), which includes a noteworthy addition: function set_collapse(nthreads = [int], na.rm = [TRUE|FALSE]) can be used to globally set argument defaults. This is worthwhile on larger projects (e.g. M1 Mac + 4 threads = >10Gb's data crunching). #rstats
0
3
13
#rcollapse 1.9 has been released to CRAN, providing greater performance and versatility in almost every domain, alongside new functionality such as grouped & weighted sample quantiles (in C) pushing the frontiers of #rstats. News: https://t.co/J7r5kUg8PJ
#fastverse #DataScience
0
4
31
#fastverse v0.3.0 is out, lighter than before, with an ability to install development versions from r-universe. The README ( https://t.co/l3Aoi9T7tu) was updated to reflect the state of high-performance in R. Those leaving twitter can also follow https://t.co/kKomhUIDYR
#RStats
0
4
20