Matt Profile
Matt

@matt_dz

Followers
5K
Following
54K
Media
568
Statuses
7K

C++, Compilers, Computer Architecture, Generic Programming, GPGPU, HPC, Machine Learning, Numerics, Parallel Computing, Quantitative Finance

Joined November 2010
Don't wanna be here? Send us removal request.
@matt_dz
Matt
9 years
Computer Architecture, C++, and High Performance - Meeting C++ 2016 https://t.co/yDPCRr7YWf https://t.co/QpzBdHzbtI https://t.co/J14zg1QbTj
Tweet card summary image
github.com
A categorized list of C++ resources. Contribute to MattPD/cpplinks development by creating an account on GitHub.
@meetingcpp
Meeting C++
9 years
Computer Architecture, C++, and High Performance - Matt P. Dziubinski - Meeting C++ 2016 https://t.co/X65Sr3IJ9V #cpp #cplusplus
1
24
102
@journal_of_fp
Journal of Functional Programming
15 days
From Daan Leijen and @anton_lorenzen: an extended version of their POPL 2023 paper, "Tail Recursion Modulo Context: An Equational Approach", which calculates a generalisation of the "tail recursion modulo cons" optimisation from its specification. https://t.co/2caVArbQoU
Tweet card summary image
cambridge.org
Tail recursion modulo context: An equational approach (extended version) - Volume 35
0
5
11
@kc_srk
KC Sivaramakrishnan
16 days
Xavier Leroy’s new book “ Control structures in programming languages: from goto to algebraic effects” https://t.co/EXcds9inhg
2
77
361
@matt_dz
Matt
19 days
Opportunistically Parallel Lambda Calculus https://t.co/RSkoZKt0V5 https://t.co/De6syqWsuq OOPSLA 2025 Stephen Mell (@stephenlmell), Konstantinos Kallas (@KonsKallas), Steve Zdancewic, Osbert Bastani
0
1
10
@ezyang
Edward Z. Yang
21 days
I'm curious if compiler Twitter has a take on this. Let's imagine you are trying to design an IR for a frontend language that has mutation and aliasing, where you expect advanced *users* to be writing compiler passes. It's probably a bad idea to have your IR have both mutation
9
3
59
@tqchenml
Tianqi Chen
22 days
🧵Reflecting a bit after @PyTorch conference. ML compilers becoming "toolkits" rather than monolithic piece. Their target are also sub-modules that must interoperates with other pieces. This is THE biggest mindset difference from traditional compilers.
3
11
87
@verdagon
Evan Ovadia
23 days
New article! https://t.co/CvpNnFkmWr "The Impossible Optimization, and the Metaprogramming To Achieve It" TL;DR: If you warp your mind a bit, you can apply metaprogramming to speed up your code by about ~10x. Enjoy!
4
1
18
@ezyang
Edward Z. Yang
25 days
A small thread about how you should be drawing the contents of higher dimensional tensors
6
25
299
@simonguozirui
Simon Guo
26 days
Wrote a 1-year retrospective with @a1zhang on KernelBench and the journey toward automated GPU/CUDA kernel generations! Since my labmates (@anneouyang, @simran_s_arora, @_williamhu) and I first started working towards this vision around last year’s @GPU_mode hackathon, we have
11
64
290
@CamelCdr
camel-cdr
27 days
"How NOT To Program an Out-of-order Vector Processor" slides are public. https://t.co/0zYHoUP3l5
1
8
67
@ast_eth
AST Lab ETH Zurich
27 days
Congratulations to Vu Le, @Chengnian& @zhendongsu on receiving the Most Influential OOPSLA Paper Award at #SPLASH2025 for their OOPSLA'15 paper "Finding Deep Compiler Bugs via Guided Stochastic Program Mutation"! 📽️Award presentation: https://t.co/zanFafAe1i @CSatETH @splashcon
0
8
24
@matt_dz
Matt
27 days
Triton Developer Conference 2025 Talks https://t.co/9oXsOyi8RA #PyTorchCon @PyTorch
0
3
13
@JiaZhihao
Zhihao Jia
29 days
Great work! This kind of interoperability will help unlock new cross-compiler optimizations to push kernel performance to the extreme.
@tqchenml
Tianqi Chen
29 days
📢Excited to introduce Apache TVM FFI, an open ABI and FFI for ML systems, enabling compilers, libraries, DSLs, and frameworks to naturally interop with each other. Ship one library across pytorch, jax, cupy etc and runnable across python, c++, rust https://t.co/m2gHJRreol
0
8
27
@saambarati
Saam Barati
29 days
I gave a talk a few days ago at REBASE about the work the Verse engineering team is doing to implement a new VM for Verse and a software transactional memory runtime and compiler for C++.
0
2
10
@rhdevelopers
Red Hat Developer
1 month
Read the latest updates on the #Clang bytecode interpreter. Discover how 500+ commits have made the implementation more solid, reduced test failures, and improved performance for compile-time constant evaluation. https://t.co/FCp7KbBkCH
Tweet card summary image
developers.redhat.com
It’s October again, so let me tell you what happened with the clang bytecode interpreter this year. In case this is the first you've encountered this topic: This is a project for a bytecode
0
2
14
@bkaradzic
Бранимир Караџић
30 days
I wrote a blog post about fast call-stack backtracing. Hopefully, someone making an intrusive profiler, memory tracker, or logger will find it useful... https://t.co/Lb6j5WYkjr
4
45
247
@justin_fargnoli
Justin Fargnoli
1 month
"Notes About Nvidia GPU Shared Memory Banks" from @axel_s_feldmann https://t.co/jkrDT5EuBQ
feldmann.nyc
0
2
8
@__protected
Jonathan Brachthäuser
1 month
You can find our paper here: https://t.co/Y1O2P3Jhwh The online demo is here: https://t.co/9TaQrWayko And if you can't attend, you have the chance to watch it live:
1
2
4
@ChapelLanguage
Chapel Language
1 month
Should a good parallel language design be minimal in nature? Read Brad Chamberlain's take on this question in this month's edition of his "10 Myths About Scalable Parallel Programming Languages" blog series. https://t.co/8IqPkvTa2i
0
1
1