gmarkall Profile Banner
Graham Markall Profile
Graham Markall

@gmarkall

Followers
1K
Following
941
Media
188
Statuses
4K

Professional interests: Python, CUDA, @numba_jit. Personal interests: RISC-V, PSXDev, OSHW, 日本語. Fun: family, cycling, running. Also @[email protected]

Lincoln, UK
Joined May 2010
Don't wanna be here? Send us removal request.
@gmarkall
Graham Markall
2 years
Apparently it's #MyTwitterAnniversary, so I'll use it as an opportunity to link to the Mastodon account I now use:
Tweet media one
0
0
3
@gmarkall
Graham Markall
3 years
Do you know about automatic differentiation and / or Enzyme? Could you help sketch out how AD support for @numba_jit could be implemented? Issue / thread:
github.com
Feature request It would be great if Numba supported automatic differentiation. Maybe using Enzyme would be the easiest way as it operates directly on the IR of LLVM. Another possible source of ins...
2
2
5
@gmarkall
Graham Markall
3 years
I've been on Mastodon a while but I got out of the habit of using it - this is my account that I'm starting to get active with again: (is this usually written as @gmarkall@mastodon.social?). It has an old profile pic, still need to upload my current one.
Tweet card summary image
mastodon.social
384 Posts, 389 Following, 302 Followers · Professional interests: Python, CUDA, Compilers. Personal interests: RISC-V, PSXDev, OSHW, 日本語. Recreation: family time, cycling, running, cooking.
0
0
2
@gmarkall
Graham Markall
3 years
@anthonypjshaw 6. Finally, maybe this should have been a blog post with some code! Would that be interesting? What else should be answered / elaborated on if I write this up?.
2
0
7
@gmarkall
Graham Markall
3 years
5. "CPython Internals" by @anthonypjshaw is a great intro / reference to CPython, saved me a lot of time, and got me up to speed quickly for this endeavour. Highly recommended if you want to poke about with this sort of stuff! (.
Tweet card summary image
realpython.com
Unlock the inner workings of the Python language, compile the Python interpreter from source code, and participate in the development of CPython. The "CPython Internals" book shows you exactly how.
1
0
2
@gmarkall
Graham Markall
3 years
4. Performance is not good for my naive implementation - literally every Python alloc results in a CUDA driver API call, instead of allocating arenas and doing other clever stuff that the Python allocators do. Though, that can be solved, maybe reusing a lot of what CPython does.
1
0
1
@gmarkall
Graham Markall
3 years
3. Context management a bit fiddly - e.g. if the allocator is the first use of CUDA it needs to cuInit and set up context, but then other libraries like Numba can be unhappy. Also other libraries switching the context (such as Numba again) needs to be handled gracefully. .
1
0
2
@gmarkall
Graham Markall
3 years
2. Using the CUDA Memory Management APIs (cuMemAllocManaged etc) was straightforward to implement the required methods (malloc, calloc, realloc, free). cuMemGetAddressRange was convenient for realloc. I used the driver API, to keep things simple (for me):
1
0
1
@gmarkall
Graham Markall
3 years
1. The PyMem_SetAllocator API ( perhaps doesn't support as broad a range of use cases as I'd like - perhaps intended mainly to support tracing and debugging? . I'm not sure I can turn off my allocator once it's on - I see no way to migrate allocations. .
docs.python.org
Overview: Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manag...
1
0
1
@gmarkall
Graham Markall
3 years
I've been experimenting with using CUDA Unified Memory for the Python heap, towards a general goal of experimenting with a more unified CPU / GPU execution model within CPython, implemented as PyMemAllocatorEx instances. I have a few thoughts so far. .
1
0
6
@gmarkall
Graham Markall
3 years
RT @RAPIDSai: Do more on GPUs with less code - check out @gmarkall's new blog on the @numba_jit high-level API
0
10
0
@gmarkall
Graham Markall
3 years
RT @numba_jit: Public service announcement: Yes, the Numba team is aware of the Python 3.11 release. 💥 Yes, we are working on it. 💪Please….
0
3
0
@gmarkall
Graham Markall
3 years
Lil identify xei9pa2AeQu6.
@geerlingguy
Jeff Geerling
3 years
your rap name is "lil" + the last message you sent in IRC.
0
0
2
@gmarkall
Graham Markall
3 years
RT @__mharrison__: Do you use the GPU in Python? If so, for what, and what library do you use? 🤔.
0
22
0
@gmarkall
Graham Markall
3 years
"Use a font that is way too small" - every corporate Powerpoint template for technical talks ever.
0
0
1
@gmarkall
Graham Markall
3 years
With @numba_jit 0.56 you can use the High-level API to extend the CUDA target. Much simpler to use than the low-level API, this notebook shows how to use it for some quick examples: HLA docs: Pic: using it to implement clock64()
Tweet media one
1
2
12
@gmarkall
Graham Markall
3 years
Money-saving tip: instead of buying an expensive chef's knife, simply take a PCIe slot blanking plate out of a cheap PC case.
0
0
22