vsoch Profile Banner
V Profile
V

@vsoch

Followers
2K
Following
6K
Media
2K
Statuses
5K

I'm the Vanessasaurus! https://t.co/NqYC6PthvP Mastodon: [email protected]

Joined June 2010
Don't wanna be here? Send us removal request.
@vsoch
V
6 days
"When you're the last one in the data center. ". Me: I will do important, serious work. But also: 🦩.
0
0
1
@vsoch
V
12 days
Of course the full music video "Flux Time" that was cut short in the live version. Thank you to @HPC_Now and everyone that attended! Please reach out to any of us with questions. 🦩.
0
0
0
@vsoch
V
12 days
Our talk "Ensemble Workloads in the Age of Converged Computing" presented the @FluxFramework Operator, deployment of Flux in different cloud environments, and user-space Kubernetes "Usernetes." .
1
0
0
@vsoch
V
12 days
If you missed the @FluxFramework workshop and talks at #HPCKP, they are online! 🦩. The workshop includes an intro to Flux, a talk on "Flux Environments," a hands on tutorial (actually a container adventure), music video and Jeopardy! 🌀.
1
0
1
@vsoch
V
14 days
Please join us this Tuesday, July 1st, at 9am Pacific to learn about my team's work on "Cloud Usability for #HPC Applications" hosted by the #CASS software stewardship organization. Please message or email me for the calendar invite. Hope to see you there!
Tweet media one
0
2
4
@vsoch
V
16 days
For the first time - user-space Kubernetes running under @FluxFramework on a production cluster. This is OSU and LAMMPS. This has been months of work and persistence. We got this working on an old kernel and hugely strict security policy. Experiments + more details coming soon!🥳
Tweet media one
Tweet media two
1
5
13
@vsoch
V
17 days
RT @tgamblin: We’ve got a request for information out on where we want to take Livermore Computing and other #HPC centers in the next five….
0
7
0
@vsoch
V
18 days
Our @OCI_ORG post on compatibility in the #Kubernetes blog is hot off the press! 📰. We are working on adding an exporter to #NFD for #HPC use cases. and planning experiments. If anyone has ideas, please share in the thread! 👇
Tweet media one
0
2
6
@vsoch
V
25 days
For most that missed the #ISC25 Flux Tutorial, we just posted our slides online:. Thank you to those that attended, and see you next time! 👋
Tweet media one
0
0
5
@vsoch
V
27 days
The biggest lie I tell myself. "Just a little further. ". 💙💚.
0
0
0
@vsoch
V
30 days
What's coming next? Along with continued work on the above, the next item of interest is automated compatibility assessment via descriptive metadata or #OCI artifacts. I hope everyone had a wonderful week, whether you attended a conference or not! 😘.
0
0
0
@vsoch
V
30 days
For the last taste of current work, we talk about running user-space Kubernetes alongside Flux, a project we call "The Bare Metal Bros." Although slirp4netns adds network overhead, when we use bypass mechanisms (Infiniband and EFA) we get close to equivalent performance.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
1
@vsoch
V
30 days
Of course we can't forget traditional #HPC - Flux is deployed as the system level scheduler on 6 of the #Top500 including El Capitan. We have an amazing, talented team of core Flux devs to thank for working hard on this for over a decade. 🙏
Tweet media one
Tweet media two
1
0
0
@vsoch
V
30 days
And. to infinity and beyond! 🚀 Our cloud environments would not be complete without an ability to deploy Flux to virtual machines. We use Packer and Terraform to do this across clouds, and they come up in minutes. ⌚
Tweet media one
1
0
0
@vsoch
V
30 days
And we mentioned dynamism. The State Machine Operator provides queue-, custom- and job-based metrics via streaming #ML models with #RiverML. Here is an example where we told it to "Keep training until you hit 70% accuracy."
Tweet media one
1
0
0
@vsoch
V
30 days
The State Machine Operator isn't just good for workflows. We used it for instance selection, figuring out the "cost per unit of science" for different instance types, on-demand and spot. Shout-out to the #AWS Graviton hpc7g instance for being a complete baddie! 😎
Tweet media one
1
0
0
@vsoch
V
30 days
We next go back to complex workflows. We have re-imagined MuMMI as an event-driven state machine, running it equivalently backed by either Flux or #Kubernetes. Each SM is a sequence of jobs that can act independently. We show lower overhead and cost of orchestration.
Tweet media one
Tweet media two
Tweet media three
1
0
0
@vsoch
V
30 days
Have you heard of #eBPF? It's a powerful space that hasn't been properly tapped by the #HPC community. With this setup, you can deploy #eBPF apps alongside your main app. We have WIP that shows adding zero to small overhead, much better than traditional HPC monitoring methods. 🤓
Tweet media one
Tweet media two
1
0
0
@vsoch
V
30 days
In recent work, we are improving upon our initial performance study. By using #helm charts and the Flux Operator or JobSet, we have packaged 30+ #HPC & ML apps to completely automate an entire experiment run, from custom params to running iterations to saving logs and metrics! ⚙️
Tweet media one
1
0
0
@vsoch
V
30 days
Next is #Kubernetes with the Flux Operator, first developed in 2022. The Operator deploys an entire #HPC cluster across nodes in Kubernetes, allowing for elasticity, sidecar services, a tree-based overlay network and bypassing etcd. Run it like a K8s job, or a persistent cluster.
Tweet media one
1
0
0