Distributed, Parallel, and Cluster Computing
@DPZ
Followers
262
Following
0
Media
0
Statuses
16K
New Distributed, Parallel, and Cluster Computing submissions to https://t.co/FMRl4YXmrm (not affiliated with https://t.co/FMRl4YXmrm)
Joined October 2010
On the Operational Resilience of CBDC: Threats and Prospects of Formal Validation for Offline Payments.
arxiv.org
Information and communication technologies are by now employed in most activities, including economics and finance. Despite the extraordinary power of modern computers and the vast amount of...
0
0
0
A Fair and Lightweight Consensus Algorithm for IoT.
arxiv.org
With the rapid growth of hyperconnected devices and decentralized data architectures, safeguarding Internet of Things (IoT) transactions is becoming increasingly challenging. Blockchain presents a...
0
0
0
SageServe: Optimizing LLM Serving on Cloud Data Centers with Forecast Aware Auto-Scaling.
arxiv.org
Global cloud service providers handle inference workloads for Large Language Models (LLMs) that span latency-sensitive (e.g., chatbots) and insensitive (e.g., report writing) tasks, resulting in...
0
0
0
Asynchronous Wait-Free Runtime Verification and Enforcement of Linearizability.
arxiv.org
This paper presents a {theoretical study} of the problem of verifying linearizability at runtime, where one seeks for a concurrent algorithm for verifying that the current execution of a given...
0
0
0
Unlocking Dynamic Inter-Client Spatial Dependencies: A Federated Spatio-Temporal Graph Learning Method for Traffic Flow Forecasting.
arxiv.org
Spatio-temporal graphs are powerful tools for modeling complex dependencies in traffic time series. However, the distributed nature of real-world traffic data across multiple stakeholders poses...
0
0
0
On The Performance of Prefix-Sum Parallel Kalman Filters and Smoothers on GPUs.
arxiv.org
This paper presents an experimental evaluation of parallel-in-time Kalman filters and smoothers using graphics processing units (GPUs). In particular, the paper evaluates different all-prefix-sum...
0
0
0
SMoFi: Step-wise Momentum Fusion for Split Federated Learning on Heterogeneous Data.
arxiv.org
Split Federated Learning is a system-efficient federated learning paradigm that leverages the rich computing resources at a central server to train model partitions. Data heterogeneity across...
0
0
0
TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training.
arxiv.org
Training large language models (LLMs) is fundamentally constrained by limited device memory and costly inter-device communication. Although pipeline parallelism alleviates memory pressure by...
0
0
0
Scalable Synthesis of distributed LLM workloads through Symbolic Tensor Graphs.
arxiv.org
Optimizing the performance of large language models (LLMs) on large-scale AI training and inference systems requires a scalable and expressive mechanism to model distributed workload execution....
0
0
0
FastGraph: Optimized GPU-Enabled Algorithms for Fast Graph Building and Message Passing.
arxiv.org
We introduce FastGraph, a novel GPU-optimized k-nearest neighbor algorithm specifically designed to accelerate graph construction in low-dimensional spaces (2-10 dimensions), critical for...
0
0
0
Workload Schedulers -- Genesis, Algorithms and Differences.
arxiv.org
This paper presents a novel approach to categorization of modern workload schedulers. We provide descriptions of three classes of schedulers: Operating Systems Process Schedulers, Cluster Systems...
0
0
0
Selection of Supervised Learning-based Sparse Matrix Reordering Algorithms.
arxiv.org
Sparse matrix ordering is a vital optimization technique often employed for solving large-scale sparse matrices. Its goal is to minimize the matrix bandwidth by reorganizing its rows and columns,...
0
0
0
Dynamic Edge Server Selection in Time-Varying Environments: A Reliability-Aware Predictive Approach.
arxiv.org
Latency-sensitive embedded applications increasingly rely on edge computing, yet dynamic network congestion in multi-server architectures challenges proper edge server selection. This paper...
0
0
0
Lit Silicon: A Case Where Thermal Imbalance Couples Concurrent Execution in Multiple GPUs.
arxiv.org
GPU systems are increasingly powering modern datacenters at scale. Despite being highly performant, GPU systems suffer from performance variation at the node and cluster levels. Such performance...
0
0
0
MoFa: A Unified Performance Modeling Framework for LLM Pretraining.
arxiv.org
The exponential growth in LLM scales, with parameters soaring from billions to trillions, has necessitated distributed pretraining across large clusters comprising thousands to tens of thousands...
0
0
0
A Poly-Log Approximation for Transaction Scheduling in Fog-Cloud Computing and Beyond.
arxiv.org
Transaction scheduling is crucial to efficiently allocate shared resources in a conflict-free manner in distributed systems. We investigate the efficient scheduling of transactions in a network of...
0
0
0
Ksurf-Drone: Attention Kalman Filter for Contextual Bandit Optimization in Cloud Resource Allocation.
arxiv.org
Resource orchestration and configuration parameter search are key concerns for container-based infrastructure in cloud data centers. Large configuration search space and cloud uncertainties are...
0
0
0
Foam Segmentation in Wastewater Treatment Plants: A Federated Learning Approach with Segment Anything Model 2.
arxiv.org
Foam formation in Wastewater Treatment Plants (WTPs) is a major challenge that can reduce treatment efficiency and increase costs. The ability to automatically examine changes in real-time with...
0
0
0
FedPM: Federated Learning Using Second-order Optimization with Preconditioned Mixing of Local Parameters.
arxiv.org
We propose Federated Preconditioned Mixing (FedPM), a novel Federated Learning (FL) method that leverages second-order optimization. Prior methods--such as LocalNewton, LTDA, and FedSophia--have...
0
0
0