Euro-Par 2021: Parallel Processing 27th International Conference on Parallel and Distributed Computing, Lisbon, Portugal, September 1-3, 2021, Proceedings / [electronic resource] :
edited by Leonel Sousa, Nuno Roma, Pedro Tomás.
- 1st ed. 2021.
- XXXVIII, 632 p. 251 illus., 183 illus. in color. online resource.
- Theoretical Computer Science and General Issues, 12820 2512-2029 ; .
- Theoretical Computer Science and General Issues, 12820 .
Compilers, Tools and Environments -- ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning -- Automatic low-overhead load-imbalance detection in MPI applications -- Performance and Power Modeling, Prediction and Evaluation -- Trace-driven Workload Generation and Execution -- Bilas Update on the Asymptotic Optimality of LPT -- E2EWatch: An End-to-end Anomaly Diagnosis Framework for Production HPC Systems -- Scheduling and Load Balancing -- Collaborative GPU Preemption via Spatial Multitasking for Efficient GPU Sharing -- A Fixed-Parameter Algorithm for Scheduling Unit dependent Tasks with Unit Communication Delays -- Plan-based Job Scheduling for Super computers with Shared Burst Buffers -- Taming Tail Latency in Key-Value Stores: a Scheduling Perspective -- A log-linear(2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource -- An MPI-Parallel Algorithm for Mapping Complex Networks onto Hierarchical Architectures -- Pipelined Model Parallelism: Complexity Results and Memory Considerations -- Data Management, Analytics and Machine Learning -- Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization -- A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks -- Towards Flexible and Compiler-Friendly Layer Fusion for CNNs on Multicore CPUs -- Smart Distributed Data Sets for Stream Processing -- Cluster, Cloud and Edge Computing -- Colony: Parallel Functions as a Service on the Cloud-Edge Continuum -- Horizontal Scaling in Cloud using Contextual Bandits -- Geo-Distribute Cloud Application at the Edge -- A Fault Tolerant and Deadline Constrained Sequence Alignment Application on Cloud-based Spot GPU Instances -- Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach -- Theory and Algorithms for Parallel and Distributed Processing -- Algorithm design for Tensor Units -- A Scalable Approximation Algorithm for Weighted Longest Common Subsequence -- TSL Queue: An E‑cient Lock-free Design for Priority Queues -- G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU -- Parallel and Distributed Programming, Interfaces, and Languages -- Accelerating Graph Applications Using Phased Transactional Memory -- Efficient GPU Computation using Task Graph Parallelism -- Towards High Performance Resilience using Performance Portable Abstractions -- Enhancing Load-Balancing of MPI Applications with Workshare -- Particle-In-Cell Simulation using Asynchronous Tasking -- Multicore and Manycore Parallelism -- Exploiting co-execution with one API: heterogeneity from a modern perspective -- Parallel Numerical Methods and Applications -- Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems -- Fault-tolerant LU factorization is low cost -- Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs -- Outsmarting the Atmospheric Turbulence for Ground-Based Telescopes Using the Stochastic Levenberg-Marquardt Method -- GPU Accelerated Mahalanobis-average Hierarchical Clustering Analysis -- High performance architectures and accelerators -- PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory Hierarchy -- Optimized Implementation of the HPCG Benchmark on Recongurable Hardware.
This book constitutes the proceedings of the 27th International Conference on Parallel and Distributed Computing, Euro-Par 2021, held in Lisbon, Portugal, in August 2021. The conference was held virtually due to the COVID-19 pandemic. The 38 full papers presented in this volume were carefully reviewed and selected from 136 submissions. They deal with parallel and distributed computing in general, focusing on compilers, tools and environments; performance and power modeling, prediction and evaluation; scheduling and load balancing; data management, analytics and machine learning; cluster, cloud and edge computing; theory and algorithms for parallel and distributed processing; parallel and distributed programming, interfaces, and languages; parallel numerical methods and applications; and high performance architecture and accelerators.
9783030856656
10.1007/978-3-030-85665-6 doi
Software engineering.
Computer engineering.
Computer networks .
Compilers (Computer programs).
Computers.
Operating systems (Computers).
Software Engineering.
Computer Engineering and Networks.
Compilers and Interpreters.
Computer Hardware.
Operating Systems.
QA76.758
005.1
Compilers, Tools and Environments -- ALONA: Automatic Loop Nest Approximation with Reconstruction and Space Pruning -- Automatic low-overhead load-imbalance detection in MPI applications -- Performance and Power Modeling, Prediction and Evaluation -- Trace-driven Workload Generation and Execution -- Bilas Update on the Asymptotic Optimality of LPT -- E2EWatch: An End-to-end Anomaly Diagnosis Framework for Production HPC Systems -- Scheduling and Load Balancing -- Collaborative GPU Preemption via Spatial Multitasking for Efficient GPU Sharing -- A Fixed-Parameter Algorithm for Scheduling Unit dependent Tasks with Unit Communication Delays -- Plan-based Job Scheduling for Super computers with Shared Burst Buffers -- Taming Tail Latency in Key-Value Stores: a Scheduling Perspective -- A log-linear(2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource -- An MPI-Parallel Algorithm for Mapping Complex Networks onto Hierarchical Architectures -- Pipelined Model Parallelism: Complexity Results and Memory Considerations -- Data Management, Analytics and Machine Learning -- Efficient and Systematic Partitioning of Large and Deep Neural Networks for Parallelization -- A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks -- Towards Flexible and Compiler-Friendly Layer Fusion for CNNs on Multicore CPUs -- Smart Distributed Data Sets for Stream Processing -- Cluster, Cloud and Edge Computing -- Colony: Parallel Functions as a Service on the Cloud-Edge Continuum -- Horizontal Scaling in Cloud using Contextual Bandits -- Geo-Distribute Cloud Application at the Edge -- A Fault Tolerant and Deadline Constrained Sequence Alignment Application on Cloud-based Spot GPU Instances -- Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach -- Theory and Algorithms for Parallel and Distributed Processing -- Algorithm design for Tensor Units -- A Scalable Approximation Algorithm for Weighted Longest Common Subsequence -- TSL Queue: An E‑cient Lock-free Design for Priority Queues -- G-Morph: Induced Subgraph Isomorphism Search of Labeled Graphs on a GPU -- Parallel and Distributed Programming, Interfaces, and Languages -- Accelerating Graph Applications Using Phased Transactional Memory -- Efficient GPU Computation using Task Graph Parallelism -- Towards High Performance Resilience using Performance Portable Abstractions -- Enhancing Load-Balancing of MPI Applications with Workshare -- Particle-In-Cell Simulation using Asynchronous Tasking -- Multicore and Manycore Parallelism -- Exploiting co-execution with one API: heterogeneity from a modern perspective -- Parallel Numerical Methods and Applications -- Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems -- Fault-tolerant LU factorization is low cost -- Mixed Precision Incomplete and Factorized Sparse Approximate Inverse Preconditioning on GPUs -- Outsmarting the Atmospheric Turbulence for Ground-Based Telescopes Using the Stochastic Levenberg-Marquardt Method -- GPU Accelerated Mahalanobis-average Hierarchical Clustering Analysis -- High performance architectures and accelerators -- PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory Hierarchy -- Optimized Implementation of the HPCG Benchmark on Recongurable Hardware.
This book constitutes the proceedings of the 27th International Conference on Parallel and Distributed Computing, Euro-Par 2021, held in Lisbon, Portugal, in August 2021. The conference was held virtually due to the COVID-19 pandemic. The 38 full papers presented in this volume were carefully reviewed and selected from 136 submissions. They deal with parallel and distributed computing in general, focusing on compilers, tools and environments; performance and power modeling, prediction and evaluation; scheduling and load balancing; data management, analytics and machine learning; cluster, cloud and edge computing; theory and algorithms for parallel and distributed processing; parallel and distributed programming, interfaces, and languages; parallel numerical methods and applications; and high performance architecture and accelerators.
9783030856656
10.1007/978-3-030-85665-6 doi
Software engineering.
Computer engineering.
Computer networks .
Compilers (Computer programs).
Computers.
Operating systems (Computers).
Software Engineering.
Computer Engineering and Networks.
Compilers and Interpreters.
Computer Hardware.
Operating Systems.
QA76.758
005.1