Arquitectura y Computación Paralela
Centro Nacional de Supercomputación
Barcelona, EspañaPublicaciones en colaboración con investigadores/as de Centro Nacional de Supercomputación (24)
2022
-
Compiler-Assisted Compaction/Restoration of SIMD Instructions
IEEE Transactions on Parallel and Distributed Systems, Vol. 33, Núm. 4, pp. 779-791
2020
-
Design space exploration of accelerators and end-to-end DNN evaluation with TFLITE-SOC
Proceedings - Symposium on Computer Architecture and High Performance Computing
-
Efficiency analysis of modern vector architectures: vector ALU sizes, core counts and clock frequencies
Journal of Supercomputing, Vol. 76, Núm. 3, pp. 1960-1979
-
Improving predication efficiency through compaction/restoration of SIMD instructions
Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
-
Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance
International Journal of High Performance Computing Applications, Vol. 34, Núm. 2, pp. 199-207
-
Semi-automatic validation of cycle-accurate simulation infrastructures: The case for gem5-x86
Future Generation Computer Systems, Vol. 112, pp. 832-847
-
Using Arm’s scalable vector extension on stencil codes
Journal of Supercomputing, Vol. 76, Núm. 3, pp. 2039-2062
2019
-
Poster: An optimized predication execution for simd extensions
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
2018
-
A vectorized k-means algorithm for compressed datasets: design and experimental analysis
Journal of Supercomputing, Vol. 74, Núm. 6, pp. 2705-2728
-
Performance and energy effects on task-based parallelized applications: User-directed versus manual vectorization
Journal of Supercomputing, Vol. 74, Núm. 6, pp. 2627-2637
-
Stencil codes on a vector length agnostic architecture
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
2017
-
Code modernization strategies to 3-D Stencil-based applications on Intel Xeon Phi: KNC and KNL
Computers and Mathematics with Applications, Vol. 74, Núm. 10, pp. 2557-2571
-
Energy efficiency effects of vectorization in data reuse transformations for many-core processors—a case study†
Journal of Low Power Electronics and Applications, Vol. 7, Núm. 1
2016
-
Architectural support for efficient message passing on shared memory multi-cores
Journal of Parallel and Distributed Computing, Vol. 95, pp. 92-106
-
Improving I/O performance through an in-kernel disk simulator
Computer Journal, Vol. 59, Núm. 10, pp. 1433-1452
2015
-
DiMP: Architectural support for direct message passing on shared memory multi-cores
Proceedings of the International Conference on Parallel Processing
-
Enhancing garbage collection synchronization using explicit bit barriers
Proceedings of the International Conference on Parallel Processing
-
ParaDIME: Parallel Distributed Infrastructure for Minimization of Energy for data centers
Microprocessors and Microsystems, Vol. 39, Núm. 8, pp. 1174-1189
2014
-
A general framework for dynamic and automatic I/O scheduling in hard and solid-state drives
Journal of Parallel and Distributed Computing, Vol. 74, Núm. 5, pp. 2380-2391
2013
-
Techniques to improve performance in requester-wins Hardware Transactional Memory
Transactions on Architecture and Code Optimization, Vol. 10, Núm. 4