Publicaciones en colaboración con investigadores/as de Centro Nacional de Supercomputación (24)

2022

  1. Compiler-Assisted Compaction/Restoration of SIMD Instructions

    IEEE Transactions on Parallel and Distributed Systems, Vol. 33, Núm. 4, pp. 779-791

2020

  1. Design space exploration of accelerators and end-to-end DNN evaluation with TFLITE-SOC

    Proceedings - Symposium on Computer Architecture and High Performance Computing

  2. Efficiency analysis of modern vector architectures: vector ALU sizes, core counts and clock frequencies

    Journal of Supercomputing, Vol. 76, Núm. 3, pp. 1960-1979

  3. Improving predication efficiency through compaction/restoration of SIMD instructions

    Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020

  4. Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance

    International Journal of High Performance Computing Applications, Vol. 34, Núm. 2, pp. 199-207

  5. Semi-automatic validation of cycle-accurate simulation infrastructures: The case for gem5-x86

    Future Generation Computer Systems, Vol. 112, pp. 832-847

  6. Using Arm’s scalable vector extension on stencil codes

    Journal of Supercomputing, Vol. 76, Núm. 3, pp. 2039-2062

2019

  1. Poster: An optimized predication execution for simd extensions

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

2018

  1. A vectorized k-means algorithm for compressed datasets: design and experimental analysis

    Journal of Supercomputing, Vol. 74, Núm. 6, pp. 2705-2728

  2. Performance and energy effects on task-based parallelized applications: User-directed versus manual vectorization

    Journal of Supercomputing, Vol. 74, Núm. 6, pp. 2627-2637

  3. Stencil codes on a vector length agnostic architecture

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

2017

  1. Code modernization strategies to 3-D Stencil-based applications on Intel Xeon Phi: KNC and KNL

    Computers and Mathematics with Applications, Vol. 74, Núm. 10, pp. 2557-2571

  2. Energy efficiency effects of vectorization in data reuse transformations for many-core processors—a case study†

    Journal of Low Power Electronics and Applications, Vol. 7, Núm. 1

2016

  1. Architectural support for efficient message passing on shared memory multi-cores

    Journal of Parallel and Distributed Computing, Vol. 95, pp. 92-106

  2. Improving I/O performance through an in-kernel disk simulator

    Computer Journal, Vol. 59, Núm. 10, pp. 1433-1452

2015

  1. DiMP: Architectural support for direct message passing on shared memory multi-cores

    Proceedings of the International Conference on Parallel Processing

  2. Enhancing garbage collection synchronization using explicit bit barriers

    Proceedings of the International Conference on Parallel Processing

  3. ParaDIME: Parallel Distributed Infrastructure for Minimization of Energy for data centers

    Microprocessors and Microsystems, Vol. 39, Núm. 8, pp. 1174-1189

2014

  1. A general framework for dynamic and automatic I/O scheduling in hard and solid-state drives

    Journal of Parallel and Distributed Computing, Vol. 74, Núm. 5, pp. 2380-2391

2013

  1. Techniques to improve performance in requester-wins Hardware Transactional Memory

    Transactions on Architecture and Code Optimization, Vol. 10, Núm. 4