Juan Carlos Fernandez Fernández-rekin lankidetzan egindako argitalpenak (25)

2015

  1. Efficient hardware-supported synchronization mechanisms for manycores

    Handbook on Data Centers (Springer New York), pp. 753-803

2013

  1. Deploying hardware locks to improve performance and energy efficiency of hardware transactional memory

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

  2. Design of an efficient communication infrastructure for highly contended locks in many-core CMPs

    Journal of Parallel and Distributed Computing, Vol. 73, Núm. 7, pp. 972-985

  3. ECONO: Express coherence notifications for efficient cache coherency in many-core CMPs

    Proceedings - 2013 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2013

  4. Efficient Dir0B cache coherency for many-core CMPs

    Procedia Computer Science

  5. On the design of energy-efficient hardware transactional memory systems

    Concurrency and Computation: Practice and Experience

2012

  1. CUDA and OpenCL implementations of 3D Fast Wavelet Transform

    2012 IEEE 3rd Latin American Symposium on Circuits and Systems, LASCAS 2012 - Conference Proceedings

  2. Design of a collective communication infrastructure for barrier synchronization in cluster-based nanoscale MPSoCs

    Proceedings -Design, Automation and Test in Europe, DATE

  3. Dynamic Serialization: Improving energy consumption in eager-eager hardware transactional memory systems

    Proceedings - 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2012

  4. Efficient hardware barrier synchronization in many-core CMPs

    IEEE Transactions on Parallel and Distributed Systems, Vol. 23, Núm. 8, pp. 1453-1466

  5. Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE

    Journal of Supercomputing, Vol. 62, Núm. 2, pp. 787-803

  6. The 2D wavelet transform on emerging architectures: GPUs and multicores

    Journal of Real-Time Image Processing, Vol. 7, Núm. 3, pp. 145-152

2011

  1. GLocks: Efficient support for highly-contended locks in many-core CMPs

    Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

2010

  1. A G-line-based network for fast and efficient barrier synchronization in many-core CMPs

    Proceedings of the International Conference on Parallel Processing

  2. Characterizing energy consumption in hardware transactional memory systems

    Proceedings - 22nd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2010

  3. Characterizing the basic synchronization and communication operations in Dual Cell-based Blades through CellStats

    Journal of Supercomputing, Vol. 53, Núm. 2, pp. 247-268

  4. Efficient and scalable barrier synchronization for many-core CMPs

    CF 2010 - Proceedings of the 2010 Computing Frontiers Conference

  5. Parallel 3D fast wavelet transform on manycore GPUs and multicore CPUs

    Procedia Computer Science

2009

  1. A parallel implementation of the 2D wavelet transform using CUDA

    Proceedings of the 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2009