Publicaciones en colaboración con investigadores/as de Northeastern University (20)

2024

  1. AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators

    CGO 2024 - Proceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization

  2. NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

    Proceedings - International Symposium on Computer Architecture

  3. Scalability Limitations of Processing-in-Memory using Real System Evaluations

    Performance Evaluation Review, Vol. 52, Núm. 1, pp. 63-64

  4. Scalability Limitations of Processing-in-Memory using Real System Evaluations

    SIGMETRICS/PERFORMANCE 2024 - Abstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems

  5. Scalability limitations of processing-in-memory using real system evaluations

    Proceedings of the ACM on Measurement and Analysis of Computing Systems, Vol. 8, Núm. 1

2023

  1. Accelerating Finite Field Arithmetic for Homomorphic Encryption on GPUs

    IEEE Micro, Vol. 43, Núm. 5, pp. 55-63

  2. GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption

    Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023

2022

  1. Accelerating Polynomial Multiplication for Homomorphic Encryption on GPUs

    Proceedings - 2022 IEEE International Symposium on Secure and Private Execution Environment Design, SEED 2022

  2. Multi-modality machine learning predicting Parkinson’s disease

    npj Parkinson's Disease, Vol. 8, Núm. 1

  3. NaviSim: A Highly Accurate GPU Simulator for AMD RDNA GPUs

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

2021

  1. GNNMark: A Benchmark Suite to Characterize Graph Neural Network Training on GPUS

    Proceedings - 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021

  2. Spartan: A Sparsity-Adaptive Framework to Accelerate Deep Neural Network Training on GPUs

    IEEE Transactions on Parallel and Distributed Systems, Vol. 32, Núm. 10, pp. 2448-2463

2020

  1. Design space exploration of accelerators and end-to-end DNN evaluation with TFLITE-SOC

    Proceedings - Symposium on Computer Architecture and High Performance Computing

  2. Griffin: Hardware-software support for efficient page migration in multi-GPU systems

    Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020

  3. Valkyrie: Leveraging inter-TLB locality to enhance GPU performance

    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

2019

  1. MGPUSim: Enabling multi-GPU performance modeling and optimization

    Proceedings - International Symposium on Computer Architecture

2018

  1. Profiling DNN Workloads on a Volta-based DGX-1 System

    2018 IEEE International Symposium on Workload Characterization, IISWC 2018

2016

  1. UMH: A hardware-based unified memory hierarchy for systems with multiple discrete GPUs

    ACM Transactions on Architecture and Code Optimization, Vol. 13, Núm. 4

2015

  1. Asymmetric NoC architectures for GPU systems

    Proceedings - 2015 9th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2015

  2. Leveraging silicon-photonic NoC for designing scalable GPUs

    Proceedings of the International Conference on Supercomputing