Arquitectura y Computación Paralela
Northeastern University
Boston, Estados UnidosPublicaciones en colaboración con investigadores/as de Northeastern University (19)
2024
-
AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators
CGO 2024 - Proceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization
-
NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator
Proceedings - International Symposium on Computer Architecture
-
Scalability Limitations of Processing-in-Memory using Real System Evaluations
Performance Evaluation Review, Vol. 52, Núm. 1, pp. 63-64
-
Scalability Limitations of Processing-in-Memory using Real System Evaluations
SIGMETRICS/PERFORMANCE 2024 - Abstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems
-
Scalability limitations of processing-in-memory using real system evaluations
Proceedings of the ACM on Measurement and Analysis of Computing Systems, Vol. 8, Núm. 1
2023
-
Accelerating Finite Field Arithmetic for Homomorphic Encryption on GPUs
IEEE Micro, Vol. 43, Núm. 5, pp. 55-63
-
GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023
2022
-
Accelerating Polynomial Multiplication for Homomorphic Encryption on GPUs
Proceedings - 2022 IEEE International Symposium on Secure and Private Execution Environment Design, SEED 2022
-
NaviSim: A Highly Accurate GPU Simulator for AMD RDNA GPUs
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
2021
-
GNNMark: A Benchmark Suite to Characterize Graph Neural Network Training on GPUS
Proceedings - 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021
-
Spartan: A Sparsity-Adaptive Framework to Accelerate Deep Neural Network Training on GPUs
IEEE Transactions on Parallel and Distributed Systems, Vol. 32, Núm. 10, pp. 2448-2463
2020
-
Design space exploration of accelerators and end-to-end DNN evaluation with TFLITE-SOC
Proceedings - Symposium on Computer Architecture and High Performance Computing
-
Griffin: Hardware-software support for efficient page migration in multi-GPU systems
Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
-
Valkyrie: Leveraging inter-TLB locality to enhance GPU performance
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
2019
-
MGPUSim: Enabling multi-GPU performance modeling and optimization
Proceedings - International Symposium on Computer Architecture
2018
-
Profiling DNN Workloads on a Volta-based DGX-1 System
2018 IEEE International Symposium on Workload Characterization, IISWC 2018
2016
-
UMH: A hardware-based unified memory hierarchy for systems with multiple discrete GPUs
ACM Transactions on Architecture and Code Optimization, Vol. 13, Núm. 4
2015
-
Asymmetric NoC architectures for GPU systems
Proceedings - 2015 9th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2015
-
Leveraging silicon-photonic NoC for designing scalable GPUs
Proceedings of the International Conference on Supercomputing