Time-power-energy balance of BLAS kernels in modern FPGAs
Resumen:
Numerical Linear Algebra (NLA) is a research field that in the last decades has been characterized by the use of kernel libraries that are de facto standards. One of the most remarkable examples, in particular in the HPC field, is the Basic Linear Algebra Subroutines (BLAS). Most BLAS operations are fundamental in multiple scientific algorithms because they generally constitute the most computationally expensive stage. For this reason, numerous efforts have been made to optimize such operations on various hardware platforms. There is a growing concern in the high-performance computing world about power consumption, making energy efficiency an extremely important quality when evaluating hardware platforms. Due to their greater energy efficiency, Field-Programmable Gate Arrays (FPGAs) are available today as an interesting alternative to other hardware platforms for the acceleration of this type of operation. Our study focuses on the evaluation of FPGAs to address dense NLA operations. Specifically, in this work we explore and evaluate the available options for two of the most representative kernels of BLAS, i.e. GEMV and GEMM. The experimental evaluation is carried out in an Alveo U50 accelerator card from Xilinx and an Intel Xeon Silver multicore CPU. Our findings show that even in kernels where the CPU reaches better runtimes, the FPGA counterpart is more energy efficient.
2022 | |
Los investigadores contaron con el apoyo de la Universidad de la República y el PEDECIBA. Se agradece a la ANII – MPG Independent Research Groups : “Efficient Hetergenous Computing” - CSC group |
|
Dense numerical linear algebra Energy-efficiency HPC Matrix-matrix multiplication |
|
Inglés | |
Universidad de la República | |
COLIBRI | |
https://link.springer.com/chapter/10.1007/978-3-031-23821-5_6
https://hdl.handle.net/20.500.12008/35893 |
|
Acceso abierto | |
Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0) |
_version_ | 1807522899694714880 |
---|---|
author | Favaro, Federico |
author2 | Dufrechou, Ernesto Oliver, Juan Pablo Ezzatti, Pablo |
author2_role | author author author |
author_facet | Favaro, Federico Dufrechou, Ernesto Oliver, Juan Pablo Ezzatti, Pablo |
author_role | author |
bitstream.checksum.fl_str_mv | 6429389a7df7277b72b7924fdc7d47a9 a006180e3f5b2ad0b88185d14284c0e0 36c32e9c6da50e6d55578c16944ef7f6 1996b8461bc290aef6a27d78c67b6b52 fbe20d980300a15e13713e4cfb1a3c9c |
bitstream.checksumAlgorithm.fl_str_mv | MD5 MD5 MD5 MD5 MD5 |
bitstream.url.fl_str_mv | http://localhost:8080/xmlui/bitstream/20.500.12008/35893/5/license.txt http://localhost:8080/xmlui/bitstream/20.500.12008/35893/2/license_url http://localhost:8080/xmlui/bitstream/20.500.12008/35893/3/license_text http://localhost:8080/xmlui/bitstream/20.500.12008/35893/4/license_rdf http://localhost:8080/xmlui/bitstream/20.500.12008/35893/1/FDOE22.pdf |
collection | COLIBRI |
dc.contributor.filiacion.none.fl_str_mv | Favaro Federico, Universidad de la República (Uruguay). Facultad de Ingeniería. Dufrechou Ernesto, Universidad de la República (Uruguay). Facultad de Ingeniería. Oliver Juan Pablo, Universidad de la República (Uruguay). Facultad de Ingeniería. Ezzatti Pablo, Universidad de la República (Uruguay). Facultad de Ingeniería. |
dc.creator.none.fl_str_mv | Favaro, Federico Dufrechou, Ernesto Oliver, Juan Pablo Ezzatti, Pablo |
dc.date.accessioned.none.fl_str_mv | 2023-02-14T12:21:43Z |
dc.date.available.none.fl_str_mv | 2023-02-14T12:21:43Z |
dc.date.issued.none.fl_str_mv | 2022 |
dc.description.abstract.none.fl_txt_mv | Numerical Linear Algebra (NLA) is a research field that in the last decades has been characterized by the use of kernel libraries that are de facto standards. One of the most remarkable examples, in particular in the HPC field, is the Basic Linear Algebra Subroutines (BLAS). Most BLAS operations are fundamental in multiple scientific algorithms because they generally constitute the most computationally expensive stage. For this reason, numerous efforts have been made to optimize such operations on various hardware platforms. There is a growing concern in the high-performance computing world about power consumption, making energy efficiency an extremely important quality when evaluating hardware platforms. Due to their greater energy efficiency, Field-Programmable Gate Arrays (FPGAs) are available today as an interesting alternative to other hardware platforms for the acceleration of this type of operation. Our study focuses on the evaluation of FPGAs to address dense NLA operations. Specifically, in this work we explore and evaluate the available options for two of the most representative kernels of BLAS, i.e. GEMV and GEMM. The experimental evaluation is carried out in an Alveo U50 accelerator card from Xilinx and an Intel Xeon Silver multicore CPU. Our findings show that even in kernels where the CPU reaches better runtimes, the FPGA counterpart is more energy efficient. |
dc.description.es.fl_txt_mv | Conference proceedings 2022 High Performance Computing. 9th Latin American Conference, CARLA 2022, Porto Alegre, Brazil, 26-30 sep 2022, Revised Selected Papers. |
dc.description.sponsorship.none.fl_txt_mv | Los investigadores contaron con el apoyo de la Universidad de la República y el PEDECIBA. Se agradece a la ANII – MPG Independent Research Groups : “Efficient Hetergenous Computing” - CSC group |
dc.format.extent.es.fl_str_mv | 12 p. |
dc.format.mimetype.es.fl_str_mv | application/pdf |
dc.identifier.citation.es.fl_str_mv | Favaro, F., Dufrechou, E., Oliver, J. y otros. Time-power-energy balance of BLAS kernels in modern FPGAs [en línea]. EN: High Performance Computing, CARLA 2022. Communications in Computer and Information Science, (CCIS, volume 1660), Springer, Cham, 2022, pp. 78-89. DOI: 10.1007/978-3-031-23821-5_6 |
dc.identifier.doi.none.fl_str_mv | 10.1007/978-3-031-23821-5_6 |
dc.identifier.isbn.none.fl_str_mv | 978-3-031-23820-8 |
dc.identifier.uri.none.fl_str_mv | https://link.springer.com/chapter/10.1007/978-3-031-23821-5_6 https://hdl.handle.net/20.500.12008/35893 |
dc.language.iso.none.fl_str_mv | en eng |
dc.publisher.es.fl_str_mv | Springer |
dc.relation.ispartof.es.fl_str_mv | High Performance Computing, CARLA 2022. Communications in Computer and Information Science, (CCIS, volume 1660), Springer, Cham, 2022, pp. 78-89. |
dc.rights.license.none.fl_str_mv | Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0) |
dc.rights.none.fl_str_mv | info:eu-repo/semantics/openAccess |
dc.source.none.fl_str_mv | reponame:COLIBRI instname:Universidad de la República instacron:Universidad de la República |
dc.subject.es.fl_str_mv | Dense numerical linear algebra Energy-efficiency HPC Matrix-matrix multiplication |
dc.title.none.fl_str_mv | Time-power-energy balance of BLAS kernels in modern FPGAs |
dc.type.es.fl_str_mv | Capítulo de libro |
dc.type.none.fl_str_mv | info:eu-repo/semantics/bookPart |
dc.type.version.none.fl_str_mv | info:eu-repo/semantics/publishedVersion |
description | Conference proceedings 2022 |
eu_rights_str_mv | openAccess |
format | bookPart |
id | COLIBRI_1a8ca799e498364087a188be67fa9de4 |
identifier_str_mv | Favaro, F., Dufrechou, E., Oliver, J. y otros. Time-power-energy balance of BLAS kernels in modern FPGAs [en línea]. EN: High Performance Computing, CARLA 2022. Communications in Computer and Information Science, (CCIS, volume 1660), Springer, Cham, 2022, pp. 78-89. DOI: 10.1007/978-3-031-23821-5_6 978-3-031-23820-8 10.1007/978-3-031-23821-5_6 |
instacron_str | Universidad de la República |
institution | Universidad de la República |
instname_str | Universidad de la República |
language | eng |
language_invalid_str_mv | en |
network_acronym_str | COLIBRI |
network_name_str | COLIBRI |
oai_identifier_str | oai:colibri.udelar.edu.uy:20.500.12008/35893 |
publishDate | 2022 |
reponame_str | COLIBRI |
repository.mail.fl_str_mv | mabel.seroubian@seciu.edu.uy |
repository.name.fl_str_mv | COLIBRI - Universidad de la República |
repository_id_str | 4771 |
rights_invalid_str_mv | Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0) |
spelling | Favaro Federico, Universidad de la República (Uruguay). Facultad de Ingeniería.Dufrechou Ernesto, Universidad de la República (Uruguay). Facultad de Ingeniería.Oliver Juan Pablo, Universidad de la República (Uruguay). Facultad de Ingeniería.Ezzatti Pablo, Universidad de la República (Uruguay). Facultad de Ingeniería.2023-02-14T12:21:43Z2023-02-14T12:21:43Z2022Favaro, F., Dufrechou, E., Oliver, J. y otros. Time-power-energy balance of BLAS kernels in modern FPGAs [en línea]. EN: High Performance Computing, CARLA 2022. Communications in Computer and Information Science, (CCIS, volume 1660), Springer, Cham, 2022, pp. 78-89. DOI: 10.1007/978-3-031-23821-5_6978-3-031-23820-8https://link.springer.com/chapter/10.1007/978-3-031-23821-5_6https://hdl.handle.net/20.500.12008/3589310.1007/978-3-031-23821-5_6Conference proceedings 2022High Performance Computing. 9th Latin American Conference, CARLA 2022, Porto Alegre, Brazil, 26-30 sep 2022, Revised Selected Papers.Numerical Linear Algebra (NLA) is a research field that in the last decades has been characterized by the use of kernel libraries that are de facto standards. One of the most remarkable examples, in particular in the HPC field, is the Basic Linear Algebra Subroutines (BLAS). Most BLAS operations are fundamental in multiple scientific algorithms because they generally constitute the most computationally expensive stage. For this reason, numerous efforts have been made to optimize such operations on various hardware platforms. There is a growing concern in the high-performance computing world about power consumption, making energy efficiency an extremely important quality when evaluating hardware platforms. Due to their greater energy efficiency, Field-Programmable Gate Arrays (FPGAs) are available today as an interesting alternative to other hardware platforms for the acceleration of this type of operation. Our study focuses on the evaluation of FPGAs to address dense NLA operations. Specifically, in this work we explore and evaluate the available options for two of the most representative kernels of BLAS, i.e. GEMV and GEMM. The experimental evaluation is carried out in an Alveo U50 accelerator card from Xilinx and an Intel Xeon Silver multicore CPU. Our findings show that even in kernels where the CPU reaches better runtimes, the FPGA counterpart is more energy efficient.Submitted by Ribeiro Jorge (jribeiro@fing.edu.uy) on 2023-02-11T01:39:30Z No. of bitstreams: 2 license_rdf: 23149 bytes, checksum: 1996b8461bc290aef6a27d78c67b6b52 (MD5) FDOE22.pdf: 285380 bytes, checksum: fbe20d980300a15e13713e4cfb1a3c9c (MD5)Approved for entry into archive by Machado Jimena (jmachado@fing.edu.uy) on 2023-02-13T20:12:00Z (GMT) No. of bitstreams: 2 license_rdf: 23149 bytes, checksum: 1996b8461bc290aef6a27d78c67b6b52 (MD5) FDOE22.pdf: 285380 bytes, checksum: fbe20d980300a15e13713e4cfb1a3c9c (MD5)Made available in DSpace by Luna Fabiana (fabiana.luna@seciu.edu.uy) on 2023-02-14T12:21:43Z (GMT). No. of bitstreams: 2 license_rdf: 23149 bytes, checksum: 1996b8461bc290aef6a27d78c67b6b52 (MD5) FDOE22.pdf: 285380 bytes, checksum: fbe20d980300a15e13713e4cfb1a3c9c (MD5) Previous issue date: 2022Los investigadores contaron con el apoyo de la Universidad de la República y el PEDECIBA.Se agradece a la ANII – MPG Independent Research Groups : “Efficient Hetergenous Computing” - CSC group12 p.application/pdfenengSpringerHigh Performance Computing, CARLA 2022. Communications in Computer and Information Science, (CCIS, volume 1660), Springer, Cham, 2022, pp. 78-89.Las obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad de la República.(Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)info:eu-repo/semantics/openAccessLicencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)Dense numerical linear algebraEnergy-efficiencyHPCMatrix-matrix multiplicationTime-power-energy balance of BLAS kernels in modern FPGAsCapítulo de libroinfo:eu-repo/semantics/bookPartinfo:eu-repo/semantics/publishedVersionreponame:COLIBRIinstname:Universidad de la Repúblicainstacron:Universidad de la RepúblicaFavaro, FedericoDufrechou, ErnestoOliver, Juan PabloEzzatti, PabloElectrónicaElectrónica AplicadaLICENSElicense.txtlicense.txttext/plain; charset=utf-84267http://localhost:8080/xmlui/bitstream/20.500.12008/35893/5/license.txt6429389a7df7277b72b7924fdc7d47a9MD55CC-LICENSElicense_urllicense_urltext/plain; charset=utf-850http://localhost:8080/xmlui/bitstream/20.500.12008/35893/2/license_urla006180e3f5b2ad0b88185d14284c0e0MD52license_textlicense_texttext/html; charset=utf-838616http://localhost:8080/xmlui/bitstream/20.500.12008/35893/3/license_text36c32e9c6da50e6d55578c16944ef7f6MD53license_rdflicense_rdfapplication/rdf+xml; charset=utf-823149http://localhost:8080/xmlui/bitstream/20.500.12008/35893/4/license_rdf1996b8461bc290aef6a27d78c67b6b52MD54ORIGINALFDOE22.pdfFDOE22.pdfapplication/pdf285380http://localhost:8080/xmlui/bitstream/20.500.12008/35893/1/FDOE22.pdffbe20d980300a15e13713e4cfb1a3c9cMD5120.500.12008/358932024-07-24 17:25:46.719oai:colibri.udelar.edu.uy:20.500.12008/35893VGVybWlub3MgeSBjb25kaWNpb25lcyByZWxhdGl2YXMgYWwgZGVwb3NpdG8gZGUgb2JyYXMKCgpMYXMgb2JyYXMgZGVwb3NpdGFkYXMgZW4gZWwgUmVwb3NpdG9yaW8gc2UgcmlnZW4gcG9yIGxhIE9yZGVuYW56YSBkZSBsb3MgRGVyZWNob3MgZGUgbGEgUHJvcGllZGFkIEludGVsZWN0dWFsICBkZSBsYSBVbml2ZXJzaWRhZCBEZSBMYSBSZXDDumJsaWNhLiAoUmVzLiBOwrogOTEgZGUgQy5ELkMuIGRlIDgvSUlJLzE5OTQg4oCTIEQuTy4gNy9JVi8xOTk0KSB5ICBwb3IgbGEgT3JkZW5hbnphIGRlbCBSZXBvc2l0b3JpbyBBYmllcnRvIGRlIGxhIFVuaXZlcnNpZGFkIGRlIGxhIFJlcMO6YmxpY2EgKFJlcy4gTsK6IDE2IGRlIEMuRC5DLiBkZSAwNy8xMC8yMDE0KS4gCgpBY2VwdGFuZG8gZWwgYXV0b3IgZXN0b3MgdMOpcm1pbm9zIHkgY29uZGljaW9uZXMgZGUgZGVww7NzaXRvIGVuIENPTElCUkksIGxhIFVuaXZlcnNpZGFkIGRlIFJlcMO6YmxpY2EgcHJvY2VkZXLDoSBhOiAgCgphKSBhcmNoaXZhciBtw6FzIGRlIHVuYSBjb3BpYSBkZSBsYSBvYnJhIGVuIGxvcyBzZXJ2aWRvcmVzIGRlIGxhIFVuaXZlcnNpZGFkIGEgbG9zIGVmZWN0b3MgZGUgZ2FyYW50aXphciBhY2Nlc28sIHNlZ3VyaWRhZCB5IHByZXNlcnZhY2nDs24KYikgY29udmVydGlyIGxhIG9icmEgYSBvdHJvcyBmb3JtYXRvcyBzaSBmdWVyYSBuZWNlc2FyaW8gIHBhcmEgZmFjaWxpdGFyIHN1IHByZXNlcnZhY2nDs24geSBhY2Nlc2liaWxpZGFkIHNpbiBhbHRlcmFyIHN1IGNvbnRlbmlkby4KYykgcmVhbGl6YXIgbGEgY29tdW5pY2FjacOzbiBww7pibGljYSB5IGRpc3BvbmVyIGVsIGFjY2VzbyBsaWJyZSB5IGdyYXR1aXRvIGEgdHJhdsOpcyBkZSBJbnRlcm5ldCBtZWRpYW50ZSBsYSBwdWJsaWNhY2nDs24gZGUgbGEgb2JyYSBiYWpvIGxhIGxpY2VuY2lhIENyZWF0aXZlIENvbW1vbnMgc2VsZWNjaW9uYWRhIHBvciBlbCBwcm9waW8gYXV0b3IuCgoKRW4gY2FzbyBxdWUgZWwgYXV0b3IgaGF5YSBkaWZ1bmRpZG8geSBkYWRvIGEgcHVibGljaWRhZCBhIGxhIG9icmEgZW4gZm9ybWEgcHJldmlhLCAgcG9kcsOhIHNvbGljaXRhciB1biBwZXLDrW9kbyBkZSBlbWJhcmdvIHNvYnJlIGxhIGRpc3BvbmliaWxpZGFkIHDDumJsaWNhIGRlIGxhIG1pc21hLCBlbCBjdWFsIGNvbWVuemFyw6EgYSBwYXJ0aXIgZGUgbGEgYWNlcHRhY2nDs24gZGUgZXN0ZSBkb2N1bWVudG8geSBoYXN0YSBsYSBmZWNoYSBxdWUgaW5kaXF1ZSAuCgpFbCBhdXRvciBhc2VndXJhIHF1ZSBsYSBvYnJhIG5vIGluZnJpZ2UgbmluZ8O6biBkZXJlY2hvIHNvYnJlIHRlcmNlcm9zLCB5YSBzZWEgZGUgcHJvcGllZGFkIGludGVsZWN0dWFsIG8gY3VhbHF1aWVyIG90cm8uCgpFbCBhdXRvciBnYXJhbnRpemEgcXVlIHNpIGVsIGRvY3VtZW50byBjb250aWVuZSBtYXRlcmlhbGVzIGRlIGxvcyBjdWFsZXMgbm8gdGllbmUgbG9zIGRlcmVjaG9zIGRlIGF1dG9yLCAgaGEgb2J0ZW5pZG8gZWwgcGVybWlzbyBkZWwgcHJvcGlldGFyaW8gZGUgbG9zIGRlcmVjaG9zIGRlIGF1dG9yLCB5IHF1ZSBlc2UgbWF0ZXJpYWwgY3V5b3MgZGVyZWNob3Mgc29uIGRlIHRlcmNlcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIHkgcmVjb25vY2lkbyBlbiBlbCB0ZXh0byBvIGNvbnRlbmlkbyBkZWwgZG9jdW1lbnRvIGRlcG9zaXRhZG8gZW4gZWwgUmVwb3NpdG9yaW8uCgpFbiBvYnJhcyBkZSBhdXRvcsOtYSBtw7psdGlwbGUgL3NlIHByZXN1bWUvIHF1ZSBlbCBhdXRvciBkZXBvc2l0YW50ZSBkZWNsYXJhIHF1ZSBoYSByZWNhYmFkbyBlbCBjb25zZW50aW1pZW50byBkZSB0b2RvcyBsb3MgYXV0b3JlcyBwYXJhIHB1YmxpY2FybGEgZW4gZWwgUmVwb3NpdG9yaW8sIHNpZW5kbyDDqXN0ZSBlbCDDum5pY28gcmVzcG9uc2FibGUgZnJlbnRlIGEgY3VhbHF1aWVyIHRpcG8gZGUgcmVjbGFtYWNpw7NuIGRlIGxvcyBvdHJvcyBjb2F1dG9yZXMuCgpFbCBhdXRvciBzZXLDoSByZXNwb25zYWJsZSBkZWwgY29udGVuaWRvIGRlIGxvcyBkb2N1bWVudG9zIHF1ZSBkZXBvc2l0YS4gTGEgVURFTEFSIG5vIHNlcsOhIHJlc3BvbnNhYmxlIHBvciBsYXMgZXZlbnR1YWxlcyB2aW9sYWNpb25lcyBhbCBkZXJlY2hvIGRlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCBlbiBxdWUgcHVlZGEgaW5jdXJyaXIgZWwgYXV0b3IuCgpBbnRlIGN1YWxxdWllciBkZW51bmNpYSBkZSB2aW9sYWNpw7NuIGRlIGRlcmVjaG9zIGRlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCwgbGEgVURFTEFSICBhZG9wdGFyw6EgdG9kYXMgbGFzIG1lZGlkYXMgbmVjZXNhcmlhcyBwYXJhIGV2aXRhciBsYSBjb250aW51YWNpw7NuIGRlIGRpY2hhIGluZnJhY2Npw7NuLCBsYXMgcXVlIHBvZHLDoW4gaW5jbHVpciBlbCByZXRpcm8gZGVsIGFjY2VzbyBhIGxvcyBjb250ZW5pZG9zIHkvbyBtZXRhZGF0b3MgZGVsIGRvY3VtZW50byByZXNwZWN0aXZvLgoKTGEgb2JyYSBzZSBwb25kcsOhIGEgZGlzcG9zaWNpw7NuIGRlbCBww7pibGljbyBhIHRyYXbDqXMgZGUgbGFzIGxpY2VuY2lhcyBDcmVhdGl2ZSBDb21tb25zLCBlbCBhdXRvciBwb2Ryw6Egc2VsZWNjaW9uYXIgdW5hIGRlIGxhcyA2IGxpY2VuY2lhcyBkaXNwb25pYmxlczoKCgpBdHJpYnVjacOzbiAoQ0MgLSBCeSk6IFBlcm1pdGUgdXNhciBsYSBvYnJhIHkgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIGluY2x1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBzaWVtcHJlIHF1ZSBzZSByZWNvbm96Y2EgYWwgYXV0b3IuCgpBdHJpYnVjacOzbiDigJMgQ29tcGFydGlyIElndWFsIChDQyAtIEJ5LVNBKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgaW5jbHVzbyBjb24gZmluZXMgY29tZXJjaWFsZXMsIHBlcm8gbGEgZGlzdHJpYnVjacOzbiBkZSBsYXMgb2JyYXMgZGVyaXZhZGFzIGRlYmUgaGFjZXJzZSBtZWRpYW50ZSB1bmEgbGljZW5jaWEgaWTDqW50aWNhIGEgbGEgZGUgbGEgb2JyYSBvcmlnaW5hbCwgcmVjb25vY2llbmRvIGEgbG9zIGF1dG9yZXMuCgpBdHJpYnVjacOzbiDigJMgTm8gQ29tZXJjaWFsIChDQyAtIEJ5LU5DKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgc2llbXByZSB5IGN1YW5kbyBlc29zIHVzb3Mgbm8gdGVuZ2FuIGZpbmVzIGNvbWVyY2lhbGVzLCByZWNvbm9jaWVuZG8gYWwgYXV0b3IuCgpBdHJpYnVjacOzbiDigJMgU2luIERlcml2YWRhcyAoQ0MgLSBCeS1ORCk6IFBlcm1pdGUgZWwgdXNvIGRlIGxhIG9icmEsIGluY2x1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBwZXJvIG5vIHNlIHBlcm1pdGUgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIGRlYmllbmRvIHJlY29ub2NlciBhbCBhdXRvci4KCkF0cmlidWNpw7NuIOKAkyBObyBDb21lcmNpYWwg4oCTIENvbXBhcnRpciBJZ3VhbCAoQ0Mg4oCTIEJ5LU5DLVNBKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgc2llbXByZSB5IGN1YW5kbyBlc29zIHVzb3Mgbm8gdGVuZ2FuIGZpbmVzIGNvbWVyY2lhbGVzIHkgbGEgZGlzdHJpYnVjacOzbiBkZSBsYXMgb2JyYXMgZGVyaXZhZGFzIHNlIGhhZ2EgbWVkaWFudGUgbGljZW5jaWEgaWTDqW50aWNhIGEgbGEgZGUgbGEgb2JyYSBvcmlnaW5hbCwgcmVjb25vY2llbmRvIGEgbG9zIGF1dG9yZXMuCgpBdHJpYnVjacOzbiDigJMgTm8gQ29tZXJjaWFsIOKAkyBTaW4gRGVyaXZhZGFzIChDQyAtIEJ5LU5DLU5EKTogUGVybWl0ZSB1c2FyIGxhIG9icmEsIHBlcm8gbm8gc2UgcGVybWl0ZSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcyB5IG5vIHNlIHBlcm1pdGUgdXNvIGNvbiBmaW5lcyBjb21lcmNpYWxlcywgZGViaWVuZG8gcmVjb25vY2VyIGFsIGF1dG9yLgoKTG9zIHVzb3MgcHJldmlzdG9zIGVuIGxhcyBsaWNlbmNpYXMgaW5jbHV5ZW4gbGEgZW5hamVuYWNpw7NuLCByZXByb2R1Y2Npw7NuLCBjb211bmljYWNpw7NuLCBwdWJsaWNhY2nDs24sIGRpc3RyaWJ1Y2nDs24geSBwdWVzdGEgYSBkaXNwb3NpY2nDs24gZGVsIHDDumJsaWNvLiBMYSBjcmVhY2nDs24gZGUgb2JyYXMgZGVyaXZhZGFzIGluY2x1eWUgbGEgYWRhcHRhY2nDs24sIHRyYWR1Y2Npw7NuIHkgZWwgcmVtaXguCgpDdWFuZG8gc2Ugc2VsZWNjaW9uZSB1bmEgbGljZW5jaWEgcXVlIGhhYmlsaXRlIHVzb3MgY29tZXJjaWFsZXMsIGVsIGRlcMOzc2l0byBkZWJlcsOhIHNlciBhY29tcGHDsWFkbyBkZWwgYXZhbCBkZWwgamVyYXJjYSBtw6F4aW1vIGRlbCBTZXJ2aWNpbyBjb3JyZXNwb25kaWVudGUuCg==Universidadhttps://udelar.edu.uy/https://www.colibri.udelar.edu.uy/oai/requestmabel.seroubian@seciu.edu.uyUruguayopendoar:47712024-07-25T14:33:19.110848COLIBRI - Universidad de la Repúblicafalse |
spellingShingle | Time-power-energy balance of BLAS kernels in modern FPGAs Favaro, Federico Dense numerical linear algebra Energy-efficiency HPC Matrix-matrix multiplication |
status_str | publishedVersion |
title | Time-power-energy balance of BLAS kernels in modern FPGAs |
title_full | Time-power-energy balance of BLAS kernels in modern FPGAs |
title_fullStr | Time-power-energy balance of BLAS kernels in modern FPGAs |
title_full_unstemmed | Time-power-energy balance of BLAS kernels in modern FPGAs |
title_short | Time-power-energy balance of BLAS kernels in modern FPGAs |
title_sort | Time-power-energy balance of BLAS kernels in modern FPGAs |
topic | Dense numerical linear algebra Energy-efficiency HPC Matrix-matrix multiplication |
url | https://link.springer.com/chapter/10.1007/978-3-031-23821-5_6 https://hdl.handle.net/20.500.12008/35893 |