RISC-V Technical Session | Vectorization & Matrix Multiplication Extensions to Speed-up Convolution

RISC-V Technical Session | Using Template Repo to setup/run Arch Test Suite under RHEL/Ubuntu

How I'd Learn AI in 2024 (if I could start over)

Повістки у Києві: «Яке право вони мають забирати всіх мужиків?» #війна #мобілізація #військові

Teenagers Show Kindness by Repairing Grandmother's Old Fence #shorts

😨 Майнкрафт но Я МОГУ ПЕРЕМАТЫВАТЬ ВРЕМЯ | #shorts

Дружина загиблого азовця, яка переїхала з Росії, розповіла про відношення до РФ

RISC-V Technical Session | Vectorization & Matrix Multiplication Extensions to Speed-up Convolution

Переглядів 572

RISC-V International

RISC-V International

2 місяці тому

Convolution is one of the most computationally intensive operations in CNN. A traditional approach to computing convolutions is known as the Im2col + BLAS method. This presentation talks about SConv: a direct-convolution algorithm based on an MLIR/LLVM code-generation toolchain that uses Vectorization and Matrix Multiplication ISA extensions to improve convolution performance, surpassing Img2col + BLAS on Intel x86 and IBM POWER10. We also describe a vector-based convolution packing routine that reduces total packing time, on full model inference, of 2.0x -- 3.9x on Intel x86 and 3.6x -- 7.2x on IBM POWER10. SConv convolution speedup, over an Im2col + BLAS method based on current BLAS implementations, is 12% -- 27% on Intel x86 and 26% -- 46% on IBM POWER10. The final speed-up for end-to-end machine-learning model inference ranges from 9% -- 25% for Intel x86 and 10% -- 42% for IBM POWER10 architectures. At the end of the talk, we lay out a plan to port SConv for RISC-V architectures.
Presenter: Guido Araújo, Full Professor of Computer Science and Engineering with University of Campinas, Brazil.

КОМЕНТАРІ

RISC-V Technical Session | Using Template Repo to setup/run Arch Test Suite under RHEL/Ubuntu

1:05:06

RISC-V Technical Session | Using Template Repo to setup/run Arch Test Suite under RHEL/Ubuntu

RISC-V International

Переглядів 167

How I'd Learn AI in 2024 (if I could start over)

17:55

How I'd Learn AI in 2024 (if I could start over)

Dave Ebbelaar

Переглядів 711 тис.

Повістки у Києві: «Яке право вони мають забирати всіх мужиків?» #війна #мобілізація #військові

00:41

Повістки у Києві: «Яке право вони мають забирати всіх мужиків?» #війна #мобілізація #військові

Слідство.Інфо | Розслідування, репортажі, викриття

Переглядів 1,7 млн

Teenagers Show Kindness by Repairing Grandmother's Old Fence #shorts

00:37

Teenagers Show Kindness by Repairing Grandmother's Old Fence #shorts

Fabiosa Best Lifehacks

Переглядів 25 млн

😨 Майнкрафт но Я МОГУ ПЕРЕМАТЫВАТЬ ВРЕМЯ | #shorts

00:56

😨 Майнкрафт но Я МОГУ ПЕРЕМАТЫВАТЬ ВРЕМЯ | #shorts

Скретч

Переглядів 1,1 млн

Дружина загиблого азовця, яка переїхала з Росії, розповіла про відношення до РФ

00:27

Дружина загиблого азовця, яка переїхала з Росії, розповіла про відношення до РФ

Суспільне Вінниця

Переглядів 1,1 млн

What is Convolutional Neural Network (CNN) | CNN Intution

27:10

What is Convolutional Neural Network (CNN) | CNN Intution

CampusX

Переглядів 58 тис.

Building High-Performance RISC-V Cores for Everything

19:01

Building High-Performance RISC-V Cores for Everything

TechTechPotato

Переглядів 97 тис.

Inside a Neural Network - Computerphile

15:42

Inside a Neural Network - Computerphile

Computerphile

Переглядів 424 тис.

You Can Learn RISC-V Assembly in 10 Minutes | Getting Started RISC-V Assembly on Linux Tutorial

10:51

You Can Learn RISC-V Assembly in 10 Minutes | Getting Started RISC-V Assembly on Linux Tutorial

Low Level Learning

Переглядів 70 тис.

Convolutional Neural Networks from Scratch | In Depth

12:56

Convolutional Neural Networks from Scratch | In Depth

far1din

Переглядів 49 тис.

Is LangGraph the Future of AgentExecutor? Comparison Reveals All!

9:26

Is LangGraph the Future of AgentExecutor? Comparison Reveals All!

Eden Marco

Переглядів 870

The ARM chip race is getting wild… Apple M4 unveiled

4:07

The ARM chip race is getting wild… Apple M4 unveiled

Fireship

Переглядів 526 тис.

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

25:47

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

WIRED

Переглядів 2,3 млн

Teens surprise math world with Pythagorean Theorem trigonometry proof | 60 Minutes

13:19

Teens surprise math world with Pythagorean Theorem trigonometry proof | 60 Minutes

60 Minutes

Переглядів 158 тис.

Verifying A RISC-V Processor

15:10

Verifying A RISC-V Processor

Semiconductor Engineering

Переглядів 2,4 тис.

Опасная флешка 🤯

0:22

Опасная флешка 🤯

FATA MORGANA

Переглядів 769 тис.

НЕ ПОКУПАЙТЕ НОВЫЙ СМАРТФОН, ПОКА НЕ ВЫШЕЛ ЭТОТ [2024]

13:25

НЕ ПОКУПАЙТЕ НОВЫЙ СМАРТФОН, ПОКА НЕ ВЫШЕЛ ЭТОТ [2024]

Thebox - о технике и гаджетах

Переглядів 80 тис.

The ARM chip race is getting wild… Apple M4 unveiled

4:07

The ARM chip race is getting wild… Apple M4 unveiled

Fireship

Переглядів 503 тис.

Do you want this LED screen as your computer screen? #ledvideowall #leddisplay #ledmodule #eagerled

0:17

Do you want this LED screen as your computer screen? #ledvideowall #leddisplay #ledmodule #eagerled

LED Screen Factory-EagerLED

Переглядів 8 млн

Это - специальный SSD для съёмки на iPhone в ProRes...на MagSafe!😲

1:00

Это - специальный SSD для съёмки на iPhone в ProRes...на MagSafe!😲

BIG GEEK

Переглядів 908 тис.

Лучший Смартфон До 149 Баксов!!!??? itel s24

20:25

Лучший Смартфон До 149 Баксов!!!??? itel s24

РасПаковка ДваПаковка

Переглядів 45 тис.

Будильник на iPhone - Г*ВНО! #технологии #iPhone #apple #опоздание

0:39

Будильник на iPhone - Г*ВНО! #технологии #iPhone #apple #опоздание

Alisher Beisebai

Переглядів 120 тис.