Linux and AMD

Message boards : Number crunching : Linux and AMD
Message board moderation

To post messages, you must log in.

AuthorMessage
alex

Send message
Joined: 5 Jul 23
Posts: 7
Credit: 1,956,042
RAC: 2,689
Message 3519 - Posted: 5 Jul 2023, 21:13:43 UTC
Last modified: 5 Jul 2023, 21:16:13 UTC

Hi all,
joined today NumberFields with one of Linux PCs and one of my Win10 PC's. Try to port most PC's from Win to Linux and therefor i tried it with debian and Mint.
The problem seems to be the AMD driver.
Pc: https://numberfields.asu.edu/NumberFields/show_host_detail.php?hostid=2862319
Workunits: https://numberfields.asu.edu/NumberFields/result.php?resultid=191399055
<core_client_version>7.22.1</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
GPU Summary String = [CAL|AMDRadeonGraphics(renoir,LLVM15.0.7,DRM3.42,5.15.0-76-generic)|1|3072MB||101].
Loading GPU lookup table from file.
GPU was not found in the lookup table. Using default values:
numBlocks = 1024.
threadsPerBlock = 32.
polyBufferSize = 32768.
Error: clBuildProgram() returned CL_BUILD_PROGRAM_FAILURE
Build Log:

Error: Failed to initialize OpenCL.

</stderr_txt>
]]>
Fails seconds after start.

Tried to run NumberFields on a win PC with AMD and Nvidia. Runs fine, no problems.

Boinc reports:
Mi 05 Jul 2023 23:08:57 | | OpenCL: AMD/ATI GPU 0: AMD Radeon Graphics (renoir, LLVM 15.0.7, DRM 3.42, 5.15.0-76-ge (driver version 23.1.1, device version OpenCL 1.1 Mesa 23.1.1 (git-fa55e3c026), 3072MB, 3072MB available, 1064 GFLOPS peak)
on the Mint PC.

I've started to use Linux just some months ago, so i'm not very experiencd in setting it up correctly, but i will try to do so if ..
.. someone has a solution for me!
Please help me!

Just for Info: it's the latest available Mint distro.
Edit: i'm using the Flathab version of BOINC
ID: 3519 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1341
Credit: 492,795,751
RAC: 552,159
Message 3520 - Posted: 6 Jul 2023, 0:01:44 UTC - in response to Message 3519.  

Clearly the openCL compiler failed to build the code, and since the compiler is part of the driver that would point to a driver problem. I'm not very familiar with all the different versions of AMD GPUs, but this looks to be an integrated graphics processor. In the past, these GPUs were not as reliable as discrete GPUs (at least on this project), so that could also be part of the problem. Hopefully someone who has experience with AMD integrated graphics can chime in with an idea.
ID: 3520 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 5 Jul 23
Posts: 7
Credit: 1,956,042
RAC: 2,689
Message 3521 - Posted: 6 Jul 2023, 0:58:10 UTC - in response to Message 3520.  

Thank you for the response.
To make it clear, it is a Ryzen 5 5600G , which means integrated graphic. Will boot it tomorrow into win and join to NumberFields, lets see how well it works with windows drivers. Hardware or software ?
ID: 3521 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 5 Jul 23
Posts: 7
Credit: 1,956,042
RAC: 2,689
Message 3523 - Posted: 6 Jul 2023, 2:18:21 UTC

No, not a hardware-problem, works fine with windows:
https://numberfields.asu.edu/NumberFields/result.php?resultid=191429822

It's the same machine (dual boot).
ID: 3523 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers

Send message
Joined: 14 May 23
Posts: 8
Credit: 161,015,055
RAC: 507,698
Message 3524 - Posted: 6 Jul 2023, 2:37:01 UTC - in response to Message 3523.  
Last modified: 6 Jul 2023, 2:42:45 UTC

Read the posts at Einstein and Milkyway about getting the newer AMD cards which includes your igpu to install the proper drivers with OpenCL support.

You might try this:

amdgpu-install_5.4.50403-1_all.deb

amdgpu-install --opencl=rocr,legacy --vulkan=amdvlk,pro --accept-eula


https://docs.amd.com/en/docs-5.4.3/deploy/linux/index.html
ID: 3524 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 5 Jul 23
Posts: 7
Credit: 1,956,042
RAC: 2,689
Message 3525 - Posted: 6 Jul 2023, 12:16:16 UTC - in response to Message 3524.  

Read the posts at Einstein and Milkyway about getting the newer AMD cards which includes your igpu to install the proper drivers with OpenCL support.

You might try this:

amdgpu-install_5.4.50403-1_all.deb

amdgpu-install --opencl=rocr,legacy --vulkan=amdvlk,pro --accept-eula


https://docs.amd.com/en/docs-5.4.3/deploy/linux/index.html


Followed the might try and the docs.amd.com, but unfortunately no success.

Will look into Einstein and Milkiway posts now.
ID: 3525 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 5 Jul 23
Posts: 7
Credit: 1,956,042
RAC: 2,689
Message 3527 - Posted: 6 Jul 2023, 16:40:02 UTC

Took a look into Milkyway forum. Found intresting post, some of them give an impression how AMD works with Linux:

Vega only supported by ROCm 4.5
ROCm 4.5 only supported on 20.04 with 5.11 kernel or 18.04 with 5.4 kernel.

the AMD linux driver situation is certainly all over the place. thankfully nvidia's driver install is pretty painless on both windows and Linux. sucks that AMD's linux experience isn't similar.

I have recently played the match the drivers kernel to the Linux Mint version so I could get my `old` amd7970 running
it worked , eventualy ,finding out then remembering that "Legacy" commandline switch had to be used so that a working open cl would be installed
What a total pain in the #&%$" it was
Two installs of mint , nuked the first 20.2
went back to mint 19.0 as it had the 4.15.0 kernel the driver needed
Then its make shure the card don't cook while crunching coz the fans are far too slow
the back of that card was instant burnt fingers hot
Read a heap more on the web , try it , it don't work , BUT find , dirty hack the file `pwm1` with 255 and its full fan speed for cool crunching .
Though I have an almost identical system that runs win7 ultimate and amd7970s all that OS / drivers / fan speed control stuff is so easy ..
And yet another system that has Linux Mint but with a nv 1060 in it , install driver / install "coolbits" = get work done .
AMD shure duz need to get its act together

But i mean, if a project has Linux AMD GPU workunits, someone of the developers should be able to post a procedure how to enable these gpu's.

Took a PC with NV Card, installed Mint --> works without problems.
Could someone please explain what's the reason why AMD is so full of troubles? And please use a english, that is also good understandable for people with a different language as first language.

Thank you!
ID: 3527 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers

Send message
Joined: 14 May 23
Posts: 8
Credit: 161,015,055
RAC: 507,698
Message 3528 - Posted: 6 Jul 2023, 16:58:33 UTC

Sorry to hear you had the most common experience getting AMD gpus to compute. Nvidia has been doing compute since its beginning. AMD only in the last couple of years for their Enterprise cards have started developing for compute. Sadly, no development has been done for the consumer cards. They just haven't put the resources toward their consumer cards and fall back on what supposedly works for their Enterprise cards. Which are not the same. Too many flavors historically. Nothing in the drivers ties them all together. Drivers are a mish-mash workup for each generation family it seems.

Nvidia has 10X the assets and 100X the market share in compute compared to AMD.

I have always used Nvidia since the start of my BOINCing, never an issue with the drivers. Reading the constant flux of posts from AMD gpu owners always leaves me very appreciative of that fact.
ID: 3528 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 5 Jul 23
Posts: 7
Credit: 1,956,042
RAC: 2,689
Message 3529 - Posted: 6 Jul 2023, 17:19:09 UTC

Thank you Keith for your kind words.

The reason why i preferred AMD over NV on win was always that the AMD cards were faster and cheaper. Yeterday i tried it on a win machine

06.07.2023 19:07:40 | | OpenCL: NVIDIA GPU 0: NVIDIA GeForce GTX 1660 SUPER (driver version 531.30, device version OpenCL 3.0 CUDA, 6144MB, 6144MB available, 5027 GFLOPS peak)
06.07.2023 19:07:40 | | OpenCL: AMD/ATI GPU 0: AMD Radeon(TM) RX 6500 XT (driver version 3516.0 (PAL,LC), device version OpenCL 2.0 AMD-APP (3516.0), 4080MB, 4080MB available, 5499 GFLOPS peak)

The AMD wu's are 20% faster than the NV

I've tried Linux 5 years ago, was a mess compared to win. But now the installation of both Mint ans debian are much easier and faster than win installations, much more privacy and most programs for my daily use are available. And dual boot systems are easy to install and use.

Let's see, what the future brings regarding the drivers. Is much better than 5 years ago ang hopefully much worse than in one year.

All the best

Alexander

PS: If someone finds a solution please post that in the project news! Just a headline and a link to the thread.
ID: 3529 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Keith Myers

Send message
Joined: 14 May 23
Posts: 8
Credit: 161,015,055
RAC: 507,698
Message 3530 - Posted: 6 Jul 2023, 20:31:35 UTC

It is not universally the case now that AMD is faster than Nvidia. Only 1 or 2 projects in the past ever had AMD gpus as a preferred vendor and that was primarily at Milkyway for the older cards better FP64 performance which that project required. But now that project does not use gpus anymore.

And for the latest generations of cards, from BOTH AMD and Nvidia, have been limiting more and more the FP64 and FP32 performance of their cards. Nvidia always limited their consumer card performance on purpose to steer the compute user towards their higher cost and higher margin professional cards.

If you take a look at the Top 100 hosts list at most projects using gpus and you will find that Nvidia is at the top with AMD cards much lower down the lists. The highest performing AMD consumer card was the Radeon VII card with its faster and more abundant memory. But now, even a RTX 3080 Ti beats that card in performance if using the optimized Nvidia CUDA applications at projects.

The fact that you can develop a purposely constructed CUDA application tailored specifically for Nvidia hardware gives you performance advantages. You don't have to rely on general-purpose OpenCL applications that don't use the hardware efficiently.
ID: 3530 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1341
Credit: 492,795,751
RAC: 552,159
Message 3531 - Posted: 7 Jul 2023, 6:26:34 UTC - in response to Message 3529.  

Hey Alex -

I also have had all kinds of frustration with the AMD drivers, so you are not alone. The trick for me was building the ROCm libraries from scratch, which took the better part of a day. A year later, after a kernel upgrade, it stopped working so I got to do it all again.

Now for the good news. My latest kernel upgrade was several months ago. After the upgrade, I installed the AMD drivers using the ROCm repo, and it actually worked out of the box. So AMD might finally be getting their act together.

With that said, I use Fedora, so I'm not sure how this translates to Mint. In case it helps, I will give the process I used on Fedora. This is based on my command history from several months ago. I executed the following commands in a terminal window (as root):
dnf config-manager --add-repo http://repo.radeon.com/rocm/
dnf list available | grep -i rocm
dnf install rocm-*
ID: 3531 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 5 Jul 23
Posts: 7
Credit: 1,956,042
RAC: 2,689
Message 3541 - Posted: 8 Jul 2023, 16:20:26 UTC
Last modified: 8 Jul 2023, 17:20:23 UTC

Hey Eric,

i've tried your advice. Had to install dnf first, is not part of the distribution.
When trying the dnf config-manager i got an error message Kein solcher Befehl: config-manager. Bitte /usr/bin/dnf --help verwenden.
The help file does not even list a config command. Do i need a add-on first?

Edit: The Packet-Manager of Mint contains a lot of Rocm packedges. I tried to install some of them only to see that there are a lot of conflicts and the installations never completed. What would be a good starting point here?
ID: 3541 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1341
Credit: 492,795,751
RAC: 552,159
Message 3543 - Posted: 8 Jul 2023, 17:11:22 UTC - in response to Message 3541.  

Hey Eric,

i've tried your advice. Had to install dnf first, is not part of the distribution.
When trying the dnf config-manager i got an error message Kein solcher Befehl: config-manager. Bitte /usr/bin/dnf --help verwenden.
The help file does not even list a config command. Do i need a add-on first?


Now that you have dnf installed, try the following to give you a list of what's available for install:
dnf list available | grep -i dnf

Looking at what I have installed, you might need the following packages:
"dnf-plugins-core.noarch", "dnf-data.noarch", and "libdnf.x86_64"

Remember, dnf comes with the Fedora distribution, so there's no guarantee this will work with the Mint distribution (I am not familiar with Mint). Instead of following my Fedora instructions verbatim, It might be best to try the equivalent procedure on Mint. Basically, the idea is simple: enable the ROCm repo and then install it.
ID: 3543 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Linux and AMD


Main page · Your account · Message boards


Copyright © 2024 Arizona State University