Posts by Eric Driver

41) Message boards : News : Support for Intel GPUs (Message 3557)
Posted 6 Aug 2023 by Profile Eric Driver
Post:
I have seen the "Out of Resources" error when there is not enough RAM. Your card appears to have enough RAM, but is it possible something else could be using up the memory? Internet browsers are notorious for using huge amounts of GPU RAM.

The system is dedicated to running BOINC -various projects- on all the four cores of the Pentium J5005 at the same time.
But in general I think the UHD 605 should be capable of running the NumberFields OpenCL app. I am basing that on the following list of hosts that have successfully returned results:
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 600|1|3021MB||300][opencl_gpu|Intel(R) UHD Graphics 600|1|3776MB|102]
[BOINC|7.20.2][INTEL|Intel(R) Iris(R) Xe Graphics|1|6427MB||300][vbox|6.1.34|0|1]
[BOINC|7.20.5][INTEL|Intel(R) Iris(R) Plus Graphics 655 [0x3ea5]|1|25565MB||300]
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 630|1|6488MB||201][vbox|6.1.34|1|1]
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 630|1|6415MB||300][vbox|7.0.10|0|1]
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 620|1|1590MB||201]
[BOINC|7.16.20][INTEL|Intel(R) HD Graphics 530|1|6507MB||300]
[BOINC|7.20.5][INTEL|Intel(R) UHD Graphics [0x4e55]|1|3276MB||300]

I couldn't find your host in the database. Is it windows or linux?

Manjaro Linux

If it is truly a resource problem, you could try adding the following line to the gpuLookupTable file in your projects directory:
UHD Graphics 605     |     256     |      8

If it still doesn't work then the resource problem is probably related to the size of the code that the openCL generates, which could be another driver problem or the app is just too complex for your card.
42) Message boards : News : Support for Intel GPUs (Message 3555)
Posted 5 Aug 2023 by Profile Eric Driver
Post:
Eric

I am experimenting with an Intel GPU on Computer 2807553. It is a INTEL Intel(R) HD Graphics 4600 (1629MB) OpenCL: 1.2

Running on Operating System Microsoft Windows 10 Professional x64 Edition, (10.00.19045.00)
BOINC version 7.22.2

It is running a whole lot slower than a non-GPU task Get Decic Fields v4.00 (default) windows_x86_64

Is the GPU application processing a bigger chunk of data or is the GPU not efficient enough to run the GPU task types ?

Thanks
Bill F


The GPU and CPU tasks process the same amount of data, so my guess is the GPU is not efficient enough.
43) Message boards : News : Support for Intel GPUs (Message 3553)
Posted 4 Aug 2023 by Profile Eric Driver
Post:
Updated the driver, error occurs later now
<stderr_txt>
<core_client_version>7.22.1</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
GPU Summary String = [INTEL|Intel(R)UHDGraphics605|1|3276MB||300].
Loading GPU lookup table from file.
GPU was not found in the lookup table. Using default values:
numBlocks = 1024.
threadsPerBlock = 32.
polyBufferSize = 32768.
Successfully Built Program.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantInit.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB8.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantMpInit.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB7DegA9.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB7DegA8.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB6DegA9.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB6DegA8.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB6DegA7.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB5.
Successfully Created Stage 1 Kernel: pdtKernelSubResultantDegB4.

Successfully Created Stage 2 Kernel: pdtKernelDiv2.
Successfully Created Stage 2 Kernel: pdtKernelDiv5.
Successfully Created Stage 2 Kernel: pdtKernelDivP.

Successfully Created Stage 3 Kernel.

Successfully Created Polynomial Memory Buffer.
Successfully Created Output Flag Memory Buffer.
Successfully Created Discriminant Data Buffer.
Successfully Created PolyA Data Buffer.
Successfully Created PolyB Data Buffer.
Successfully Created DegA Data Buffer.
Successfully Created DegB Data Buffer.
Successfully Created G Data Buffer.
Successfully Created H Data Buffer.
Successfully Created mpA Data Buffer.
Successfully Created mpB Data Buffer.

OpenCL initialization was successful.
CHECKPOINT_FILE = wu_sf6_DS-12x9_Grp49971of128000_checkpoint.
Checkpoint Flag = 0.
Reading file ../../projects/numberfields.asu.edu_NumberFields/sf6_DS-12x9_Grp49971of128000.dat
K = x^2 - 10
S = [2, 5]
Disc Bound = 800000000000
Skip = (P^3)*(Q^5)
Num Congruences = 10
SCALE = 1.000000
|dK| = 40
Signature = [2,0]
Opening output file ../../projects/numberfields.asu.edu_NumberFields/wu_sf6_DS-12x9_Grp49971of128000_0_r611689789_0
Now starting the targeted Martinet search:
Num Cvecs = 10.
Doing Cvec 1.
File polDiscTest_gpuOpenCL.cpp, Line 201: Error: Failed to Enqueue Kernel pdtKernelSubResultantMpInit. clEnqueueNDRangeKernel returned CL_OUT_OF_RESOURCES
polDisc Test had an error. Aborting.
</stderr_txt>

Techpowerup has the UHD 605 as
The UHD Graphics 605 Mobile is a mobile integrated graphics solution by Intel, launched on December 11th, 2017. Built on the 14 nm process, and based on the Gemini Lake GT1.5 graphics processor, the device supports DirectX 12. This ensures that all modern games will run on UHD Graphics 605 Mobile. It features 144 shading units, 18 texture mapping units, and 3 ROPs. The GPU is operating at a frequency of 200 MHz, which can be boosted up to 750 MHz.
Its power draw is rated at 5 W maximum.

I have seen the "Out of Resources" error when there is not enough RAM. Your card appears to have enough RAM, but is it possible something else could be using up the memory? Internet browsers are notorious for using huge amounts of GPU RAM.

But in general I think the UHD 605 should be capable of running the NumberFields OpenCL app. I am basing that on the following list of hosts that have successfully returned results:
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 600|1|3021MB||300][opencl_gpu|Intel(R) UHD Graphics 600|1|3776MB|102]
[BOINC|7.20.2][INTEL|Intel(R) Iris(R) Xe Graphics|1|6427MB||300][vbox|6.1.34|0|1]
[BOINC|7.20.5][INTEL|Intel(R) Iris(R) Plus Graphics 655 [0x3ea5]|1|25565MB||300]
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 630|1|6488MB||201][vbox|6.1.34|1|1]
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 630|1|6415MB||300][vbox|7.0.10|0|1]
[BOINC|7.22.2][INTEL|Intel(R) UHD Graphics 620|1|1590MB||201]
[BOINC|7.16.20][INTEL|Intel(R) HD Graphics 530|1|6507MB||300]
[BOINC|7.20.5][INTEL|Intel(R) UHD Graphics [0x4e55]|1|3276MB||300]

I couldn't find your host in the database. Is it windows or linux?
44) Message boards : News : Support for Intel GPUs (Message 3550)
Posted 30 Jul 2023 by Profile Eric Driver
Post:
GPU Summary String = [INTEL|Intel(R)UHDGraphics605[0x3184]|2|3276MB||300].
Loading GPU lookup table from file.
GPU was not found in the lookup table. Using default values:
numBlocks = 1024.
threadsPerBlock = 32.
polyBufferSize = 32768.
Error: clBuildProgram() returned CL_INVALID_BINARY
Build Log:
error : unresolved external symbol __const.pdtKernelStg3.resMod64 at offset 48 in instructions segment #14 (aka kernel : pdtKernelStg3)
error : unresolved external symbol __const.pdtKernelStg3.resMod63 at offset 64 in instructions segment #14 (aka kernel : pdtKernelStg3)
error : unresolved external symbol __const.pdtKernelStg3.resMod65 at offset 80 in instructions segment #14 (aka kernel : pdtKernelStg3)

Error: Failed to initialize OpenCL.

*->*Intel(R)UHDGraphics605=IGP of Intel Pentium J5005*<-*


That is an odd error since there is no binary and it should be building the code from source. My guess is that the openCL portion of the driver is buggy.
45) Message boards : News : Future of the Project (Message 3547)
Posted 16 Jul 2023 by Profile Eric Driver
Post:
At Gerasim, SerVal has integrated the apps for nvidia gpu (on windows), amd gpu (on windows), and linux cpu. These apps are now processing WUs for subfield 3.

On Numberfields, I am sprinkling in some WUs for sf6, in order to get some updated timing information. This will help in creating the new batches for sf6 to come later in the year.

On Gerasim, is it possible to include a GPU application for Linux? Thank you very much.


There doesn't seem to be much interest for linux in general and it's a lot of work to create and utilize the apps at Gerasim, so my first instinct would be to say no. Sorry!
46) Message boards : Number crunching : Linux and AMD (Message 3543)
Posted 8 Jul 2023 by Profile Eric Driver
Post:
Hey Eric,

i've tried your advice. Had to install dnf first, is not part of the distribution.
When trying the dnf config-manager i got an error message Kein solcher Befehl: config-manager. Bitte /usr/bin/dnf --help verwenden.
The help file does not even list a config command. Do i need a add-on first?


Now that you have dnf installed, try the following to give you a list of what's available for install:
dnf list available | grep -i dnf

Looking at what I have installed, you might need the following packages:
"dnf-plugins-core.noarch", "dnf-data.noarch", and "libdnf.x86_64"

Remember, dnf comes with the Fedora distribution, so there's no guarantee this will work with the Mint distribution (I am not familiar with Mint). Instead of following my Fedora instructions verbatim, It might be best to try the equivalent procedure on Mint. Basically, the idea is simple: enable the ROCm repo and then install it.
47) Message boards : Number crunching : Linux + Intel ARC = GPU HANG GetDecics_4.02 (Message 3542)
Posted 8 Jul 2023 by Profile Eric Driver
Post:
I did a quick search of the message boards. Found the following which is good to know:
The following cards have returned successful results: HD500, HD515, Gen9, UHD620, UHD630.

The following links are also interesting. The 2nd shows that there have been successful results from the Arc A750 GPUs (I believe on Windows).
https://numberfields.asu.edu/NumberFields/forum_thread.php?id=579
https://numberfields.asu.edu/NumberFields/forum_thread.php?id=501&postid=3421#3421
48) Message boards : Number crunching : Linux + Intel ARC = GPU HANG GetDecics_4.02 (Message 3540)
Posted 8 Jul 2023 by Profile Eric Driver
Post:
What do you make of the fact that PrimeGrid runs fine on the same hardware & drivers? Unrelated?


I think that's because the PrimeGrid code is much simpler than the NumberFields code. The NumberFields algorithm is very complex, including multiple nested loops with break outs when certain conditions are met. The code adheres to all openCL specs but sometimes the vendor's openCL implementation is flawed. I saw this a couple years ago with the AMD openCL implementation - every new driver would crash, and it was always in a different way, sometimes during the build phase and sometimes during execution. AMD eventually got their act together, and their drivers work now most of the time, with only the occasional hiccup. I think this is what's happening with the Intel drivers; however, there were some older Intel cards that worked last year when I first put the app out (if memory serves).

I mentioned trying Windows because the drivers are usually different and the vendors usually put more resources into testing/fixing the Windows drivers. If the app works on Windows then that would be a starting point for the Intel developers to diagnose the problem with the Linux version of the drivers.
49) Message boards : Number crunching : App_Config.xml Request (Message 3538)
Posted 8 Jul 2023 by Profile Eric Driver
Post:
You can turn off CPU usage from the BOINC client. Much easier than with app_config file. Unless I misunderstood your question?
50) Message boards : Number crunching : Higher number of CVEC = deeper search into the number field? (Message 3537)
Posted 8 Jul 2023 by Profile Eric Driver
Post:
If you mean the number of Cvecs in a WU, then no. A Cvec is a "congruence vector" and is the set of congruences on the polynomial coefficients in order for the polynomial to represent a field with the desired ramification properties.

The number of Cvecs in a WU file depends on the total number of Cvecs and the estimated run time for the particular subsearch. For example, on the current search sf6_DS11x11, there were a total of 96,000,000 Cvecs and I estimated the total runtime to be about 1,500,000 hours. Since I aim for 1 hour work units, I created 1,500,000 files, each with 96/1.5 = 64 Cvecs.

Hopefully that makes some sense.
51) Message boards : Number crunching : Linux + Intel ARC = GPU HANG GetDecics_4.02 (Message 3534)
Posted 8 Jul 2023 by Profile Eric Driver
Post:
It looks like it passed the build phase of the openCL but then the GPU hung while processing it. In the past, that was usually caused by not enough GPU RAM, an older card (not enough FLOPS), or a problem with the driver for the card. You have plenty of RAM and the A750 should be comparable to the Nvidia 3060, so in theory this card should work great. That would point to a bad driver. So the obvious question is if you have the latest driver installed? If you dual boot into windows, you could also try it there.

Does intel have an equivalent to the nvidia-smi command (or the AMD radeontop command) for viewing utilization on the GPU? If so, use that to make sure it's not running out of RAM, and when it hangs check if utilization goes to zero.

At one point in the past I had a bad driver for my AMD card and the GPU app would periodically hang. Radeontop showed me that utilization went to 0%, so it was acting like the GPU was turned off. From the client I would click "Suspend GPU" under Activity, and then click "Use GPU always" to turn it back on - this would cause the GPU to wake up and continue processing. It's a long shot, but maybe something similar is happening here.
52) Message boards : Number crunching : Linux and AMD (Message 3531)
Posted 7 Jul 2023 by Profile Eric Driver
Post:
Hey Alex -

I also have had all kinds of frustration with the AMD drivers, so you are not alone. The trick for me was building the ROCm libraries from scratch, which took the better part of a day. A year later, after a kernel upgrade, it stopped working so I got to do it all again.

Now for the good news. My latest kernel upgrade was several months ago. After the upgrade, I installed the AMD drivers using the ROCm repo, and it actually worked out of the box. So AMD might finally be getting their act together.

With that said, I use Fedora, so I'm not sure how this translates to Mint. In case it helps, I will give the process I used on Fedora. This is based on my command history from several months ago. I executed the following commands in a terminal window (as root):
dnf config-manager --add-repo http://repo.radeon.com/rocm/
dnf list available | grep -i rocm
dnf install rocm-*
53) Message boards : News : Subfield 3 Officially Complete (Message 3526)
Posted 6 Jul 2023 by Profile Eric Driver
Post:
I am pleased to report that the search over ℚ(√2), also known affectionately as "subfield 3", is now officially 100% complete.

This search has been in the works, off and on, for several years. It goes without saying that this is a major computational achievement. We couldn't have done it without all the volunteers, so I thank you all!

The batch status tables have been updated accordingly with subfield 3 moved to the archives. The results of the search can be found here.


WOO HOO!!!

Is there a way in the future to add the subfield we are doing to the app name so you have different apps for the batches of tasks? This would do two things one make it easier to see what kind of thing we are working towards and two recreate interest in the Gridcoin community by adding additional apps that people can then get their hours in. Currently wuprop badges end at 100k hours and adding new apps would bring alot of those people, who are also GridCoin crunchers to your Project who already have their 100k hours in the current app.
Just a thought
mikey


Although possible in principle, it think it would be a pain in the butt and would be impractical. In essence, for each platform I would have to maintain multiple copies of the same executable but with different names. But I will think more about it.
54) Message boards : Number crunching : Linux and AMD (Message 3520)
Posted 6 Jul 2023 by Profile Eric Driver
Post:
Clearly the openCL compiler failed to build the code, and since the compiler is part of the driver that would point to a driver problem. I'm not very familiar with all the different versions of AMD GPUs, but this looks to be an integrated graphics processor. In the past, these GPUs were not as reliable as discrete GPUs (at least on this project), so that could also be part of the problem. Hopefully someone who has experience with AMD integrated graphics can chime in with an idea.
55) Message boards : News : Batch Plan (Message 3517)
Posted 5 Jul 2023 by Profile Eric Driver
Post:
It appears that SF7 is (mostly!) complete, since I've only been getting the 11x10 & 11x11 SF6 workunits these past few days.

I assume once we have crunched 100% of SF6, that we might go back to fully finish off SF7 just for the sake of total certainty?


Yes. Sorry, I should have posted a note. I feel pretty confident that it is complete. While we proceed with subfield 6, Gerasim is hosting subfield 7 tasks.
56) Message boards : News : Subfield 3 Officially Complete (Message 3513)
Posted 27 Jun 2023 by Profile Eric Driver
Post:
This is great news!! I'll be here until your system crashes. Hopefully, you'll be able/allowed to find a new benefactor. If you get a new ASU benefactor will you be allowed to receive donations?


No, the university never wanted to deal with donations. I can't remember why, maybe for legal reasons.
57) Message boards : News : Subfield 3 Officially Complete (Message 3509)
Posted 23 Jun 2023 by Profile Eric Driver
Post:
Gerasim@home contribuera encore à la recherche de numberfield?


Good question. Yes, Gerasim will continue helping with subfield 7.
58) Message boards : News : Subfield 3 Officially Complete (Message 3506)
Posted 23 Jun 2023 by Profile Eric Driver
Post:
I am pleased to report that the search over ℚ(√2), also known affectionately as "subfield 3", is now officially 100% complete.

This search has been in the works, off and on, for several years. It goes without saying that this is a major computational achievement. We couldn't have done it without all the volunteers, so I thank you all!

The batch status tables have been updated accordingly with subfield 3 moved to the archives. The results of the search can be found here.
59) Message boards : News : Batch Plan (Message 3505)
Posted 14 Jun 2023 by Profile Eric Driver
Post:
It is interesting, do you mean that you know how many fields is hidden in "Search 7 DS12x16"


Not exactly. But I can measure the rate at which new fields come in and at a certain point I can say with high probability that the search is complete. If you recall, I did this with subfield 3 - I predicted a while back that it was already complete and now we are finishing that search which will validate the prediction and allow us to say it is 100% complete. It's similar to how a news room will call the presidential election with only X% of the votes in; we know it's not 100% guaranteed but we have high confidence in it.
60) Message boards : News : Batch Plan (Message 3503)
Posted 13 Jun 2023 by Profile Eric Driver
Post:
Hello,

What is the sequence of next searches?

Unless something changed, per news it should be finishing off SF 7 and then SF6.


Welcome!

Can you help us describe the subvariants of SF7 that need to be completed before we move on to SF6?


16x12 is the last search for sf7. Because this search is so large, it is further subdivided over the 25 cvecs for prime 5. The WU name contains a substring 16x12-X; the X is the cvec index. The current plan is to do cvec #1 and see how many new fields are found, which will dictate how many more cvecs to do before moving on to sf6. Once I have high confidence that all fields have been found, I'd rather move on to sf6 where many more fields are waiting to be discovered. Hope that explains it...


Previous 20 · Next 20


Main page · Your account · Message boards


Copyright © 2024 Arizona State University