Message boards :
News :
Support for Intel GPUs
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 7 Sep 13 Posts: 2 Credit: 18,955,810 RAC: 34,773 |
First wu finished after 3h36m: https://numberfields.asu.edu/NumberFields/result.php?resultid=124655746 |
Send message Joined: 4 Dec 19 Posts: 10 Credit: 14,695,896 RAC: 0 |
Aborted both of mine after 9+ hours with no sign of any work being done or any actual GPU usage |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
If this will help, it ran for 2.5 hours before I aborted all wus. If I understand this correctly, it hung while trying to compile the openCL code, and the debug dump occurred as a result of aborting it. |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
First wu finished after 3h36m: Well, at least it finished. But the run time suggests either a bad build of the openCL code or the GPU is not powerful enough for this app. This is the same thing we saw with older Nvidia/AMD cards. Contention may also be playing a role - if other things were running on the GPU at the same time or all the CPU cores were maxed out (recall, the app needs a small portion of a cpu core). |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
Aborted both of mine after 9+ hours with no sign of any work being done or any actual GPU usage Looking at the stderr.txt, it hung while trying to build the openCL, the same problem others are having. |
Send message Joined: 28 Oct 11 Posts: 180 Credit: 247,148,724 RAC: 150,421 |
Contention may also be playing a role - if other things were running on the GPU at the same time or all the CPU cores were maxed out (recall, the app needs a small portion of a cpu core).I've run intel_gpu tasks under BOINC for a number of projects - originally SETI@Home, and currently both Einstein and WCG/Covid. All my iGPU experience is under Windows. Contention certainly needs to be considered carefully. All GPU applications require some degree of CPU support while running, but the OpenCL language running under Windows on an iGPU is an outlier. It doesn't need much CPU time, but it wants CPU access FAST. That CPU support can be made available in two different ways: either by ensuring that a CPU core is free from a continuous running BOINC-style workload (which is wasteful), or - as first recognised at Einstein - by setting the CPU part of the iGPU application to run at real-time priority. !!! That latter suggestion clearly has to be handled with care by experienced users, but it works for me. It perhaps gives some insight into what's going on under the hood. Unfortunately, the iGPU is the ugly sister of the GPGPU programming world, and I don't know of anyone who has really got to grips with programming it properly under BOINC - too little time, too much to do. |
Send message Joined: 6 Feb 21 Posts: 4 Credit: 3,337,063 RAC: 0 |
My HD620 finished 3 WU's last night while the tablet was not used for anything. The CPU cores were also free. The times to finish varied between 2:16 and 3:45 hours. |
Send message Joined: 15 Aug 12 Posts: 12 Credit: 6,341,704 RAC: 12,442 |
I just added support for Intel GPUs. There are 2 versions - 64 bit linux and 64 bit windows, both require openCL 1.1 or higher. My WUs error out on the glibc version (2.27) of Linux Mint 19.3 (the OpenCL used is 2.0) <stderr_txt> ../../projects/numberfields.asu.edu_NumberFields/GetDecics_4.02_x86_64-pc-linux-gnu__opencl_intel: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by ../../projects/numberfields.asu.edu_NumberFields/GetDecics_4.02_x86_64-pc-linux-gnu__opencl_intel) ../../projects/numberfields.asu.edu_NumberFields/GetDecics_4.02_x86_64-pc-linux-gnu__opencl_intel: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by ../../projects/numberfields.asu.edu_NumberFields/GetDecics_4.02_x86_64-pc-linux-gnu__opencl_intel) ../../projects/numberfields.asu.edu_NumberFields/GetDecics_4.02_x86_64-pc-linux-gnu__opencl_intel: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ../../projects/numberfields.asu.edu_NumberFields/GetDecics_4.02_x86_64-pc-linux-gnu__opencl_intel) </stderr_txt> |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
Update on the glibc problem (only applies to linux users): The root cause of the problem is that I upgraded my build system in the last year and now have a newer version of glibc. Anyone using a version of glibc older than 2.33 will have this problem. The problem seems to be with a new symbol "fstat" which was not even in the old binaries (according to objdump). I think the solution is for me to downgrade my system to an earlier version. This will break other things for me, so I am hesitant to do this in the near term. Since this app is a beta version and most runtimes take over 2 hours on the UHD 620/630 cards and hang on lesser cards, it seems like this app will not get much use until the newer intel cards come out in 2022. Hopefully by then I can improve my build system, so we can stop having these compatibility issues. |
Send message Joined: 21 Mar 17 Posts: 1 Credit: 166,200 RAC: 0 |
all my wu finished with miscalculations Error during calculations |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
all my wu finished with miscalculations Error during calculations According to the stderr, it failed to build the openCL code. At least it exited gracefully instead of the openCL compiler hanging indefinitely which sometimes happens for others. |
Send message Joined: 27 Dec 19 Posts: 6 Credit: 623,103 RAC: 0 |
It didn't find my nvidia gpu GPU Summary String = [CUDA|NVIDIAGeForceGTX1650|1|4095MB|49676|300]. Loading GPU lookup table from file. GPU was not found in the lookup table. Using default values: numBlocks = 1024. threadsPerBlock = 32. polyBufferSize = 32768. |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
It didn't find my nvidia gpu That's because it's not in the lookup table. But no worries, performance should still be relatively good with the default values. |
Send message Joined: 27 Sep 21 Posts: 11 Credit: 2,318,536 RAC: 2,057 |
It didn't find my nvidia gpu A NVIDIAGeForceGTX1650|1|4095MB|49676|300] is a pretty mid range GPU that I believe has been around for a while. Are the project developers having to build an entire look up table or are there common industry tables available ? |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
It didn't find my nvidia gpu The settings in the lookup table are unique to this project, so yes it would have to be built up manually. The default settings should be pretty good for most cards. The idea was that over time some entries could be added to the lookup table to improve performance on specific cards. |
Send message Joined: 28 Oct 11 Posts: 180 Credit: 247,148,724 RAC: 150,421 |
I'm slightly surprised that you've felt the need to research and supply additional data about GPU compute devices. Shouldn't that be made available to you and all other application developers, either by the BOINC platform, or from the manufacturer's driver tools? I've looked into what is made available to you by BOINC / OpenCL for my iGPUs. name Intel(R) HD Graphics 4600 Intel(R) HD Graphics 4600 vendor Intel(R) Corporation Intel(R) Corporation vendor_id 32902 32902 available 1 1 half_fp_config 0 0 single_fp_config 158 158 double_fp_config 0 0 endian_little 1 1 execution_capabilities 1 1 global_mem_size 1360632218 1360632218 local_mem_size 65536 65536 max_clock_frequency 400 400 max_compute_units 20 20 nv_compute_capability_major 0 0 nv_compute_capability_minor 0 0 amd_simd_per_compute_unit 0 0 amd_simd_width 0 0 amd_simd_instruction_width 0 0 opencl_platform_version OpenCL 1.2 OpenCL 1.2 opencl_device_version OpenCL 1.2 OpenCL 1.2 opencl_driver_version 10.18.14.5162 10.18.14.5162 device_num 0 peak_flops 64000000000 opencl_available_ram 1360632218 opencl_device_index 0 warn_bad_cuda 0That's a pretty eclectic list. The first column is taken from an internal BOINC file called 'coproc_info.xml', and the second from the sched_request_...xml file sent by our clients to your servers. I don't know why the nv and amd fields are present in the iGPU report - presumably it was simpler to use a common data structure across all vendors. Anyway, numBlocks, threadsPerBlock and polyBufferSize are clearly all missing. How important are they, where can we get them from, and should we ask the BOINC developers to do the heavy lifting? |
Send message Joined: 15 Aug 12 Posts: 12 Credit: 6,341,704 RAC: 12,442 |
Update on the glibc problem (only applies to linux users): Forget (most) Linux users then, as the repositories of the most common used distro/versions are no further than 2.31 |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
I'm slightly surprised that you've felt the need to research and supply additional data about GPU compute devices. Shouldn't that be made available to you and all other application developers, either by the BOINC platform, or from the manufacturer's driver tools? I think there is some confusion here on the purpose of the lookup table. Let me attempt to clear this up. The parameters "numBlocks" and "threadsPerBlock" are CUDA terminology and describe how to break up the data on the GPU. There is different but equivalent terminology for openCL (I forget the exact names, but it's not important). The optimal parameters are somewhat dependent on the algorithm itself, so the only way to determine them is to run several WUs at different settings to see which is best. The idea was that willing volunteers could do this analysis on their specific card and then report back the optimal setting to be added to the lookup table, with the hope that anybody with that same card would see the same improvement. For those interested, the testing process is described here. This can be a tedius process, but there have been several volunteers who have helped and have lead to several of the entries in the table. Note that improvement over the default settings is usually less than 10%, so don't expect to see huge performance gains. |
Send message Joined: 28 Oct 11 Posts: 180 Credit: 247,148,724 RAC: 150,421 |
My UHD 620 laptop survived the transition to Windows 11 (and preserved the BOINC v7.16.20 installation intact, somewhat to my surprise) - details at host 1232275. I've put through task 125207453 as a test: it returned a valid result in just under two hours, so there's another data point. I've commented with concern at other projects that their invocation of the -cl-mad-enable OpenCL compiler flag has led to validation errors, especially on this machine: but I would guess that the fused multiply + add opcode is more likely to cause problems with floating point arithmetic, rather than here. Still, I mention it just in case... This machine is an ultraportable, and after an earlier Windows 10 update, the fan only operates at zero or full speed: that doesn't sound good. I'll keep it in reserve and available for testing, but I won't be running it full time. |
Send message Joined: 8 Jul 11 Posts: 1342 Credit: 513,010,516 RAC: 581,321 |
My UHD 620 laptop survived the transition to Windows 11 (and preserved the BOINC v7.16.20 installation intact, somewhat to my surprise) - details at host 1232275. Yes, we only use integer ops here. The good news is the results are correct. This tells me the build process is working and the intel GPUs are capable of compiling the openCL and running it successfully. The only thing I don't like are the run times, but I think this is because the intel GPUs are not powerful enough to handle this app. I look forward to seeing how well the new Intel Arc GPUs fare after they debut next year. For now I will leave this as a beta app. The following cards have returned successful results: HD500, HD515, Gen9, UHD620, UHD630. |