Older GPUs not working

Author	Message
JStateson Send message Joined: 13 Feb 18 Posts: 3 Credit: 6,711,778 RAC: 27,112	Message 3472 - Posted: 1 Apr 2023, 10:36:12 UTC Last modified: 1 Apr 2023, 10:47:22 UTC I have seven s9000 and s9050 type GPUs. These are OpenCL 1.2 and run extremely well in Milkyway but fairly slow though useable in Einstein. Their GPU chip is same as the HD-7950 or 7970. Unaccountably, they consistently fail here in NumberFields. GPU-z shows %0 utilization the entire time the app is running. After about 4 hours they generate the following error "C:\Users\JSTATE~1\AppData\Local\Temp\OCL27052T3.cl", line 1941: warning: shift count is too large if( (pow2 & 0xFFFFFFFF)==0 ) { d = 32; pow2 >>= 32; } ^ My s9100 and s9150 are OpenCl 2.1 and run fine on this project. Their GPU chip is different. Any possibility of getting those older boards to work? I enabled "DEBUG" and "DBG_THREAD" settings in the .h and .cl files but nothing showed up. Code probably did not get that far to print anything but I am just guessing. ID: 3472 · Rating: 0 · rate: / Reply Quote

Eric Driver Project administrator Project developer Project tester Project scientist Send message Joined: 8 Jul 11 Posts: 1388 Credit: 691,490,842 RAC: 827,968	Message 3473 - Posted: 1 Apr 2023, 18:25:48 UTC - in response to Message 3472. That is very strange. The code that's referenced should only be reached under CUDA (the multi-precision words are 64bits in CUDA and only 32bits in OpenCL, hence the error about the shift being too large). The OpenCL and CUDA apps use much of the same code, and the differences are handled via preprocessor directives. Here is the best theory I have for what is happening: You have a system environment variable named "CUDA" so when your openCL compiler reaches a point in the code with the "#ifdef CUDA" directive, it ends up setting the multi-precision word larger than it should, and will also set other parameters incorrectly. If this is what's happening, I would first check your system variables for "CUDA", otherwise it could be the openCL implementation on your system that is setting the CUDA directive. On windows, I believe the openCL implementation is part of a system dll file, so not sure how to handle that. Do you know how to get the exact command line that your openCL compiler is using? ID: 3473 · Rating: 0 · rate: / Reply Quote

JStateson Send message Joined: 13 Feb 18 Posts: 3 Credit: 6,711,778 RAC: 27,112	Message 3474 - Posted: 3 Apr 2023, 1:09:43 UTC - in response to Message 3473. Last modified: 3 Apr 2023, 1:21:02 UTC That is very strange. The code that's referenced should only be reached under CUDA (the multi-precision words are 64bits in CUDA and only 32bits in OpenCL, hence the error about the shift being too large). The OpenCL and CUDA apps use much of the same code, and the differences are handled via preprocessor directives. Here is the best theory I have for what is happening: You have a system environment variable named "CUDA" so when your openCL compiler reaches a point in the code with the "#ifdef CUDA" directive, it ends up setting the multi-precision word larger than it should, and will also set other parameters incorrectly. If this is what's happening, I would first check your system variables for "CUDA", otherwise it could be the openCL implementation on your system that is setting the CUDA directive. On windows, I believe the openCL implementation is part of a system dll file, so not sure how to handle that. Do you know how to get the exact command line that your openCL compiler is using? I spent some time looking at this as solving puzzles is one of my bad habits plus I am retired. Looking at the .h and .cl I see they are all functionally the same eg: gpuMultiPrecAMD_v402.h pdtKernelAMD_v402.cl pdtKernel_v402.cl gpuMultiPrec_v402.h The app copies them down to a boinc slot and drops the "v402". The one named "amd" is no different than the nvidia one except the name (and the include name) The following are the cpu, intel, nvidia and amd executables. It seems there is no difference between the nvidia and amd so the OpenCL library differenciate. 02/12/2023 09:56 AM 22,536,448 GetDecics_4.00_windows_x86_64 03/31/2023 02:32 PM 22,719,607 GetDecics_4.02_windows_x86_64__opencl_intel 03/31/2023 02:32 PM 22,575,516 GetDecics_4.02_windows_x86_64__opencl_nvidia 02/12/2023 09:55 AM 22,575,516 GetDecics_4.02_windows_x86_64__opencl_amd I did find (I assume) the command line options in the sched_reply xml but they were all empty <file_ref> <file_name>sf7_DS-16x11_Grp4834097of8000000.dat</file_name> <open_name>in</open_name> </file_ref> <command_line> </command_line> </workunit> I tried running in standalone mode by removing the soft links and executing the AMD app in a folder with all the files it needed (that I could guess at) 04/02/2023 07:08 PM <DIR> . 04/02/2023 07:08 PM <DIR> .. 04/02/2023 06:13 PM 349 gpuLookupTable.txt 04/02/2023 06:13 PM 34,028 gpuMultiPrecAMD.h 04/02/2023 06:13 PM 1,061 in 04/02/2023 06:31 PM 8,642 init_data.xml 04/02/2023 06:13 PM 728 mp_int.h 04/02/2023 06:36 PM 0 out 04/02/2023 06:31 PM 124 out.1 04/02/2023 06:13 PM 76,649 pdtKernelAMD.cl 02/12/2023 09:55 AM 22,536,448 test.exe 02/12/2023 09:55 AM 22,575,516 test2.exe 10 File(s) 45,233,545 bytes 2 Dir(s) 1,820,312,457,216 bytes free C:\Users\jstateson\Downloads\numbers_h110\test>test2 When I ran "test.exe" the cpu app runs and seemed to work as the iteration started. Unfortunately "test2.exe", the amd app, did nothing whatsoever. GPU-z showed no load on the GPU. The contents of stderr is C:\Users\jstateson\Downloads\numbers_h110\test>type stderr.txt 19:08:37 (40056): Can't set up shared mem: -1. Will run in standalone mode. GPU Summary String = [CAL\|AMDFireProS9050\|3\|12288MB\|\|102]. Loading GPU lookup table from file. GPU was not found in the lookup table. Using default values: numBlocks = 1024. threadsPerBlock = 32. polyBufferSize = 32768. Anyway, this is as far as I got. I never figured out how the xml file got created. If I run the app without the xml file I get an error and the app closes C:\Users\jstateson\Downloads\numbers_h110\test>test2 C:\Users\jstateson\Downloads\numbers_h110\test>type stderr.txt 20:04:42 (66488): Can't open init data file - running in standalone mode GPU Summary String = . Loading GPU lookup table from file. GPU was not found in the lookup table. Using default values: numBlocks = 1024. threadsPerBlock = 32. polyBufferSize = 32768. 20:04:42 (66488): Can't open init data file - running in standalone mode Error: Failed to obtain OpenCL device id. Error: Failed to initialize OpenCL. I as never able to duplicates that 31 bit error. If you got any ideas let me know. [edit] The AMD app runs fine on my s9100, s9150, VII and MI25 amd boards but fails on the s9000 and s9150 The s90x0 are OpenCL platform 2.1 and device 1.2 whereas the s91x0 are 2.1 and 2.0 My s9000 and s9050 work fine under OpenCl for Einstein and Milkway under windows and I have used them under linux before. ID: 3474 · Rating: 0 · rate: / Reply Quote

Eric Driver Project administrator Project developer Project tester Project scientist Send message Joined: 8 Jul 11 Posts: 1388 Credit: 691,490,842 RAC: 827,968	Message 3475 - Posted: 3 Apr 2023, 5:10:12 UTC - in response to Message 3474. I was unfamiliar with the AMD FirePro 9050, so I did a quick search. I think it is powerful enough to run the app, but I could be wrong. When GPU-z shows no load, that is when openCL is compiling the code. We have seen this issue before with some of the AMD drivers, where the openCL compiler hangs indefinitely (this was discussed somewhere in previous threads, but I am too lazy to dig through it right now). I recall the app working for some users, and after updating the driver they had the same problem you are currently seeing. I should mention, I also had this problem with the stock openCL on linux - I fixed it by installing the AMD RocM driver manually. I imagine AMD is no longer updating the driver for your older card, so you may be stuck. When I get a chance tomorrow, I will look through old results in the database to see if any other users have a working FirePro GPU, then I can ask them what driver version they have. You said you have used these cards under linux. Does that include NumberFields? Because if it works there, then we know it is most likely the driver (to be more precise, the openCL part of the driver). ID: 3475 · Rating: 0 · rate: / Reply Quote

JStateson Send message Joined: 13 Feb 18 Posts: 3 Credit: 6,711,778 RAC: 27,112	Message 3476 - Posted: 3 Apr 2023, 14:12:55 UTC - in response to Message 3475. Was running only SETI, Einstein and Milkyway in Ubuntu 18.04 Never tried Numberfields and no longer use these cards in Ubuntu due to the difficulty of rolling back to the kernel blessed by AMD. s9000 and s9050 have the Tahiti chipset as do the HD-7970, 7950, R9 and all variations of Tahiti just have more memory and use more or less cores The Tahiti problem was discussed before here note the only difference is the memory and number of cores: https://www.techpowerup.com/gpu-specs/radeon-hd-7970.c296 https://www.techpowerup.com/gpu-specs/firepro-s9000.c1879 The s9100, s9150 and Mi25 are recognized as w9100, w9150 and wx9100 respectively in windows 10 and work fine on all projects. These boards all go for under $100 USD used but need a good DIY fan. The xml file in the slot assigned to Numberfields only recognizes the first card. That does not seem to be a problem even though there are 4 really different cards in the system with the 2.0 OpenCl devices. How is that file generated? Is there a debugging feature that can be enabled by supplying an option in the command line? The following code should never have executed since DIGIT_BIT was defined to be 31 in an earlier include statement. if (DIGIT_BIT>32) { if( (pow2 & 0xFFFFFFFF)==0 ) { d = 32; pow2 >>= 32; } } ID: 3476 · Rating: 0 · rate: / Reply Quote

Eric Driver Project administrator Project developer Project tester Project scientist Send message Joined: 8 Jul 11 Posts: 1388 Credit: 691,490,842 RAC: 827,968	Message 3477 - Posted: 3 Apr 2023, 18:25:24 UTC - in response to Message 3476. I queried the database for anyone with a gpu string containing "fire". Only 99 such hosts. Of those, there were only 2 with recent successful results. The first was your s9150 host and the 2nd had the following info string: [BOINC\|7.16.11][CAL\|AMD FirePro W5100\|1\|4096MB\|\|200][vbox\|6.0.14\|1\|1] I'm not sure how that host compares to yours from a performance perspective. But either way, there are no other users with a successful s9000 or s9050. I'm not sure exactly who generates that xml file. I would guess the client generates it from data it receives from the project. You mentioned other cards in the system. Are these other cards Nvidia by any chance? That could explain how the openCL compiler is getting confused - maybe the presence of the Nvidia card has resulted in the creation of the "CUDA" environment variable. That is the only way the 64bit code can be reached (In hind sight, I should have used a less common name). ID: 3477 · Rating: 0 · rate: / Reply Quote