Posts by Richard Haselgrove

1) Message boards : News : Support for Intel GPUs (Message 3777) Posted 17 Jan 2025 by Richard Haselgrove Post: Here we go: https://numberfields.asu.edu/NumberFields/workunit.php?wuid=230312537 https://numberfields.asu.edu/NumberFields/result.php?resultid=246629207 Thanks for sharing! The results themselves look good. My concern is the run times are much higher than expected. The Arc B580 should be similar to a 4060. My 3070 Ti averages about 10 minutes per WU, which should be comparable. Your run times are about 4 times larger than that. Any idea what could be causing this? Are you running multiple GPU threads simultaneously? Is the card at 100% utilization? It's possible Intel's openCL driver is not efficient, but I doubt that can account for a factor of 4 slow down. Two memories from the early days of Intel on-die integrated GPUs (testing done on host 17234 and host 33342). 1) Accuracy: floating-point precision was reduced if the Intel OpenCL compiler was allowed to optimise the code with the "Fused multiply+add" opcode. This effect became more pronounced with the later and more powerful GPU models - the HD 530 showed it much more than the HD 4600. 2) Speed: the runtime support for the Intel OpenCL compiled code requires very little CPU - but boy, does it want it FAST! By default, BOINC will schedule one CPU task per CPU core, plus GPU tasks requiring less than 100% CPU utilisation. This over-commitment of the CPU causes a slowdown of up to 7-fold. There are three ways of mitigating this: a) reduce the number of CPU tasks running alongside the Intel GPU app. b) declare the Intel GPU app to require 100% CPU usage. c) Dangerous - use with care. Set the Intel GPU to run at REAL TIME priority, via a utility like Process Lasso. I experienced nothing worse than a momentary screen stutter once per task, but YMMV. References: 1) Private testing with a volunteer SETI@Hone developer - he built them, I broke them! 2) First reported on the Einstein@home message boards.
2) Message boards : Number crunching : Expired certificate (Message 3739) Posted 17 Aug 2024 by Richard Haselgrove Post: I emailed the IT department last night. They are still working the problem. It was only a 3-month certificate, so I assume it was expected to auto-renew? Shouldn't be too hard, provided the supplier works weekends.
3) Message boards : News : Support for Intel GPUs (Message 3566) Posted 11 Aug 2023 by Richard Haselgrove Post: Have either of you considered using the configuration option <dont_check_file_sizes>0\|1</dont_check_file_sizes> Normally, the size of application and input files are compared with the project-supplied values after the files are downloaded, and just before starting an application. If this flag is set, this check is skipped. Use it if you need to modify files locally for some reason. (details in the User manual)
4) Message boards : Number crunching : Credits aren't updated on BOINCstats, Free-DC (Message 3446) Posted 29 Jan 2023 by Richard Haselgrove Post: Check your project preferences page https://numberfields.asu.edu/NumberFields/prefs.php?subset=project, and ensure that this line (in the first group) is checked: Do you consent to exporting your data to BOINC statistics aggregation Web sites?
5) Message boards : Number crunching : More than one task per GPU? (Message 3383) Posted 19 Oct 2022 by Richard Haselgrove Post: Best to refer people directly to the User Manual In this case, there's a whole section on app_config.xml files: client configuration - project-level_configuration
6) Message boards : News : Future of the Project (Message 3305) Posted 10 Aug 2022 by Richard Haselgrove Post: Sorry to hear that. But it confirms a decision of my own. All of us will have been incurring elevated electricity prices during the current emergencies, and here in the UK, we face another significant price jump in less than two months - at the beginning of October. In recent years, I've been concentrating on ever-more-powerful GPU crunching, but I think with this price rise the time has come to cut back on that indulgence. So I'll be taking a dozen - reasonably modern and powerful - GPUs out of service, at least temporarily. In recent years, GPU projects have been increasingly encroaching on CPU computing, with applications utilising wasteful spin-wait synchronisation loops. So mothballing the GPUs will release a similar number of CPU cores, and many of them will find their way here. We'll see whose hardware fails first, but I hope to give you a decent shove towards your objectives before either of us succumbs.
7) Message boards : Number crunching : My user listing has disappeared from top participants stats pages (Message 3185) Posted 10 Dec 2021 by Richard Haselgrove Post: You're currently listed at number 20, so you can relax. The pages are cached, and they're not all updated at the same instant. It can sometimes happen that you're in the process of crossing a page boundary when the snapshots are taken - you might fall into the gap, or you might even appear on both! It should sort itself out at the next refresh.
8) Message boards : Number crunching : HTTPS for Master URL please (Message 3180) Posted 28 Nov 2021 by Richard Haselgrove Post: Project admins should pester David Anderson to ask why there is still no sign of a v7.18 release for all platforms.
9) Message boards : News : Support for Intel GPUs (Message 3177) Posted 26 Nov 2021 by Richard Haselgrove Post: My UHD 620 laptop survived the transition to Windows 11 (and preserved the BOINC v7.16.20 installation intact, somewhat to my surprise) - details at host 1232275. I've put through task 125207453 as a test: it returned a valid result in just under two hours, so there's another data point. I've commented with concern at other projects that their invocation of the -cl-mad-enable OpenCL compiler flag has led to validation errors, especially on this machine: but I would guess that the fused multiply + add opcode is more likely to cause problems with floating point arithmetic, rather than here. Still, I mention it just in case... This machine is an ultraportable, and after an earlier Windows 10 update, the fan only operates at zero or full speed: that doesn't sound good. I'll keep it in reserve and available for testing, but I won't be running it full time.
10) Message boards : News : Support for Intel GPUs (Message 3174) Posted 22 Nov 2021 by Richard Haselgrove Post: I'm slightly surprised that you've felt the need to research and supply additional data about GPU compute devices. Shouldn't that be made available to you and all other application developers, either by the BOINC platform, or from the manufacturer's driver tools? I've looked into what is made available to you by BOINC / OpenCL for my iGPUs. name Intel(R) HD Graphics 4600 Intel(R) HD Graphics 4600 vendor Intel(R) Corporation Intel(R) Corporation vendor_id 32902 32902 available 1 1 half_fp_config 0 0 single_fp_config 158 158 double_fp_config 0 0 endian_little 1 1 execution_capabilities 1 1 global_mem_size 1360632218 1360632218 local_mem_size 65536 65536 max_clock_frequency 400 400 max_compute_units 20 20 nv_compute_capability_major 0 0 nv_compute_capability_minor 0 0 amd_simd_per_compute_unit 0 0 amd_simd_width 0 0 amd_simd_instruction_width 0 0 opencl_platform_version OpenCL 1.2 OpenCL 1.2 opencl_device_version OpenCL 1.2 OpenCL 1.2 opencl_driver_version 10.18.14.5162 10.18.14.5162 device_num 0 peak_flops 64000000000 opencl_available_ram 1360632218 opencl_device_index 0 warn_bad_cuda 0 That's a pretty eclectic list. The first column is taken from an internal BOINC file called 'coproc_info.xml', and the second from the sched_request_...xml file sent by our clients to your servers. I don't know why the nv and amd fields are present in the iGPU report - presumably it was simpler to use a common data structure across all vendors. Anyway, numBlocks, threadsPerBlock and polyBufferSize are clearly all missing. How important are they, where can we get them from, and should we ask the BOINC developers to do the heavy lifting?
11) Message boards : News : Support for Intel GPUs (Message 3164) Posted 19 Nov 2021 by Richard Haselgrove Post: Contention may also be playing a role - if other things were running on the GPU at the same time or all the CPU cores were maxed out (recall, the app needs a small portion of a cpu core). I've run intel_gpu tasks under BOINC for a number of projects - originally SETI@Home, and currently both Einstein and WCG/Covid. All my iGPU experience is under Windows. Contention certainly needs to be considered carefully. All GPU applications require some degree of CPU support while running, but the OpenCL language running under Windows on an iGPU is an outlier. It doesn't need much CPU time, but it wants CPU access FAST. That CPU support can be made available in two different ways: either by ensuring that a CPU core is free from a continuous running BOINC-style workload (which is wasteful), or - as first recognised at Einstein - by setting the CPU part of the iGPU application to run at real-time priority. !!! That latter suggestion clearly has to be handled with care by experienced users, but it works for me. It perhaps gives some insight into what's going on under the hood. Unfortunately, the iGPU is the ugly sister of the GPGPU programming world, and I don't know of anyone who has really got to grips with programming it properly under BOINC - too little time, too much to do.
12) Message boards : News : Support for Intel GPUs (Message 3154) Posted 18 Nov 2021 by Richard Haselgrove Post: Probably best to just abort them when you see this behavior. Yes, I'll probably do that before I go to bed this evening, but we may as well get as much information from them as possible. I have one machine with 02-Nov-2021 12:05:09 [---] OpenCL: Intel GPU 0: Intel(R) HD Graphics 530 (driver version 21.20.16.5103, device version OpenCL 2.0, 1298MB, 1298MB available, 202 GFLOPS peak) (host 33342) That has reached Now starting the targeted Martinet search: Num Cvecs = 50. Doing Cvec 1. Doing Cvec 2. Doing Cvec 3. but it's taken two and a half hours - barely better then an abacus! - but that too has no task_state.xml file. It is, however, showing a steady and appropriate progress value of 5.003%, so maybe that was a false alarm. I also have UHD 620 machine, but it's not crunching at the moment. Maybe tomorrow.
13) Message boards : News : Support for Intel GPUs (Message 3150) Posted 18 Nov 2021 by Richard Haselgrove Post: Still running, after an hour and three quarters. I'm even more convinced it's stalled, so I'll have a deeper dig later. Workunit has settings <rsc_fpops_est>30000000000000.000000</rsc_fpops_est> <rsc_fpops_bound>700000000000000000000.000000</rsc_fpops_bound> That means that BOINC will allow it to run for over four centuries before it intervenes to abort it. I don't think I've quite got that much patience...
14) Message boards : News : Support for Intel GPUs (Message 3149) Posted 18 Nov 2021 by Richard Haselgrove Post: First task has reached an indicated 99% progress after 50 minutes, but I strongly suspect this is BOINC's imaginary 'pseudo progress' from a stalled app. I'll keep an eye on it.
15) Message boards : News : Support for Intel GPUs (Message 3147) Posted 18 Nov 2021 by Richard Haselgrove Post: And this doesn't look right, either. D:\BOINCdata\slots\5>dir Volume in drive D has no label. Volume Serial Number is E0ED-E51A Directory of D:\BOINCdata\slots\5 18/11/2021 16:06 <DIR> . 18/11/2021 16:06 <DIR> .. 18/11/2021 16:06 0 boinc_lockfile 18/11/2021 16:06 117 GetDecics_4.02_windows_x86_64__opencl_intel 30/01/2021 10:51 349 gpuLookupTable.txt 30/01/2021 10:51 34,028 gpuMultiPrec.h 18/11/2021 16:06 113 in 18/11/2021 16:06 10,923 init_data.xml 30/01/2021 10:51 728 mp_int.h 18/11/2021 16:06 128 out 30/01/2021 10:51 76,646 pdtKernel.cl 18/11/2021 16:06 281 stderr.txt 10 File(s) 123,313 bytes I don't see any sign of the app recording progress. I'd expect to see a 'boinc_task_state.xml' file by now. Initial runtime estimate was 10 min 42 sec (BOINC estimates are notoriously unreliable for a new app): it's been running for 18:30, but I'm not sure it's done anything.
16) Message boards : News : Support for Intel GPUs (Message 3146) Posted 18 Nov 2021 by Richard Haselgrove Post: This doesn't look right. From stderr, in running: GPU Summary String = [CUDA\|NVIDIAGeForceGTX1660SUPER\|2\|4095MB\|47212\|300][INTEL\|Intel(R)HDGraphics4600\|1\|1297MB\|\|102]. Loading GPU lookup table from file. GPU found in lookup table: GPU Name = GTX1660. numBlocks = 8192. threadsPerBlock = 32. polyBufferSize = 262144. The "Intel HD Graphics 4600" is right (present on the machine, OpenCL driver installed, known to BOINC, ready for use), but the GPU Name and (probably) metrics are from the wrong manufacturer.
17) Message boards : News : Support for Intel GPUs (Message 3145) Posted 18 Nov 2021 by Richard Haselgrove Post: Just had this happen: 18/11/2021 15:36:28 \| NumberFields@home \| Scheduler request completed: got 42 new tasks 18/11/2021 15:36:28 \| NumberFields@home \| [sched_op] estimated total Intel GPU task duration: 26357 seconds 18/11/2021 15:36:30 \| NumberFields@home \| Started download of GetDecics_4.02_windows_x86_64__opencl_intel The download of the .exe file is going very slowly (21 MB at a speed of 16 KBps), but it's getting there. Should be finished in 15 mins or so. I'll be watching host 1291 - any particular feedback you'd like? Edit: 18/11/2021 16:05:58 \| NumberFields@home \| Finished download of GetDecics_4.02_windows_x86_64__opencl_intel 18/11/2021 16:06:04 \| NumberFields@home \| Starting task wu_sf3_DS-16x271-15_Grp669648of1000000_0 Nearly half an hour to download! A caching server (e.g. Cloudflare) can help with that - otherwise, just have patience. It'll be better when the initial rush has died down.
18) Message boards : Number crunching : HTTPS for Master URL please (Message 3128) Posted 3 Oct 2021 by Richard Haselgrove Post: Another reason for not acting too precipitately: the SSL expiry problem only affected clients running on the Windows platform. They use a static ca-bundle.crt file, which can't be automatically updated. Other platforms use the operating system's security bundle, so they get updates with system updates. The emergency release we are expecting 'real soon now' will be for Windows only. The http->https patch will only be fully effective when a planned release is made across all platforms. That's been overdue for a while, but there's no sign of a plan for when it might take place.
19) Message boards : Number crunching : HTTPS for Master URL please (Message 3124) Posted 2 Oct 2021 by Richard Haselgrove Post: Please keep an eye on future developments - specifically, https://github.com/BOINC/boinc/pull/4539, "all master URL update to https". David Anderson wrote that yesterday, when he should have been fixing a problem caused by the expiry of an SSL certificate the day before. Judging by his comment, "Note: code you wrote a long time ago sometimes doesn't seem to make any sense at all", I don't expect this feature to be fully tested in time for the emergency release we expect in the next few days, but there's hope that the problems you encountered will be reduced when the next recommended release reaches widespread coverage,
20) Message boards : Number crunching : Team credit lost ! (Message 3035) Posted 30 Jan 2021 by Richard Haselgrove Post: ... your tasks still show "in progress".... If you click through to the workunit (as distinct form the task), they show as 'WU cancelled'. I've also just got a batch (across several machines, including this one) of 'download failed' - scheduler issued the work, but the data file couldn't be found. I think that's all part of the process of the database healing itself. I'm not worried about any of these - though it might be worth giving what remains of the database a good spring-clean at the end of this run. You might want to do that anyway to check that you've got all the results you were expecting. But if you've got a load of unmatched upload files, and a load of incomplete database records, a query might be able to flip the database state to 'needs validation', and recover them.

Next 20