Message boards :
Number crunching :
Control both CPU 4.00 and GPU 4.02 simultaneously
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 1 Oct 17 Posts: 2 Credit: 6,718,648 RAC: 402,355 |
Since both WUs have the same name GetDecics I cannot find a way to control them both. The problem is that this project sends far too many CPU WUs and triggers tagging them as High Priority. When this happens often the GPU WU stops running to allow the many HP CPU WUs to take over. Exacerbating the problem when a mix of deadlines appear and the closer ones run as HP and the ones that have already started switch to Waiting To Run. I've tried everything I can think of in my app_config but nothing seems to work. The easiest fix would be to do like every other BOINC project and give them different names, e.g. GetDecicsCPU and GetDecicsGPU.
|
Eric DriverSend message Joined: 8 Jul 11 Posts: 1472 Credit: 1,284,399,689 RAC: 1,174,842 |
Since both WUs have the same name GetDecics I cannot find a way to control them both. Have you tried adjusting parameters in the manager? For example, only queuing 1 day of work should keep it from downloading too many WUs. The WUs have a 6 day deadline, so that would give you plenty of time to complete them before they are flagged as High Priority. Also, if your CPU has hyper threading, I find it best to set CPU usage to between 60% and 75% of the CPUs, otherwise you get too much resource contention between threads and the WUs take much longer to run. |
Eric DriverSend message Joined: 8 Jul 11 Posts: 1472 Credit: 1,284,399,689 RAC: 1,174,842 |
Also, here is a thread which talks about app_config.xml settings, which may or may not be helpful: https://numberfields.asu.edu/NumberFields/forum_thread.php?id=558#3378 This app_config works great for me when CPU usage is set to 50% (which means 100% of physical cores). |
SerValSend message Joined: 1 Jan 20 Posts: 69 Credit: 60,050,709 RAC: 298 |
-- use Gerasim -- go select [name] from dbo.Apps /* 1. Get Decic Fields 2. GetDecicFields 3. GetDecicFields(Linux) 4. GetDecics_4.02_windows_x86_64__opencl_nvidia */ -- content of app_config.xml for represent 2 gpu from 1 : /************ <app_config> <app> <name>GetDecics_4.02_windows_x86_64__opencl_nvidia</name> <max_concurrent>8</max_concurrent> <gpu_versions> <gpu_usage>0.45</gpu_usage> <cpu_usage>0.85</cpu_usage> </gpu_versions> </app> </app_config> ************/ -- restart Boinc Manager requared |
|
Send message Joined: 4 Jan 25 Posts: 59 Credit: 345,095,899 RAC: 602,756 |
Your cache is too big. Given the number of projects you are running, and if you feel that for some odd reason that you need to have a cache, make sure you set your cache to 0.2 days and 0.01 additional days. Otherwise, 0.97 days and 0.01 additional days would be plenty. Then leave things alone for a few days to let them settle down. Grant Darwin NT, Australia. |
|
Send message Joined: 1 Oct 17 Posts: 2 Credit: 6,718,648 RAC: 402,355 |
I'm Linux so no Gerasim. The key word is smultaneously. Please read what I actually wrote. Here's one version I've tried: <!-- i7-5960X 8c16t 4x16=64 GB L3 Cache 20 MB -->
<app_config>
<app>
<name>GetDecics</name>
<!-- NumberFields -->
<version_num>402</version_num>
<plan_class>cuda30</plan_class>
<gpu_versions>
<cpu_usage>1</cpu_usage>
<gpu_usage>1</gpu_usage>
</gpu_versions>
<max_concurrent>1</max_concurrent>
<fraction_done_exact/>
</app>
<app>
<name>GetDecics</name>
<!-- NumberFields -->
<version_num>400</version_num>
<plan_class>default</plan_class>
<max_concurrent>8</max_concurrent>
<fraction_done_exact/>
</app>
<report_results_immediately/>
</app_config>
|
Eric DriverSend message Joined: 8 Jul 11 Posts: 1472 Credit: 1,284,399,689 RAC: 1,174,842 |
According to this link: https://github.com/BOINC/boinc/wiki/Client-configuration#Project-level-configuration you need an <app_version> block for each version you want to control. I have never tried this personally, so I don't know if it will actually work. |
|
Send message Joined: 4 Jan 25 Posts: 59 Credit: 345,095,899 RAC: 602,756 |
KISS always has and always will apply. Running with a reasonably sized cache would stop the issue from occurring, negating the need for convoluted app_config based fixes, which will most likely result in other issues that will then need to be addressed. *shrug* Grant Darwin NT, Australia. |
|
Send message Joined: 8 Jul 17 Posts: 13 Credit: 72,795,837 RAC: 124,107 |
Also, if your CPU has hyper threading, I find it best to set CPU usage to between 60% and 75% of the CPUs, otherwise you get too much resource contention between threads and the WUs take much longer to run. Interesting and very useful information. Does this mean that a typical NumberFields work unit use/has about 60% - 75% FP32 instruction workload during the entire computation time since running a theoretical workload with 100% FP32 will pretty much saturate a full physical core (i.e. 50% with HT will work well) as opposed to running integer operations where running 100% with HT on seems to be more beneficial? Will this change over different work unit batches? |
Eric DriverSend message Joined: 8 Jul 11 Posts: 1472 Credit: 1,284,399,689 RAC: 1,174,842 |
Also, if your CPU has hyper threading, I find it best to set CPU usage to between 60% and 75% of the CPUs, otherwise you get too much resource contention between threads and the WUs take much longer to run. Just so we are clear, the percentages I gave were for maximizing overall throughput (tasks completed per day). It varies with WU batch and CPU, which is why I left it as a broad range. This is not very scientific and is only based on my experience with several AMD machines. Anyways, I'm not sure exactly why it happens. Maybe at high CPU usage there's more contention for the shared L3 cache, or maybe the processor heats up and the clock speed gets throttled. Hopefully someone else can chime in with a better explanation. But to answer your main question, NumberFields is mostly integer arithmetic, not floating point. |