GPU app - beta version for linux nvidia

Message boards : News : GPU app - beta version for linux nvidia
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2236 - Posted: 24 Mar 2019, 20:16:22 UTC - in response to Message 2234.  

Would it be possible for the Server Status page to separate the CPU/GPU WU counts out? So we can tell if there are GPU WU's available?


It is probably out of date, but my version of the server status page only does apps, not app versions. But I'm sure I can modify it to do that. I will add that to my list.

In the meantime, I have been watching it to make sure it doesn't run dry, so no need to worry about that.
ID: 2236 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2237 - Posted: 24 Mar 2019, 20:22:14 UTC - in response to Message 2233.  

^I have seen the same behaviour on a 2-GPU system. It shows one task for each device, but it's actually crunching both on a single card while the other one is left idle.

EDIT: For now I have simply added an exclusion for the second GPU (GTX 980) and assigned it to another project. The first (GTX 1660 Ti) is now crunching two tasks at the same time with a noticeable boost in GPU utilization (and throughput? projected credit/day is 4+ million). Too bad each seems to require a full CPU thread.


I wonder if I should be calling a function to set the device. I vaguely remember seeing something about that, but I completely forgot to follow up on it. Having only a single GPU, I was not perceptive to this "bug".
ID: 2237 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
davidBAM

Send message
Joined: 25 Oct 18
Posts: 15
Credit: 112,744,248
RAC: 485
Message 2239 - Posted: 24 Mar 2019, 20:35:59 UTC - in response to Message 2231.  

@lakewik which command did you use to see processes running on each GPU please?
ID: 2239 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stiwi

Send message
Joined: 13 Mar 19
Posts: 10
Credit: 35,453,876
RAC: 19,711
Message 2242 - Posted: 24 Mar 2019, 22:43:30 UTC - in response to Message 2239.  

@lakewik which command did you use to see processes running on each GPU please?

nvidia-smi

If you want realtime data:
watch -n 1 nvidia-smi
ID: 2242 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
davidBAM

Send message
Joined: 25 Oct 18
Posts: 15
Credit: 112,744,248
RAC: 485
Message 2243 - Posted: 24 Mar 2019, 22:54:26 UTC - in response to Message 2242.  

Thank you

Yes, confirms that multiple WU were running on a single GPU rather than one per GPU
ID: 2243 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 28 Oct 11
Posts: 180
Credit: 252,529,025
RAC: 182,120
Message 2244 - Posted: 24 Mar 2019, 23:03:08 UTC - in response to Message 2237.  

I wonder if I should be calling a function to set the device. I vaguely remember seeing something about that, but I completely forgot to follow up on it. Having only a single GPU, I was not perceptive to this "bug".
Yes, you should.

https://boinc.berkeley.edu/trac/wiki/AppCoprocessor#Deviceselection

Concentrate on boinc_get_init_data() - the older command line --device N is so old it can be relegated to an afterthought.
ID: 2244 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2246 - Posted: 25 Mar 2019, 1:40:51 UTC - in response to Message 2244.  

I just put a new app version out there that should take care of the device selection bug. It is version 3.01

The only problem is there are 15k WUs already out there associated with version 3.00. I created another 10k with no version association. Still deciding what to do with the other 15k...
ID: 2246 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2247 - Posted: 25 Mar 2019, 1:45:22 UTC - in response to Message 2244.  

I wonder if I should be calling a function to set the device. I vaguely remember seeing something about that, but I completely forgot to follow up on it. Having only a single GPU, I was not perceptive to this "bug".
Yes, you should.

https://boinc.berkeley.edu/trac/wiki/AppCoprocessor#Deviceselection

Concentrate on boinc_get_init_data() - the older command line --device N is so old it can be relegated to an afterthought.


It turns out the --device N command line option was helpful for debugging. When running in stand-alone mode there is no init data, so that helped me to get a device number into the code for testing.
ID: 2247 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2248 - Posted: 25 Mar 2019, 5:30:12 UTC - in response to Message 2246.  

I deprecated the 3.00 app version so that my client would pick up the newer one. Maybe someone knows another way but this seemed to do the trick.

Please test if it works as expected with multiple GPUs. Assuming it works, those who have multiple GPUs may want to abort the old WUs.
ID: 2248 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
davidBAM

Send message
Joined: 25 Oct 18
Posts: 15
Credit: 112,744,248
RAC: 485
Message 2249 - Posted: 25 Mar 2019, 5:33:00 UTC - in response to Message 2248.  

Thanks Eric - I have quite a few of the original WU to work my way through but can do this now that I know what is happening. If it looks like I'll miss the deadline, I'll abort some of them.

Thanks for fixing this so quickly
ID: 2249 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 1 Feb 17
Posts: 23
Credit: 62,005,857
RAC: 2,459
Message 2251 - Posted: 25 Mar 2019, 10:35:07 UTC

Yesterday was triple the output of the Formula BOINC Sprint's best day.
ID: 2251 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Azmodes

Send message
Joined: 20 May 18
Posts: 6
Credit: 165,471,630
RAC: 0
Message 2255 - Posted: 25 Mar 2019, 15:27:36 UTC - in response to Message 2246.  

I just put a new app version out there that should take care of the device selection bug. It is version 3.01

Works, both GPUs are now actually being used.
ID: 2255 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2259 - Posted: 25 Mar 2019, 16:39:45 UTC - in response to Message 2255.  

I just put a new app version out there that should take care of the device selection bug. It is version 3.01

Works, both GPUs are now actually being used.


Good to hear!
ID: 2259 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2261 - Posted: 25 Mar 2019, 16:45:26 UTC - in response to Message 2251.  

Yesterday was triple the output of the Formula BOINC Sprint's best day.


Unfortunately, I don't think those numbers are accurate, just based on the numbers of results coming back. It's correlated to the obscenely high GPU credits being awarded. Maybe it's time I change to "Credit New", from the current "Credit based on runtime"?
ID: 2261 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 1 Feb 17
Posts: 23
Credit: 62,005,857
RAC: 2,459
Message 2265 - Posted: 25 Mar 2019, 17:47:50 UTC - in response to Message 2261.  

Yesterday was triple the output of the Formula BOINC Sprint's best day.


Unfortunately, I don't think those numbers are accurate, just based on the numbers of results coming back. It's correlated to the obscenely high GPU credits being awarded. Maybe it's time I change to "Credit New", from the current "Credit based on runtime"?


I can only see credit, not # of results, so that's what I mean by output.

CreditNew provides lower credit for CPUs than the current system here so those CPU credits will be lowered too. But something will have to be done.
ID: 2265 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
davidBAM

Send message
Joined: 25 Oct 18
Posts: 15
Credit: 112,744,248
RAC: 485
Message 2266 - Posted: 25 Mar 2019, 17:58:08 UTC - in response to Message 2265.  

Why? (asked in all seriousness)

Presumably the GPU app advances the project much quicker so has to pay the going rate to attract GPU owners to crunch them. By my rough reckoning, the new WU are paying approx double what GPUGrid is paying but less than half of what Collatz pays.
ID: 2266 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
n3Ro

Send message
Joined: 7 Jan 16
Posts: 1
Credit: 42,301,682
RAC: 3,274
Message 2267 - Posted: 25 Mar 2019, 18:44:14 UTC - in response to Message 2248.  

I deprecated the 3.00 app version so that my client would pick up the newer one. Maybe someone knows another way but this seemed to do the trick.

Please test if it works as expected with multiple GPUs. Assuming it works, those who have multiple GPUs may want to abort the old WUs.


Works fine for me now. WUs are now distributed over multiple GPUs in contrast to yesterdays behaviour. Thank you!
ID: 2267 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
tito

Send message
Joined: 18 Aug 13
Posts: 2
Credit: 1,003,792
RAC: 0
Message 2268 - Posted: 25 Mar 2019, 18:58:05 UTC
Last modified: 25 Mar 2019, 18:59:44 UTC

Only errors here.
GTX645; newest drivers 418
Anyone can help?

PG PPS sieve works fine.
ID: 2268 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
glennpat

Send message
Joined: 23 Aug 11
Posts: 5
Credit: 84,950,248
RAC: 0
Message 2269 - Posted: 25 Mar 2019, 19:05:58 UTC - in response to Message 2261.  

Yesterday was triple the output of the Formula BOINC Sprint's best day.


Unfortunately, I don't think those numbers are accurate, just based on the numbers of results coming back. It's correlated to the obscenely high GPU credits being awarded. Maybe it's time I change to "Credit New", from the current "Credit based on runtime"?


I do believe you need to change the credit system. On my GPUs it takes running about 3 WUs to fully load the GPU. If I run 6 WUs per GPU I get twice the credit, but the amount of work done is probably about the same. I only ran 6 WUs per GPU on 1 GPU for a couple of hours. I started to run other ones for a few minutes, but then backed off to 3 since it doesn't really seem fair.

Thanks for the great job. Things do run really well.
ID: 2269 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1344
Credit: 530,216,564
RAC: 559,997
Message 2270 - Posted: 25 Mar 2019, 20:12:54 UTC - in response to Message 2268.  

Only errors here.
GTX645; newest drivers 418
Anyone can help?

PG PPS sieve works fine.


What version cuda do you have?
If its pre 10.1 that could be the problem. I might have to update the plan_class to reflect the cuda version.
ID: 2270 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : News : GPU app - beta version for linux nvidia


Main page · Your account · Message boards


Copyright © 2024 Arizona State University