Message boards :
Number crunching :
Long running wu_Qsqrt421_DS1x5 units - how long to let them run?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
Send message Joined: 17 Dec 15 Posts: 6 Credit: 9,414,792 RAC: 0 |
using application 2.08 windows x86 64 wu Qsqrt421_D53x8_CV1_S815_N2 then a. 194161_N1_805973_k2_-2_1 and b. 194163_N1_805977_k2_-1_1 Remaining time listed as 00:00:00 for both and 100% complete a shows CPU time last chkpt 3d 17:28:51 CPU time 4d 07:39:02 Elapsed time 5d 18:34:51 b shows CPU time last chkpt 3d 01:46:36 CPU time 3d 14:23:49 Elapsed time 4d 20:56:25 Are they finished but continue to run? Should I abort them? I have a 3rd related wu running on the same machine for which the CPU time at last chkpt and CPU total are close with 3+ days elapsed and 2+ days remaining and 57+ percent complete. |
Send message Joined: 25 Feb 13 Posts: 216 Credit: 9,899,302 RAC: 0 |
I have seen the progress meter go to 100.000% and the WU still continues processing for another few hours. I believe what is happening is that the progress is really 99.9995% and the client is rounding it up to 100. No need to worry that it's stuck; the WU will eventually finish. Just wait and drink tea. :) |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 543,091,180 RAC: 610,505 |
It was discovered there was a bug in the code that reports the progress to the client (always reports 0) so BOINC is doing it's own estimate and is wrong. Other than that, there is nothing wrong with the WU and it will eventually finish, but it could be another several days (so I hear from others, I personally have only waited a few more hours). Version 2.10 of the app fixes the progress meter problem. |
Send message Joined: 17 Dec 15 Posts: 6 Credit: 9,414,792 RAC: 0 |
Still going - drinking tea, etc. - I'll decide tomorrow . . . |
Send message Joined: 17 Dec 15 Posts: 6 Credit: 9,414,792 RAC: 0 |
All 3 wus are still going The first 2 still have remaining time listed as 00:00:00 for both and 100% complete a) shows CPU time last chkpt 4d 14:28 (up from 3d 17:28:51) CPU time 5d 01:21 (up from 4d 07:39:02) Elapsed time 6d 16:58 (up from 5d 18:34:51) b) shows CPU time last chkpt 4d 05:21 (up from 3d 01:46:36) CPU time 4d 08:12 (up from 3d 14:23:49) Elapsed time 5d 19:19 (up from 4d 20:56:25) I assume this indicates progress or should I abort them? The 3rd related wu running on the same machine has made demonstrable progress: the CPU time at last chkpt and CPU total remain close and are now 3d 08:48 & 3d 10:20, now with 4+ days elapsed and 1+ days remaining and 71.5% (previous 57+ percent) complete. |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 543,091,180 RAC: 610,505 |
Since they are making progress I would let them continue. I promise you they will finish. |
Send message Joined: 23 Feb 13 Posts: 29 Credit: 21,480,710 RAC: 0 |
Beside the primegrid race I have two of these WUs left. One of them "stuck" at 100% and another other holding at 57% for several days. I will continue both of them, the program runs quite stable (excluding the PARI Crashes) and I've done many of these WUs in the past. All of them have been granted credit. |
Send message Joined: 17 Dec 15 Posts: 6 Credit: 9,414,792 RAC: 0 |
All 3 wus are still going The first 2 still have remaining time listed as 00:00:00 for both and 100% complete a) shows CPU time last chkpt 5d 11:30 (up from 4d 14:28 & 3d 17:28:51) CPU time 5d 23:06 (up from 5d 01:21 & 4d 07:39:02) Elapsed time 7d 19:46 (up from 6d 16:58 & 5d 18:34:51) b) shows CPU time last chkpt 4d 18:56 (up from 4d 05:21 & 3d 01:46:36) CPU time 5d 05:50 (up from 4d 08:12 & 3d 14:23:49) Elapsed time 6d 22:06 (up from 5d 19:19 & 4d 20:56:25) I assume this indicates enough progress to continue. The 3rd related wu running on the same machine is gone (I assume completed) |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 543,091,180 RAC: 610,505 |
It looks like you completed a long one yesterday. I wonder if that's it. http://numberfields.asu.edu/NumberFields/results.php?hostid=26919 |
Send message Joined: 17 Dec 15 Posts: 6 Credit: 9,414,792 RAC: 0 |
wu 15560839 is still running, it is the one I listed as a on previous posts. CPU time last chkpt 6d 08:51 from 5d 11:30 (up from 4d 14:28 & 3d 17:28:51) CPU time 6d 19:41 from 5d 23:06 (up from 5d 01:21 & 4d 07:39:02) Elapsed time 8d 20:00 from 7d 19:46 (up from 6d 16:58 & 5d 18:34:51) This is truly long . . . but still running 15565332 finished 22 April and 15161473 finished 22 April and appear to be credited appropriately, thanks. How do I upgrade the app to 2.10 - just wait until this wu finishes and then reset project or remove and then add project? |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 543,091,180 RAC: 610,505 |
The client picks it up automatically. Looking at your task list, it looks like your more recent jobs have been using it, so you should be good. |
Send message Joined: 17 Dec 15 Posts: 6 Credit: 9,414,792 RAC: 0 |
At last . . . It finished with CPU time of 624,877.40 sec and total time of 805,908.80 sec or 173.57 hr and 223.88 hr or 7.31d and 9.33d. I noticed that on most work units the total time is closer to the CPU time and credit seems to be based on the CPU time. But I am glad that it did go to completion! |