Message boards :
News :
Double credits for problematic work units.
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 Jul 11 Posts: 1341 Credit: 492,828,056 RAC: 552,132 |
As has been discussed in the message boards, there are a relatively small set of really long running WUs. Some of these may take up to 8 days or even longer. As an incentive to users to not abort these WUs, double credits will be granted for them. The bad work units have names of the form Qsqrt421_DS1x5_CV2_S815_*to* OR Qsqrt421_DS1x8_CV1_S815_*to*. There are about 500 such WUs, of which ~150 are of the really slow variety. |
Send message Joined: 29 Dec 15 Posts: 5 Credit: 127,662 RAC: 0 |
Any idea why these, in particular, are taking so long? I think I have one of them running now. It's been going for about 2 days and 13hrs so far and it's only at about 13.5% done. The estimated remaining time is currently reading about 16 days and 13hrs, but the estimate is still increasing! So I guess it will run for 14 more days at least (probably more, based on the increasing time). I'll miss the deadline (14 Jan) at this rate though. Will it still count even if it comes in late (by a week or more)? |
Send message Joined: 8 Jul 11 Posts: 1341 Credit: 492,828,056 RAC: 552,132 |
Any idea why these, in particular, are taking so long? I think I have one of them running now. It's been going for about 2 days and 13hrs so far and it's only at about 13.5% done. The estimated remaining time is currently reading about 16 days and 13hrs, but the estimate is still increasing! So I guess it will run for 14 more days at least (probably more, based on the increasing time). I'll miss the deadline (14 Jan) at this rate though. Will it still count even if it comes in late (by a week or more)? Yes, it will still count. Only problem is if it gets reissued to someone else who then finishes it first. Just let me know if this happens and I will get you the credit. As for why this is happening, it was a bad formula for predicting the location of fast/slow regions. The formula worked well for the bounded app for a while - it turned out the formula had a small dependence on discriminant which caused the predictions to slowly drift as time went on. This same problematic formula was used for the Qsqrt421 cases, where the problem is even more pronounced. The good news is I now have a more precise fomula which appears to be working very well (in offline testing). So the next batches of WUs should not suffer from this problem. |
Send message Joined: 29 Dec 15 Posts: 5 Credit: 127,662 RAC: 0 |
Only problem is if it gets reissued to someone else who then finishes it first. Just let me know if this happens and I will get you the credit. I suspect this may happen because my progress is only up to 15% after 3 days and ~11hrs, and the estimated remaining time is still climbing. Currently at 19 days and ~13hrs... I may just cut this one loose and move on! Thanks for the explanation! |
Send message Joined: 8 Jul 11 Posts: 1341 Credit: 492,828,056 RAC: 552,132 |
Only problem is if it gets reissued to someone else who then finishes it first. Just let me know if this happens and I will get you the credit. I tend to ignore the estimated time- BOINC assumes the time is uniformly distributed, which it is not for these WUs, so can lead to erroneous estimates. With that said, 15% after 3 days does not seem encouraging. You are probably looking at 2 weeks to complete. I wouldn't blame you if you decided to abort it. |
Send message Joined: 5 Jun 12 Posts: 3 Credit: 2,655,424 RAC: 0 |
Didn't realize that I had been running one of these behemoth task for the past six days! Only 50% done so far. So, should I keep it running? http://numberfields.asu.edu/NumberFields/result.php?resultid=13643497 |
Send message Joined: 8 Jul 11 Posts: 1341 Credit: 492,828,056 RAC: 552,132 |
Didn't realize that I had been running one of these behemoth task for the past six days! Only 50% done so far. So, should I keep it running? Since it's past half way, I would say let it keep going. |
Send message Joined: 18 Jan 13 Posts: 2 Credit: 6,731,219 RAC: 0 |
I've got two of those: One thinks it is half way (done 10d, only 9d to go), but it is past it's due date of 2 januari. The other one has run for 2d and takes another estimated 567d. So that one won't make the date either. Is it interesting to let them run? I don't have any problem with it. The machine is running 24/7 and runs 4 NumberFields tasks on an i5 processor. So other work is still done. |
Send message Joined: 8 Jul 11 Posts: 1341 Credit: 492,828,056 RAC: 552,132 |
I've got two of those: Well, I wouldn't worry about it taking 567days. That estimate is way too high. Since the other is half way, I'd let it continue. |
Send message Joined: 5 Jun 12 Posts: 3 Credit: 2,655,424 RAC: 0 |
Didn't realize that I had been running one of these behemoth task for the past six days! Only 50% done so far. So, should I keep it running? OOPS! Did I say 50%! Not even close! After running 188 hrs, it just hit the 50% mark! :D |
Send message Joined: 12 Oct 13 Posts: 17 Credit: 39,647,678 RAC: 4,181 |
This WU took 17 days to process on my AMD A8-3850 :) http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12351564 |
Send message Joined: 5 Jun 12 Posts: 3 Credit: 2,655,424 RAC: 0 |
This WU took 17 days to process on my AMD A8-3850 :) At least, the granted credit eases the pain a bit! ;) |
Send message Joined: 15 Mar 15 Posts: 11 Credit: 113,280,935 RAC: 0 |
I have 5 of these long-running q-squirts running on one of my machines with about 21 days of combined accumulated crunch time. My account page says two of them have already deadlined and been redistributed, but they're still running on this end. When they finally get done (in 844 days, according to BOINC), how will I know how long they took? Or will they be uploaded as normal completions? |
Send message Joined: 15 Mar 15 Posts: 11 Credit: 113,280,935 RAC: 0 |
Just checked my remote machines...found one WU has an estimated remaining time of 5.665 years. Running XP will probably be illegal by then... |
Send message Joined: 8 Jul 11 Posts: 1341 Credit: 492,828,056 RAC: 552,132 |
I apologize for not responding sooner, I am in France attending a conference and I haven't had much time to check the message boards. I don't believe any of these cases should take more than a few weeks. That is based on timing of earlier cases. Of course it depends on the speed of the host, how often it is interrupted, ratio of threads running to number of physical cores, etc. Even when the deadline is missed, it will be treated normally. The exception is when someone else returns it first, but credit can be awarded manually in that case (you just need to let me know) |
Send message Joined: 29 Dec 15 Posts: 5 Credit: 127,662 RAC: 0 |
Well, it's been 10 days and about 14hrs. 25.5% complete. Estimated remaining is 29days and about 7hrs, still rising... Even so, I've decided to see this one through to the end. I'm a bit curious to see just how long it goes. Hopefully it's doing something useful! I guess you can expect to see a PM from me in about a month apparently. Thanks for being so attentive to the message boards, hope your conference is going well! - Leo |
Send message Joined: 12 Jul 12 Posts: 9 Credit: 10,000,929 RAC: 0 |
Leo The only work unit I see for you that has timed out is http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12296215 which has already been completed by someone else. It took him 22 days on a much faster processor than yours. |
Send message Joined: 29 Dec 15 Posts: 5 Credit: 127,662 RAC: 0 |
Frac, I'm not sure what happened, I don't think there was anyone else on this workunit before, but now I see that other completed task there, so who knows. Maybe I'm not remembering it correctickeley... But, I've come so far already, too late to turn back now, never give up, even if the odds are against you and everyone is saying NO! I'm going for it! Let my other projects suffer the loss of a core! NumberFields, This Core's for you! - Leo |
Send message Joined: 18 Jan 13 Posts: 2 Credit: 6,731,219 RAC: 0 |
The WU's I mentioned seem to run faster at the end, as you said. One is gone now (and my average credit spiked) and the other one is at 99% after 9 days. So all is well at the end. |
Send message Joined: 29 Dec 15 Posts: 5 Credit: 127,662 RAC: 0 |
Yay! My workunit finally finished today! Final elapsed time: 29 days, 23 hours, 34 minutes, and 59 seconds! |