Double credits for problematic work units.

Message boards : News : Double credits for problematic work units.

To post messages, you must log in.

AuthorMessage
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 685
Credit: 46,255,262
RAC: 39,614
Message 1432 - Posted: 1 Jan 2016, 22:11:39 UTC

As has been discussed in the message boards, there are a relatively small set of really long running WUs. Some of these may take up to 8 days or even longer.

As an incentive to users to not abort these WUs, double credits will be granted for them.

The bad work units have names of the form Qsqrt421_DS1x5_CV2_S815_*to* OR Qsqrt421_DS1x8_CV1_S815_*to*.

There are about 500 such WUs, of which ~150 are of the really slow variety.
ID: 1432 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Leopold

Send message
Joined: 29 Dec 15
Posts: 5
Credit: 127,662
RAC: 0
Message 1456 - Posted: 7 Jan 2016, 2:47:08 UTC - in response to Message 1432.  

Any idea why these, in particular, are taking so long? I think I have one of them running now. It's been going for about 2 days and 13hrs so far and it's only at about 13.5% done. The estimated remaining time is currently reading about 16 days and 13hrs, but the estimate is still increasing! So I guess it will run for 14 more days at least (probably more, based on the increasing time). I'll miss the deadline (14 Jan) at this rate though. Will it still count even if it comes in late (by a week or more)?
ID: 1456 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 685
Credit: 46,255,262
RAC: 39,614
Message 1457 - Posted: 7 Jan 2016, 6:42:29 UTC - in response to Message 1456.  

Any idea why these, in particular, are taking so long? I think I have one of them running now. It's been going for about 2 days and 13hrs so far and it's only at about 13.5% done. The estimated remaining time is currently reading about 16 days and 13hrs, but the estimate is still increasing! So I guess it will run for 14 more days at least (probably more, based on the increasing time). I'll miss the deadline (14 Jan) at this rate though. Will it still count even if it comes in late (by a week or more)?


Yes, it will still count. Only problem is if it gets reissued to someone else who then finishes it first. Just let me know if this happens and I will get you the credit.

As for why this is happening, it was a bad formula for predicting the location of fast/slow regions. The formula worked well for the bounded app for a while - it turned out the formula had a small dependence on discriminant which caused the predictions to slowly drift as time went on. This same problematic formula was used for the Qsqrt421 cases, where the problem is even more pronounced. The good news is I now have a more precise fomula which appears to be working very well (in offline testing). So the next batches of WUs should not suffer from this problem.
ID: 1457 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Leopold

Send message
Joined: 29 Dec 15
Posts: 5
Credit: 127,662
RAC: 0
Message 1462 - Posted: 7 Jan 2016, 23:44:13 UTC - in response to Message 1457.  

Only problem is if it gets reissued to someone else who then finishes it first. Just let me know if this happens and I will get you the credit.


I suspect this may happen because my progress is only up to 15% after 3 days and ~11hrs, and the estimated remaining time is still climbing. Currently at 19 days and ~13hrs... I may just cut this one loose and move on!

Thanks for the explanation!
ID: 1462 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 685
Credit: 46,255,262
RAC: 39,614
Message 1463 - Posted: 8 Jan 2016, 0:59:51 UTC - in response to Message 1462.  

Only problem is if it gets reissued to someone else who then finishes it first. Just let me know if this happens and I will get you the credit.


I suspect this may happen because my progress is only up to 15% after 3 days and ~11hrs, and the estimated remaining time is still climbing. Currently at 19 days and ~13hrs... I may just cut this one loose and move on!

Thanks for the explanation!


I tend to ignore the estimated time- BOINC assumes the time is uniformly distributed, which it is not for these WUs, so can lead to erroneous estimates.

With that said, 15% after 3 days does not seem encouraging. You are probably looking at 2 weeks to complete. I wouldn't blame you if you decided to abort it.
ID: 1463 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Sajjad Imam*
Avatar

Send message
Joined: 5 Jun 12
Posts: 3
Credit: 1,001,207
RAC: 637
Message 1477 - Posted: 11 Jan 2016, 10:20:58 UTC

Didn't realize that I had been running one of these behemoth task for the past six days! Only 50% done so far. So, should I keep it running?
http://numberfields.asu.edu/NumberFields/result.php?resultid=13643497
ID: 1477 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 685
Credit: 46,255,262
RAC: 39,614
Message 1478 - Posted: 11 Jan 2016, 20:44:39 UTC - in response to Message 1477.  

Didn't realize that I had been running one of these behemoth task for the past six days! Only 50% done so far. So, should I keep it running?
http://numberfields.asu.edu/NumberFields/result.php?resultid=13643497


Since it's past half way, I would say let it keep going.
ID: 1478 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stappie

Send message
Joined: 18 Jan 13
Posts: 2
Credit: 6,731,219
RAC: 0
Message 1479 - Posted: 11 Jan 2016, 23:49:14 UTC

I've got two of those:
One thinks it is half way (done 10d, only 9d to go), but it is past it's due date of 2 januari.
The other one has run for 2d and takes another estimated 567d. So that one won't make the date either.

Is it interesting to let them run? I don't have any problem with it.
The machine is running 24/7 and runs 4 NumberFields tasks on an i5 processor. So other work is still done.
ID: 1479 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 685
Credit: 46,255,262
RAC: 39,614
Message 1480 - Posted: 12 Jan 2016, 16:48:14 UTC - in response to Message 1479.  

I've got two of those:
One thinks it is half way (done 10d, only 9d to go), but it is past it's due date of 2 januari.
The other one has run for 2d and takes another estimated 567d. So that one won't make the date either.

Is it interesting to let them run? I don't have any problem with it.
The machine is running 24/7 and runs 4 NumberFields tasks on an i5 processor. So other work is still done.


Well, I wouldn't worry about it taking 567days. That estimate is way too high.

Since the other is half way, I'd let it continue.
ID: 1480 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Sajjad Imam*
Avatar

Send message
Joined: 5 Jun 12
Posts: 3
Credit: 1,001,207
RAC: 637
Message 1481 - Posted: 13 Jan 2016, 7:17:14 UTC - in response to Message 1477.  

Didn't realize that I had been running one of these behemoth task for the past six days! Only 50% done so far. So, should I keep it running?
http://numberfields.asu.edu/NumberFields/result.php?resultid=13643497

OOPS! Did I say 50%! Not even close!
After running 188 hrs, it just hit the 50% mark! :D
ID: 1481 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Matt Kowal
Avatar

Send message
Joined: 12 Oct 13
Posts: 9
Credit: 13,647,302
RAC: 41,542
Message 1482 - Posted: 13 Jan 2016, 8:55:42 UTC
Last modified: 13 Jan 2016, 8:56:14 UTC

This WU took 17 days to process on my AMD A8-3850 :)

http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12351564
ID: 1482 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Sajjad Imam*
Avatar

Send message
Joined: 5 Jun 12
Posts: 3
Credit: 1,001,207
RAC: 637
Message 1483 - Posted: 13 Jan 2016, 10:12:38 UTC - in response to Message 1482.  

This WU took 17 days to process on my AMD A8-3850 :)

http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12351564

At least, the granted credit eases the pain a bit! ;)
ID: 1483 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Capital Avionics

Send message
Joined: 15 Mar 15
Posts: 11
Credit: 57,397,886
RAC: 37,447
Message 1484 - Posted: 13 Jan 2016, 23:41:11 UTC

I have 5 of these long-running q-squirts running on one of my machines with about 21 days of combined accumulated crunch time. My account page says two of them have already deadlined and been redistributed, but they're still running on this end. When they finally get done (in 844 days, according to BOINC), how will I know how long they took? Or will they be uploaded as normal completions?
ID: 1484 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Capital Avionics

Send message
Joined: 15 Mar 15
Posts: 11
Credit: 57,397,886
RAC: 37,447
Message 1485 - Posted: 14 Jan 2016, 3:41:28 UTC

Just checked my remote machines...found one WU has an estimated remaining time of 5.665 years. Running XP will probably be illegal by then...
ID: 1485 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 685
Credit: 46,255,262
RAC: 39,614
Message 1486 - Posted: 14 Jan 2016, 13:02:32 UTC - in response to Message 1485.  

I apologize for not responding sooner, I am in France attending a conference and I haven't had much time to check the message boards.

I don't believe any of these cases should take more than a few weeks. That is based on timing of earlier cases. Of course it depends on the speed of the host, how often it is interrupted, ratio of threads running to number of physical cores, etc.

Even when the deadline is missed, it will be treated normally. The exception is when someone else returns it first, but credit can be awarded manually in that case (you just need to let me know)
ID: 1486 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Leopold

Send message
Joined: 29 Dec 15
Posts: 5
Credit: 127,662
RAC: 0
Message 1489 - Posted: 15 Jan 2016, 2:49:45 UTC - in response to Message 1486.  

Well, it's been 10 days and about 14hrs. 25.5% complete. Estimated remaining is 29days and about 7hrs, still rising... Even so, I've decided to see this one through to the end. I'm a bit curious to see just how long it goes. Hopefully it's doing something useful! I guess you can expect to see a PM from me in about a month apparently. Thanks for being so attentive to the message boards, hope your conference is going well!

- Leo
ID: 1489 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
fractal

Send message
Joined: 12 Jul 12
Posts: 9
Credit: 10,000,929
RAC: 0
Message 1490 - Posted: 16 Jan 2016, 20:00:51 UTC - in response to Message 1489.  

Leo

The only work unit I see for you that has timed out is http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12296215 which has already been completed by someone else. It took him 22 days on a much faster processor than yours.
ID: 1490 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Leopold

Send message
Joined: 29 Dec 15
Posts: 5
Credit: 127,662
RAC: 0
Message 1491 - Posted: 16 Jan 2016, 21:57:08 UTC - in response to Message 1490.  

Frac,
I'm not sure what happened, I don't think there was anyone else on this workunit before, but now I see that other completed task there, so who knows. Maybe I'm not remembering it correctickeley... But, I've come so far already, too late to turn back now, never give up, even if the odds are against you and everyone is saying NO! I'm going for it! Let my other projects suffer the loss of a core! NumberFields, This Core's for you!

- Leo
ID: 1491 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stappie

Send message
Joined: 18 Jan 13
Posts: 2
Credit: 6,731,219
RAC: 0
Message 1496 - Posted: 19 Jan 2016, 7:23:38 UTC

The WU's I mentioned seem to run faster at the end, as you said. One is gone now (and my average credit spiked) and the other one is at 99% after 9 days. So all is well at the end.
ID: 1496 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Leopold

Send message
Joined: 29 Dec 15
Posts: 5
Credit: 127,662
RAC: 0
Message 1532 - Posted: 4 Feb 2016, 0:33:42 UTC

Yay! My workunit finally finished today! Final elapsed time: 29 days, 23 hours, 34 minutes, and 59 seconds!
ID: 1532 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : News : Double credits for problematic work units.


Main page · Your account · Message boards


Copyright © 2017 Arizona State University