Long running wu_Qsqrt421_DS1x5 units - how long to let them run?

Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
fractal

Send message
Joined: 12 Jul 12
Posts: 9
Credit: 10,000,929
RAC: 0
Message 1498 - Posted: 19 Jan 2016, 22:52:27 UTC - in response to Message 1495.  
Last modified: 19 Jan 2016, 22:53:28 UTC

Found one task on my server, http://numberfields.asu.edu/NumberFields/result.php?resultid=13633161

28k points. (WTF)

6 days and 6 hours runtime. :O

Didn´t guessed that I got some decic tasks.

It could be worse. zombie finished it the day after you did. He got it before you and finished it after. His machine spent 1.5 million seconds working in.

My two active ones are still running. Both in the mid 50's of completion with 25 days of runtime. Both have two wingmen. One of my wingmen aborted his after 1.5 million seconds.
ID: 1498 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Aurel
Avatar

Send message
Joined: 25 Feb 13
Posts: 211
Credit: 8,816,142
RAC: 5
Message 1499 - Posted: 20 Jan 2016, 8:44:17 UTC - in response to Message 1498.  
Last modified: 20 Jan 2016, 9:04:48 UTC

Found one task on my server, http://numberfields.asu.edu/NumberFields/result.php?resultid=13633161

28k points. (WTF)

6 days and 6 hours runtime. :O

Didn´t guessed that I got some decic tasks.

It could be worse. zombie finished it the day after you did. He got it before you and finished it after. His machine spent 1.5 million seconds working in.

My two active ones are still running. Both in the mid 50's of completion with 25 days of runtime. Both have two wingmen. One of my wingmen aborted his after 1.5 million seconds.


You mean http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12351121 and http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12350747.

Over 25 days? Which you got luck by completing them. Bounded tasks are not available, I´ll switch to decic tasks right now.

EDIT:
I haveone task running on my server. 369 hours runtime, http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12310752
ID: 1499 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 28 Oct 11
Posts: 134
Credit: 110,346,034
RAC: 29,251
Message 1500 - Posted: 20 Jan 2016, 20:50:53 UTC - in response to Message 1499.  

He's lucky. I'm celebrating, because my Christmas Eve task (WU 12347065) - wu_Qsqrt421_DS1x5_CV2_S815_N2_-55_N1_-518to462 - has just reached positive territory.

          N1 = 5.

26 days, 53.346% progress - and counting.

(pssst - I think it might be starting to speed up)
ID: 1500 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 28 Oct 11
Posts: 134
Credit: 110,346,034
RAC: 29,251
Message 1502 - Posted: 22 Jan 2016, 13:18:42 UTC - in response to Message 1500.  

WU 12347065 is back - 2,410,576.86 seconds!
ID: 1502 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
fractal

Send message
Joined: 12 Jul 12
Posts: 9
Credit: 10,000,929
RAC: 0
Message 1505 - Posted: 23 Jan 2016, 2:15:59 UTC

I just had a long one finish so here is my current status on i7-2700k/stock

wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551 53% after 28d03h. Two Wingmen running.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-63_N1_-645to581 finished after 27 days 22 hours 2 min 16 sec for 114,280.69 credits. 1 wingman running. 1 wingman aborted after a million seconds.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-72_N1_-775to702 finished after 15 days 20 hours 16 min 47 sec for 64,859.97 credits. Wingman finished after 11 days 7 hours 32 min 15 sec for no credit.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-72_N1_-8018to-3291_0 finished after 7 days 18 hours 51 min 2 sec for 31,743.02 credits. No wingman.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-73_N1_-8020to-3290_0 finished after 8 days 7 hours 11 min 27 sec. No longer in system. No wingman.

I am currently set at "no new tasks" until I can look up the config setting to limit my machines to 2 work units. I can absorb a couple with incorrect estimates but having it too many would confuse boinc too much.
ID: 1505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
fractal

Send message
Joined: 12 Jul 12
Posts: 9
Credit: 10,000,929
RAC: 0
Message 1513 - Posted: 26 Jan 2016, 17:56:43 UTC

All done!

Current status on i7-2700k/stock

wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551 finished after 31 days 12 hours 4 min 37 sec for 128,955.42 credits. Two Wingmen running.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-63_N1_-645to581 finished after 27 days 22 hours 2 min 16 sec for 114,280.69 credits. 1 wingman running. 1 wingman aborted after a million seconds.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-72_N1_-775to702 finished after 15 days 20 hours 16 min 47 sec for 64,859.97 credits. Wingman finished after 11 days 7 hours 32 min 15 sec for no credit.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-72_N1_-8018to-3291_0 finished after 7 days 18 hours 51 min 2 sec for 31,743.02 credits. No longer in system.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-73_N1_-8020to-3290_0 finished after 8 days 7 hours 11 min 27 sec. No longer in system.

How do the "DS1x5" work units map onto http://numberfields.asu.edu/NumberFields/batch_status.html?
ID: 1513 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 28 Oct 11
Posts: 134
Credit: 110,346,034
RAC: 29,251
Message 1514 - Posted: 26 Jan 2016, 19:23:56 UTC

ID: 1514 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 945
Credit: 104,217,744
RAC: 68,254
Message 1516 - Posted: 27 Jan 2016, 19:05:34 UTC - in response to Message 1513.  

All done!

Current status on i7-2700k/stock

wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551 finished after 31 days 12 hours 4 min 37 sec for 128,955.42 credits. Two Wingmen running.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-63_N1_-645to581 finished after 27 days 22 hours 2 min 16 sec for 114,280.69 credits. 1 wingman running. 1 wingman aborted after a million seconds.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-72_N1_-775to702 finished after 15 days 20 hours 16 min 47 sec for 64,859.97 credits. Wingman finished after 11 days 7 hours 32 min 15 sec for no credit.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-72_N1_-8018to-3291_0 finished after 7 days 18 hours 51 min 2 sec for 31,743.02 credits. No longer in system.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-73_N1_-8020to-3290_0 finished after 8 days 7 hours 11 min 27 sec. No longer in system.

How do the "DS1x5" work units map onto http://numberfields.asu.edu/NumberFields/batch_status.html?


Thanks for the update!

I'm still getting caught up after my 2 week hiatus, so haven't had a chance yet to reassess the status of this special search. Right now, there is no mapping to the batch status. Originally it was hoped this search would be relatively quick and would not need a status table, but I guess that is not the case. Hopefully I will have some time this weekend to add such a table.
ID: 1516 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jondi_hanluc

Send message
Joined: 10 Dec 12
Posts: 5
Credit: 22,083,545
RAC: 0
Message 1519 - Posted: 28 Jan 2016, 10:49:31 UTC

The pattern on these WUs I've noticed is that they speed up at around 62-63%, once they've hit that percentage they using complete within 24hrs.
ID: 1519 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
farnost litomysl

Send message
Joined: 5 Nov 11
Posts: 1
Credit: 5,827,360
RAC: 0
Message 1527 - Posted: 2 Feb 2016, 6:43:59 UTC
Last modified: 2 Feb 2016, 6:46:02 UTC

Hi. Wu:
http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12731827
is up and running.
Now 400 hours and it is going to end.
ID: 1527 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jondi_hanluc

Send message
Joined: 10 Dec 12
Posts: 5
Credit: 22,083,545
RAC: 0
Message 1536 - Posted: 6 Feb 2016, 2:27:38 UTC
Last modified: 6 Feb 2016, 3:18:50 UTC

http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12351452

I have just completed this Work Unit, after 2,366,625 secs (27.39 days) I get: "Completed, too late to validate" and zero credit :(
It ran 24/7 since the day I got it.

Edit:
Is it because one above had already been completed that I got no credit? Because I've now notice that so has this one http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12348078 that I'm currently 27 days into crunching, shall I just abort it as it's already been completed? shame about my 27 days wasted on it however :(
ID: 1536 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 945
Credit: 104,217,744
RAC: 68,254
Message 1537 - Posted: 6 Feb 2016, 5:22:36 UTC - in response to Message 1536.  

http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12351452

I have just completed this Work Unit, after 2,366,625 secs (27.39 days) I get: "Completed, too late to validate" and zero credit :(
It ran 24/7 since the day I got it.

Edit:
Is it because one above had already been completed that I got no credit? Because I've now notice that so has this one http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12348078 that I'm currently 27 days into crunching, shall I just abort it as it's already been completed? shame about my 27 days wasted on it however :(


No worries. I went ahead and granted you the canonical credit on both WUs. You can abort the one still running; it sounds like it should be close to finishing but no reason to waste the cpu cycles.

To answer your question... yes, someone returned the WU before you. It looks like it had originally timed out for them, the WU was reissued to you, and then they returned it after you had started it. When that happens, you won't get credit unless you return it within the grace period (which is hard to do with these really long WUs).
ID: 1537 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jondi_hanluc

Send message
Joined: 10 Dec 12
Posts: 5
Credit: 22,083,545
RAC: 0
Message 1538 - Posted: 6 Feb 2016, 9:35:39 UTC - in response to Message 1537.  

http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12351452

I have just completed this Work Unit, after 2,366,625 secs (27.39 days) I get: "Completed, too late to validate" and zero credit :(
It ran 24/7 since the day I got it.

Edit:
Is it because one above had already been completed that I got no credit? Because I've now notice that so has this one http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12348078 that I'm currently 27 days into crunching, shall I just abort it as it's already been completed? shame about my 27 days wasted on it however :(


No worries. I went ahead and granted you the canonical credit on both WUs. You can abort the one still running; it sounds like it should be close to finishing but no reason to waste the cpu cycles.

To answer your question... yes, someone returned the WU before you. It looks like it had originally timed out for them, the WU was reissued to you, and then they returned it after you had started it. When that happens, you won't get credit unless you return it within the grace period (which is hard to do with these really long WUs).


Thank you very much.

I have found another one http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12404933 which someone else has already completed, I'm 2 weeks into this but if the grace period is 10 days after the expiry I should be OK.

I still have another 3 long ones, 2 at around 30 days, one at 40 days, so far no one else has completed them but I'll give you a shout if they do if that's OK.
ID: 1538 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 945
Credit: 104,217,744
RAC: 68,254
Message 1539 - Posted: 6 Feb 2016, 16:32:00 UTC - in response to Message 1538.  

http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12351452

I have just completed this Work Unit, after 2,366,625 secs (27.39 days) I get: "Completed, too late to validate" and zero credit :(
It ran 24/7 since the day I got it.

Edit:
Is it because one above had already been completed that I got no credit? Because I've now notice that so has this one http://numberfields.asu.edu/NumberFields//workunit.php?wuid=12348078 that I'm currently 27 days into crunching, shall I just abort it as it's already been completed? shame about my 27 days wasted on it however :(


No worries. I went ahead and granted you the canonical credit on both WUs. You can abort the one still running; it sounds like it should be close to finishing but no reason to waste the cpu cycles.

To answer your question... yes, someone returned the WU before you. It looks like it had originally timed out for them, the WU was reissued to you, and then they returned it after you had started it. When that happens, you won't get credit unless you return it within the grace period (which is hard to do with these really long WUs).


Thank you very much.

I have found another one http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12404933 which someone else has already completed, I'm 2 weeks into this but if the grace period is 10 days after the expiry I should be OK.

I still have another 3 long ones, 2 at around 30 days, one at 40 days, so far no one else has completed them but I'll give you a shout if they do if that's OK.


Sounds good to me. Thanks!
ID: 1539 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond

Send message
Joined: 12 Aug 12
Posts: 7
Credit: 10,300,855
RAC: 816
Message 1544 - Posted: 17 Feb 2016, 21:23:55 UTC - in response to Message 1425.  

It has recently come to my attention that the Qsqrt421 cases suffer from the same problem that the Bounded app did a couple weeks ago. I am currently looking into a similar fix for these WUs.

The stderr for this WU looks particularly bad. I suspect it could take at least another 6 days to finish. I wont feel bad if you decide to kill it. Either way I will get you manual credit for the lost CPU cycles.

If anyone else has one of these bad WUs please report, either by message board or private message, so I can try and remove them from the system.

Sorry for the inconvenience!

I have 2 of these WUs running at the moment. One at 271 hours and the other at 385 hours. Leave them run?

http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12420089

http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12404866

Thanks/Ed
ID: 1544 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 945
Credit: 104,217,744
RAC: 68,254
Message 1545 - Posted: 18 Feb 2016, 5:20:02 UTC - in response to Message 1544.  

It has recently come to my attention that the Qsqrt421 cases suffer from the same problem that the Bounded app did a couple weeks ago. I am currently looking into a similar fix for these WUs.

The stderr for this WU looks particularly bad. I suspect it could take at least another 6 days to finish. I wont feel bad if you decide to kill it. Either way I will get you manual credit for the lost CPU cycles.

If anyone else has one of these bad WUs please report, either by message board or private message, so I can try and remove them from the system.

Sorry for the inconvenience!

I have 2 of these WUs running at the moment. One at 271 hours and the other at 385 hours. Leave them run?

http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12420089

http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12404866

Thanks/Ed


No one else has returned them yet, so I would say let them continue, especially given the amount of time you have already spent on them.

Thanks!
ID: 1545 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 20 Dec 14
Posts: 17
Credit: 3,635,951
RAC: 63
Message 1547 - Posted: 20 Feb 2016, 20:07:48 UTC - in response to Message 1544.  

Someone has returned a result that was validated on the first of the two work units, so you might just as well abort it and crunch some work unit that has not yet been solved. The other one has not been solved yet, so you can continue on that one.
ID: 1547 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 945
Credit: 104,217,744
RAC: 68,254
Message 1548 - Posted: 21 Feb 2016, 5:15:08 UTC - in response to Message 1547.  

Someone has returned a result that was validated on the first of the two work units, so you might just as well abort it and crunch some work unit that has not yet been solved. The other one has not been solved yet, so you can continue on that one.


Thanks for catching that Jesse.

Ed - you can go ahead and abort that WU. I gave you the canonical credit for it. Thanks!
ID: 1548 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond

Send message
Joined: 12 Aug 12
Posts: 7
Credit: 10,300,855
RAC: 816
Message 1550 - Posted: 23 Feb 2016, 1:18:30 UTC - in response to Message 1548.  

Someone has returned a result that was validated on the first of the two work units, so you might just as well abort it and crunch some work unit that has not yet been solved. The other one has not been solved yet, so you can continue on that one.

Thanks for catching that Jesse.

Ed - you can go ahead and abort that WU. I gave you the canonical credit for it. Thanks!

Thanks! The second one continues to run, now at 509 hours.
ID: 1550 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Roadranner

Send message
Joined: 3 Sep 12
Posts: 2
Credit: 15,062,217
RAC: 0
Message 1551 - Posted: 23 Feb 2016, 6:04:13 UTC - in response to Message 1424.  

After 44 days my last long running wu (task #13680649) ended up in a computation error. :-(
On the day before I saw the wu stopping for a few times without any reason.
ID: 1551 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run?


Main page · Your account · Message boards


Copyright © 2019 Arizona State University