Posts by fractal

1) Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run? (Message 1513)
Posted 26 Jan 2016 by fractal
Post:
All done!

Current status on i7-2700k/stock

wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551 finished after 31 days 12 hours 4 min 37 sec for 128,955.42 credits. Two Wingmen running.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-63_N1_-645to581 finished after 27 days 22 hours 2 min 16 sec for 114,280.69 credits. 1 wingman running. 1 wingman aborted after a million seconds.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-72_N1_-775to702 finished after 15 days 20 hours 16 min 47 sec for 64,859.97 credits. Wingman finished after 11 days 7 hours 32 min 15 sec for no credit.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-72_N1_-8018to-3291_0 finished after 7 days 18 hours 51 min 2 sec for 31,743.02 credits. No longer in system.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-73_N1_-8020to-3290_0 finished after 8 days 7 hours 11 min 27 sec. No longer in system.

How do the "DS1x5" work units map onto http://numberfields.asu.edu/NumberFields/batch_status.html?
2) Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run? (Message 1505)
Posted 23 Jan 2016 by fractal
Post:
I just had a long one finish so here is my current status on i7-2700k/stock

wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551 53% after 28d03h. Two Wingmen running.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-63_N1_-645to581 finished after 27 days 22 hours 2 min 16 sec for 114,280.69 credits. 1 wingman running. 1 wingman aborted after a million seconds.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-72_N1_-775to702 finished after 15 days 20 hours 16 min 47 sec for 64,859.97 credits. Wingman finished after 11 days 7 hours 32 min 15 sec for no credit.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-72_N1_-8018to-3291_0 finished after 7 days 18 hours 51 min 2 sec for 31,743.02 credits. No wingman.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-73_N1_-8020to-3290_0 finished after 8 days 7 hours 11 min 27 sec. No longer in system. No wingman.

I am currently set at "no new tasks" until I can look up the config setting to limit my machines to 2 work units. I can absorb a couple with incorrect estimates but having it too many would confuse boinc too much.
3) Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run? (Message 1498)
Posted 19 Jan 2016 by fractal
Post:
Found one task on my server, http://numberfields.asu.edu/NumberFields/result.php?resultid=13633161

28k points. (WTF)

6 days and 6 hours runtime. :O

DidnĀ“t guessed that I got some decic tasks.

It could be worse. zombie finished it the day after you did. He got it before you and finished it after. His machine spent 1.5 million seconds working in.

My two active ones are still running. Both in the mid 50's of completion with 25 days of runtime. Both have two wingmen. One of my wingmen aborted his after 1.5 million seconds.
4) Message boards : News : Double credits for problematic work units. (Message 1490)
Posted 16 Jan 2016 by fractal
Post:
Leo

The only work unit I see for you that has timed out is http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12296215 which has already been completed by someone else. It took him 22 days on a much faster processor than yours.
5) Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run? (Message 1474)
Posted 10 Jan 2016 by fractal
Post:
Current status on i7-2700k/stock

wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551 44% after 16d01h. Wingman running.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-63_N1_-645to581 47% after 16d01h. Wingman running.
wu_Qsqrt421_DS1x5_CV2_S815_N2_-72_N1_-775to702 finished today after 15 days 20 hours 16 min 47 sec. Was at 60% yesterday. Wingman still running.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-72_N1_-8018to-3291_0 finished after 7 days 18 hours 51 min 2 sec. No wingman.
wu_Qsqrt421_DS1x8_CV1_S815_N2_-73_N1_-8020to-3290_0 finished after 8 days 7 hours 11 min 27 sec. No wingman.
6) Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run? (Message 1445)
Posted 5 Jan 2016 by fractal
Post:
I wouldn't count on 8 days. The spreadsheet doesn't paste very well but it shows the % complete for the past 4 days on three DX1x5 units.

DS1x5_CV2_S815 2-Jan 3-Jan 4-Jan 5-Jan

-61 36.4 36.8 37.6 38.2
-63 36.2 36.8 38.3 39.2
-73 46.1 50.8 54.7 56.2

So, the wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551_0 has been running 10d,23:41:37 and is 38.255% complete. It has completed 2% in 3 days. The -73 is doing better. This is on an i7-2700k/stock if that matters.

http://numberfields.asu.edu/NumberFields/workunit.php?wuid=12350747 timed out for me and has been given to someone else along with the others. I wonder if his opteron will catch up with the 10 day head start my i7 has ;)
7) Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run? (Message 1428)
Posted 1 Jan 2016 by fractal
Post:
They aren't hurting anything so I'll let them run.

It looks like it got 3 more points written to stderr since last night when I hit the return key 4 times on the "tail -f stderr.txt"

          N1 = -197.
          N1 = -196.




          N1 = -195.
          N1 = -194.
          N1 = -193.


Am I guessing correctly that it will continue to run until N1 counts up from -226 to +551?
8) Message boards : Number crunching : Long running wu_Qsqrt421_DS1x5 units - how long to let them run? (Message 1424)
Posted 1 Jan 2016 by fractal
Post:
I have three work units on one of my machines that have been running for over six days now. They are all currently 17 hours past deadline. They all have been stuck at around 35% complete for the past three days. All three are wu_Qsqrt421_DS1x5_CV2_S815_N2_-<this part varies>

stderr contains

Opening output file ../../projects/numberfields.asu.edu_NumberFields/wu_Qsqrt421_DS1x5_CV2_S815_N2_-61_N1_-613to551_0_0
Now starting the targeted Martinet search:
    N2_L = -61.
    N2_U = -61.
      N2 = -61.
        N1_L = -226.
        N1_U = 551.
          N1 = -226.
          N1 = -225.
          N1 = -224.
          N1 = -223.
          N1 = -222.
          N1 = -221.
          N1 = -220.
          N1 = -219.
          N1 = -218.
          N1 = -217.
          N1 = -216.
          N1 = -215.
          N1 = -214.
          N1 = -213.
          N1 = -212.
          N1 = -211.
          N1 = -210.
          N1 = -209.
          N1 = -208.
          N1 = -207.
          N1 = -206.
          N1 = -205.
          N1 = -204.
          N1 = -203.
          N1 = -202.
          N1 = -201.
          N1 = -200.
          N1 = -199.
          N1 = -198.
          N1 = -197.
          N1 = -196.

and has occasionally has new numbers added to it.

The computer in question has been on the project for over a year now and no errors other than the work units I aborted the other day before they even started.

I don't mind letting them run if they might complete but figured I would ask if they are just wasting time.
9) Message boards : Number crunching : Process got signal 11 (Message 687)
Posted 13 Jul 2012 by fractal
Post:
I am having similar problems on my RHEL5 x64 hosts,
...
% ./GetBoundedDecics_1.07_x86_64-pc-linux-gnu
FATAL: kernel too old
Segmentation fault (core dumped)


We fixed the issue that was causing some Suse distros to fail. However, it looks like we are using a syscall that was added since kernel 2.6.18. I'm not sure which one or if we can avoid it. Looking through the kernel git log for the syscall table I think the last one was added in 2008. Perhaps you could run an strace on your RHEL5 system to see which syscall fails and I can look up when it was added.


boinc@plum:~/BOINC/projects/numberfields.asu.edu_NumberFields$ ./GetBoundedDecics_2.03_x86_64-pc-linux-gnu
FATAL: kernel too old
Segmentation fault (core dumped)
boinc@plum:~/BOINC/projects/numberfields.asu.edu_NumberFields$ strace ./GetBoundedDecics_2.03_x86_64-pc-linux-gnu
execve("./GetBoundedDecics_2.03_x86_64-pc-linux-gnu", ["./GetBoundedDecics_2.03_x86_64-p"...], [/* 19 vars */]) = 0
uname({sys="Linux", node="plum", ...}) = 0
open("/dev/tty", O_RDWR|O_NONBLOCK|O_NOCTTY) = 3
writev(3, [{"FATAL: kernel too old\n", 22}], 1FATAL: kernel too old
) = 22
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aece8177000
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV (core dumped) +++
Process 12651 detached
boinc@plum:~/BOINC/projects/numberfields.asu.edu_NumberFields$ uname -a
Linux plum 2.6.20-15-generic #2 SMP Sun Apr 15 06:17:24 UTC 2007 x86_64 GNU/Linux

-----

I will upgrade that machine to a modern kernel now that I found this thread.





Main page · Your account · Message boards


Copyright © 2024 Arizona State University