Modification to bounded app

Message boards : News : Modification to bounded app
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Aurel
Avatar

Send message
Joined: 25 Feb 13
Posts: 216
Credit: 9,899,302
RAC: 0
Message 1200 - Posted: 19 Dec 2014, 16:53:37 UTC - in response to Message 1197.  

ID: 1200 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dennis-TW

Send message
Joined: 2 Dec 14
Posts: 3
Credit: 388,326
RAC: 0
Message 1201 - Posted: 20 Dec 2014, 0:09:46 UTC

Ding!

We have a winner here with almost 103 hours CPU Time.

http://numberfields.asu.edu/NumberFields/result.php?resultid=9788343

The second one took "only" 60 hours.

By the way, I don't know if the problem is related to the underutilized CPU one you referred to. As far as I could see it, this unit used one full core all the time and all my other cores were busy too.
ID: 1201 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Aurel
Avatar

Send message
Joined: 25 Feb 13
Posts: 216
Credit: 9,899,302
RAC: 0
Message 1202 - Posted: 20 Dec 2014, 14:33:23 UTC - in response to Message 1201.  

Nice. Eric, can you give me the informations there standing in the .out?

Seams like the app restarted the unit some times:

Signature = [2,0]
a11 = 1
a12 = 2
sig1a1 = -8.774964387392122060406388307
sig2a1 = 8.774964387392122060406388307
Ca1_pre = 30.800000
Opening output file ../../projects/numberfields.asu.edu_NumberFields/wu_12E10_SF77-0_Idx6_Grp44261of111118_0_0
Now starting the Martinet search:

Doing case a5 = 11 + -3w...
  2nd part of Martinet bound = 18.852711.
  Martinet bound = 49.652711.
  a22_L = 2.
  a22_U = 2.
  a22 = 2.
    a21_L = 31.
    a21_U = 31.
    a21 = 31.
      a32_L = -6.
      a32_U = 24.
Reading checkpoint file.
Checkpoint Flag = 1.
ID: 1202 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1321
Credit: 409,155,058
RAC: 243,749
Message 1203 - Posted: 20 Dec 2014, 16:43:06 UTC - in response to Message 1201.  

Ding!

We have a winner here with almost 103 hours CPU Time.

http://numberfields.asu.edu/NumberFields/result.php?resultid=9788343

The second one took "only" 60 hours.

By the way, I don't know if the problem is related to the underutilized CPU one you referred to. As far as I could see it, this unit used one full core all the time and all my other cores were busy too.


Ok. I manually granted you credit for the WU that went over the 100 hour cap.
ID: 1203 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1321
Credit: 409,155,058
RAC: 243,749
Message 1204 - Posted: 20 Dec 2014, 16:49:59 UTC - in response to Message 1202.  

Nice. Eric, can you give me the informations there standing in the .out?

Seams like the app restarted the unit some times:

Signature = [2,0]
a11 = 1
a12 = 2
sig1a1 = -8.774964387392122060406388307
sig2a1 = 8.774964387392122060406388307
Ca1_pre = 30.800000
Opening output file ../../projects/numberfields.asu.edu_NumberFields/wu_12E10_SF77-0_Idx6_Grp44261of111118_0_0
Now starting the Martinet search:

Doing case a5 = 11 + -3w...
  2nd part of Martinet bound = 18.852711.
  Martinet bound = 49.652711.
  a22_L = 2.
  a22_U = 2.
  a22 = 2.
    a21_L = 31.
    a21_U = 31.
    a21 = 31.
      a32_L = -6.
      a32_U = 24.
Reading checkpoint file.
Checkpoint Flag = 1.


Just to clarify... the app doesn't restart the WU. It may have been restarted for a number of reasons, including the user restarting their computer. What's important is the checkpoint mechanism is functioning as expected.
ID: 1204 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dennis-TW

Send message
Joined: 2 Dec 14
Posts: 3
Credit: 388,326
RAC: 0
Message 1205 - Posted: 21 Dec 2014, 1:04:05 UTC

Thank you Eric!

What I also saw with this WU was that it was once standing at 53% for a long time (40 hours or so), then jumping back to 28% and from there crawling to the finish line. Don't know if it helps to find reason for these longrunners.
ID: 1205 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 20 Dec 14
Posts: 17
Credit: 12,153,123
RAC: 0
Message 1206 - Posted: 21 Dec 2014, 3:30:09 UTC

All of my bounded discriminant work units report memory leaks.
ID: 1206 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 20 Dec 14
Posts: 17
Credit: 12,153,123
RAC: 0
Message 1207 - Posted: 21 Dec 2014, 5:47:00 UTC - in response to Message 1146.  

A memory leak is a bug where a program loses track of memory that it has obtained and therefore cannot release it. This wastes the leaked memory while the program runs. When the leaky program closes, a modern operating system that employs protected memory will free the leaked memory. See https://en.wikipedia.org/wiki/Memory_leak for more information on memory leaks.
ID: 1207 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1321
Credit: 409,155,058
RAC: 243,749
Message 1208 - Posted: 21 Dec 2014, 6:59:13 UTC - in response to Message 1205.  

Thank you Eric!

What I also saw with this WU was that it was once standing at 53% for a long time (40 hours or so), then jumping back to 28% and from there crawling to the finish line. Don't know if it helps to find reason for these longrunners.


That's very strange. The progress meter should never go backward. Looking at the stderr, it appears that it tried to read the checkpoint file at one point but failed for some reason, and started the search from the beginning; maybe this explains the progress going backwards.
ID: 1208 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1321
Credit: 409,155,058
RAC: 243,749
Message 1209 - Posted: 21 Dec 2014, 7:19:02 UTC - in response to Message 1207.  

A memory leak is a bug where a program loses track of memory that it has obtained and therefore cannot release it. This wastes the leaked memory while the program runs. When the leaky program closes, a modern operating system that employs protected memory will free the leaked memory. See https://en.wikipedia.org/wiki/Memory_leak for more information on memory leaks.


I really doubt there is a true memory leak. I only allocate a small number of variables; and these are all freed before the program ends. However, it is in the realm of possibility that one of the libraries I am using has a memory leak. But like you say, this should be freed when the program ends.

Also note, these memory leak messages only happen with the windows app.
ID: 1209 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : News : Modification to bounded app


Main page · Your account · Message boards


Copyright © 2024 Arizona State University