Message boards :
News :
Finally recovering from hard drive crash.
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
The crash was so bad that the database could not be repaired. The first non-corrupted backup was 2 days old. Restoring from there seems to have worked but there are probably issues since it will be out of sync from results that get returned. I pushed the deadline back another 4 days for any outstanding WUs to give people time to return them, but anything issued 2 days prior to the crash will not be in the database so I am not sure how that will play out. Sorry for the inconvenience! |
Send message Joined: 13 Apr 19 Posts: 26 Credit: 11,558,380 RAC: 8,252 |
I am pleased you got the system going again. You may or may not be aware of this. When I tried to get tasks I got the following information in my log: 27/01/2021 12:29:55 PM | NumberFields@home | Master file download succeeded 27/01/2021 12:30:00 PM | NumberFields@home | Sending scheduler request: To fetch work. 27/01/2021 12:30:00 PM | NumberFields@home | Requesting new tasks for NVIDIA GPU 27/01/2021 12:30:02 PM | NumberFields@home | Scheduler request completed: got 50 new tasks 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367043of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367117of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367121of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367123of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367757of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368172of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368234of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368300of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368305of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368418of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368423of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368424of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368426of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368428of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368429of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368430of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368431of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368433of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368435of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368437of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368438of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368439of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368442of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368443of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368446of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368447of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368452of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368453of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368456of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368457of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368458of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368460of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368461of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368462of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368470of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368473of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368476of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368477of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368479of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368480of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368482of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368484of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368485of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368487of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368489of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368492of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368493of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368495of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368496of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368498of2000000_0 27/01/2021 12:30:02 PM | NumberFields@home | Project requested delay of 31 seconds 27/01/2021 12:30:03 PM | NumberFields@home | work fetch suspended by user 27/01/2021 12:30:04 PM | NumberFields@home | Started download of GetDecics_3.05_windows_x86_64__opencl_nvidia 27/01/2021 12:30:04 PM | NumberFields@home | Started download of pdtKernel_v305.cl 27/01/2021 12:30:04 PM | NumberFields@home | Started download of gpuMultiPrec_v305.h 27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367043of2000000.dat 27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367117of2000000.dat 27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367121of2000000.dat 27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367123of2000000.dat 27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367757of2000000.dat 27/01/2021 12:30:06 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367123of2000000.dat: permanent HTTP error 27/01/2021 12:30:06 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368172of2000000.dat 27/01/2021 12:30:07 PM | NumberFields@home | Finished download of pdtKernel_v305.cl 27/01/2021 12:30:07 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368172of2000000.dat: permanent HTTP error 27/01/2021 12:30:07 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368234of2000000.dat 27/01/2021 12:30:07 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368300of2000000.dat 27/01/2021 12:30:08 PM | NumberFields@home | Finished download of gpuMultiPrec_v305.h 27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367043of2000000.dat: permanent HTTP error 27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367117of2000000.dat: permanent HTTP error 27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368234of2000000.dat: permanent HTTP error 27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368300of2000000.dat: permanent HTTP error 27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368305of2000000.dat 27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368418of2000000.dat 27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368423of2000000.dat 27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368424of2000000.dat 27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368426of2000000.dat 27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367121of2000000.dat: permanent HTTP error 27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367757of2000000.dat: permanent HTTP error 27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368305of2000000.dat: permanent HTTP error 27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368418of2000000.dat: permanent HTTP error 27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368423of2000000.dat: permanent HTTP error 27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368424of2000000.dat: permanent HTTP error 27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368426of2000000.dat: permanent HTTP error 27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368428of2000000.dat 27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368429of2000000.dat 27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368430of2000000.dat 27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368431of2000000.dat 27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368433of2000000.dat 27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368435of2000000.dat 27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368437of2000000.dat 27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368428of2000000.dat: permanent HTTP error 27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368429of2000000.dat: permanent HTTP error 27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368430of2000000.dat: permanent HTTP error 27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368431of2000000.dat: permanent HTTP error 27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368433of2000000.dat: permanent HTTP error 27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368435of2000000.dat: permanent HTTP error 27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368437of2000000.dat: permanent HTTP error 27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368438of2000000.dat 27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368439of2000000.dat 27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368442of2000000.dat 27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368443of2000000.dat 27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368446of2000000.dat 27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368447of2000000.dat 27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368452of2000000.dat 27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368438of2000000.dat: permanent HTTP error 27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368439of2000000.dat: permanent HTTP error 27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368442of2000000.dat: permanent HTTP error 27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368443of2000000.dat: permanent HTTP error 27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368446of2000000.dat: permanent HTTP error 27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368447of2000000.dat: permanent HTTP error 27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368452of2000000.dat: permanent HTTP error 27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368453of2000000.dat 27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368456of2000000.dat 27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368457of2000000.dat 27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368458of2000000.dat 27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368460of2000000.dat 27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368461of2000000.dat 27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368462of2000000.dat 27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368453of2000000.dat: permanent HTTP error 27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368456of2000000.dat: permanent HTTP error 27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368457of2000000.dat: permanent HTTP error 27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368458of2000000.dat: permanent HTTP error 27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368460of2000000.dat: permanent HTTP error 27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368461of2000000.dat: permanent HTTP error 27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368462of2000000.dat: permanent HTTP error 27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368470of2000000.dat 27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368473of2000000.dat 27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368476of2000000.dat 27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368477of2000000.dat 27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368479of2000000.dat 27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368480of2000000.dat 27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368482of2000000.dat 27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368470of2000000.dat: permanent HTTP error 27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368473of2000000.dat: permanent HTTP error 27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368476of2000000.dat: permanent HTTP error 27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368477of2000000.dat: permanent HTTP error 27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368479of2000000.dat: permanent HTTP error 27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368480of2000000.dat: permanent HTTP error 27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368482of2000000.dat: permanent HTTP error 27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368484of2000000.dat 27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368485of2000000.dat 27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368487of2000000.dat 27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368489of2000000.dat 27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368492of2000000.dat 27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368493of2000000.dat 27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368495of2000000.dat 27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368484of2000000.dat: permanent HTTP error 27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368485of2000000.dat: permanent HTTP error 27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368487of2000000.dat: permanent HTTP error 27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368489of2000000.dat: permanent HTTP error 27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368492of2000000.dat: permanent HTTP error 27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368493of2000000.dat: permanent HTTP error 27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368495of2000000.dat: permanent HTTP error 27/01/2021 12:30:14 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368496of2000000.dat 27/01/2021 12:30:14 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368498of2000000.dat 27/01/2021 12:30:15 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368496of2000000.dat: permanent HTTP error 27/01/2021 12:30:15 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368498of2000000.dat: permanent HTTP error 27/01/2021 12:31:09 PM | NumberFields@home | Finished download of GetDecics_3.05_windows_x86_64__opencl_nvidia 27/01/2021 12:32:15 PM | NumberFields@home | update requested by user 27/01/2021 12:32:18 PM | NumberFields@home | Fetching scheduler list 27/01/2021 12:32:20 PM | NumberFields@home | Master file download succeeded 27/01/2021 12:32:25 PM | NumberFields@home | Sending scheduler request: Requested by user. 27/01/2021 12:32:25 PM | NumberFields@home | Reporting 50 completed tasks 27/01/2021 12:32:25 PM | NumberFields@home | Not requesting tasks: "no new tasks" requested via Manager 27/01/2021 12:32:27 PM | NumberFields@home | Scheduler request completed 27/01/2021 12:32:27 PM | NumberFields@home | Project requested delay of 31 seconds I will wait for things to come fully back online before trying again. It looks like I had to 21 tasks and progress from the time that you took the database backup. I had returned these tasks before the crash If the server is able to resend these tasks to me before the deadline 30 Jan 2021, 19:41:28 UTC I will be more than happy to process them |
Send message Joined: 8 Oct 20 Posts: 1 Credit: 3,192,473 RAC: 4,381 |
I'm sure you are aware of this, but download of new work is failing instantly with a "Permanent HTTP error" message. According to server status up/download servers should be up. Hope you get the site in fully operational state soon :) Edit: Oh I see Speedy51 posted while I was writing ... |
Send message Joined: 2 Sep 11 Posts: 13 Credit: 8,767,817 RAC: 6,365 |
|
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
So the "resent lost tasks" are those that you completed in that 2 day period just before the hard drive crash - since the database is now 2 days behind, it is unaware that you already returned it, so it's asking you to resend it. The good news is those result files were already assimilated and were recovered off the old drive, so the computation is not lost. The bad news is the credits are gone. The other bad news is the project thinks it still needs the data so it will try to send it back out to someone else. I think my best course of action will be to just delete all current WUs in the database and then regen those that are still needed. But I'm still thinking about it and hoping there is a better option... |
Send message Joined: 13 Apr 19 Posts: 26 Credit: 11,558,380 RAC: 8,252 |
Thanks for the feedback Eric. Happy decision-making :) |
Send message Joined: 27 Feb 16 Posts: 11 Credit: 14,722,790 RAC: 0 |
I got 40 valid UT on the 250 completed UT I sent back :D :D :D (i'm still amazed that in this day and age, there was no RAID setup on this/any server ) Thank you for the recovery though, looking forward crunching more WU |
Send message Joined: 3 May 18 Posts: 18 Credit: 45,233,128 RAC: 0 |
Thank you for the update Eric. I was wondering what had happened. |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
You'll notice a bunch of download errors. This is due to the WU not being present in the download directory because it was completed and deleted just before the crash; and the database is out of sync with this. I have set the "max_error_results" to 1 so the WU will be deleted after 1 download error, which should help push these out. |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
(i'm still amazed that in this day and age, there was no RAID setup on this/any server ) That's a good point. I'll have to look into it, but I know the math department's budget is limited. |
Send message Joined: 22 Jun 20 Posts: 1 Credit: 2,159,687 RAC: 0 |
Glad you are back but my sympathies to you over the nightmare you must be facing in picking up the pieces. Ian (NapierNimbus) |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
Another annoying thing to look out for... I noticed one of my hosts started using the 32bit version of the cpu app. I'm not sure why it's doing this, as the server has the "prefer_primary_platform" config option set. This is especially annoying since the 32bit version runs twice as slow. Not sure yet how to get around this and resetting the project didn't help. |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
Another annoying thing to look out for... I noticed one of my hosts started using the 32bit version of the cpu app. I'm not sure why it's doing this, as the server has the "prefer_primary_platform" config option set. This is especially annoying since the 32bit version runs twice as slow. I figured this out. It was basically caused by too many of the download errors in a 24 hour period on that particular host. Most of the download probems should now be flushed from the system, so this should automatically correct within a day. If you want it corrected sooner let me know and I can fix it for your host. You know you have the problem if your 64bit cpu is running a version less than 4.00 (this only applies to cpu apps). |
Send message Joined: 27 Feb 16 Posts: 11 Credit: 14,722,790 RAC: 0 |
(i'm still amazed that in this day and age, there was no RAID setup on this/any server ) I understand. A lot of servers have HW RAID built-in the drive controller, failing that, the server OS should be able to do software RAID (via a volume manager or something else). Or there's plenty cheap 2nd hand RAID adapter on the market, but IT policies may prevent using non-approved HW :D |
Send message Joined: 25 Feb 13 Posts: 216 Credit: 9,899,302 RAC: 0 |
Nice, it seems to work here but i got WAY to many new tasks. I have got my new CPU and mainboard yesterday and sadly have to use a normal CPU cooler instead of water; so i cant run on full power 24/7 til i got a new kit. (they forgot to add the screws, they werent in the package). It also didnt created an new ID for this PC somehow? |
Send message Joined: 25 Feb 13 Posts: 216 Credit: 9,899,302 RAC: 0 |
Nice, it seems to work here but i got WAY to many new tasks. I have got my new CPU and mainboard yesterday and sadly have to use a normal CPU cooler instead of water; so i cant run on full power 24/7 til i got a new kit. (they forgot to add the screws, they werent in the package). Update about this, after 10 minutes of working several cores reached Temperaturs of 100�C which forces me to stop running BOINC. I was only using 25% of cores. I should get the kit in a week, so i wont be able to finish within deadline. |
Send message Joined: 2 Sep 11 Posts: 13 Credit: 8,767,817 RAC: 6,365 |
I wrote: ... There's still lots to fix up here ...Not any more. I no longer see any database or download errors. To my mind, this project has fully recovered from sudden death. Incredible! Edit: Forgot to mention that email notifications aren't working. |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
I understand. I used software RAIDs for many years on my own computers and finally stopped because it seemed like a waste of money buying 2 drives when I never had a problem. But it probably makes more sense with a server doing tons of file I/O. |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
Update about this, after 10 minutes of working several cores reached Temperaturs of 100°C which forces me to stop running BOINC. I was only using 25% of cores. I should get the kit in a week, so i wont be able to finish within deadline. Yikes! I start to worry when mine goes above 80C. Doesn't the cpu automatically shut down or at least throttle itself when it gets that hot? |
Send message Joined: 8 Jul 11 Posts: 1344 Credit: 527,059,656 RAC: 572,342 |
Forgot to mention that email notifications aren't working. Come to think of it, you're right. I used to get email notifications on subscribed threads. I will have to look into this. They probably forgot to configure sendmail when they reinstalled the OS. |