Finally recovering from hard drive crash.

Message boards : News : Finally recovering from hard drive crash.
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 2994 - Posted: 26 Jan 2021, 22:45:59 UTC

The crash was so bad that the database could not be repaired. The first non-corrupted backup was 2 days old. Restoring from there seems to have worked but there are probably issues since it will be out of sync from results that get returned.

I pushed the deadline back another 4 days for any outstanding WUs to give people time to return them, but anything issued 2 days prior to the crash will not be in the database so I am not sure how that will play out.

Sorry for the inconvenience!
ID: 2994 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Speedy51

Send message
Joined: 13 Apr 19
Posts: 22
Credit: 4,295,044
RAC: 6,872
Message 2995 - Posted: 26 Jan 2021, 23:47:21 UTC

I am pleased you got the system going again. You may or may not be aware of this. When I tried to get tasks I got the following information in my log:
27/01/2021 12:29:55 PM | NumberFields@home | Master file download succeeded
27/01/2021 12:30:00 PM | NumberFields@home | Sending scheduler request: To fetch work.
27/01/2021 12:30:00 PM | NumberFields@home | Requesting new tasks for NVIDIA GPU
27/01/2021 12:30:02 PM | NumberFields@home | Scheduler request completed: got 50 new tasks
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367043of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367117of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367121of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367123of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp367757of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368172of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368234of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368300of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368305of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368418of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368423of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368424of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368426of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368428of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368429of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368430of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368431of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368433of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368435of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368437of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368438of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368439of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368442of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368443of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368446of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368447of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368452of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368453of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368456of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368457of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368458of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368460of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368461of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368462of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368470of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368473of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368476of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368477of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368479of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368480of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368482of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368484of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368485of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368487of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368489of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368492of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368493of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368495of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368496of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Resent lost task wu_sf3_DS-16x271-1_Grp368498of2000000_0
27/01/2021 12:30:02 PM | NumberFields@home | Project requested delay of 31 seconds
27/01/2021 12:30:03 PM | NumberFields@home | work fetch suspended by user
27/01/2021 12:30:04 PM | NumberFields@home | Started download of GetDecics_3.05_windows_x86_64__opencl_nvidia
27/01/2021 12:30:04 PM | NumberFields@home | Started download of pdtKernel_v305.cl
27/01/2021 12:30:04 PM | NumberFields@home | Started download of gpuMultiPrec_v305.h
27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367043of2000000.dat
27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367117of2000000.dat
27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367121of2000000.dat
27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367123of2000000.dat
27/01/2021 12:30:04 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp367757of2000000.dat
27/01/2021 12:30:06 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367123of2000000.dat: permanent HTTP error
27/01/2021 12:30:06 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368172of2000000.dat
27/01/2021 12:30:07 PM | NumberFields@home | Finished download of pdtKernel_v305.cl
27/01/2021 12:30:07 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368172of2000000.dat: permanent HTTP error
27/01/2021 12:30:07 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368234of2000000.dat
27/01/2021 12:30:07 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368300of2000000.dat
27/01/2021 12:30:08 PM | NumberFields@home | Finished download of gpuMultiPrec_v305.h
27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367043of2000000.dat: permanent HTTP error
27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367117of2000000.dat: permanent HTTP error
27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368234of2000000.dat: permanent HTTP error
27/01/2021 12:30:08 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368300of2000000.dat: permanent HTTP error
27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368305of2000000.dat
27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368418of2000000.dat
27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368423of2000000.dat
27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368424of2000000.dat
27/01/2021 12:30:08 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368426of2000000.dat
27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367121of2000000.dat: permanent HTTP error
27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp367757of2000000.dat: permanent HTTP error
27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368305of2000000.dat: permanent HTTP error
27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368418of2000000.dat: permanent HTTP error
27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368423of2000000.dat: permanent HTTP error
27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368424of2000000.dat: permanent HTTP error
27/01/2021 12:30:09 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368426of2000000.dat: permanent HTTP error
27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368428of2000000.dat
27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368429of2000000.dat
27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368430of2000000.dat
27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368431of2000000.dat
27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368433of2000000.dat
27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368435of2000000.dat
27/01/2021 12:30:09 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368437of2000000.dat
27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368428of2000000.dat: permanent HTTP error
27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368429of2000000.dat: permanent HTTP error
27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368430of2000000.dat: permanent HTTP error
27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368431of2000000.dat: permanent HTTP error
27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368433of2000000.dat: permanent HTTP error
27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368435of2000000.dat: permanent HTTP error
27/01/2021 12:30:10 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368437of2000000.dat: permanent HTTP error
27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368438of2000000.dat
27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368439of2000000.dat
27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368442of2000000.dat
27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368443of2000000.dat
27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368446of2000000.dat
27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368447of2000000.dat
27/01/2021 12:30:10 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368452of2000000.dat
27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368438of2000000.dat: permanent HTTP error
27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368439of2000000.dat: permanent HTTP error
27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368442of2000000.dat: permanent HTTP error
27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368443of2000000.dat: permanent HTTP error
27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368446of2000000.dat: permanent HTTP error
27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368447of2000000.dat: permanent HTTP error
27/01/2021 12:30:11 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368452of2000000.dat: permanent HTTP error
27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368453of2000000.dat
27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368456of2000000.dat
27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368457of2000000.dat
27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368458of2000000.dat
27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368460of2000000.dat
27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368461of2000000.dat
27/01/2021 12:30:11 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368462of2000000.dat
27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368453of2000000.dat: permanent HTTP error
27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368456of2000000.dat: permanent HTTP error
27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368457of2000000.dat: permanent HTTP error
27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368458of2000000.dat: permanent HTTP error
27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368460of2000000.dat: permanent HTTP error
27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368461of2000000.dat: permanent HTTP error
27/01/2021 12:30:12 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368462of2000000.dat: permanent HTTP error
27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368470of2000000.dat
27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368473of2000000.dat
27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368476of2000000.dat
27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368477of2000000.dat
27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368479of2000000.dat
27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368480of2000000.dat
27/01/2021 12:30:12 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368482of2000000.dat
27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368470of2000000.dat: permanent HTTP error
27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368473of2000000.dat: permanent HTTP error
27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368476of2000000.dat: permanent HTTP error
27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368477of2000000.dat: permanent HTTP error
27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368479of2000000.dat: permanent HTTP error
27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368480of2000000.dat: permanent HTTP error
27/01/2021 12:30:13 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368482of2000000.dat: permanent HTTP error
27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368484of2000000.dat
27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368485of2000000.dat
27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368487of2000000.dat
27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368489of2000000.dat
27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368492of2000000.dat
27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368493of2000000.dat
27/01/2021 12:30:13 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368495of2000000.dat
27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368484of2000000.dat: permanent HTTP error
27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368485of2000000.dat: permanent HTTP error
27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368487of2000000.dat: permanent HTTP error
27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368489of2000000.dat: permanent HTTP error
27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368492of2000000.dat: permanent HTTP error
27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368493of2000000.dat: permanent HTTP error
27/01/2021 12:30:14 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368495of2000000.dat: permanent HTTP error
27/01/2021 12:30:14 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368496of2000000.dat
27/01/2021 12:30:14 PM | NumberFields@home | Started download of sf3_DS-16x271-1_Grp368498of2000000.dat
27/01/2021 12:30:15 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368496of2000000.dat: permanent HTTP error
27/01/2021 12:30:15 PM | NumberFields@home | Giving up on download of sf3_DS-16x271-1_Grp368498of2000000.dat: permanent HTTP error
27/01/2021 12:31:09 PM | NumberFields@home | Finished download of GetDecics_3.05_windows_x86_64__opencl_nvidia
27/01/2021 12:32:15 PM | NumberFields@home | update requested by user
27/01/2021 12:32:18 PM | NumberFields@home | Fetching scheduler list
27/01/2021 12:32:20 PM | NumberFields@home | Master file download succeeded
27/01/2021 12:32:25 PM | NumberFields@home | Sending scheduler request: Requested by user.
27/01/2021 12:32:25 PM | NumberFields@home | Reporting 50 completed tasks
27/01/2021 12:32:25 PM | NumberFields@home | Not requesting tasks: "no new tasks" requested via Manager
27/01/2021 12:32:27 PM | NumberFields@home | Scheduler request completed
27/01/2021 12:32:27 PM | NumberFields@home | Project requested delay of 31 seconds


I will wait for things to come fully back online before trying again. It looks like I had to 21 tasks and progress from the time that you took the database backup. I had returned these tasks before the crash If the server is able to resend these tasks to me before the deadline 30 Jan 2021, 19:41:28 UTC I will be more than happy to process them
ID: 2995 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Karsten Vinding

Send message
Joined: 8 Oct 20
Posts: 1
Credit: 650,482
RAC: 1,028
Message 2996 - Posted: 26 Jan 2021, 23:47:51 UTC - in response to Message 2994.  
Last modified: 26 Jan 2021, 23:48:59 UTC

I'm sure you are aware of this, but download of new work is failing instantly with a "Permanent HTTP error" message.

According to server status up/download servers should be up.

Hope you get the site in fully operational state soon :)

Edit: Oh I see Speedy51 posted while I was writing ...
ID: 2996 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Contact
Avatar

Send message
Joined: 2 Sep 11
Posts: 9
Credit: 399,089
RAC: 3
Message 2997 - Posted: 27 Jan 2021, 0:22:37 UTC

And now old pre-crash wu's (9 Jan 2021, 7:34:35 UTC) have uploaded and validated. After several failed downloads, I'm downloading new work and attached a new host. There's still lots to fix up here, but the progress is swift so far. Well done!
ID: 2997 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 2998 - Posted: 27 Jan 2021, 2:55:02 UTC - in response to Message 2995.  

So the "resent lost tasks" are those that you completed in that 2 day period just before the hard drive crash - since the database is now 2 days behind, it is unaware that you already returned it, so it's asking you to resend it. The good news is those result files were already assimilated and were recovered off the old drive, so the computation is not lost. The bad news is the credits are gone. The other bad news is the project thinks it still needs the data so it will try to send it back out to someone else. I think my best course of action will be to just delete all current WUs in the database and then regen those that are still needed. But I'm still thinking about it and hoping there is a better option...
ID: 2998 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Speedy51

Send message
Joined: 13 Apr 19
Posts: 22
Credit: 4,295,044
RAC: 6,872
Message 2999 - Posted: 27 Jan 2021, 3:06:25 UTC - in response to Message 2998.  

Thanks for the feedback Eric. Happy decision-making :)
ID: 2999 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>EDLS]zOU

Send message
Joined: 27 Feb 16
Posts: 4
Credit: 5,335,499
RAC: 0
Message 3000 - Posted: 27 Jan 2021, 7:39:05 UTC

I got 40 valid UT on the 250 completed UT I sent back :D :D :D

(i'm still amazed that in this day and age, there was no RAID setup on this/any server )

Thank you for the recovery though, looking forward crunching more WU
ID: 3000 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Chooka

Send message
Joined: 3 May 18
Posts: 15
Credit: 17,304,326
RAC: 1,676
Message 3001 - Posted: 27 Jan 2021, 7:46:57 UTC

Thank you for the update Eric.
I was wondering what had happened.

ID: 3001 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 3002 - Posted: 27 Jan 2021, 8:10:19 UTC - in response to Message 2999.  

You'll notice a bunch of download errors. This is due to the WU not being present in the download directory because it was completed and deleted just before the crash; and the database is out of sync with this. I have set the "max_error_results" to 1 so the WU will be deleted after 1 download error, which should help push these out.
ID: 3002 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 3003 - Posted: 27 Jan 2021, 8:14:28 UTC - in response to Message 3000.  

(i'm still amazed that in this day and age, there was no RAID setup on this/any server )


That's a good point. I'll have to look into it, but I know the math department's budget is limited.
ID: 3003 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian (NapierNimbus)

Send message
Joined: 22 Jun 20
Posts: 1
Credit: 1,522,293
RAC: 971
Message 3006 - Posted: 27 Jan 2021, 20:24:23 UTC - in response to Message 2994.  

Glad you are back but my sympathies to you over the nightmare you must be facing in picking up the pieces. Ian (NapierNimbus)
ID: 3006 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 3007 - Posted: 27 Jan 2021, 22:46:03 UTC - in response to Message 3006.  

Another annoying thing to look out for... I noticed one of my hosts started using the 32bit version of the cpu app. I'm not sure why it's doing this, as the server has the "prefer_primary_platform" config option set. This is especially annoying since the 32bit version runs twice as slow.

Not sure yet how to get around this and resetting the project didn't help.
ID: 3007 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 3009 - Posted: 28 Jan 2021, 0:18:45 UTC - in response to Message 3007.  

Another annoying thing to look out for... I noticed one of my hosts started using the 32bit version of the cpu app. I'm not sure why it's doing this, as the server has the "prefer_primary_platform" config option set. This is especially annoying since the 32bit version runs twice as slow.

Not sure yet how to get around this and resetting the project didn't help.


I figured this out. It was basically caused by too many of the download errors in a 24 hour period on that particular host. Most of the download probems should now be flushed from the system, so this should automatically correct within a day. If you want it corrected sooner let me know and I can fix it for your host. You know you have the problem if your 64bit cpu is running a version less than 4.00 (this only applies to cpu apps).
ID: 3009 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>EDLS]zOU

Send message
Joined: 27 Feb 16
Posts: 4
Credit: 5,335,499
RAC: 0
Message 3011 - Posted: 28 Jan 2021, 7:18:48 UTC - in response to Message 3003.  

(i'm still amazed that in this day and age, there was no RAID setup on this/any server )


That's a good point. I'll have to look into it, but I know the math department's budget is limited.


I understand.
A lot of servers have HW RAID built-in the drive controller, failing that, the server OS should be able to do software RAID (via a volume manager or something else).
Or there's plenty cheap 2nd hand RAID adapter on the market, but IT policies may prevent using non-approved HW :D
ID: 3011 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Aurel
Avatar

Send message
Joined: 25 Feb 13
Posts: 216
Credit: 9,768,546
RAC: 3
Message 3013 - Posted: 28 Jan 2021, 14:22:22 UTC

Nice, it seems to work here but i got WAY to many new tasks. I have got my new CPU and mainboard yesterday and sadly have to use a normal CPU cooler instead of water; so i cant run on full power 24/7 til i got a new kit. (they forgot to add the screws, they werent in the package).

It also didnt created an new ID for this PC somehow?
ID: 3013 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Aurel
Avatar

Send message
Joined: 25 Feb 13
Posts: 216
Credit: 9,768,546
RAC: 3
Message 3014 - Posted: 28 Jan 2021, 14:47:23 UTC - in response to Message 3013.  

Nice, it seems to work here but i got WAY to many new tasks. I have got my new CPU and mainboard yesterday and sadly have to use a normal CPU cooler instead of water; so i cant run on full power 24/7 til i got a new kit. (they forgot to add the screws, they werent in the package).

It also didnt created an new ID for this PC somehow?


Update about this, after 10 minutes of working several cores reached Temperaturs of 100�C which forces me to stop running BOINC. I was only using 25% of cores. I should get the kit in a week, so i wont be able to finish within deadline.
ID: 3014 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Contact
Avatar

Send message
Joined: 2 Sep 11
Posts: 9
Credit: 399,089
RAC: 3
Message 3015 - Posted: 28 Jan 2021, 15:23:03 UTC - in response to Message 2997.  
Last modified: 28 Jan 2021, 15:28:37 UTC

I wrote:
... There's still lots to fix up here ...
Not any more. I no longer see any database or download errors. To my mind, this project has fully recovered from sudden death. Incredible!

Edit: Forgot to mention that email notifications aren't working.
ID: 3015 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 3017 - Posted: 28 Jan 2021, 17:50:54 UTC - in response to Message 3011.  

I understand.
A lot of servers have HW RAID built-in the drive controller, failing that, the server OS should be able to do software RAID (via a volume manager or something else).
Or there's plenty cheap 2nd hand RAID adapter on the market, but IT policies may prevent using non-approved HW :D


I used software RAIDs for many years on my own computers and finally stopped because it seemed like a waste of money buying 2 drives when I never had a problem. But it probably makes more sense with a server doing tons of file I/O.
ID: 3017 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 3018 - Posted: 28 Jan 2021, 17:53:33 UTC - in response to Message 3014.  

Update about this, after 10 minutes of working several cores reached Temperaturs of 100C which forces me to stop running BOINC. I was only using 25% of cores. I should get the kit in a week, so i wont be able to finish within deadline.


Yikes! I start to worry when mine goes above 80C. Doesn't the cpu automatically shut down or at least throttle itself when it gets that hot?
ID: 3018 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Eric Driver
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 8 Jul 11
Posts: 1144
Credit: 183,491,079
RAC: 163,214
Message 3019 - Posted: 28 Jan 2021, 17:56:32 UTC - in response to Message 3015.  

Forgot to mention that email notifications aren't working.


Come to think of it, you're right. I used to get email notifications on subscribed threads. I will have to look into this. They probably forgot to configure sendmail when they reinstalled the OS.
ID: 3019 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : News : Finally recovering from hard drive crash.


Main page · Your account · Message boards


Copyright © 2021 Arizona State University