Message boards :
Number crunching :
I cant send finished wus
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
Are you mixing up your host ids? I show nothing uploaded by #95069, but #95038 returning a successful Septic result. |
Send message Joined: 2 Apr 18 Posts: 10 Credit: 561,802 RAC: 0 |
Hi Erich looks different to me - "Office" #95038 is still trying to upload the same septic-wus in Boinc. There is one decic wu on this host, which is ready to report. But i will have a closer look. |
Send message Joined: 28 Oct 11 Posts: 179 Credit: 220,503,822 RAC: 128,960 |
Here is an absolutely off-the-wall observation, which may be completely unrelated - but it has some similarities with this situation. I present it as food for thought, nothing more. Like HerrJeh, I have multiple machines - all are at my home, and share an internet connection. They have the same public IP addresses, but different private IP addresses behind the router. I'm connected to a different BOINC project, which like this one runs every BOINC function from a single server. That other project distributes long-running (up to 18 hour) workunits, but regards them as time-critical - it likes to have them returned within 24 hours. So I don't want to get a cache of tasks in advance, and in fact the project often has no tasks immediately available on demand. All of which is a long way of explaining why I find myself repeatedly clicking the 'Update' button in an attempt to get new work before the current task completes. What I am observing is that I can issue repeated requests every 30 seconds from one machine, and connect to the server every time. But if I try to connect from a different machine (same LAN, same IP) in between, the second machine can't connect. If I stop updating the first machine for a minute or two, the second machine can connect, and goes on connecting for as long as is needed and allowed by their 30-second backoff interval. The nature of the server contact required doesn't affect the connection failures: I've just had a machine which couldn't connect to upload results, while I was requesting new work on a different machine. It isn't simply congestion at the server port: that would be more random. This observation is strictly about multiple connection attempts, closely spaced in time, from different computers sharing the same public IP address. It feels more like a server OS-level problem than a BOINC problem, and it's been consistent for weeks, if not months. Their server is running 11/04/2018 11:03:50 | | [http] [ID#1] Received header from server: Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips mod_auth_gssapi/1.3.1 mod_auth_kerb/5.4 mod_fcgid/2.3.9 PHP/5.4.16 mod_wsgi/3.4 Python/2.7.5My machines are all Windows 7/64 and run recent versions of BOINC (mostly v7.9 test builds). I haven't yet explored the http logs for further clues, but I'll give it a try when I have time. |
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
Hi Erich looks different to me - "Office" #95038 is still trying to upload the same septic-wus in Boinc. There is one decic wu on this host, which is ready to report. But i will have a closer look. That is odd. I see something different. I'm doing database queries through the web admin interface since your computers are hidden from the normal view. I show you uploaded a septic result on April 10th for WU: septics_Bnd200E6_Grp153126of3001592. I also verified the uploaded file resides in the final assimilated directory, so all looks well on my end. Is that the same WU your client is still trying to upload? |
Send message Joined: 28 Oct 11 Posts: 179 Credit: 220,503,822 RAC: 128,960 |
|
Send message Joined: 2 Apr 18 Posts: 10 Credit: 561,802 RAC: 0 |
hi, i opened up hosts for you and changed wus to decic only. @Richard: Like you said - there is a local network running, up to 7 hosts working on boinc. Your idea seems plausible to me, i will give it a try and shut down single managers for a while to see if behaviour changes. Thank you for your support! |
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
@HerrJeh: It looks like all your hosts with the exception of #95062 have returned at least one septic result. Let me know if any of the WUs showing "in progress" are still unable to upload. I can run them offline to see if there is anything unusual about them (like file size). @UrsD: Sorry to have neglected you. Are you still having upload problems? Several tasks show "timed out"; I imagine some of those are the ones with upload problems? |
Send message Joined: 2 Apr 18 Posts: 10 Credit: 561,802 RAC: 0 |
@UrsD: Sorry, i didn´t want to highjack your thread. @Eric: Thank you. |
Send message Joined: 25 Feb 18 Posts: 5 Credit: 5,590,094 RAC: 0 |
I just run Get Decic Fields on my win10 pc. No problem. My good old i7 965 runs septics under linux. No prob! If U cant handle the problem. Avoid it! :-) ;-) |
Send message Joined: 25 Feb 18 Posts: 5 Credit: 5,590,094 RAC: 0 |
@HerrJeh. Kein Thema ;) |
Send message Joined: 10 Oct 15 Posts: 5 Credit: 38,148,839 RAC: 274 |
Kaspersky18@work - could this be a problem? Looks like it is. Several users complained about upload problems in the last few days and they all used Kaspersky. Apparently, the upload itself works fine (which is why everything looks fine on server side), but the server's confirmation that the file was received is blocked by Kaspersky (so the client thinks it went wrong and tries again). This does not only affect NumberFields@home, but at least one other (and probably all) project that also uses HTTPS for file uploads. HTTP appears to be fine, which is why the problem doesn't occur with Get Decic Fields tasks. Workarounds: - disable network traffic scan of BOINC Client in Kaspersky - set <http_1_0>1</http_1_0> in cc_config.xml to force BOINC to use HTTP 1.0 for file transfers |
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
Thanks for the update! It looks like the mystery is finally solved... |
Send message Joined: 6 May 18 Posts: 4 Credit: 288,028 RAC: 0 |
Hi I have read the thread but I also cant upload some WUs. All Decic WUs upload without problem but most Septics WUs fail with 12/05/2018 8:51:24 AM | NumberFields@home | Temporarily failed upload of wu_septics_Bnd200E6_Grp1093864of3001592_0_r1114902147_0: transient HTTP error However some Septics units do upload. Windows 7 Pro x64, BOINC 7.8.3 but I dont run Kaspersky. |
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
Hi I'm wondering if the file size might be playing a part. The number of fields in the septic tasks that you successfully uploaded were close to average or less. If it's not too much to ask, could you look at some of the stuck tasks in the slot directories to see how many fields are in each file (#lines is sufficient). I am also running the one you mentioned above to see if I notice anything out of the ordinary. |
Send message Joined: 6 May 18 Posts: 4 Credit: 288,028 RAC: 0 |
Hi The output files vary in size between 585B and 1.18kB, the output files are between 13 and 26 lines in length with the last 5 lines of the form # The search is complete. Stats: # Inspected 11965108 polynomials. # Num Polys post discriminant = 11965005. # Num Polys passing field disc test = 18. # Elapsed Time = 48121 (sec) There are six (6) completed WUs in total that don't upload. The completed output files are all in E:\BOINC Data\projects\numberfields.asu.edu_NumberFields, they are not stuck in any slot directories as incomplete. |
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
Hi AussieGeoff, So it doesn't look like a file size problem. 13 lines (8 fields) is below average and you've returned other results with more fields. Also, I ran Grp1093864, which was the one you mentioned above that couldn't upload, and this one only had 9 fields. I will think about this problem some more. |
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
So that everyone is in the loop, Geoff emailed me one of the non-uploadable result files and I determined there is nothing wrong with it. Therefore, I believe this problem is client or host related. Geoff: Do you have a firewall and/or virus scan software that could be temporarily disabled before trying to upload? Or based on pschoefer's post above could you try this: set |
Send message Joined: 6 May 18 Posts: 4 Credit: 288,028 RAC: 0 |
So that everyone is in the loop, Geoff emailed me one of the non-uploadable result files and I determined there is nothing wrong with it. Therefore, I believe this problem is client or host related. Hi I disabled all my malware protection and it seemed to work - all 6 files uploaded after 2-3 attempts at each. |
Send message Joined: 8 Jul 11 Posts: 1318 Credit: 404,000,018 RAC: 289,984 |
That's great news Geoff! So Kaspersky is not the only malware provider that conflicts with BOINC. I imagine there is a way to configure your malware protection so that it ignores the BOINC client. If not you could try to use the http config work around mentioned above. |
Send message Joined: 6 May 18 Posts: 4 Credit: 288,028 RAC: 0 |
That's great news Geoff! So Kaspersky is not the only malware provider that conflicts with BOINC. I have been checking and the problem seems to be WinPatrol Firewall and/or WinPatrol WAR (anti malware). They run a shared service that continues to run after you shut them both down. That service runs another service that seems to be the problem because it is a pain to get stopped and as soon as you do BOINC issues a message that your login is invalid and you have to start BOINC again. The original WinPatrol is fine. What really annoyed me is that there is no indication of any action (blocking) in the logs of either of the 2 program. |