Posts by Eric Driver

21) Message boards : News : server maintenance (Message 3635)
Posted 21 Jan 2024 by Profile Eric Driver
Post:
It looks like the validator also went crazy at some point, everyone who was reporting tasks during that time was awarded several times their usual daily throughput between 19 Jan 14:00 UTC and 20 Jan 05:00 UTC. I cannot find any tasks with suspiciously high credit (or extremely short run times), so most likely the credit for each task was awarded several times. It has been back to normal since.

The only persistent "problems" I can see right now are PHP warnings on the login page:
Deprecated: urldecode(): Passing null to parameter #1 ($string) of type string is deprecated in /home/boincadm/projects/NumberFields/html/user/login_form.php on line 26
and on the result detail pages (e.g., https://numberfields.asu.edu/NumberFields/result.php?resultid=212105635):
Deprecated: Implicit conversion from float-string "7664.013119" to int loses precision in /home/boincadm/projects/NumberFields/html/inc/util.inc on line 413
Deprecated: Implicit conversion from float-string "7574.109" to int loses precision in /home/boincadm/projects/NumberFields/html/inc/util.inc on line 413


Thanks for reporting those php warnings.

In regards to the multiple validations, that actually makes sense. The root cause of all the problems was an upgrade to python3. For some reason this caused the startup script to not create the lock files on the daemons and as a result every 5 minutes it would spawn new copies of the daemons. These daemons use database connections and so eventually we would exceed the max number of connections. In addition, the duplicate daemons were stepping all over themselves and causing all sorts of errors in the logs. Apparently the multiple validators each gave credits for the same task.
22) Message boards : News : server maintenance (Message 3633)
Posted 20 Jan 2024 by Profile Eric Driver
Post:
I believe all the kinks have been worked out now. But please let me know if you notice anything suspicious.
23) Message boards : News : server maintenance (Message 3632)
Posted 19 Jan 2024 by Profile Eric Driver
Post:
Just a heads up, the server was upgraded to the latest version of Ubuntu this morning.

The database seems to be crashing every so often with an error about exceeding max connections. Bear with me as we look into this.
24) Message boards : Number crunching : How can i earn septic count and ℚ(√-10) count? (Message 3631)
Posted 8 Jan 2024 by Profile Eric Driver
Post:
ok. please can you post here when the time comes? thanks

sure thing.
25) Message boards : News : 2023 Year End Summary (Message 3630)
Posted 8 Jan 2024 by Profile Eric Driver
Post:
How long will the calculations take in general?


Probably several years before subfield 6 is "essentially complete". A few more years beyond that to guarantee completeness.
26) Message boards : Number crunching : How can i earn septic count and ℚ(√-10) count? (Message 3627)
Posted 7 Jan 2024 by Profile Eric Driver
Post:
The ℚ(√-10) search is also known as sf7. This one is mostly being computed at Gerasim (no badges). Periodically, some cases are run here giving you the chance to earn some counts.

Any chance of having a few more of these please?

A small chance, but probably not for at least several months.
27) Message boards : News : 2023 Year End Summary (Message 3624)
Posted 1 Jan 2024 by Profile Eric Driver
Post:
2023 was basically a good year. The main achievements were:
1. Competed 100% of subfield 3.
2. Got subfield 7 to the point where it is essentially complete.
3. Completed several rows of the subfield 6 table.

Going forward we will continue chipping away at subfield 6. Meanwhile, the last remnants of subfield 7 have been farmed out to Gerasim.

Thanks everyone for your contributions and have a wonderful New Year!
28) Message boards : Number crunching : W ork done not showing up at BOINCStats (Message 3621)
Posted 2 Dec 2023 by Profile Eric Driver
Post:
See this thread:
https://numberfields.asu.edu/NumberFields/forum_thread.php?id=582#3446
29) Message boards : Number crunching : Consisting Crashing on GPU (Message 3618)
Posted 17 Nov 2023 by Profile Eric Driver
Post:
I suppose that NF is FP64-intense which is not friendly to NVIDIA cards. Am I right?


Actually, it's integer intensive.
30) Message boards : Number crunching : Consisting Crashing on GPU (Message 3612)
Posted 12 Nov 2023 by Profile Eric Driver
Post:
Yes. I see about 8x speedup on 4060 Laptop than the CPU when running 3 tasks in parallel, but with 60W power consumption. When on CPU, it only consumes 25W with 16 tasks in parallel. That's awkward because heterogeneous computing normally increases power efficiency by nearly a magnitude.


Where is this power measurement coming from? Is it the GPU only or the whole system?

Another thing to keep in mind is the GPU app also uses a portion of a CPU core, probably somewhere between 20% to 50% depending on the speed of the GPU. The CPU generates the list of polynomials to test and the GPU does the actual testing; when the GPU is really fast, the CPU has to work harder to keep up feeding it, hence the CPU usage goes up.
31) Message boards : Number crunching : Consisting Crashing on GPU (Message 3610)
Posted 11 Nov 2023 by Profile Eric Driver
Post:
Besides, I don't see any energy efficiency increase when switching to GPU as expected. Is that normal?


I'm not sure exactly what you mean. I see about 25x speedup on my 3070 Ti compared to a single cpu core, but it also uses a bunch more power, so not sure if it's any more energy efficient.
32) Message boards : Number crunching : Consisting Crashing on GPU (Message 3606)
Posted 10 Nov 2023 by Profile Eric Driver
Post:
Sorry for the frustration. I'm not sure what the problem is. I saw similar behavior years ago when overclocking the cpu - the system would overheat and then shut itself down. Maybe something similar is happening with the GPU?
33) Message boards : Number crunching : How can i earn septic count and ℚ(√-10) count? (Message 3602)
Posted 31 Oct 2023 by Profile Eric Driver
Post:
Also I did a little Paint.net work using layers to make the Galois Field badges much easier to distinguish apart at as low as 10% scale of the original size.

If you wish to use these, and I hope you do - feel free!

It should be relatively easy to swap out the images. But for some reason they are not loading for me. Not sure if it's my browser or the site. I will look into it later when I have a free moment.
34) Message boards : Number crunching : How can i earn septic count and ℚ(√-10) count? (Message 3598)
Posted 27 Oct 2023 by Profile Eric Driver
Post:
As title. How can i earn septic count and ℚ(√-10) count?


The septic search ended a long time ago, so no more for that one.

The ℚ(√-10) search is also known as sf7. This one is mostly being computed at Gerasim (no badges). Periodically, some cases are run here giving you the chance to earn some counts. Ironically, yesterday I dropped about 35k WUs for sf7, so that would have given you a chance.
35) Message boards : News : Support for Intel GPUs (Message 3595)
Posted 24 Oct 2023 by Profile Eric Driver
Post:
I have had 3 tasks fail after between 12 and 14 hours with the following error:
Exit status 198 (0x000000C6) EXIT_MEM_LIMIT_EXCEEDED

The tasks are https://numberfields.asu.edu/NumberFields/result.php?resultid=204294692, https://numberfields.asu.edu/NumberFields/result.php?resultid=204294798 and https://numberfields.asu.edu/NumberFields/result.php?resultid=204294071

Part of Stderr Output gives:

<core_client_version>7.24.1</core_client_version>
<![CDATA[
<message>
working set size > client RAM limit: 16385.39MB > 16309.76MB</message>
<stderr_txt>
GPU Summary String = [INTEL|Intel(R)HDGraphics4600|1|1629MB||102].
Loading GPU lookup table from file.
GPU was not found in the lookup table. Using default values:
numBlocks = 1024.
threadsPerBlock = 32.
polyBufferSize = 32768.


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x00007ffe3287b892

Engaging BOINC Windows Runtime Debugger...

From these messages I understand that the tasks are running out of GPU memory.
Is there any way to reduce the amount of memory being used?

Thanks, Ruud


Sorry for your troubles. The application doesn't use near that much memory and I think that error code is referring to CPU memory not GPU (but I could be wrong). Also, the memory is allocated up front, so it shouldn't take 14 hours before it errors out.

Looking at the stderr, it doesn't even get to the openCL messages. My best guess for what is happening- the openCL driver is hanging during the build phase. More specifically, the openCL compiler gets stuck and is slowly chewing up system memory until it runs out (do you have 16GB of system memory?). Since the openCL compiler is part of the graphics driver, the only solution I see is to upgrade the driver, if that's even possible.
36) Message boards : News : Support for Intel GPUs (Message 3591)
Posted 7 Oct 2023 by Profile Eric Driver
Post:
Pardon me for the impertinence, but I may have a better solution for the GPU problems that this project is hampered by:

*Expand the number of GPUs in the file your project supplies! (gpuLookupTable_v402.txt)*

      GPU Name      |   numBlocks   |  threadsPerBlock
==========================================================
     GTX 1050       |      9600     |       32
     GTX 1050 Ti    |      9600     |       32
     GTX 1660       |      8192     |       32
      RX 570        |      2048     |       64
     GTX 1070       |      9600     |       32

is a bit meagre, don't you think?
There's whole generations of GPUs (and IGPs) that do not get mentioned, and the default values don't work for all those. Editing the file does not work, even when you have a line in the cc_config.xml preventing the check for changed files -at least not in my BOINC client version. It looks like the file gets overwritten at times too, because I have made the change several times -to no avail.


To remind you, the default values should work well for most cards (some older cards will have problems). The lookup table is for those who want to tweak the settings to eek out a little more performance for their specific card. I can't add new entries if I don't have access to other cards. The original hope was for some users to send me their optimal settings and then I could add them to the official lookup table.
37) Message boards : News : Batch Plan (Message 3585)
Posted 26 Sep 2023 by Profile Eric Driver
Post:
The plan is to go up to 13x11 and then move on to row 14. Rows 14 and 15 are comparable to row 13, so we should be able to get up to row 16 (the final row) with only data sets 13x12, 14x12, and 15x12 unfinished.
38) Message boards : News : Support for Intel GPUs (Message 3581)
Posted 14 Sep 2023 by Profile Eric Driver
Post:
I tried the solution
If it is truly a resource problem, you could try adding the following line to the gpuLookupTable file in your projects directory:
UHD Graphics 605     |     256     |      8

but it had no effect whatsoever -it still can't find the GPU in a certain table, apparently not the one in my drictory. I made my hosts visible now, so you can see more yourself.


It's been a while since I tried this, so I thought it might be wise to try it again. It still works for me.

For me, the file to change was: [BOINC_root]/projects/numberfields.asu.edu_NumberFields/gpuLookupTable_v402.txt

If that's the file you changed, then maybe the problem is the client version as mentioned earlier in this thread. My client/manager version was 7.20.2
39) Message boards : News : Support for Intel GPUs (Message 3576)
Posted 15 Aug 2023 by Profile Eric Driver
Post:
DKlimax - Thanks for looking into that.

So it looks like we have a working solution for anyone who wants to modify the GPU lookup table. Again, it's not necessary, but allows users the option to tweak parameters for their specific card (no guarantee, but it might improve performance by 5 to 10%).
40) Message boards : Number crunching : Older Batches (Message 3574)
Posted 13 Aug 2023 by Profile Eric Driver
Post:
I noticed on the batch status page the sf6-DS-12x9 has been marked completed. However, I am still getting a few of these tasks from time to time. Do you still need these tasks? If not, I would rather abort them and work on tasks that are needed.


When it gets very close to being done, I turn it off on the batch status page to reduce database queries. Tasks that are not needed anymore will be aborted, so no need to worry about it yourself.


Previous 20 · Next 20


Main page · Your account · Message boards


Copyright © 2024 Arizona State University