Posts by Eric Driver

41) Message boards : News : Support for Intel GPUs (Message 3779)
Posted 17 Jan 2025 by Profile Eric Driver
Post:
Although Thunderbolt 3 is much slower than PCI Express, I find it hard to believe it could be the problem. The total amount of data being transferred by the NumberFields app is relatively small, on the order of GBs; and Thunderbolt should have speeds in the tens of GBs per second. So if the Thunderbolt connection is working as advertised then it should only add a few seconds to the total run time.

As Richard alluded to, problems like this are usually due to an over committed CPU. Recall, the NumberFields app uses a CPU core to buffer polynomials to be tested while the GPU works on a previous buffer of data. When the GPU finishes the current buffer, it hands the results back to the CPU and gets the next buffer - if the CPU doesn't have the next buffer ready then the GPU idles as it waits for it. To check this, I usually monitor the CPU during execution of a task to see how much it is being used. As an example, I just checked my 4070Ti and it is using about 50% of a cpu core - this is good, because it means the CPU is waiting half the time on the GPU. You want the CPU to wait on the GPU, not the other way around.

You mentioned you have almost full GPU utilization, which usually means the GPU is not idle. However, I wonder if that is being reported correctly. Sometimes the GPU fan or temp readings give a better indicator if the card is actually being fully utilized. I find it odd that you are at full utilization with a single thread. On my 4070Ti I need to run 2 threads simultaneously, or else my utilization is only 50% (that's because the CPU can't feed the GPU fast enough, so it idles half the time).
42) Message boards : News : Support for Intel GPUs (Message 3775)
Posted 17 Jan 2025 by Profile Eric Driver
Post:
Here we go:
https://numberfields.asu.edu/NumberFields/workunit.php?wuid=230312537
https://numberfields.asu.edu/NumberFields/result.php?resultid=246629207


Thanks for sharing!

The results themselves look good. My concern is the run times are much higher than expected. The Arc B580 should be similar to a 4060. My 3070 Ti averages about 10 minutes per WU, which should be comparable. Your run times are about 4 times larger than that.

Any idea what could be causing this? Are you running multiple GPU threads simultaneously? Is the card at 100% utilization? It's possible Intel's openCL driver is not efficient, but I doubt that can account for a factor of 4 slow down.
43) Message boards : News : Support for Intel GPUs (Message 3770)
Posted 2 Jan 2025 by Profile Eric Driver
Post:
I am getting an Intel Arc B580 connected via a Thunderbolt eGPU enclosure this month, will report any result here when I receive the GPU and finish some WUs.


Sounds great! I look forward to your report.
44) Message boards : News : 2024 Year End Summary (Message 3768)
Posted 1 Jan 2025 by Profile Eric Driver
Post:
The main highlights this year were:
1. We spent the majority of the year on row 14 for the subfield 6 searches. Row 14 is now in the "essentially complete" phase, with only 14x12 left. Only 2 more rows to go!
2. Subfield 7, which is essentially complete, continues to be chipped away by Gerasim. No new fields were found, but with 7 of the 25 congruences completed, the evidence that we are indeed complete continues to mount.

Looking ahead to 2025, the main goal will be to complete row 15 (except last column) for subfield 6. This will get row 15 to the "essentially complete" phase.

Thanks everyone for your contributions and have a wonderful New Year!
45) Message boards : Number crunching : Production Moving Forward? (Message 3765)
Posted 23 Nov 2024 by Profile Eric Driver
Post:
If sf6 is all that remains, what is the estimated remaining project life? thanks


I would say we have at least 2 years left. We should be essentially complete by then but will need a couple more years to guarantee 100% completion.
46) Message boards : News : Future of the Project (Message 3763)
Posted 12 Nov 2024 by Profile Eric Driver
Post:
Is there any updates on this thread?


Much progress has been made - sf3 is officially complete, sf7 is essentially complete, and we are making good progress on sf6. Also, the server hardware seems to be doing just fine.
47) Message boards : Number crunching : MacOS Apple Silicon ARM support? (Message 3760)
Posted 7 Nov 2024 by Profile Eric Driver
Post:
When will this project get Apple Silicon ARM support?

Currently, it only supports MacOS Intel devices.


There was not much demand for the old Mac app, so I am hesitant to put the work into porting it.

But anybody who's looking for something to do is welcome to try porting it. Just let me know and I can point you to the GitHub page.
48) Message boards : Number crunching : Indefinite recycling of a Gerasim workunit. (Message 3757)
Posted 14 Sep 2024 by Profile Eric Driver
Post:
@SerVal
1. Ok, I will not add new tasks for now. But I don't understand the problem. Is that an internal Gerasim problem?
2. I don't use the WU batch field. It is always set to "0".
49) Message boards : Number crunching : Indefinite recycling of a Gerasim workunit. (Message 3753)
Posted 4 Sep 2024 by Profile Eric Driver
Post:
It looks like the WU input file got corrupted somehow. I will attempt to delete that WU from the Gerasim system.
50) Message boards : Number crunching : Something wrong with statsitics (Message 3749)
Posted 24 Aug 2024 by Profile Eric Driver
Post:
Hello.

There's something odd going on there. First: According to my account info my RAC is 94,247.19 and i am supposed to be in 5% top participants, but if I look at Statistics->Top participants I am not there between voicesurfr and ManuelJC. (If I sort table by Total credit and go to offset 220 I can find myself at position 232 with correct RAC and totals.)

But that's not all. If I have table sorted by RAC and go bit deeper, there are at offset 60 two users with badge "top 25%" while all others are with badge "top 5%".

So my question is, what's wrong (if any) and if ti is possible to fix it.

Many thanks.


Yes, I have seen this before. It is related to caching of the web pages to reduce server load. Caching was set to several hours so I reduced it to several minutes. That appears to have fixed the first problem.

The second problem is because badges are only assigned once per day but credits are continuously being updated. So those two out of place entries of "top 25%" are from yesterdays badges, and they will get updated to "top 5%" later today when rankings are recomputed.
51) Message boards : Number crunching : Expired certificate (Message 3746)
Posted 19 Aug 2024 by Profile Eric Driver
Post:
The new certificate has been installed


Yes, uploads should work now. Let me know if still having problems.
52) Message boards : Number crunching : Expired certificate (Message 3743)
Posted 19 Aug 2024 by Profile Eric Driver
Post:
Still waiting to hear back on the ETA.

Is anyone aware of a client work around to ignore expired certificates, in the same way the browser can ignore them? I know libcurl has the ability to do this, so I imagine you would need a special build of the client.
53) Message boards : Number crunching : Expired certificate (Message 3740)
Posted 17 Aug 2024 by Profile Eric Driver
Post:
I emailed the IT department last night. They are still working the problem.
It was only a 3-month certificate, so I assume it was expected to auto-renew? Shouldn't be too hard, provided the supplier works weekends.


I heard back from IT. They are just waiting for the certificate to be issued. I'm not sure how the process works, but it sounds like we are waiting on a 3rd party supplier. They said it could take until Monday. Sorry for the delay...
54) Message boards : Number crunching : Expired certificate (Message 3738)
Posted 17 Aug 2024 by Profile Eric Driver
Post:
I emailed the IT department last night. They are still working the problem.
55) Message boards : News : GPU app - beta version for linux nvidia (Message 3731)
Posted 16 Aug 2024 by Profile Eric Driver
Post:
After all these years, we finally have our first GPU app. It's only a beta version for 64bit linux with Nvidia GPUs. Support for other platforms and GPUs will be coming soon.

If you'd like to help test this app, you will need to check the "run test applications" box on the project preferences page. I generated a special batch of work for this app from some older WUs that I have truth for. This will help to find any potential bugs that are still there.

A few potential issues:
1. This was built with the Cuda SDK version 10.1, so it uses a relatively new Nvidia driver version and only supports compute capability 3.0 and up. If this affects too many users out there, I will need to rebuild with on older SDK.
2. I was not able to build a fully static executable, but I did statically link the ones most likely to be a problem (i.e. pari, gmp, std c++)

Please report any problems. I am still relatively new to the whole GPU app process, so I am sure there will be issues of some kind.

Also, feel free to leave comments regarding what platform, GPU, etc I should concentrate on next. I was thinking I would attack linux OpenCL (i.e. ATI/AMD) next as that should be a quick port of what I did with Nvidia. I think the windows port will take much longer, since I normally use mingw to cross-compile but I don't think that's compatible with the nvidia compiler.


Eric, can you explain the part about only building part of the app as statically compiled. How much of application execution depends on the cpu? I am trying to understand why execution is so much slower on my Epyc hosts with RTX 3080 cards versus my Ryzen Zen 4 hosts with the same RTX 3080 cards as the Epyc hosts. All hosts run the same Ubuntu 24.04 distro with the same exact kernels. The Zen 4 hosts all have 32GB of memory and the Epyc hosts all have 128GB of memory.

The only difference in environments between these two cohorts of hosts is the 2X greater cpu clock speeds of the Ryzen 7950X hosts which are running the cpu cores at ~5.0Ghz versus the Eypc hosts which are only running ~2.1-2.4Ghz clock speeds on their cores.
So my thinking is the execution of each CUDA30 task must be going out to the cpu for more than the initial data retrieval of the input file. The storage systems of all hosts are using identical M.2 Gen 4 SSD's and the storage speeds are pretty much identical so I don't think that has a major influence in task speed.

Can you comment on how much the type and speed of the host cpu has on the speed of execution of the CUDA30 app? Thanks.


You quoted the original post which is over 5 years old, and much has changed since then. But to answer your question...

The CPU works in parallel with the GPU. There are 2 buffers of polynomials for testing - as the GPU processes the one buffer, the CPU does backend processing on the other buffer and then queues it for the next pass. I have seen my CPUs use between 20% and 50% of a core, depending on the relative speeds of the CPU and GPU. You want your core usage to be below 100% otherwise that means the GPU is waiting on the CPU. One way to get around this problem is to run multiple instances on the GPU by configuring your app_config file. Also be careful with hyper-threading since that can use up all your available cores. On Linux just do a "top" command and see what the CPU usage is while it's running and modify your BOINC manager accordingly.

Another thing to keep in mind when comparing two systems is the speed of the RAM and the motherboard (i.e. address bus). Remember there is a lot of data being transferred back and forth between the CPU/GPU and RAM.
56) Message boards : Number crunching : member stat xml ... (Message 3728)
Posted 5 Jul 2024 by Profile Eric Driver
Post:
I think I fixed it. Can you try again?
57) Message boards : Number crunching : Production Moving Forward? (Message 3725)
Posted 1 Jul 2024 by Profile Eric Driver
Post:
I believe this was discussed previously in a sense but I thought I would bring up my two cents on how to approach the remaining tasks.

Judging by the amount of time it takes to complete a nx11 and nx12 set and basing on the premise that this project has a limited number of days (not in actuality but could happen if a major hardware failure takes place), would it not be prudent to run the nx7, nx8, nx9, and nx10 tasks first and then come back and do the nx11 and finally nx12? I feel as though it would ensure that the largest number of sets are complete should the unfortunate day come where the project goes offline for uncontrollable reasons as the admin warned us that the project is no longer supported by the university.

Again, I'm not telling anyone how to do their job, I'm just throwing my opinion out there and hoping to contribute as much as possible to mathematics through this projects as possible in the (potentially) limited time we have.

Keep on crunching!


Just to clarify, the project is supported by the university, but no new hardware will be approved. The SSD is relatively new, so we should be good for several more years, as the SSD is usually the first thing to go in a BOINC server.

The reasoning behind the ordering of the searches is driven by aspects of the algorithm. Congruences in the lower tiers are skipped if they are duplicated in the higher tier searches. For this reason, it is important to do the higher tier searches to guarantee everything has been covered for the entire row. In other words, it is more important to complete all columns in a row before moving on to the next row. I hope that helps explain the reasoning.


2 hopefully short questions then:
1) if the Boinc Community paid for an hardware upgrades would that be acceptable?

2) Can you please explain a little more basically to this non math guy what you mean by " In other words, it is more important to complete all columns in a row before moving on to the next row." ie VERY basically, so if you have holes in the nx7 column why does it make sense to finish the nx11 column first and then go back to the nx7 column? Are you saying we are closer to finish nx11 and since nx7 has more holes its for later on? Is it too hard to do both at once, or is that beyond the capabilities due to the rules from ASU?


1.) Last I asked, the answer to that was no. I believe it was some kind of university policy.

2.) Rows and columns refer to the 2D batch status. The MxN search is basically the same as the Mx(N+1) search, but the Mx(N+1) has a larger discriminant bound so it finds more but also takes longer. Any shared congruences are redundant, so we remove those from the smaller discriminant case. So completing Mx(N+1) guarantees we have also completed MxN. The same principle applies to the other dimension of the matrix, so completing (M+1)xN is important in order to guarantee MxN is complete.

I hope that answers the question and is not too long winded.
58) Message boards : Number crunching : Production Moving Forward? (Message 3722)
Posted 28 Jun 2024 by Profile Eric Driver
Post:
I believe this was discussed previously in a sense but I thought I would bring up my two cents on how to approach the remaining tasks.

Judging by the amount of time it takes to complete a nx11 and nx12 set and basing on the premise that this project has a limited number of days (not in actuality but could happen if a major hardware failure takes place), would it not be prudent to run the nx7, nx8, nx9, and nx10 tasks first and then come back and do the nx11 and finally nx12? I feel as though it would ensure that the largest number of sets are complete should the unfortunate day come where the project goes offline for uncontrollable reasons as the admin warned us that the project is no longer supported by the university.

Again, I'm not telling anyone how to do their job, I'm just throwing my opinion out there and hoping to contribute as much as possible to mathematics through this projects as possible in the (potentially) limited time we have.

Keep on crunching!


Just to clarify, the project is supported by the university, but no new hardware will be approved. The SSD is relatively new, so we should be good for several more years, as the SSD is usually the first thing to go in a BOINC server.

The reasoning behind the ordering of the searches is driven by aspects of the algorithm. Congruences in the lower tiers are skipped if they are duplicated in the higher tier searches. For this reason, it is important to do the higher tier searches to guarantee everything has been covered for the entire row. In other words, it is more important to complete all columns in a row before moving on to the next row. I hope that helps explain the reasoning.
59) Message boards : Number crunching : How can i earn septic count and ℚ(√-10) count? (Message 3718)
Posted 31 May 2024 by Profile Eric Driver
Post:
I'll be dropping another 20k work units within the next hour or two.
60) Message boards : Number crunching : How can i earn septic count and ℚ(√-10) count? (Message 3715)
Posted 22 May 2024 by Profile Eric Driver
Post:
ok. please can you post here when the time comes? thanks


Just a heads up, I will be dropping about 30k WUs for sf7 within the next day.


They just dropped. Should be seeing them shortly.


Previous 20 · Next 20


Main page · Your account · Message boards


Copyright © 2025 Arizona State University