GPU status update
It's been over a month since our last update, but I now have some good news. I have made some improvements to the GPU code and am ready to start deploying the new GPU apps.
I will start with the AMD OpenCL version for Linux. This will be a beta version. I have had a hell of a time with the AMD implementation of openCL, and this app still doesn't work on my Fedora system, and I believe strongly it's due to the graphics driver. But I have had the help of a volunteer named Wiktor and it runs fine for him (I believe he runs Ubuntu). Please keep in mind that AMD officially only supports RHEL and Ubuntu, so I will be interested to hear if this app works for anyone with an "unsupported" linux distro like myself.
I also have openCL Windows apps that were cross compiled using mingW. I have no means of testing these, so I am not ready to deploy them just yet. But if anyone would like to take them for a spin offline, please let me know, and I can send them to you.
15 May 2019, 22:05:56 UTC
Subfield 5 entering final phase
Subfield 5 is now on it's final batch (16x12).
This is the first batch of WUs generated since the app was optimized. As a result, run times will be going up. These WUs average 2 hour run times on my AMD Ryzen 2990WX. Note there are 1.6 million of them; with the unoptimized app it would have required 16 million, so this is a huge improvement.
I will be mixing in some subfield 6 WUs with this batch. These were generated a while ago for the unoptimized app, so will run very quickly by comparison. If I were to run them independently, it might put too much of a strain on the server, so I figured this is a good way to get them done (without having to re-generate them which can be tedious).
4 May 2019, 18:25:13 UTC
GPU app status update
So there have been some new developments over the last week. It's both good and bad.
First of all, some history. The reason I waited so long to develop a GPU app is because the calculation was heavily dependent on multi-precision libraries (gmp) and number theoretic libraries (pari/gp). Both of these use dynamically allocated memory which is a big no-no in GPUs. I found a multi-precision library online that I could use by hard coding the precision to the maximum required (about 750 bits), thereby removing the dependence on memory allocations. The next piece of the puzzle was to code up a polynomial discriminant function. After doing this, I could finally compile a kernel for the GPU. That is the history for the current GPU app. It is about 20 to 30 times faster than the current cpu version (depends on WU and cpu/gpu speeds).
But then I got thinking... my GPU polynomial discriminant algorithm is different from the one in the PARI library (theirs works for any degree and mine is specialized to degree 10). So to do a true apples-to-apples comparison, I replaced the PARI algorithm with mine in the cpu version of the code. I was shocked by what I found... the cpu version was now about 10x faster than it used to be. I never thought I was capable of writing an algorithm that would be 10x faster than a well established library function. WTF? Now I'm kicking myself in the butt for not having done this sooner!
This brings mixed emotions. On one side, it is great that I now have a cpu version that is 10x faster. But it also means that my GPU code is total crap. With all the horsepower in a present day GPU I would expect it to be at least 10x faster than the equivalent cpu version. Compared with the new cpu version, the gpu is only 2 to 3 times faster. That is unacceptable.
So the new plan is as follows:
1. Deploy new cpu executables. Since it's 10x faster, I will need to drop the credit by a factor of 10. (Credits/hour will remain the same for the cpu but will obviously drop for the GPU)
2. Develop new and improved GPU kernels.
I don't blame the GPU users for jumping ship at this point. Frankly, the inefficiency of the current GPU app just makes it not worth it (for them or the project).
For what it's worth, I did have openCL versions built. Nvidia version works perfectly. The AMD version is buggy for some reason, as is the windows version. Since I will be changing the kernels anyways, there is no point in debugging them yet.
5 Apr 2019, 23:40:01 UTC
GPU app - beta version for linux nvidia
After all these years, we finally have our first GPU app. It's only a beta version for 64bit linux with Nvidia GPUs. Support for other platforms and GPUs will be coming soon.
If you'd like to help test this app, you will need to check the "run test applications" box on the project preferences page. I generated a special batch of work for this app from some older WUs that I have truth for. This will help to find any potential bugs that are still there.
A few potential issues:
1. This was built with the Cuda SDK version 10.1, so it uses a relatively new Nvidia driver version and only supports compute capability 3.0 and up. If this affects too many users out there, I will need to rebuild with on older SDK.
2. I was not able to build a fully static executable, but I did statically link the ones most likely to be a problem (i.e. pari, gmp, std c++)
Please report any problems. I am still relatively new to the whole GPU app process, so I am sure there will be issues of some kind.
Also, feel free to leave comments regarding what platform, GPU, etc I should concentrate on next. I was thinking I would attack linux OpenCL (i.e. ATI/AMD) next as that should be a quick port of what I did with Nvidia. I think the windows port will take much longer, since I normally use mingw to cross-compile but I don't think that's compatible with the nvidia compiler.
22 Mar 2019, 18:53:12 UTC
Septic Results Paper Accepted for Publication
I mentioned a while back that we submitted the results of the septic search for publication. I am now happy to report that the paper has been accepted for publication in the Journal of Number Theory.
I will continue to post additional details as they become available.
Thank you to all our volunteers for making this possible!
12 Mar 2019, 16:23:33 UTC
News is available as an RSS feed