Posts by Stephen Uitti

1) Message boards : News : New subfield search (Message 2062)
Posted 25 May 2018 by Profile Stephen Uitti
Post:
The RakeSearch project (since the Fall of 2017, and officially announced in Dec 2017) has optimized apps that improve speed by 8x to 10x over previous versions. The new apps use SSE2 and (if available) AVX instructions. And, there are new versions that work on Arm computers - getting the performance of prior x86 implementations on, for example, Raspberry Pi 2 and 3 computers that are easily 8x slower than reference x86 machines. This isn't the same as going to a GPU (except, arguably, in the case of the Arm), but it is a step in that direction. SSE2 is the double precision version of "SIMD" (Single Instruction Multiple Data) x86 vector instructions. It's limited, but SIMD is how modern GPUs work. GPUs are expected to become MIMD (Multiple Instruction Multiple Data - much like the independent cores of a multicore x86) in the next decade or less. This should make it easier to toss an app at your video card. I say easier, because you'll still have bandwidth issues at multiple levels to deal with to get good performance. Perhaps a good optimizing compiler could do much of that work for you. The RakeSearch Arm versions are very impressive to me, since you mostly have to deal with these bandwidth issues manually on that platform, and word has it, it's a nightmare. Many would call it "impossible", which for others simply means it's a bunch of hard work.
2) Message boards : News : New subfield search (Message 1822)
Posted 5 Mar 2017 by Profile Stephen Uitti
Post:
I've recently been looking at a new-ish computer language: Julia. It has a pretty easy mathematical syntax (3 * a in FORTran can be 3a in Julia). It also has syntax for parallel processing and vector processing. Parallel can mean co-routines on a single core (helping I/O and CPU overlap), multi-core shared memory, or systems on a network. Vector processing can be using the vector instructions that the processor (especially x86) has, a video card (nVidia or Radeon), on-chip video (like AMD's APU or an Arm's DSP or whatever Intel calls their on-chip video). The parallel syntax is fairly non-invasive, and mainly gives the language hints that you're not going to go wild with pointer references, etc. By new, i mean that the current release is 0.5. While the language has a JVM, with garbage collection and Just-In-Time compiling, it also can compile to the bare metal, and has, as a goal, near-C performance. The C language has been the gold standard for performance since the 80s. It can also either call other languages like C, or be called by other languages. Perhaps there's no need to recode any BOINC interfaces. Anyway, i'm looking at it specifically because it's a higher level language, and it looks like it can produce video card code without having to learn or code in video card instructions. Perhaps a good fit for NumberFields.

So far, i've read the 645 page manual (available online or in PDF) and have written a couple tens-of-lines programs and gotten them to get correct results. One of these is the Collatz core (which is real simple). Julia has easy arbitrary precision arithmetic support (it's an integer type), which helps here.

http://julialang.org/

I personally have _no idea_ what math NumberFields uses. I'm not totally ignorant about discrete math, but I mostly studied continuous Calculus stuff. Perhaps there are users who would be interested in helping out. That could speed things up.

Stephen.





Main page · Your account · Message boards


Copyright © 2024 Arizona State University