Oh, and the long runtimes which Ageless has are normal to me. I will set NNT to other projects to see if the times go down when my tasks run in one go.
I have been away for a bit due to my motherboard dying, and when I got back up and running with the rebuild I was waiting for 7.0.25 to go live so I could run Milkway with Albert without using a beta version for a live project.
So, poking through my WU times, I am hovering at ~ 5900 GPU seconds, and ~ 3,200 CPU seconds per WU.
AMD Phenom II x4 975 (couldn't find an 1100T for non-ripoff prices and am waiting for Piledriver [not happy with Bulldozer])
AMD HD 6950
8G DDR3 1600
Win 7 x64
Boinc 7.0.25
I've seen a message elsewhere saying that OpenCL workunits tend to need much more CPU use than running similar workunits using CUDA. This implies that slow CPUs will slow down OpenCL workunits much more than they slow down CUDA workunits.
This implies that slow CPUs will slow down OpenCL workunits much more than they slow down CUDA workunits.
Understood. However, that's why I checked Anandtech's benchmarks to see just how much faster the 2600k was than my cpu. The benchmarks do not reflect a 66% performance difference so there is something else going on.
Also, unless I read the charts wrong, comparing the GFLOPS between the two video cards, theoretically the 6950 should smoke the 550Ti in SP output (2253 vs. 691.2).
So, back to my original question, is the OpenCL app that unoptimized compared to the CUDA app?
I am 58327 and someone with a GTX 590 GPU, Intel 2600k CPU, Cuda OpenCL client is 56759.
My CPU seconds are 1463 and theirs are 2061.
My GPU seconds are 3244 and theirs are 2198.
My CPU time is actually lower (75%) of the 2600k, but my GPU time is ~150% of the GTX 590 (which again, is curious, given the GFLOP numbers).
My conclusion from all this is then, that the Albert AMD OpenCL application isn't as quite as optimized as the Albert CUDA application. Can anyone confirm/deny?
Hmm the OpenCl app uses a full CPU core to work.
IS there any way to lower that usage?
Quote:
Hmm the OpenCl app uses a full CPU core to work.
IS there any way to lower that usage?
Hi!
In terms of CPU usage, the OpenCL app should in theory be comparable to the NVIDIA/CUDA app, but we have seen huge differences in CPU usage with different driver versions from ATI. So the only advice I can give now is to try different drivers, sorry. Please let us know any results for your card (e.g. which driver worked better wrt CPU usage).
From the previous message:
Quote:
My conclusion from all this is then, that the Albert AMD OpenCL application isn't as quite as optimized as the Albert CUDA application. Can anyone confirm/deny?
It's fair to say that the CUDA app is more optimized to NVIDIA cards than the OpenCL app is optimized to ATI cards, yes. This has several reasons:
* OpenCL is a multi-vendor platform while CUDA is NVIDIA only. If you write OpenCL code you want to keep the vendor-independence. It would be great if we could have just one code basis, it has to be seen whether this will be realistic without too much impact on performance on either platform.
* The OpenCL app for the pulsar search is a port of the CUDA app which came out first of course, so it's not specifically tuned to the strengths of ATI cards...yet
* The first priority is, needless to say, to get the app to a point where it runs on all our target platforms (OSX, Linux, Windows) and produces scientifically sound results that cross-validate with the CUDA and CPU apps. As has been mentioned elsewhere, the level of support (tools, libraries, bugfixing, drivers...) is certainly more mature for CUDA/NVIDIA than for OpenCL/ATI, so almost all our efforts currently have to be directed into "making it work at all" and less can be spent on "optimizing".
On the other hand the ATI cards are, without any questions, fine pieces of hardware! So I'm quite optimistic that already the first OpenCL app that will
go into production on E@H will have a decent performance/Watt ratio.
Stay tuned and thanks for helping us test the thing here on Albert@Home!
Hi Once the jobs has been
)
Hi
Once the jobs has been loaded and a couple were aborted, I disabled requesting new jobs, changed and waited for the remaining jobs to complete.
It looks this is no more required because the jobs I got today are processing OK, so it looks fixed.
Christophe
WUID 47277, run time:
)
WUID 47277, run time: 29,286.40 seconds.
WUID 46805, run time: 39,079.57 seconds.
WUID 46559, run time: 4,538.00 seconds.
47277 has this:
46805 has this:
And from there on in, they slow down. 46559 ran from start to finish without exception handling (aka a break), and as such it ran in 'normal' time.
Now, the troubling thing is that it doesn't do this with all tasks. WUID 47791 has a run time of 6,306.80 seconds, yet it also has this:
That was a BOINC exit & restart. The other two were stops of the task itself while BOINC continued running.
I have an
)
I have an invalid.
http://albertathome.org/task/140891
Oh, and the long runtimes which Ageless has are normal to me. I will set NNT to other projects to see if the times go down when my tasks run in one go.
Christoph
RE: All 1.22 wu's are
)
Looks like it was the client at fault, upgraded to 7.0.23 & I have a wu in progress
I have been away for a bit
)
I have been away for a bit due to my motherboard dying, and when I got back up and running with the rebuild I was waiting for 7.0.25 to go live so I could run Milkway with Albert without using a beta version for a live project.
So, poking through my WU times, I am hovering at ~ 5900 GPU seconds, and ~ 3,200 CPU seconds per WU.
AMD Phenom II x4 975 (couldn't find an 1100T for non-ripoff prices and am waiting for Piledriver [not happy with Bulldozer])
AMD HD 6950
8G DDR3 1600
Win 7 x64
Boinc 7.0.25
This WU vs. i2600k Sandybridge/550Ti shows the 2600k coming in at 1/3 the time of my cpu. However, Anandtech Bench does not show the 2600k as 66% faster. Also, wikipedia shows the AMD HD6950 SP GFLOPS at 2253 and the NVIDIA GTX 55Ti SP GFLOPS at 691.2, but the 550Ti time is 2/3 of mine.
So, my question is, what gives? Is the OpenCL app that unoptimized compared to the CUDA app?
I've seen a message elsewhere
)
I've seen a message elsewhere saying that OpenCL workunits tend to need much more CPU use than running similar workunits using CUDA. This implies that slow CPUs will slow down OpenCL workunits much more than they slow down CUDA workunits.
RE: This implies that slow
)
Understood. However, that's why I checked Anandtech's benchmarks to see just how much faster the 2600k was than my cpu. The benchmarks do not reflect a 66% performance difference so there is something else going on.
Also, unless I read the charts wrong, comparing the GFLOPS between the two video cards, theoretically the 6950 should smoke the 550Ti in SP output (2253 vs. 691.2).
So, back to my original question, is the OpenCL app that unoptimized compared to the CUDA app?
Here is a WU from Seti@Home
)
Here is a WU from Seti@Home Beta's OpenCL application:
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=3973426
I am 58327 and someone with a GTX 590 GPU, Intel 2600k CPU, Cuda OpenCL client is 56759.
My CPU seconds are 1463 and theirs are 2061.
My GPU seconds are 3244 and theirs are 2198.
My CPU time is actually lower (75%) of the 2600k, but my GPU time is ~150% of the GTX 590 (which again, is curious, given the GFLOP numbers).
My conclusion from all this is then, that the Albert AMD OpenCL application isn't as quite as optimized as the Albert CUDA application. Can anyone confirm/deny?
Hmm the OpenCl app uses a
)
Hmm the OpenCl app uses a full CPU core to work.
IS there any way to lower that usage?
RE: Hmm the OpenCl app uses
)
Hi!
In terms of CPU usage, the OpenCL app should in theory be comparable to the NVIDIA/CUDA app, but we have seen huge differences in CPU usage with different driver versions from ATI. So the only advice I can give now is to try different drivers, sorry. Please let us know any results for your card (e.g. which driver worked better wrt CPU usage).
From the previous message:
It's fair to say that the CUDA app is more optimized to NVIDIA cards than the OpenCL app is optimized to ATI cards, yes. This has several reasons:
* OpenCL is a multi-vendor platform while CUDA is NVIDIA only. If you write OpenCL code you want to keep the vendor-independence. It would be great if we could have just one code basis, it has to be seen whether this will be realistic without too much impact on performance on either platform.
* The OpenCL app for the pulsar search is a port of the CUDA app which came out first of course, so it's not specifically tuned to the strengths of ATI cards...yet
* The first priority is, needless to say, to get the app to a point where it runs on all our target platforms (OSX, Linux, Windows) and produces scientifically sound results that cross-validate with the CUDA and CPU apps. As has been mentioned elsewhere, the level of support (tools, libraries, bugfixing, drivers...) is certainly more mature for CUDA/NVIDIA than for OpenCL/ATI, so almost all our efforts currently have to be directed into "making it work at all" and less can be spent on "optimizing".
On the other hand the ATI cards are, without any questions, fine pieces of hardware! So I'm quite optimistic that already the first OpenCL app that will
go into production on E@H will have a decent performance/Watt ratio.
Stay tuned and thanks for helping us test the thing here on Albert@Home!
HBE