[New release] BRP app v1.23/1.24 (OpenCL) feedback thread

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Joined: 28 Aug 06
Posts: 164
Credit: 1864017
RAC: 0

RE: For me the new app

Message 79208 in response to message 79206

Quote:
For me the new app takes a full CPU core when it is running. Is that by intention?

Hi!

This will depend on the driver version you are using, e.g. you can see from the screenshots posted here that this is not the in general the case for this app.

Cheers
HB

Christoph
Christoph
Joined: 25 Aug 05
Posts: 30
Credit: 208211
RAC: 0

RE: RE: For me the new

Message 79209 in response to message 79208

Quote:
Quote:
For me the new app takes a full CPU core when it is running. Is that by intention?

Hi!

This will depend on the driver version you are using, e.g. you can see from the screenshots posted here that this is not the in general the case for this app.

Cheers
HB


I was not exact enough in my statement. BOINC reserves a full core. '1 CPUs + 1 ATI GPU' and as I understand it that should not be the case.

Here one log snippet: 03.05.2012 22:25:33 | Albert@Home | [rr_sim_detail] 339385.57: starting p2030.20110421.G41.06+00.53.N.b6s0g0.00000_1448_2 (1.00 CPU + 1.00 ATI)

Oh, driver version 12.3 as far as I read it there that high CPU usage is solved.

Christoph

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Joined: 28 Aug 06
Posts: 164
Credit: 1864017
RAC: 0

RE: So, my questions

Message 79210 in response to message 79203

Quote:

So, my questions are:
* What is checkpointing? An intermidiate state (variables) save in case calculations get interrupted and you don't have to start over?


Exactly

Quote:

* Is the aitOpenCl app checkpointing more? Or is it that the two apps are doing the same amount of work (calcs), and it's just that the CUDA app/GTX 560 is doing more work per unit time and therefore only needs to checkpoint 5 vs. my 20 times?


By default, all apps are checkpointing every 60 seconds. All workunits, whether they will get picked up by a CPU, NVIDIA GPU or ATI GPU will do the same amount of work, but the faster the processing is, the fewer checkpoints will happen during the execution time.

Quote:

* Is the GTX 560/CUDA app really 4x (20/5=4) than the HD6950/AtiOpenCl? The 6950 shows 2253 SP GFLOPS vs. the GTX 560 SP GFLOPS of 1088.6.
http://en.wikipedia.org/wiki/Comparison_of_AMD_graphics_processing_units
http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units

Well, the Flops numbers are just theoretical peak performance and not too meaningful. But anyway, it's fair to say that our CUDA app and the libraries we used with it are more mature and optimized than our OpenCL app. We have some ideas in the pipeline how to further improve the OpenCL app and hopefully we can implement them in a timeframe of weeks rather than months, stay tuned.

Quote:


To semi-answer that, GPU Time indicates a 2.503x increase for the GTX560/CUDA vs. the AtiOpenCl/HD6950. The CPU time for the CUDA app is ,however, 4.24x less than that of the OpenCl app. Anandtech Bench shows the 2500k vs. my AMD 975BE to be slightly better in single-threaded, multi-threaded, and total MIPS (7-Zip test), but nothing earth shattering.
http://www.anandtech.com/bench/Product/288?vs=435

I know you said before that the OpenCl app uses way more CPU than the CUDA app. Perhaps the OpenCl standard is still yet immature, AMD has crappy drivers, or a mix of both? Regardless, I really commend everyone's efforts. Having done a fair bit of coding myself, I know what a pain this can all be.

Indeed :-). I don't think one should blame the OpenCL standard, it's not different from CUDA anyway. It's the implementation of the standard and the drivers that are causing a few troubles. Neither AMD nor NVIDIA seem too enthusiastic about OpenCL anymore, I'm afraid.

CU
HB

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Joined: 28 Aug 06
Posts: 164
Credit: 1864017
RAC: 0

RE: RE: RE: For me the

Message 79211 in response to message 79209

Quote:
Quote:
Quote:
For me the new app takes a full CPU core when it is running. Is that by intention?

Hi!

This will depend on the driver version you are using, e.g. you can see from the screenshots posted here that this is not the in general the case for this app.

Cheers
HB


I was not exact enough in my statement. BOINC reserves a full core. '1 CPUs + 1 ATI GPU' and as I understand it that should not be the case.

Here one log snippet: 03.05.2012 22:25:33 | Albert@Home | [rr_sim_detail] 339385.57: starting p2030.20110421.G41.06+00.53.N.b6s0g0.00000_1448_2 (1.00 CPU + 1.00 ATI)

Oh, driver version 12.3 as far as I read it there that high CPU usage is solved.

Yup, the allocation of a full core by BOINC is something that is configured at the server side. This was to prevent the situation where CPUs will get overcommitted for those users with older drivers where indeed a full CPU is taken by the driver...it's a conservative choice. We will look into the question how to handle this when we go live with the app, e.g. we could make the CPU allocation dependent on the driver version as we once did for NVIDIA where a similar driver problem existed under Linux, iirc.

Cheers
HB

Christoph
Christoph
Joined: 25 Aug 05
Posts: 30
Credit: 208211
RAC: 0

Ah, ok. Looks like I missed

Message 79212 in response to message 79211

Ah, ok. Looks like I missed that detail somehow. So I can stop scratching my head.

Christoph

Christoph
Christoph
Joined: 25 Aug 05
Posts: 30
Credit: 208211
RAC: 0

I got one computation error.

I got one computation error. Propably it has something to do with my reboot yesterday.
I had one also with WCG, that was the reason for the reboot.

http://albertathome.org/task/201372

Christoph

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Joined: 28 Aug 06
Posts: 164
Credit: 1864017
RAC: 0

RE: I got one computation

Message 79214 in response to message 79213

Quote:

I got one computation error. Propably it has something to do with my reboot yesterday.
I had one also with WCG, that was the reason for the reboot.

http://albertathome.org/task/201372

Hi!

Thanks very much for reporting this, this might actually point to a real problem in the code which affects only some cards that have certain restrictions that the app has to take into account when generating work for the GPU.

Stay tuned, I hope I can install a fix tomorrow.

HB

Infusioned
Infusioned
Joined: 11 Feb 05
Posts: 38
Credit: 149000
RAC: 0

Two more errors

Message 79215 in response to message 79214

Two more errors with:

Incorrect function. (0x1) - exit code 1 (0x1)

http://albertathome.org/task/199760
http://albertathome.org/task/199762

I just read through the 7.0.27 change log and there is some stuff about trying to address this error. I installed 7.0.27, I'll see if this helps.

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 4
Credit: 1633
RAC: 0

I´m new here since yesterday

I´m new here since yesterday afternoon.
The change from BOINC 7.0.25 to .26 was straight forward, except that it took astonishing long - some minutes ? - until my established tasks became running once again.
To establish AaH in my BOINC was more complicated, as AaH is not included in the list of projects in BOINC Manager/Tools/Add a project or project manager. I´d help me by clicking on EaH and replacing in the URL Einstein by Albert. It took a while to find this way. Why isn´t AaH included in the list of projects ??? In the AaH preferences I set the GPU utilization factor to 0.5, BRP4 check and S6LV1 unchecked.
I was somewhat astonished to find in the task log AaH running (1 CPU + 0.5 NVIDIA GPU) and 1 CPU waiting to run S6LV1. Whereas BRP4 tasks from EaH are running with (0.2 CPU + 0.5 NVIDIA GPU) and all 4 CPUs crunching S6LV1 tasks. The CPU load was reduced to about 90%, where as I had before always 100% of load. During running AaH the desktop was very sticky, most time I had to wait some seconds before any activity could be performed. This was also during the phases of waiting of the AaH task. The desktop was no longer sticky when the AaH project was suspended. This is a very uncomfortable way of operation.
So it was running quite a while, but about 15 minutes before the AaH task came to end, I found, that within the last nearly exact 20 minutes interval 3 of the running BRP4 tasks from EaH became marked as "Error while computing", all with Exit code 1002. The AaH task it self ended fine. To morning it was validated by a ATI card running under Linux on a Intel CPU. The running time is about a factor 3 longer, where as the CPU time is comparable - AaH/EaH -.

Because of the divers reported malfunctioning I supended AaH. It´s nice that the task became validated, especialy as the counterpart was of much other type. It shows, you are on a very good way, and when the next version will be available, I will try again.

Kind regards
Martin

tullio
tullio
Joined: 22 Jan 05
Posts: 53
Credit: 137342
RAC: 0

Albert@home runs well on my

Albert@home runs well on my Linux box, all results are validated. I have no GPU.I got some validation error on Einstein@home, on a Gamma-ray pulsar search unit.
Tullio

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.