[New release] BRP app v1.23/1.24 (OpenCL) feedback thread

Bikeman (Heinz-...

Joined: 28 Aug 06

Posts: 164

Credit: 1864017

RAC: 0

RE: For me the new app

3 May 2012 16:24:11 UTC

Message 79208 in response to message 79206

(moderation:

)

Quote:

For me the new app takes a full CPU core when it is running. Is that by intention?

Hi!

This will depend on the driver version you are using, e.g. you can see from the screenshots posted here that this is not the in general the case for this app.

Cheers
HB

Christoph

Joined: 25 Aug 05

Posts: 30

Credit: 208211

RAC: 0

RE: RE: For me the new

3 May 2012 20:25:57 UTC

Message 79209 in response to message 79208

(moderation:

)

Quote:

Quote:
For me the new app takes a full CPU core when it is running. Is that by intention?

Hi!

This will depend on the driver version you are using, e.g. you can see from the screenshots posted here that this is not the in general the case for this app.

Cheers
HB

I was not exact enough in my statement. BOINC reserves a full core. '1 CPUs + 1 ATI GPU' and as I understand it that should not be the case.

Here one log snippet: 03.05.2012 22:25:33 | Albert@Home | [rr_sim_detail] 339385.57: starting p2030.20110421.G41.06+00.53.N.b6s0g0.00000_1448_2 (1.00 CPU + 1.00 ATI)

Oh, driver version 12.3 as far as I read it there that high CPU usage is solved.

Christoph

Bikeman (Heinz-...

Joined: 28 Aug 06

Posts: 164

Credit: 1864017

RAC: 0

RE: So, my questions

3 May 2012 20:27:55 UTC

Message 79210 in response to message 79203

(moderation:

)

Quote:

So, my questions are:
* What is checkpointing? An intermidiate state (variables) save in case calculations get interrupted and you don't have to start over?

Exactly

Quote:

* Is the aitOpenCl app checkpointing more? Or is it that the two apps are doing the same amount of work (calcs), and it's just that the CUDA app/GTX 560 is doing more work per unit time and therefore only needs to checkpoint 5 vs. my 20 times?

By default, all apps are checkpointing every 60 seconds. All workunits, whether they will get picked up by a CPU, NVIDIA GPU or ATI GPU will do the same amount of work, but the faster the processing is, the fewer checkpoints will happen during the execution time.

Quote:

* Is the GTX 560/CUDA app really 4x (20/5=4) than the HD6950/AtiOpenCl? The 6950 shows 2253 SP GFLOPS vs. the GTX 560 SP GFLOPS of 1088.6.
http://en.wikipedia.org/wiki/Comparison_of_AMD_graphics_processing_units
http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units

Well, the Flops numbers are just theoretical peak performance and not too meaningful. But anyway, it's fair to say that our CUDA app and the libraries we used with it are more mature and optimized than our OpenCL app. We have some ideas in the pipeline how to further improve the OpenCL app and hopefully we can implement them in a timeframe of weeks rather than months, stay tuned.

Quote:

To semi-answer that, GPU Time indicates a 2.503x increase for the GTX560/CUDA vs. the AtiOpenCl/HD6950. The CPU time for the CUDA app is ,however, 4.24x less than that of the OpenCl app. Anandtech Bench shows the 2500k vs. my AMD 975BE to be slightly better in single-threaded, multi-threaded, and total MIPS (7-Zip test), but nothing earth shattering.
http://www.anandtech.com/bench/Product/288?vs=435

I know you said before that the OpenCl app uses way more CPU than the CUDA app. Perhaps the OpenCl standard is still yet immature, AMD has crappy drivers, or a mix of both? Regardless, I really commend everyone's efforts. Having done a fair bit of coding myself, I know what a pain this can all be.

Indeed :-). I don't think one should blame the OpenCL standard, it's not different from CUDA anyway. It's the implementation of the standard and the drivers that are causing a few troubles. Neither AMD nor NVIDIA seem too enthusiastic about OpenCL anymore, I'm afraid.

CU
HB

Bikeman (Heinz-...

Joined: 28 Aug 06

Posts: 164

Credit: 1864017

RAC: 0

RE: RE: RE: For me the

3 May 2012 20:34:00 UTC

Message 79211 in response to message 79209

(moderation:

)

Quote:

Quote:
Quote:
For me the new app takes a full CPU core when it is running. Is that by intention?

Hi!

This will depend on the driver version you are using, e.g. you can see from the screenshots posted here that this is not the in general the case for this app.

Cheers
HB

I was not exact enough in my statement. BOINC reserves a full core. '1 CPUs + 1 ATI GPU' and as I understand it that should not be the case.

Here one log snippet: 03.05.2012 22:25:33 | Albert@Home | [rr_sim_detail] 339385.57: starting p2030.20110421.G41.06+00.53.N.b6s0g0.00000_1448_2 (1.00 CPU + 1.00 ATI)

Oh, driver version 12.3 as far as I read it there that high CPU usage is solved.

Yup, the allocation of a full core by BOINC is something that is configured at the server side. This was to prevent the situation where CPUs will get overcommitted for those users with older drivers where indeed a full CPU is taken by the driver...it's a conservative choice. We will look into the question how to handle this when we go live with the app, e.g. we could make the CPU allocation dependent on the driver version as we once did for NVIDIA where a similar driver problem existed under Linux, iirc.

Cheers
HB

Christoph

Joined: 25 Aug 05

Posts: 30

Credit: 208211

RAC: 0

Ah, ok. Looks like I missed

3 May 2012 20:44:22 UTC

Message 79212 in response to message 79211

(moderation:

)

Ah, ok. Looks like I missed that detail somehow. So I can stop scratching my head.

Christoph

Joined: 25 Aug 05

Posts: 30

Credit: 208211

RAC: 0

I got one computation error.

3 May 2012 21:28:50 UTC

Message 79213

(moderation:

)

I got one computation error. Propably it has something to do with my reboot yesterday.
I had one also with WCG, that was the reason for the reboot.

http://albertathome.org/task/201372

Christoph

Bikeman (Heinz-...

Joined: 28 Aug 06

Posts: 164

Credit: 1864017

RAC: 0

RE: I got one computation

3 May 2012 21:38:40 UTC

Message 79214 in response to message 79213

(moderation:

)

Quote:

I got one computation error. Propably it has something to do with my reboot yesterday.
I had one also with WCG, that was the reason for the reboot.

http://albertathome.org/task/201372

Hi!

Thanks very much for reporting this, this might actually point to a real problem in the code which affects only some cards that have certain restrictions that the app has to take into account when generating work for the GPU.

Stay tuned, I hope I can install a fix tomorrow.

Infusioned

Joined: 11 Feb 05

Posts: 38

Credit: 149000

RAC: 0

Two more errors

4 May 2012 1:29:39 UTC

Message 79215 in response to message 79214

(moderation:

)

Two more errors with:

Incorrect function. (0x1) - exit code 1 (0x1)

http://albertathome.org/task/199760
http://albertathome.org/task/199762

I just read through the 7.0.27 change log and there is some stuff about trying to address this error. I installed 7.0.27, I'll see if this helps.

astro-marwil

Joined: 28 May 05

Posts: 4

Credit: 1633

RAC: 0

IÂ´m new here since yesterday

4 May 2012 7:15:55 UTC

Message 79216

(moderation:

)

IÂ´m new here since yesterday afternoon.
The change from BOINC 7.0.25 to .26 was straight forward, except that it took astonishing long - some minutes ? - until my established tasks became running once again.
To establish AaH in my BOINC was more complicated, as AaH is not included in the list of projects in BOINC Manager/Tools/Add a project or project manager. IÂ´d help me by clicking on EaH and replacing in the URL Einstein by Albert. It took a while to find this way. Why isnÂ´t AaH included in the list of projects ??? In the AaH preferences I set the GPU utilization factor to 0.5, BRP4 check and S6LV1 unchecked.
I was somewhat astonished to find in the task log AaH running (1 CPU + 0.5 NVIDIA GPU) and 1 CPU waiting to run S6LV1. Whereas BRP4 tasks from EaH are running with (0.2 CPU + 0.5 NVIDIA GPU) and all 4 CPUs crunching S6LV1 tasks. The CPU load was reduced to about 90%, where as I had before always 100% of load. During running AaH the desktop was very sticky, most time I had to wait some seconds before any activity could be performed. This was also during the phases of waiting of the AaH task. The desktop was no longer sticky when the AaH project was suspended. This is a very uncomfortable way of operation.
So it was running quite a while, but about 15 minutes before the AaH task came to end, I found, that within the last nearly exact 20 minutes interval 3 of the running BRP4 tasks from EaH became marked as "Error while computing", all with Exit code 1002. The AaH task it self ended fine. To morning it was validated by a ATI card running under Linux on a Intel CPU. The running time is about a factor 3 longer, where as the CPU time is comparable - AaH/EaH -.

Because of the divers reported malfunctioning I supended AaH. ItÂ´s nice that the task became validated, especialy as the counterpart was of much other type. It shows, you are on a very good way, and when the next version will be available, I will try again.

Kind regards
Martin

tullio

Joined: 22 Jan 05

Posts: 53

Credit: 137342

RAC: 0

Albert@home runs well on my

4 May 2012 12:20:30 UTC

Message 79217

(moderation:

)

Albert@home runs well on my Linux box, all results are validated. I have no GPU.I got some validation error on Einstein@home, on a Gamma-ray pulsar search unit.
Tullio

[New release] BRP app v1.23/1.24 (OpenCL) feedback thread

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports