Sending work

Albert@home is still an unofficial, non-public test project. Don't expect anything to work here. The only type of work Albert@home is currently sending out is for a highly experimental BRP4 OpenCL application.

Comments

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

It's not that bad after all.

Message 78479 in response to message 78478

It's not that bad after all. However, while our current OpenCL app runs on NVIDIA GPUs indeed it doesn't yet produce valid results on all their models. Most likely due to some subtle differences (optimizations) in NVIDIA's runtime OpenCL compiler... We're investigating but we'll focus on OpenCL@AMD first since we got a CUDA app anyway...

Oliver

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Looks like the OpenCL app for

Looks like the OpenCL app for ATI can't validate against the CPU app: http://albertathome.org/workunit/11889

Oh and thanks for making the initial replication 3, but leaving the third one unsent. ;-)

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Separate post for this, can

Separate post for this, can you stop sending ATI OpenCL work to non-6.13 clients?
I see one of my other tasks is waiting for a wingman, because two people got "Failed to get OpenCL platform/device info from BOINC (error: -161)!" (6.12.34 and 6.10.60), the third had it running on the failed CUDA app.

Since you're checking for the information from the client, you may just as well make the client mandatory.

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: Separate post for this,

Message 78482 in response to message 78481

Quote:

Separate post for this, can you stop sending ATI OpenCL work to non-6.13 clients?
I see one of my other tasks is waiting for a wingman, because two people got "Failed to get OpenCL platform/device info from BOINC (error: -161)!" (6.12.34 and 6.10.60), the third had it running on the failed CUDA app.

Since you're checking for the information from the client, you may just as well make the client mandatory.

Guess what we do already? :-)
We're aware of that problem but there's no fix right now... Still investigating...

Oliver

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: Looks like the OpenCL

Message 78483 in response to message 78480

Quote:
Looks like the OpenCL app for ATI can't validate against the CPU app: http://albertathome.org/workunit/11889


We're still tuning the validator. That's part of this test, we need to sample the numerical stability/inaccuracies across various platforms and devices...

Quote:

Oh and thanks for making the initial replication 3, but leaving the third one unsent. ;-)


Hm?

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: RE: Oh and thanks

Message 78484 in response to message 78483

Quote:
Quote:

Oh and thanks for making the initial replication 3, but leaving the third one unsent. ;-)

Hm?


Sneaky... now it's sent to a third party.
It wasn't for several hours last night, before I made the comment... I just needed more patience then. ;-)

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: Guess what we do

Message 78485 in response to message 78482

Quote:
Guess what we do already? :-)
We're aware of that problem but there's no fix right now... Still investigating...


One further request then, decrease the deadline here? I see wingmen who have seemingly abandoned the cause, so now I'll have to wait 14 days before anything is resent. That's too much for a test project. Just make the deadline 3 to 5 days, that's time enough to crunch the work, even on a multi-project system, and send it back.

You want the results back fast, don't you? :-)

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: We're aware of that

Message 78486 in response to message 78482

Quote:

We're aware of that problem but there's no fix right now... Still investigating...

Should work now...

robertmiles
robertmiles
Joined: 16 Nov 11
Posts: 31
Credit: 4468368
RAC: 0

RE: RE: I've seen rumors

Message 78487 in response to message 78473

Quote:

Quote:
I've seen rumors that the 7.0 version of BOINC will be available soon, with at least some support for OpenCL GPU workunits, but almost nothing more on just how much OpenCL GPU support.

BOINC 7 (being tested as BOINC 6.13) will support both ATI and nVidia GPUs, with a possibility for Intel GPUs to follow once Intel gets an API out. But really, BOINC doesn't need to support OpenCL, as it doesn't do any of the work. The science application will need to do OpenCL.

All that BOINC will do is detect if your GPU is OpenCL capable and if so, which version it is compliant to. Just as it already did detect if the nVidia GPU is CUDA capable and the ATI GPU CAL/Brook+ capable.

Looks adequate for computers with only one GPU.

On other BOINC projects, I've seen one thing mentioned that BOINC really needs to do if you want to be able to run both an OpenCL GPU workunit and a non-OpenCL GPU workunit at the same time on computers with more than one GPU - provide a standard way of mapping the way BOINC identifies which GPU it has assigned to a workunit to the way OpenCL identifies those GPUs. Otherwise, it's likely that those two workunits will often attempt to use the same GPU at the same time, with both failing.

Except for this, I've seen nothing on features for OpenCL that need to be in BOINC instead of the application. Adding more features would be helpful, but not required, for BOINC projects wanting OpenCL GPU workunits.

I currently don't have any AMD/ATI GPUs, so I won't be able to much for Albert@Home soon, at least until I find time to measure the size requirements for putting such a graphics board into my newest computer.

robertmiles
robertmiles
Joined: 16 Nov 11
Posts: 31
Credit: 4468368
RAC: 0

RE: Albert@home is still an

Quote:
Albert@home is still an unofficial, non-public test project. Don't expect anything to work here. The only type of work Albert@home is currently sending out is for a highly experimental BRP4 OpenCL application.

I've recently received some SSE and CUDA workunits. Should I consider them outdated workunits left over from Einstein@Home, or should I consider them Nvidia and CPU versions of the OpenCL workunits?

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: I've recently received

Message 78489 in response to message 78488

Quote:
I've recently received some SSE and CUDA workunits. Should I consider them outdated workunits left over from Einstein@Home, or should I consider them Nvidia and CPU versions of the OpenCL workunits?

The latter. All BRP4 work units on albert can be considered equal and it doesn't matter whether they're run on a CPU, a CUDA device or via OpenCL. This allows us to check cross-platform/device validation and get a feeling for their relative performances. It also allows us to test BOINC's behavior in mixed-GPU (NVIDIA and AMD GPUs in one box) setups - so far the results look good.

Best,
Oliver

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

ATI OpenCL and CUDA32 don't

ATI OpenCL and CUDA32 don't validate together either.
http://albertathome.org/workunit/12465

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: ATI OpenCL and CUDA32

Message 78491 in response to message 78490

Quote:
ATI OpenCL and CUDA32 don't validate together either.
http://albertathome.org/workunit/12465

Not universally true but dependent on the NVIDIA GPU. My observation: a GTX 285 (Tesla) does, a GTX 580 (Fermi) doesn't. We'll investigate and will most likely tune the validator. It just takes some time since we're running low on manpower right now and got the download server issue at first prio.

Stay tuned,
Oliver

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Another one then,

Message 78492 in response to message 78491

Another one then, http://albertathome.org/workunit/11889.
That's one OpenCL vs two CPU, guess whose work was deemed invalid? ;-)

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Also looks like you need more

Also looks like you need more stringent GPU capability detection.
this host has an OpenCL capable GPU, but for some reason on has 384MB memory. Thus all its tasks err.

As far as I know you can set up the scheduler to check for OpenCL capability, memory on the GPU, BOINC client used, etc. GPUs like this need to be locked out, before they're sent work as else it's using unnecessary bandwidth and confuses the user to no end.

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: you can set up the

Message 78494 in response to message 78493

Quote:
you can set up the scheduler to check for OpenCL capability, memory on the GPU, BOINC client used, etc. GPUs like this need to be locked out, before they're sent work as else it's using unnecessary bandwidth and confuses the user to no end.

Yep, already on our TODO list (mostly done, memory requirements pending)...

Oliver

x3mEn
x3mEn
Joined: 21 Jun 11
Posts: 9
Credit: 10000
RAC: 0

HD4890: 1WU = 62,806.60 sec =

HD4890: 1WU = 62,806.60 sec = 17.5 hours and... 500.00 credits only, I guess? )
I understand, that A@H is a test project, but for my opinion it's a waste of time for HD4890.
Don't say me that HD4890 is a low-end GPU )
HD4890 brought me 1Ms in Collatz, MilkyWay, Moo! Wrapper.
In these projects HD4890 is more effective than GTX460, which usually spend only 1 hour for the same WU at Einstein@Home.

[SETI.USA]Tank_Master
[SETI.USA]Tank_...
Joined: 22 Jan 05
Posts: 13
Credit: 301403
RAC: 0

credits for this alpha

credits for this alpha project are 500 per WU that validates, irregardless of how long it takes to run. Also, last I read this project is soon not going to export stats, if they haven't yet turned that off.

Infusioned
Infusioned
Joined: 11 Feb 05
Posts: 38
Credit: 149000
RAC: 0

CAL ATI Radeon HD 4700/4800

CAL ATI Radeon HD 4700/4800 (RV740/RV770) (1024MB) driver: 0.1

It seems Ageless and I have the same video card. Thus far, I haven't been able to validate a single WU:

http://albertathome.org/workunit/12534
http://albertathome.org/workunit/12549
http://albertathome.org/workunit/12550
http://albertathome.org/workunit/12551
http://albertathome.org/workunit/12294

Also (as mentioned above) it looks like there are problems with the nVidia cards validating with themselves (wu 12294):

The GTX 260 validates with the GTX 590, but not with the GTX 580.

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

ATIOpenCL v1.19 doesn't

ATIOpenCL v1.19 doesn't validate with another ATIOpenCL v1.19 either:
TankMaster (HD 69xx) versus me (HD 4850) == http://albertathome.org/workunit/12850 (Completed, validation inconclusive) with a third task going out.

The next two tasks I have will be against other HD 48xx Radeons. I aborted all tasks that had CUDA or CPU as wingmen. They don't validate anyway so it's kinda useless to spend electrons on that at this time. Perhaps later again when there's been time to fine tune the validator even further.

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: ATIOpenCL v1.19 doesn't

Message 78499 in response to message 78498

Quote:
ATIOpenCL v1.19 doesn't validate with another ATIOpenCL v1.19 either:
TankMaster (HD 69xx) versus me (HD 4850) == http://albertathome.org/workunit/12850 (Completed, validation inconclusive) with a third task going out.


After 5+ hours the third task finally went out, now to a CUDA GPU. I can tell you up front that this isn't going to validate.

Is HR a thought?

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Hmmm... Forget HR. Forget

Hmmm... Forget HR. Forget using any HD4850 it seems, or any OpenCL 1.0 only GPUs? I noticed that Infusioned now does have credit, but he did so with a CPU task, not his HD4850.

http://albertathome.org/workunit/12832 is my HD4850 versus Alexone's (presumably) HD4850 or 4870. Validation is inconclusive.
The difference here being my Intel i3 versus his AMD XII perhaps? Or my 8GB of RAM versus his 4GB?

A third task went out to a CUDA, we all know how that's going to end. I'm ending my extra electron burn here, I aborted the other similar task, since it would put my GPU against Alexone's again anyway. Either my GPU is broken, or I use a driver version your app doesn't like (Catalysts 11.6), or it's the direction of the ley lines crossing under my computer room that break things. I stop testing for now and turn my attention fully to Skyrim. ;-)

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: RE: you can set up

Message 78501 in response to message 78494

Quote:
Quote:
you can set up the scheduler to check for OpenCL capability, memory on the GPU, BOINC client used, etc. GPUs like this need to be locked out, before they're sent work as else it's using unnecessary bandwidth and confuses the user to no end.

Yep, already on our TODO list (mostly done, memory requirements pending)...

Due to a bug in the BOINC client this has to wait until at least 6.13.13 (should be released soon).

Stay tuned,
Oliver

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: Either my GPU is

Message 78502 in response to message 78500

Quote:
Either my GPU is broken, or I use a driver version your app doesn't like (Catalysts 11.6)

You need at least Catalyst 11.7 (8.872) since we use APP SDK 2.5 until further notice.

Oliver

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: Due to a bug in the

Message 78503 in response to message 78501

Quote:
Due to a bug in the BOINC client this has to wait until at least 6.13.13 (should be released soon).


Looks like there ain't gonna be a 6.13.13, but that instead it's going to be BOINC 7.0.x :-)

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: RE: Either my GPU is

Message 78504 in response to message 78502

Quote:
Quote:
Either my GPU is broken, or I use a driver version your app doesn't like (Catalysts 11.6)

You need at least Catalyst 11.7 (8.872) since we use APP SDK 2.5 until further notice.

Darn, we just updated the validator (using less strict tolerances). We won't know whether the 11.6 you used before updating to 11.7 played a role in your tasks being considered invalid :-/

Oliver

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

What's it you're saying,

Message 78505 in response to message 78504

What's it you're saying, Oliver? You want me to return to 11.6?
That's no problem. I'm here to test work for you, remember? Totally not in it for the credits. :-)

I'll run the one ATIOpenCL v1.19 I have started already to completion on 11.7, then switch back to 11.6

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: What's it you're

Message 78506 in response to message 78505

Quote:
What's it you're saying, Oliver? You want me to return to 11.6?

No, you don't have to. You may of course do so as it wouldn't require a full core anymore. Maybe the relaxed validator settings let your tasks through now (we had to tune it anyway)...

Thanks for supporting this effort!

Oliver

steffen_moeller
steffen_moeller
Joined: 9 Feb 05
Posts: 6
Credit: 397892
RAC: 0

Hello, HD5770 fails after

Hello,

HD5770 fails after ~30 seconds
http://albertathome.org/task/51833

[19:27:27][4969][ERROR] Error during OpenCL FFT setup (error: -5)
[19:27:27][4969][ERROR] Demodulation failed (error: 2021)!

It is Debian unstable, fglrx 11.11, amd-app 2.5, BOINC 6.13.12. The installation of amd-app is essential. Without it, neither primegrid nor albert@H binaries can be executed.

Cheers,
Steffen

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: [19:27:27][4969][ERROR]

Message 78508 in response to message 78507

Quote:
[19:27:27][4969][ERROR] Error during OpenCL FFT setup (error: -5)
[19:27:27][4969][ERROR] Demodulation failed (error: 2021)!

Sorry, not enough GPU memory.

Quote:

It is Debian unstable, fglrx 11.11, amd-app 2.5, BOINC 6.13.12. The installation of amd-app is essential. Without it, neither primegrid nor albert@H binaries can be executed.

Why? What happens if you don't install it? The runtime libOpenCL.so should already be installed with the driver (as of 11.9 IIRC). Hm, maybe you still need to register the ICD...

Oliver

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: RE: What's it you're

Message 78509 in response to message 78506

Quote:
Quote:
What's it you're saying, Oliver? You want me to return to 11.6?

No, you don't have to. You may of course do so as it wouldn't require a full core anymore.


I did return to 11.6, but left SDK 2.5 on my system.
Immediately the use of the one core went back to 02-10%, instead of the full 25% it was using before that on 11.7.

Skyrim is now also back to being a bit more stable. I had many more CTDs (crash to desktop) with 11.7 than I have had with 11.6; where with 11.6 it would be perhaps once a day, with 11.7 it was 7 times yesterday alone. So after the last CTD I reverted back to 11.6 ;-)

So... will need an eye on what http://albertathome.org/workunit/13760 will go do. It's me versus two CUDA that can't decide between themselves who's right. I doubt I'll be the clincher for them. ;-)

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: So... will need an eye

Message 78510 in response to message 78509

Quote:
So... will need an eye on what http://albertathome.org/workunit/13760 will go do. It's me versus two CUDA that can't decide between themselves who's right. I doubt I'll be the clincher for them. ;-)


As I thought, I wasn't the clincher. That task was crunched with catalysts 11.6

Looks like these don't validate to ATIOpenCL yet. Further fine tuning of the validator? I have suspended the last 3 tasks I have until I hear more. Although, I could of course abort them and see what the newer scheduler in 7.0.2 thinks I should get for loads of amounts of work. With a REC of 11000, too much anyway. ;-)

Remember, work that got credit is bad work. Work that didn't validate is good work. It tells the developers here their validator isn't ready yet to work in the outside angry world. :)

(and really developers, how many set it and forget it users do you have on here? ;))

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: So... will need an eye

Message 78511 in response to message 78509

Quote:
So... will need an eye on what http://albertathome.org/workunit/13760 will go do.


I see it was another dud. It took another CUDA32 to get validated, me and another CUDA32 finishing outside the points.

I have 4 new tasks.
12537 is paired against a BRP3cuda32. I think I won't even try.
15980 is two ATIOpenCL. I wish driver detection here was working so I could make a reasonable guess as to what driver the other guy is using. Driver: 0.1 is useless.
15971 also has me paired against a BRP3cuda32.
15943 also has me paired against a BRP3cuda32.

15980 it is then.
Can the developers in the mean time fix the driver detection?

x3mEn
x3mEn
Joined: 21 Jun 11
Posts: 9
Credit: 10000
RAC: 0

Completed, marked as

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Easily explained:

Message 78513 in response to message 78512

Easily explained: http://albertathome.org/workunit/12082
You're ATIOpenCL, you were paired against two CUDA which walked away with the credits. The validator isn't tuned enough yet to see that these results may well have been the same.

Infusioned
Infusioned
Joined: 11 Feb 05
Posts: 38
Credit: 149000
RAC: 0

This is from the Seti@Home

Message 78514 in response to message 78513

This is from the Seti@Home Beta message boards (developing an ATi OpenCL App):

http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1867

Quote:

Please, update your hosts with this builds. ATi and NV builds got speed increase, HD5-version added, NV version added.

CPU builds got some updates too.

Known issues:
1) As with OpenCL AstroPulse, last driver version from both vendors show increased CPU usage. AMD already aknowledged this issue and promised to fix in new Catalyst releases, NV still keeping silence about this issue.

2) OpenCL NV app can silently (i.e., w/o errors in stderr) produce incorrect results (overflows). Again, situation resembles NV AstroPulse rev521 case and usually means too long kernel call. Why NV OpenCL runtime doesn't report error code for kernel enqueue runtime call - no idea. But low-end NV GPUs could be not capable to use this app. This testing should determine GPU requirements for NV app too.

Quote:

Thanks, I downloaded the WU and confirm the difference.

At least part of the problem is a long-standing issue which is seen in SETI@home Enhanced between the stock CPU and stock CUDA applications too. The most efficient order to do the various searches is different for CPU and GPU, so for this kind of task with a lot of potential signals the CPU finds a different subset than the GPU does. Eric Korpela is aware of the issue and maybe if he or Jeff Cobb get a chance the Validator code will be revised to judge quick overflow results differently.

The way that's pertinent to this case is the MB7 r365 sources are primarily targeting openCL builds so the Autocorr and Spike searches are done in a different order than stock 6.97.

At first glance, that doesn't seem to explain all the differences between the MB7 CPU result and 6.97 result. I need to analyze more to really be sure what the data indicates.

FWIW, those results will be considered "weakly similar" and both get credit when the third result is returned.

Maybe this is part of the issue regarding validation?

TRuEQ & TuVaLu
TRuEQ & TuVaLu
Joined: 11 Sep 06
Posts: 75
Credit: 615315
RAC: 0

RE: This is from the

Message 78515 in response to message 78514

Quote:

This is from the Seti@Home Beta message boards (developing an ATi OpenCL App):

http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1867

Quote:

Please, update your hosts with this builds. ATi and NV builds got speed increase, HD5-version added, NV version added.

CPU builds got some updates too.

Known issues:
1) As with OpenCL AstroPulse, last driver version from both vendors show increased CPU usage. AMD already aknowledged this issue and promised to fix in new Catalyst releases, NV still keeping silence about this issue.

2) OpenCL NV app can silently (i.e., w/o errors in stderr) produce incorrect results (overflows). Again, situation resembles NV AstroPulse rev521 case and usually means too long kernel call. Why NV OpenCL runtime doesn't report error code for kernel enqueue runtime call - no idea. But low-end NV GPUs could be not capable to use this app. This testing should determine GPU requirements for NV app too.

Quote:

Thanks, I downloaded the WU and confirm the difference.

At least part of the problem is a long-standing issue which is seen in SETI@home Enhanced between the stock CPU and stock CUDA applications too. The most efficient order to do the various searches is different for CPU and GPU, so for this kind of task with a lot of potential signals the CPU finds a different subset than the GPU does. Eric Korpela is aware of the issue and maybe if he or Jeff Cobb get a chance the Validator code will be revised to judge quick overflow results differently.

The way that's pertinent to this case is the MB7 r365 sources are primarily targeting openCL builds so the Autocorr and Spike searches are done in a different order than stock 6.97.

At first glance, that doesn't seem to explain all the differences between the MB7 CPU result and 6.97 result. I need to analyze more to really be sure what the data indicates.

FWIW, those results will be considered "weakly similar" and both get credit when the third result is returned.

Maybe this is part of the issue regarding validation?

I think the problem might be in the FFT and CuFFT variations....
But I am not sure. I saw something about such a discussion in another thread...

Ver Greeneyes
Ver Greeneyes
Joined: 18 Nov 11
Posts: 6
Credit: 861017
RAC: 0

RE: Maybe this is part of

Message 78516 in response to message 78514

Quote:
Maybe this is part of the issue regarding validation?


Nvidia cards are still using the CUDA app though, not OpenCL.

Infusioned
Infusioned
Joined: 11 Feb 05
Posts: 38
Credit: 149000
RAC: 0

It doesn't mean that NVidia

Message 78517 in response to message 78516

It doesn't mean that NVidia cards don't silently generate overflows in general.

Also, the second post I quoted details how even the CUDA app was not validating against a CPU due to the order of calculations and the validator needed tweaking to regard them as weakly similar.

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: Can the developers in

Message 78518 in response to message 78511

Quote:

Can the developers in the mean time fix the driver detection?

Sorry, not up to us. I'm not sure whether the BOINC devs can do anything about it since this might even be an AMD driver issue.

Oliver

TRuEQ & TuVaLu
TRuEQ & TuVaLu
Joined: 11 Sep 06
Posts: 75
Credit: 615315
RAC: 0

RE: RE: Can the

Message 78519 in response to message 78518

Quote:
Quote:

Can the developers in the mean time fix the driver detection?

Sorry, not up to us. I'm not sure whether the BOINC devs can do anything about it since this might even be an AMD driver issue.

Oliver

It seems to work in previous boinc manager versions. Maybe not in 6.13xx yet though....

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: Sorry, not up to us.

Message 78520 in response to message 78518

Quote:

Sorry, not up to us. I'm not sure whether the BOINC devs can do anything about it since this might even be an AMD driver issue.

Oliver


Of course it's up to you. Apparently the server back-end version that you use doesn't store the ATI/CAL driver version, but it is sent to you.

From my 7.0.2 sched_request_albert.phys.uwm.edu.xml file:

Quote:


1
ATI Radeon HD 4700/4800 (RV740/RV770)
1040187392.000000
1
1
0.000000
0.000000
0.000000
2000000000000.000000
1.4.1417
5
1024
2047
2047
625
950
64
10
1
256
4096
8192
8192
8192


ATI RV770
Advanced Micro Devices, Inc.
4098
1
0
62
63
1
1
cl_amd_fp64 cl_khr_gl_sharing cl_amd_device_attribute_query cl_khr_d3d10_sharing
1073741824
16384
625
10
OpenCL 1.1 AMD-APP-SDK-v2.5 (775.2)
OpenCL 1.0 AMD-APP-SDK-v2.5 (775.2)
CAL 1.4.1417


You can even use the OpenCL information.
Then with the CAL version we can figure out which Catalysts they are. E.g. CAL 1.4.1417 is Catalysts 11.6

TRuEQ & TuVaLu
TRuEQ & TuVaLu
Joined: 11 Sep 06
Posts: 75
Credit: 615315
RAC: 0

RE: RE: Sorry, not up to

Message 78521 in response to message 78520

Quote:
Quote:

Sorry, not up to us. I'm not sure whether the BOINC devs can do anything about it since this might even be an AMD driver issue.

Oliver


Of course it's up to you. Apparently the server back-end version that you use doesn't store the ATI/CAL driver version, but it is sent to you.

From my 7.0.2 sched_request_albert.phys.uwm.edu.xml file:

Quote:


1
ATI Radeon HD 4700/4800 (RV740/RV770)
1040187392.000000
1
1
0.000000
0.000000
0.000000
2000000000000.000000
1.4.1417
5
1024
2047
2047
625
950
64
10
1
256
4096
8192
8192
8192


ATI RV770
Advanced Micro Devices, Inc.
4098
1
0
62
63
1
1
cl_amd_fp64 cl_khr_gl_sharing cl_amd_device_attribute_query cl_khr_d3d10_sharing
1073741824
16384
625
10
OpenCL 1.1 AMD-APP-SDK-v2.5 (775.2)
OpenCL 1.0 AMD-APP-SDK-v2.5 (775.2)
CAL 1.4.1417


You can even use the OpenCL information.
Then with the CAL version we can figure out which Catalysts they are. E.g. CAL 1.4.1417 is Catalysts 11.6

Jord has a point here.

And it can help users detect possible wrong drivers when they compare with other users. On the question, "which driver is the best driver for my ATI card?"

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

If you want to be totally

Message 78522 in response to message 78520

If you want to be totally confused, it does work on Einstein. See my account there.

You can't blame it on the client version either, it's merely all ATI that is affected. Examples: this host uses 6.10.58 and shows driver 0.1; this host uses 6.12.41 and shows as driver version 0.1

Bernd Machenschalk
Bernd Machenschalk
Administrator
Joined: 15 Oct 04
Posts: 155
Credit: 6218130
RAC: 0

RE: RE: Can the

Message 78523 in response to message 78518

Quote:
Quote:

Can the developers in the mean time fix the driver detection?

Sorry, not up to us. I'm not sure whether the BOINC devs can do anything about it since this might even be an AMD driver issue.

You may talk about two different things here.

Jord, what exactly do you think should be fixed?

I do see that displaying the ATI CAL/driver version on the host web pages appears broken (on Albert), and possibly the string in the DB is, too.

In the scheduler the ATI "driver" version is stored as "char version[50]" and "int version_num" in coproc_ati, and in "char opencl_driver_version[32]" in opencl_device_prop. These could in principle be used in app_plan(), though we don't check this yet.

BM

BM

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

RE: Jord, what exactly do

Message 78524 in response to message 78523

Quote:
Jord, what exactly do you think should be fixed?


Showing of the CAL driver version on ATI cards on the account pages here.

Yes, sorry, I said it wrong. I asked for a fix for the driver detection. I know you don't do that, that that's up to the client. I meant that all the driver versions showing for Nvidia GPUs is correct, for all ATI GPUs it's always 0.1, which isn't correct.

Bernd Machenschalk
Bernd Machenschalk
Administrator
Joined: 15 Oct 04
Posts: 155
Credit: 6218130
RAC: 0

Hm. On your host page I

Message 78525 in response to message 78524

Hm. On your host page I currently read:

AMD ATI Radeon HD 4700/4800 (RV740/RV770) (1024MB) driver: 1.4.1417

I don't see anything wrong with that. Maybe the previous entry was from an old Client version?

The only thing I changed this morning was (parts of) the web page code, but nothing related to the pages involved here.

BM

BM

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Well, whatever you did fixed

Message 78526 in response to message 78525

Well, whatever you did fixed that bug. It now shows on all hosts I checked which CAL driver version these people use. It may have been in there all this time, just not showing as such. So thanks. :)

Btw Oliver, BOINC 7.0.1 (the minimum requirement now) was never compiled and stored anywhere. The minimum minimum anyone could download was 7.0.2; only people who got the source code of branch_7.0 and compiled that on the 30th of November will have 7.0.1, all else will have 6.13.12 or 7.0.2. ;-)

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

RE: Btw Oliver, BOINC

Message 78527 in response to message 78526

Quote:

Btw Oliver, BOINC 7.0.1 (the minimum requirement now) was never compiled and stored anywhere. The minimum minimum anyone could download was 7.0.2; only people who got the source code of branch_7.0 and compiled that on the 30th of November will have 7.0.1, all else will have 6.13.12 or 7.0.2. ;-)

I know, I'm just using the exact tag/version that contains the required bug fix :-)

Oliver

robertmiles
robertmiles
Joined: 16 Nov 11
Posts: 31
Credit: 4468368
RAC: 0

RE: RE: RE: Can the

Message 78528 in response to message 78523

Quote:
Quote:
Quote:

Can the developers in the mean time fix the driver detection?

Sorry, not up to us. I'm not sure whether the BOINC devs can do anything about it since this might even be an AMD driver issue.

You may talk about two different things here.

Jord, what exactly do you think should be fixed?

I do see that displaying the ATI CAL/driver version on the host web pages appears broken (on Albert), and possibly the string in the DB is, too.

In the scheduler the ATI "driver" version is stored as "char version[50]" and "int version_num" in coproc_ati, and in "char opencl_driver_version[32]" in opencl_device_prop. These could in principle be used in app_plan(), though we don't check this yet.

BM

I just read something related on the boinc_dev mailing list. It seems that BOINC 7.01 and 7.02 don't allocate enough digits in one of the places they store
ATI version numbers, and are therefore likely to get at least some of the version numbers wrong.