For some reason this wu is showing 0% GPU load and 25% CPU load. My initial reaction was that this must be an error, however, you can see the GPU clock was down to 725 from 840.
In my AMD HD 6850 i'm running 2 boinc projects: albert@home and poem@home (3 gpu wu in 1 cpu). When i download an Albert@home gpu wu, the poem wus entered in "suspended" state and albert@home wu doesn't start - aka, no work on gpu. If i reboot boinc client, the poem wu remain suspended, but albert wu starts and runs ok...
Is this normal??
hmmm....theoretically it is possible that the Albert task *thought* it didn't have enough memory and waited for some to get available, which happened after the reboot...still, this looks suspicious. Thanks for reporting.
One question tho: is this reproducible, e.g. after each new WU download from Albert?
For some reason this wu is showing 0% GPU load and 25% CPU load. My initial reaction was that this must be an error, however, you can see the GPU clock was down to 725 from 840.
But I would expect a lower GPU temperature if the load had really been 0% for a longer time, so actually I suspect that the readout is wrong. The app does have phases (at the beginning of each of the 8 subtasks) when there is exclusively CPU load, but this will last only a couple of seconds, not minutes.
Digging through some of the stderr outputs I notice the atiOpenCl app is doing an awful lot of checkpointing. Curious to see if the cuda app was the same, I looked into one of my wu's:
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.50
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.60
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.70
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.80
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.90
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.00
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.10
Checkpointing each WAPP file once per minute, 20 times.
Comparing to the BRP3cuda32 app (abbreviated):
[12:27:01][5004][INFO ] Starting data processing...
[12:27:01][5004][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 218 MB (807 MB free / 1025 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[12:27:01][5004][INFO ] Using CUDA device #0 "GeForce GTX 560" (336 CUDA cores / 1105.44 GFLOPS)
[12:27:01][5004][INFO ] Version of installed CUDA driver: 4020
[12:27:01][5004][INFO ] Version of CUDA driver API used: 3020
[12:27:01][5004][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[12:27:01][5004][INFO ] Header contents:
------> Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.40
...
[12:27:31][5004][INFO ] Checkpoint committed!
[12:28:01][5004][INFO ] Checkpoint committed!
[12:28:31][5004][INFO ] Checkpoint committed!
[12:29:01][5004][INFO ] Checkpoint committed!
[12:29:31][5004][INFO ] Checkpoint committed!
[12:30:02][5004][INFO ] Checkpoint committed!
[12:30:32][5004][INFO ] Checkpoint committed!
[12:31:01][5004][INFO ] Data processing finished successfully!
...
which then also repeats for:
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.50
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.60
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.70
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.80
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.90
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.00
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.10
Checkpointing each WAPP file once per minute, 5 times.
So, my questions are:
* What is checkpointing? An intermidiate state (variables) save in case calculations get interrupted and you don't have to start over?
* Is the aitOpenCl app checkpointing more? Or is it that the two apps are doing the same amount of work (calcs), and it's just that the CUDA app/GTX 560 is doing more work per unit time and therefore only needs to checkpoint 5 vs. my 20 times?
To semi-answer that, GPU Time indicates a 2.503x increase for the GTX560/CUDA vs. the AtiOpenCl/HD6950. The CPU time for the CUDA app is ,however, 4.24x less than that of the OpenCl app. Anandtech Bench shows the 2500k vs. my AMD 975BE to be slightly better in single-threaded, multi-threaded, and total MIPS (7-Zip test), but nothing earth shattering. http://www.anandtech.com/bench/Product/288?vs=435
I know you said before that the OpenCl app uses way more CPU than the CUDA app. Perhaps the OpenCl standard is still yet immature, AMD has crappy drivers, or a mix of both? Regardless, I really commend everyone's efforts. Having done a fair bit of coding myself, I know what a pain this can all be.
Edit:
I take that back, I noticed spottiness again, so I ran the latest 3 versions of GPU-Z side-by-side just to see if there was a bug in one of the versions. There doesn't appear to be as they all report the same load %.
One question tho: is this reproducible, e.g. after each new WU download from Albert?
Cheers
HBE
My pc has 8gb DDR3 on Win7 64bit, it's enough?
If i continue to download and run A@H wus, the wus take precedence over Poem.
After the last A@H wu, Poem restarts correctly and if i download another A@H the situation occurs again... :-(
I forget: during the no-gpu-use state, 1 cpu core is in use (like A@h is running)
p2030.20110421.G41.29-00.40.S
)
p2030.20110421.G41.29-00.40.S.b0s0g0.00000_1728_0
For some reason this wu is showing 0% GPU load and 25% CPU load. My initial reaction was that this must be an error, however, you can see the GPU clock was down to 725 from 840.
http://img140.imageshack.us/img140/883/b0s0g00000017280.jpg
p2030.20110421.G41.29-00.40.S
)
p2030.20110421.G41.29-00.40.S.b0s0g0.00000_1504_1 using einsteinbinary_BRP4 version 123 (atiOpenCL)
http://img15.imageshack.us/img15/3065/b0s0g00000015041.jpg
RE: In my AMD HD 6850 i'm
)
hmmm....theoretically it is possible that the Albert task *thought* it didn't have enough memory and waited for some to get available, which happened after the reboot...still, this looks suspicious. Thanks for reporting.
One question tho: is this reproducible, e.g. after each new WU download from Albert?
Cheers
HBE
RE: p2030.20110421.G41.29-0
)
Strange...this is this one I guess:
http://albertathome.org/task/197941 which has finished in abeout the same time as other tasks. Let's see if it validates.
But I would expect a lower GPU temperature if the load had really been 0% for a longer time, so actually I suspect that the readout is wrong. The app does have phases (at the beginning of each of the 8 subtasks) when there is exclusively CPU load, but this will last only a couple of seconds, not minutes.
THX
HBE
p2030.20110421.G41.29-00.40.S
)
p2030.20110421.G41.29-00.40.S.b0s0g0.00000_1920_1 using einsteinbinary_BRP4 version 123 (atiOpenCL)
http://img96.imageshack.us/img96/6813/b0s0g00000019201.jpg
Digging through some of the
)
Digging through some of the stderr outputs I notice the atiOpenCl app is doing an awful lot of checkpointing. Curious to see if the cuda app was the same, I looked into one of my wu's:
http://albertathome.org/workunit/68681
My (atiOpenCL) output (abbreviated):
And then repeats the process for:
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.50
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.60
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.70
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.80
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.90
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.00
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.10
Checkpointing each WAPP file once per minute, 20 times.
Comparing to the BRP3cuda32 app (abbreviated):
which then also repeats for:
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.50
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.60
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.70
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.80
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM126.90
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.00
Original WAPP file: ./p2030.20110421.G41.29-00.40.S.b0s0g0.00000_DM127.10
Checkpointing each WAPP file once per minute, 5 times.
So, my questions are:
* What is checkpointing? An intermidiate state (variables) save in case calculations get interrupted and you don't have to start over?
* Is the aitOpenCl app checkpointing more? Or is it that the two apps are doing the same amount of work (calcs), and it's just that the CUDA app/GTX 560 is doing more work per unit time and therefore only needs to checkpoint 5 vs. my 20 times?
* Is the GTX 560/CUDA app really 4x (20/5=4) than the HD6950/AtiOpenCl? The 6950 shows 2253 SP GFLOPS vs. the GTX 560 SP GFLOPS of 1088.6.
http://en.wikipedia.org/wiki/Comparison_of_AMD_graphics_processing_units
http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units
To semi-answer that, GPU Time indicates a 2.503x increase for the GTX560/CUDA vs. the AtiOpenCl/HD6950. The CPU time for the CUDA app is ,however, 4.24x less than that of the OpenCl app. Anandtech Bench shows the 2500k vs. my AMD 975BE to be slightly better in single-threaded, multi-threaded, and total MIPS (7-Zip test), but nothing earth shattering.
http://www.anandtech.com/bench/Product/288?vs=435
I know you said before that the OpenCl app uses way more CPU than the CUDA app. Perhaps the OpenCl standard is still yet immature, AMD has crappy drivers, or a mix of both? Regardless, I really commend everyone's efforts. Having done a fair bit of coding myself, I know what a pain this can all be.
p2030.20110421.G41.29-00.40.S
)
p2030.20110421.G41.29-00.40.S.b0s0g0.00000_1928_0 using einsteinbinary_BRP4 version 123 (atiOpenCL)
This one seems to have some weid GPU Load spottiness at ~ the 20% completion mark, but seems to have steadied out at 23% load.
http://img210.imageshack.us/img210/4024/b0s0g00000019280.jpg
Edit:
I take that back, I noticed spottiness again, so I ran the latest 3 versions of GPU-Z side-by-side just to see if there was a bug in one of the versions. There doesn't appear to be as they all report the same load %.
http://img196.imageshack.us/img196/7073/gpuzcomparison.jpg
p2030.20110421.G41.29-00.40.S
)
p2030.20110421.G41.29-00.40.S.b0s0g0.00000_2504_0 using einsteinbinary_BRP4 version 123 (atiOpenCL)
http://img140.imageshack.us/img140/4502/b0s0g00000025040.jpg
For me the new app takes a
)
For me the new app takes a full CPU core when it is running. Is that by intention?
Christoph
RE: One question tho: is
)
My pc has 8gb DDR3 on Win7 64bit, it's enough?
If i continue to download and run A@H wus, the wus take precedence over Poem.
After the last A@H wu, Poem restarts correctly and if i download another A@H the situation occurs again... :-(
I forget: during the no-gpu-use state, 1 cpu core is in use (like A@h is running)