Project server code update

The project will be taken down in about an hour to perform an update of the BOINC server code. Ideally you shouldn't notice anything, but usually the world isn't ideal. See you again on the other side.

Comments

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: RH - Please let me know

Message 80148 in response to message 80147

Quote:
RH - Please let me know if it would be more helpful to simply switch my 7950 from BRP5 to BRP4 or to "remove project" / "add project" (presumably that would create a new host and therefore start credit calcs fresh). Also, is it easier for you if I only run 1 WU at a time?


Remove project / add project doesn't normally change the HostID - BOINC is designed to recycle the numbers, if for example it recognises the IP address and hardware configuration.

Doesn't matter if it's one at a time or multiples at at time, but it's probably best if you don't mix task types (whether from this project or across projects). If I do start monitoring your host - thanks for the offer - it would help the other observers if you could tell us a bit about any configuration details which can't be observed from the outside - and GPU utilisation factor is one of those.

Don't bust a gut changing things over. I need a bit of a breather, and to set up and get used to a replacement monitor: and Bernd needs to test some more new server code fixes next week, which will give us a new set of apps (designated as 'beta', but in reality the same as the existing ones) with blank application_details records to have a go at.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

With new hosts and a new

With new hosts and a new monitor, let's see how that looks.

I've knock out the old data (and with it, the extreme data points) - but even so, Juan's new machines show very wide scatter.

Here's that in figures:

[pre] Jason Holmis Claggy Juan Juan Juan RH RH
Host: 11363 2267 9008 10352 10512 10351 5367 5367
GTX 780 GTX 660 GT 650M GTX 690 GTX 690 GTX 780 GTX 670 GTX 670

Credit for BRP4G, GPU
Maximum 2708.58 2197.18 10952.0 7209.47 6889.8 6652.9 4137.85
Minimum 115.82 88.84 153.90 1667.23 1244.41 1546.02 1355.49
Average 1326.79 1277.87 3631.58 2728.70 2198.10 2463.06 2007.02
Median 1541.35 1411.09 2426.03 2135.67 1948.04 2091.49 1910.19
Std Dev 628.07 690.05 2712.34 1403.91 942.62 969.59 305.80

nSamples 76 102 71 52 43 44 459

Runtime (seconds) (before)(after)
Maximum 5027.36 5088.99 11295.0 5605.83 8922.7 3182.0 4191.43 5099.40
Minimum 3239.20 3294.83 8122.09 3081.97 3854.24 1852.2 4061.45 4284.52
Average 3645.57 4549.28 8902.94 4411.88 6305.41 2342.3 4128.08 4686.13
Median 3535.46 4769.05 8847.82 3673.33 5127.40 1864.0 4127.35 4672.83
Std Dev 344.17 456.55 508.22 998.49 1932.50 615.41 20.40 204.66
365 94
Turnround (days)
Maximum 6.09 3.91 2.75 0.08 0.45 0.22 0.91
Minimum 0.13 0.07 0.13 0.04 0.05 0.02 0.15
Average 1.94 1.46 0.90 0.05 0.09 0.03 0.67
Median 1.46 1.54 0.79 0.04 0.06 0.03 0.69
Std Dev 1.78 1.00 0.65 0.01 0.06 0.03 0.12 [/pre]
All three of Juan's machines are showing a very wide variation in runtime - he'll have to explain that by local observation, I can't pick it up from the website.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

treblehit
treblehit
Joined: 12 Mar 05
Posts: 5
Credit: 35119
RAC: 0

RE: What is most helpful

Message 80150 in response to message 80145

Quote:

What is most helpful is finding hosts with a nice, steady, continuous flow of data, and as little variation as possible in the running conditions (so that any noise in the credit granted can be attributed to external causes).

The sheer number of tasks pushed through isn't particularly important, but the consistency is.

It's quite time-consuming to switch things over, so bear with me - for the time being at least, old results aren't being deleted here, so there's no rush.

On the basis of that guidance I am going to provide multiple weak systems that will run only Albert and will remain untouched after initial setup. Also, I'll go "natural" without multiple work units or doing anything with the clocks.

These will be new hosts (really low-powered hosts) so won't carry any prior statistics or other baggage with them.

I'll get on it, shortly.

If you need something different, I think Juan and I are both ready to make any sacrifice of "credits" if we are being helpful.

treblehit
treblehit
Joined: 12 Mar 05
Posts: 5
Credit: 35119
RAC: 0

Computer 11519 Pretending to

Computer 11519

Pretending to be a new user. New install of GPU, new install of drivers, new install of BOINC.

First work fetch of BRP4G-opencl-ati has estimated runtime of 10 seconds.

Obviously, they are erroring-out.

Run time 3 min 40 sec

Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED

I know what the fix is, but I'm not concerned with fixing it. I'm concerned with helping you fix it.

What do you want me to do?

7.2.42

Maximum elapsed time exceeded

Activated exception handling...
[22:05:40][3552][INFO ] Starting data processing...
[22:05:41][3552][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[22:05:41][3552][INFO ] Using OpenCL device "Juniper" by: Advanced Micro Devices, Inc.
[22:05:41][3552][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[22:05:41][3552][INFO ] Header contents:
------> Original WAPP file: ./p2030.20130202.G202.32-01.96.N.b0s0g0.00000_DM209.60
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 56326.065838408722
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 62454.7106018
------> DEC (J2000): 83413.5978003
------> Galactic l: 0
------> Galactic b: 0
------> Name: G202.32-01.96.N
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 209.6 cm^-3 pc
------> Scale factor: 0.00111372
[22:05:46][3552][INFO ] Seed for random number generator is 1168661235.
[22:05:56][3552][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[22:06:42][3552][INFO ] Checkpoint committed!
[22:07:44][3552][INFO ] Checkpoint committed!
[22:08:46][3552][INFO ] Checkpoint committed!
[22:09:20][3552][INFO ] OpenCL shutdown complete!
[22:09:20][3552][WARN ] BOINC wants us to quit prematurely or we lost contact! Exiting...

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

Thanks, I had hoped

Thanks,
I had hoped newhost+app onramp for GPUs would improve, but see that it hasn't. I'm not surprised given we know two precise mechanisms there (default GPU efficiency pinned at 10% (0.1) and improperly applied normalisation (you can't normalise time estimates without a functional host_scale, which is disabled for the onramp period.)

New user, host &/or application is central to this effort, so thanks again for the information. At this point you could either choose to jigger the bounds of tasks (allowing it to reach where host_scale kicks in) or alternatively let it go on erroring & see what happens (I imagine it'd just keep erroring & rediuce quota to 1/day).

Both options have merit so it's your choice, though I think the jiggering option has been pretty thoroughly used, and the second one more likely in common usage cases. Up to you

Jason

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

treblehit
treblehit
Joined: 12 Mar 05
Posts: 5
Credit: 35119
RAC: 0

RE: At this point you

Message 80153 in response to message 80152

Quote:
At this point you could either choose to jigger the bounds of tasks (allowing it to reach where host_scale kicks in) or alternatively let it go on erroring & see what happens (I imagine it'd just keep erroring & rediuce quota to 1/day).

That's what happened. Down to 1 wu/day and I'm done for the day.

Man, am I ever glad I drove that one hour round trip in a 15mpg vehicle to try to get a steady stream of work headed Albert's direction.

There's always the 1 wu I'll get tomorrow.

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

lol, yeah, all in a good

Message 80154 in response to message 80153

lol, yeah, all in a good cause though :) obvious breakage like that makes the case put forward in some quarters that it's working fine look a tad on the ridiculous side. The more 'normal' situations like that, that simply don't work, the better we understand, and can push to get it fixed once and for all.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

From treblehit's server log

From treblehit's server log https://albert.phys.uwm.edu/host_sched_logs/11/11519

Quote:
2014-06-29 09:21:30.4581 [PID=3880 ] [version] [AV#738] (BRP5-opencl-ati) adjusting projected flops based on PFC avg: 16250.85G
2014-06-29 09:21:30.4581 [PID=3880 ] [version] Best app version is now AV738 (0.89 GFLOP)
2014-06-29 09:21:30.4581 [PID=3880 ] [version] [AV#738] (BRP5-opencl-ati) adjusting projected flops based on PFC avg: 16250.85G
2014-06-29 09:21:30.4581 [PID=3880 ] [version] Best version of app einsteinbinary_BRP5 is [AV#738] (16250.85 GFLOPS)


I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: I do think we ought to

Message 80156 in response to message 80155

Quote:
I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way.


Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used?

Claggy

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: From treblehit's server

Message 80157 in response to message 80155

Quote:

From treblehit's server log https://albert.phys.uwm.edu/host_sched_logs/11/11519

Quote:
2014-06-29 09:21:30.4581 [PID=3880 ] [version] [AV#738] (BRP5-opencl-ati) adjusting projected flops based on PFC avg: 16250.85G
2014-06-29 09:21:30.4581 [PID=3880 ] [version] Best app version is now AV738 (0.89 GFLOP)
2014-06-29 09:21:30.4581 [PID=3880 ] [version] [AV#738] (BRP5-opencl-ati) adjusting projected flops based on PFC avg: 16250.85G
2014-06-29 09:21:30.4581 [PID=3880 ] [version] Best version of app einsteinbinary_BRP5 is [AV#738] (16250.85 GFLOPS)

I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way.

Sure, first from client perspective:
referring to the dodgy diagram, factoring in the bad onramp period default pfc_scale of 0.1 for GPUs, and inactive host_scale (x1) results in:

wu pfc ('peak flop claim') est = 0.1*1*wu_est (10% of minimum possible)
device peak_flops likely standard GPU ~20x actual rate (app, card & system dependant)
--> est about 1/200th of required elapsed --> bound exceed

Now digging through server end...

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: I do think we

Message 80158 in response to message 80156

Quote:
Quote:
I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way.

Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used?

Claggy

There is no adjustment, the adjustment is a lie. is hard wired active for all clients >= 7.0.28.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: RE: RE: I do think

Message 80159 in response to message 80158

Quote:
Quote:
Quote:
I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way.

Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used?

Claggy

There is no adjustment, the adjustment is a lie. is hard wired active for all clients >= 7.0.28.


But only on projects that don't use dcf, Einstein on my i7-2600K/HD7770 has a dcf of:

1.267963

Albert has of cause:

Claggy

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: RE: RE: I do

Message 80160 in response to message 80159

Quote:
Quote:
Quote:
Quote:
I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way.

Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used?

Claggy

There is no adjustment, the adjustment is a lie. is hard wired active for all clients >= 7.0.28.


But only on projects that don't use dcf, Einstein on my i7-2600K/HD7770 has a dcf of:

1.267963

Claggy

Well you've lost me there, because every scheduler reply to a >= 7.0.28 client, accirding to the scheduler code, pushes , [and there is no configuration switch for it ]

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: RE: RE: RE: Quote

Message 80161 in response to message 80160

Quote:
Quote:
Quote:
Quote:
Quote:
I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way.

Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used?

Claggy

There is no adjustment, the adjustment is a lie. is hard wired active for all clients >= 7.0.28.


But only on projects that don't use dcf, Einstein on my i7-2600K/HD7770 has a dcf of:

1.267963

Claggy

Well you've lost me there, because every scheduler reply to a >= 7.0.28 client, accirding to the scheduler code, pushes , [and there is no configuration switch for it ]


Einstein has an older scheduler than Albert (or at least server version):

29/06/2014 11:45:58 | Einstein@Home | sched RPC pending: Requested by user
29/06/2014 11:45:58 | Einstein@Home | [sched_op] Starting scheduler request
29/06/2014 11:45:58 | Einstein@Home | Sending scheduler request: Requested by user.
29/06/2014 11:45:58 | Einstein@Home | Not requesting tasks: "no new tasks" requested via Manager
29/06/2014 11:45:58 | Einstein@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
29/06/2014 11:45:58 | Einstein@Home | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
29/06/2014 11:46:00 | Einstein@Home | Scheduler request completed
29/06/2014 11:46:00 | Einstein@Home | [sched_op] Server version 611
29/06/2014 11:46:00 | Einstein@Home | Project requested delay of 60 seconds
29/06/2014 11:46:00 | Einstein@Home | [sched_op] Deferring communication for 00:01:00
29/06/2014 11:46:00 | Einstein@Home | [sched_op] Reason: requested by project
29/06/2014 11:46:05 | Albert@Home | sched RPC pending: Requested by user
29/06/2014 11:46:05 | Albert@Home | [sched_op] Starting scheduler request
29/06/2014 11:46:05 | Albert@Home | Sending scheduler request: Requested by user.
29/06/2014 11:46:05 | Albert@Home | Reporting 2 completed tasks
29/06/2014 11:46:05 | Albert@Home | Not requesting tasks: don't need
29/06/2014 11:46:05 | Albert@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
29/06/2014 11:46:05 | Albert@Home | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
29/06/2014 11:46:08 | Albert@Home | Scheduler request completed
29/06/2014 11:46:08 | Albert@Home | [sched_op] Server version 703
29/06/2014 11:46:08 | Albert@Home | Project requested delay of 60 seconds
29/06/2014 11:46:08 | Albert@Home | [sched_op] handle_scheduler_reply(): got ack for task h1_0997.10_S6Direct__S6CasAf40_997.55Hz_1017_1
29/06/2014 11:46:08 | Albert@Home | [sched_op] handle_scheduler_reply(): got ack for task p2030.20130202.G202.32-01.96.N.b2s0g0.00000_2384_5
29/06/2014 11:46:08 | Albert@Home | [sched_op] Deferring communication for 00:01:00
29/06/2014 11:46:08 | Albert@Home | [sched_op] Reason: requested by project

Claggy

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

Ah allright, Yeah only

Message 80162 in response to message 80161

Ah allright,
Yeah only interested in fixing current code, rather than diagnosing/patching old versions :)

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: Ah allright, Yeah

Message 80163 in response to message 80162

Quote:
Ah allright,
Yeah only interested in fixing current code, rather than diagnosing/patching old versions :)


I was thinking that they were using Einstein customisations here that might not be needed, looking at robl's Einstein log shows it's the durations that get scaled there:

http://einstein.phys.uwm.edu/hosts_user.php?userid=613597

2014-06-29 09:28:50.6296 [PID=17986] [send] [HOST#7536795] Sending app_version 483 einsteinbinary_BRP5 7 139 BRP5-cuda32-nv270; 49.97 GFLOPS
2014-06-29 09:28:50.6312 [PID=17986] [send] est. duration for WU 193304662: unscaled 9004.88 scaled 18527.18
2014-06-29 09:28:50.6312 [PID=17986] [HOST#7536795] Sending [RESULT#443159459 PB0024_00191_182_0] (est. dur. 18527.18 seconds)
2014-06-29 09:28:50.6324 [PID=17986] [send] est. duration for WU 193307638: unscaled 9004.88 scaled 18527.18
2014-06-29 09:28:50.6324 [PID=17986] [send] [WU#193307638] meets deadline: 18527.18 + 18527.18 < 1209600
2014-06-29 09:28:50.6332 [PID=17986] [send] [HOST#7536795] Sending app_version 483 einsteinbinary_BRP5 7 139 BRP5-cuda32-nv270; 49.97 GFLOPS
2014-06-29 09:28:50.6347 [PID=17986] [send] est. duration for WU 193307638: unscaled 9004.88 scaled 18527.18
2014-06-29 09:28:50.6347 [PID=17986] [HOST#7536795] Sending [RESULT#443165551 PB0024_00141_24_0] (est. dur. 18527.18 seconds)
2014-06-29 09:28:50.6356 [PID=17986] [send] est. duration for WU 193249827: unscaled 9004.88 scaled 18527.18
2014-06-29 09:28:50.6356 [PID=17986] [send] [WU#193249827] meets deadline: 37054.37 + 18527.18 < 1209600
2014-06-29 09:28:50.6364 [PID=17986] [send] [HOST#7536795] Sending app_version 483 einsteinbinary_BRP5 7 139 BRP5-cuda32-nv270; 49.97 GFLOPS
2014-06-29 09:28:50.6380 [PID=17986] [send] est. duration for WU 193249827: unscaled 9004.88 scaled 18527.18
2014-06-29 09:28:50.6381 [PID=17986] [HOST#7536795] Sending [RESULT#443038987 PB0023_01561_144_0] (est. dur. 18527.18 seconds)

Claggy

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

Now the server side, that

Now the server side, that 'Best version of app' striing comes from sched_version.cpp (scheduler inbuilt functions) and uses the following resources:
app->name, bavp->avp->id, bavp->host_usage.projected_flops/1e9

That projected_flops is set during app version selection, as number os samples will be < 10 , flops will be adjusted based on pfc samples average for the app version (there will be 100 of those from other users).

Since that's normalised elsewhere (see red ellipse on dodgy diagram), net effect translates pfc of 0.1 used for the original estimate, to 1, so peak_flops is x10-20

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: I was thinking that

Message 80165 in response to message 80163

Quote:
I was thinking that they were using Einstein customisations here that might not be needed, looking at robl's Einstein log shows it's the durations that get scaled there:

Yeah, they were before. Quite a lot of work Bernd had to do to get here to stock updated sever code. Now (here), should be pretty close or identical (for our purposes) to current Boinc master IIRC.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Now the server side,

Message 80166 in response to message 80164

Quote:

Now the server side, that 'Best version of app' striing comes from sched_version.cpp (scheduler inbuilt functions) and uses the following resources:
app->name, bavp->avp->id, bavp->host_usage.projected_flops/1e9

That projected_flops is set during app version selection, as number os samples will be < 10 , flops will be adjusted based on pfc samples average for the app version (there will be 100 of those from other users).

Since that's normalised elsewhere (see red ellipse on dodgy diagram), net effect translates pfc of 0.1 used for the original estimate, to 1, so peak_flops is x10-20

Richard do you want code line numbers for that ?

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: Ah allright, Yeah

Message 80167 in response to message 80162

Quote:
Ah allright,
Yeah only interested in fixing current code, rather than diagnosing/patching old versions :)


Yes, concentrating on the current code and moving it forward is certainly the right approach - but it's probably worth just being aware of the steps we moved through to reach this point, because it can influence compatibility problems that could arise in the future.

As we've discussed, DCF was deprecated from client v7.0.28, and in the server code from a little earlier. But not everything in the BOINC world moves in lockstep, so we have older and newer servers in use, and we also have older and newer clients in use.

Older servers take account of client DCF when scaling runtime estimates prior to allocating work:
[send] active_frac 0.999987 on_frac 0.999802 DCF 0.776980
Newer servers don't:
[send] on_frac 0.999802 active_frac 0.999987 gpu_active_frac 0.999978
Those are both the same machine (the one I've been graphing here), which explains why on_frac and active_frac are identical. But the first line comes from the Einstein server log, and the second line from the Albert server log.

So, even my late-alpha version of BOINC (v7.3.19) is maintaining, using and reporting DCF against an 'old server' project which needs it. Good compatibility choice.

But the reverse case is not so happy. An older client (I'm talking standard stock clients here, not Jason's specially-tweaked client) will do on using and reporting DCF as before, because it doesn't parse the tag. But the newer server code has discarded DCF completely, and doesn't scale its internal runtime estimates when presented with a work request from a client which is still using it.

This can - and does - result in servers allocating vastly different volumes of work from what the client expects, because the estimation process doesn't have all the same inputs.

Say, for the sake of argument, that an 'old' (pre-v7.0.28) client has got itself into a state with DCF=100, and asks for 1 day of work. For the BRP4G tasks we're studying here, we'd all expect the server to allocate maybe 20 tasks, and the client to agree with the server calculation of estimated runtime, slightly over 1 day. But if the client is using DCF, and the server isn't, that can appear as a 100 day work work cache when the client does the local calculation. That's a case where server-client compatibility breaks down, and breaks down badly.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

It's a bit of a stretch to

Message 80168 in response to message 80167

It's a bit of a stretch to examine border cases when the standard setup doesn't even work right. IMO let's start at the common case & work outward, because I guarantee if the numbers come up flaky there, then they aren;t going to be magically better with incompatible server and clients.

For the present (treblehit's example) question, specifically the old Project DCF isn't involved in treblehit's example, on Albert, in any way (even though maintained by the client). It's the improper normalisation with inactive host scale appearing in another form

... however...

since both host_scale and pfc_scales are, somewhat noisy and unstable, 'per app DCFs' in disguise, and improperly normalised, it amounts to familiar sets of wacky number symptoms. If you keep looking for those you will find them everywhere, because the entire system is dependant on these, and you'd just end up swearing Project DCF is active server side, which in a sense through a lot of spaghetti it is, though it isn't called that, and is per app version and per host app version instead.

i.e. forget Project DCF (for now), use pfc_scale & host_scale.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: RE: Now the server

Message 80169 in response to message 80166

Quote:
Quote:

Now the server side, that 'Best version of app' striing comes from sched_version.cpp (scheduler inbuilt functions) and uses the following resources:
app->name, bavp->avp->id, bavp->host_usage.projected_flops/1e9

That projected_flops is set during app version selection, as number os samples will be 10 by the numbers.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

right, that's what I meant by

Message 80170 in response to message 80169

right, that's what I meant by line numbers (with brief description)

Caggy's case:

Quote:
if (av.pfc.n > MIN_VERSION_SAMPLES) {
hu.projected_flops = hu.peak_flops/av.pfc.get_avg();
if (config.debug_version_select) {
log_messages.printf(MSG_NORMAL,
"[version] [AV#%d] (%s) adjusting projected flops based on PFC avg: %.2fG\n",

is his marketing flops estimate peak_flops / app version pfc's .

app version pfc is normalised to 0.1 (design flaw), and any real samples would have driven it toward 0.05 or lower . so that text should be 10-20x+ marketing flops, and is NOT the intent, nor remotely correct design. It's Gibberish.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: I can't quickly find

Message 80171 in response to message 80169

Quote:
I can't quickly find the client GFLOPS peak number for Claggy's ATI 'Capeverde' with "based on PFC avg: 34968.78G". I'd like to look for the variable (presumably a struct member) where we might expect GFLOPS peak to be stored, and see what it's multiplied by in those initial stages before 11 completions establish an APR. We might expect 0.1 from the words, but we seem to be using >10 by the numbers.


17/06/2014 18:17:17 | | CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak)
17/06/2014 18:17:17 | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.5 (VM), device version OpenCL 1.2 AMD-APP (1348.5), 1024MB, 984MB available, 3584 GFLOPS peak)
17/06/2014 18:17:17 | | OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.5 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.5))

Claggy

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

there you go. app version pfc

Message 80172 in response to message 80171

there you go. app version pfc average (!) is 3584GFLOPS/34968.78 ~= 0.102**

[Edit:]
** unfortunately, that's improperly normalised, so meaningless without the normalisation reference app version figure, as per red ellipse on diagram... so the true figure will be likely around 0.02 or so, but anybody's guess without saying what app version is at 0.1

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: app version pfc is

Message 80173 in response to message 80170

Quote:
app version pfc is normalised to 0.1 (design flaw), and any real samples would have driven it toward 0.05 or lower . so that text should be 10-20x+ marketing flops, and is NOT the intent, nor remotely correct design. It's Gibberish.


The advice given to project administrators in http://boinc.berkeley.edu/trac/wiki/AppPlanSpec is:

Quote:
x
scale GPU peak speed by this (default 1).


I'm wondering whether they put in 0.1, expecting this to be a multiplier (real flops are lower than peak flops), but end up dividing by 0.1 instead? And from what you say, 'default 1' doesn't match the code either?

Edit: the alternative C++ documentation for plan_classes is in http://boinc.berkeley.edu/trac/wiki/PlanClassFunc. There, the example is

.21 // estimated GPU efficiency (actual/peak FLOPS) At least one of those must be upside down.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: app version pfc

Message 80174 in response to message 80173

Quote:
Quote:
app version pfc is normalised to 0.1 (design flaw), and any real samples would have driven it toward 0.05 or lower . so that text should be 10-20x+ marketing flops, and is NOT the intent, nor remotely correct design. It's Gibberish.

The advice given to project administrators in http://boinc.berkeley.edu/trac/wiki/AppPlanSpec is:

Quote:
x
scale GPU peak speed by this (default 1).

I'm wondering whether they put in 0.1, expecting this to be a multiplier (real flops are lower than peak flops), but end up dividing by 0.1 instead? And from what you say, 'default 1' doesn't match the code either?

nope [0.1 is hardwired via 'magic number'], and 1 wouldn't be right for GPU anyway. correct would be ~0.05, don't normalise (except for credit), and enable+set a default host_scale of 1 from the start.... which would yield a projected flops (before convergence) of 0.05x1*peak_flops ... basically one 20th of the Marketing flops... then [let it] scale itself..

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

See edit to my last. In my

Message 80175 in response to message 80174

See edit to my last. In my view, if the relevant numbers are all <<1, we should be multiplying by them, not dividing by them.

Out of coffee error - going shopping. Back soon.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: See edit to my last. In

Message 80176 in response to message 80175

Quote:

See edit to my last. In my view, if the relevant numbers are all <<1, we should be multiplying by them, not dividing by them.

Out of coffee error - going shopping. Back soon.

The main issue is really that he starts with real marketing flops (more or less usable), works out an average efficiency there (yuck but still OK-ish), but then he normalises to some other app version... IOW multiplies by some arbitrary large number (or divides by some fraction if you prefer) with no connection to real throughputs or efficiencies in this device+app.

That's OK for a relative number for credit (debatable)... but totally useless for time and throughput estimates (which are absolute estimates). Improper normalisation shrunk your estimate multiplying the projected_flops to 10x+ bloated marketing flops.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: At least one of those

Message 80177 in response to message 80173

Quote:
At least one of those must be upside down.

In a sense yes. GPU app+device+conditions efficiency would be actual/peak, and must be less than 1 (and it is, e.g. it should be around 0.05 for single task Cuda GPU). Normalisation could be viewed as turning it upside down. It'll raise the GFlops & shrink the time estimate artificially --> the exact opposite of the kindof behaviour we want for new hosts/apps.

A bit will become clearer when I have the next dodgy diagram ready. Getting bogged down in broken code is a bit of a red-herring at the moment, as there are design level issues to tackle first.

In particular, debugging the normalisation, including the absurd GFlops numbers it produces, is pointless in the context of estimates. That's because neither the time nor Gflops should be being normalised [AT ALL], so it all get's disabled in estimates, and restricted to credit related uses where it's applicable to get the same credit claims from different apps.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: RE: At least one of

Message 80178 in response to message 80177

Quote:
Quote:
At least one of those must be upside down.

In a sense yes. GPU app+device+conditions efficiency would be actual/peak, and must be less than 1 (and it is, e.g. it should be around 0.05 for single task Cuda GPU). Normalisation could be viewed as turning it upside down. It'll raise the GFlops & shrink the time estimate artificially --> the exact opposite of the kindof behaviour we want for new hosts/apps.

A bit will become clearer when I have the next dodgy diagram ready. Getting bogged down in broken code is a bit of a red-herring at the moment, as there are design level issues to tackle first.

In particular, debugging the normalisation, including the absurd GFlops numbers it produces, is pointless in the context of estimates. That's because neither the time nor Gflops should be being normalised [AT ALL], so it all get's disabled in estimates, and restricted to credit related uses where it's applicable to get the same credit claims from different apps.


Well, we do (crudely) have two separate cases to deal with.

1) initial attach. We have to get rid of that divide-by-almost-zero, or hosts can't run. They get the absurdly low runtime estimate/bound and error when they exceed it.

2) steady state. In my (political) opinion, trying to bring back client-side DCF will be flogging one dead horse too many. We need some sort of server-side control of runtime estimates, so that client scheduling works and user expectations are met. I'm happy to accept that the new version will be different to the one we have now, and look forward to seeing it.

OK, I'll get out of your hair, and take my coffee downstairs to grab some more stats.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: RE: At least

Message 80179 in response to message 80178

Quote:
Quote:
Quote:
At least one of those must be upside down.

In a sense yes. GPU app+device+conditions efficiency would be actual/peak, and must be less than 1 (and it is, e.g. it should be around 0.05 for single task Cuda GPU). Normalisation could be viewed as turning it upside down. It'll raise the GFlops & shrink the time estimate artificially --> the exact opposite of the kindof behaviour we want for new hosts/apps.

A bit will become clearer when I have the next dodgy diagram ready. Getting bogged down in broken code is a bit of a red-herring at the moment, as there are design level issues to tackle first.

In particular, debugging the normalisation, including the absurd GFlops numbers it produces, is pointless in the context of estimates. That's because neither the time nor Gflops should be being normalised [AT ALL], so it all get's disabled in estimates, and restricted to credit related uses where it's applicable to get the same credit claims from different apps.


Well, we do (crudely) have two separate cases to deal with.

1) initial attach. We have to get rid of that divide-by-almost-zero, or hosts can't run. They get the absurdly low runtime estimate/bound and error when they exceed it.

2) steady state. In my (political) opinion, trying to bring back client-side DCF will be flogging one dead horse too many. We need some sort of server-side control of runtime estimates, so that client scheduling works and user expectations are met. I'm happy to accept that the new version will be different to the one we have now, and look forward to seeing it.

OK, I'll get out of your hair, and take my coffee downstairs to grab some more stats.

LoL, always appreciate bouncing it around, thanks. At the moment it's a bit like pointing to a bucket of kittens and saying 'that's not the flower-pot I ordered!'. Yeah it's possible to debate over the intent versus function more, but when push comes to shove it's just wrong & gives wacky numbers. Not really any more complicated than that in some sense ;)

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Snow Crash
Snow Crash
Joined: 11 Aug 13
Posts: 10
Credit: 5011603
RAC: 0

[pre]June 29, 2014 18:00

[pre]June 29, 2014 18:00 UTC
https://albertathome.org/host/9649
BRP4G 2x using 1 cpu thread each (app_config), GPU utilization = 92%
running an additional 4x Skynet POGs cpu WUs
GPU 7950 mem=1325, gpu=1150, pcie v2 x16
OS Win7 x64 Home Premium
CPU 980X running at 3.41 GHz with HT off
MEM Triple channel 1600 (7.7.7.20.2)[/pre]

treblehit
treblehit
Joined: 12 Mar 05
Posts: 5
Credit: 35119
RAC: 0

RE: 1) initial attach. We

Message 80181 in response to message 80178

Quote:

1) initial attach. We have to get rid of that divide-by-almost-zero, or hosts can't run. They get the absurdly low runtime estimate/bound and error when they exceed it.

I'll be bringing more machines online today in a desperate attempt to provide steady, un-fiddled-with, untweaked, vanilla BRP4G work for you.

I just need instructed: A) let them fail so you can see that, B) somehow prevent them from failing so that you have the reliable work-flow.

Instructions, please.

Bret

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

Um, if you don't mind, I

Message 80182 in response to message 80181

Um, if you don't mind, I think it might be best to wait a little time. The administrators on this project are based in Europe, and as you know Jason is ahead of our time-zone, in Australia. I think it might be better to wait 12 hours or so, until we have a chance to compare notes by email when the lab opens in the morning.

After all, we don't want to use up our entire supply of unattached new hosts in one hit, or else we won't have anything left to test Jason's patches with....

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

treblehit
treblehit
Joined: 12 Mar 05
Posts: 5
Credit: 35119
RAC: 0

RE: Um, if you don't mind,

Message 80183 in response to message 80182

Quote:


Um, if you don't mind, I think it might be best to wait a little time.

Quote:

I completely understand, Richard. I was reluctant to bring it up in the first place.

Unfortunately for me I have to deal with the hardware side of it when I can, so I'm going to cope with that today. I'll get it ready to connect remotely when you guys are ready for it.

Let me know. You both know how to find me when and if you want me.

In the meantime, I'm going to detach this host and go away to stop being a distraction.

I only started this because "She Who Must Be Obeyed" had indicated you guys needed a reliable and unchanging stream of BRP4G tasks over on the GPU User's Group team message board.

Bret

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Um, if you don't mind,

Message 80184 in response to message 80182

Quote:

Um, if you don't mind, I think it might be best to wait a little time. The administrators on this project are based in Europe, and as you know Jason is ahead of our time-zone, in Australia. I think it might be better to wait 12 hours or so, until we have a chance to compare notes by email when the lab opens in the morning.

After all, we don't want to use up our entire supply of unattached new hosts in one hit, or else we won't have anything left to test Jason's patches with....

Yes, unhooking that normalisation ( which divides by ~0.1, multiplies the GPU GFlops x~10 into absurd levels, and shrinks time estimates) is going to take quite some preparation to unhook *safely*. That same mechanism is hooked into credit (where it does make sense), so quite a lot of backwards & forwards for clarification, discussion and debate will be needed to get it 'right', and part of that's going to be me communicating effectively (which isn't always easy :)).

The other aspect is that some bandaids will be painful to rip off, and still other odd artefacts might be hiding inside... and only way to tell for sure is open it up.

The next few days will tell if we're all on the same page (but looking from different angles is fine). To me though, we are well through the tricky bits of understanding the current system enough to say it needs to be a lot better.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

Latest scattergram. I've

Latest scattergram.

I've reverted my 5367 to normal running (early afternoon yesterday), so my timings *should* be lower and steadier - doesn't really seem to show in credit yet. I wonder why Claggy's laptop gets such variable credit?

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: I wonder why Claggy's

Message 80186 in response to message 80185

Quote:
I wonder why Claggy's laptop gets such variable credit?

Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: RE: I wonder why

Message 80187 in response to message 80186

Quote:
Quote:
I wonder why Claggy's laptop gets such variable credit?

Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there.


I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count).

Claggy

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: RE: I wonder

Message 80188 in response to message 80187

Quote:
Quote:
Quote:
I wonder why Claggy's laptop gets such variable credit?

Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there.


I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count).

Claggy

Could be the wingmen. (There's a number of combinations of wingmen types that'll give random results between two regions. Two similar wingmen tend to cancel with averaging and become 'normal')

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: RE: RE: RE: I

Message 80189 in response to message 80188

Quote:
Quote:
Quote:
Quote:
I wonder why Claggy's laptop gets such variable credit?

Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there.


I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count).

Claggy


Could be the wingmen. (There's a number of combinations of wingmen types that'll give random results between two regions. Two similar wingmen tend to cancel with averaging and become 'normal')


Conversely, when he's paired with me - now back to lower, stable, runtimes - no jackpot, no bonus. Sorry 'bout that.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: RE: RE: Quote

Message 80190 in response to message 80189

Quote:
Quote:
Quote:
Quote:
Quote:
I wonder why Claggy's laptop gets such variable credit?

Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there.


I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count).

Claggy


Could be the wingmen. (There's a number of combinations of wingmen types that'll give random results between two regions. Two similar wingmen tend to cancel with averaging and become 'normal')

Conversely, when he's paired with me - now back to lower, stable, runtimes - no jackpot, no bonus. Sorry 'bout that.

LoL, yep, throwing the dice to get an answer is as good as any ;)

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

juan BFB
juan BFB
Joined: 10 Dec 12
Posts: 8
Credit: 1674320
RAC: 0

@Richard/Claggy Should i

@Richard/Claggy

Should i continue to crunch BRP4G only or you sugest to crunch another type of WU too (could do GPU work only here).

BTW I slow down my cruchers here since don´t belive quantity is what you´re looking for and now they will produce a stable number of daily WU.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: BTW I slow down my

Message 80192 in response to message 80191

Quote:
BTW I slow down my cruchers here since don´t belive quantity is what you´re looking for and now they will produce a stable number of daily WU.


I think that's probably a good idea. We're already at the stage where my last 12 consecutive validations have been against one or other of your hosts (5 different machines, I think). And the machines are all pretty similar, to each other and to mine: GTX 670/690/780, running Win7/64 or (in one case) Server 2008.

In order to see (now) and test (later) BOINC's behaviour in the real world, we probably need a reasonable variation in hosts to give us realistic variation in the times and credits.

Bernd has launched a new 'BRP5' (Persueus Arm Survey) v1.40, with a Beta app tag on it, to test that new feature in the BOINC scheduler. I'm in the process of switching my machine over to run that instead: some company would be nice, but be warned: we're half expecting to fall over the 'EXIT_TIME_LIMIT_EXCEEDED' problem at some stage with BRP5 Beta, so hosts running it probably need to be watched quite closely for strange estimated runtimes, and you need to be ready to take action to correct it.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

RE: ... some company would

Message 80193 in response to message 80192

Quote:
... some company would be nice, but be warned: we're half expecting to fall over the 'EXIT_TIME_LIMIT_EXCEEDED' problem at some stage with BRP5 Beta...


I just downloaded my first v1.40 BRP5 and I'd say it's looking pretty good so far! The estimated completion time shown in Boinc is 5h03m08s.
These are the relevant lines from the scheduler log:

Quote:
2014-07-02 19:35:03.2067 [PID=25783] [version] Best version of app einsteinbinary_BRP5 is [AV#934] (24.74 GFLOPS)
2014-07-02 19:35:03.2067 [PID=25783] [send] est delay 0, skipping deadline check
2014-07-02 19:35:03.2067 [PID=25783] [version] get_app_version(): getting app version for WU#625766 (PB0020_006A1_164) appid:27
2014-07-02 19:35:03.2067 [PID=25783] [version] returning cached version: [AV#934]
2014-07-02 19:35:03.2067 [PID=25783] [send] est delay 0, skipping deadline check
2014-07-02 19:35:03.3000 [PID=25783] [send] Sending app_version einsteinbinary_BRP5 2 140 BRP5-cuda32-nv301; projected 24.74 GFLOPS
2014-07-02 19:35:03.3001 [PID=25783] [send] est. duration for WU 625766: unscaled 18188.26 scaled 18306.56
2014-07-02 19:35:03.3001 [PID=25783] [send] [HOST#2267] sending [RESULT#1514790 PB0020_006A1_164_4] (est. dur. 18306.56s (5h05m06s55)) (max time 363765.12s (101h02m45s11))


And I've got this in the application details:

Quote:
[pre]Binary Radio Pulsar Search (Perseus Arm Survey) 1.40 windows_intelx86 (BRP5-cuda32-nv301)
Number of tasks completed 0
Max tasks per day 0
Number of tasks today 1
Consecutive valid tasks 0
Average turnaround time 0.00 days[/pre]


For v1.39 the tasks took less than 5 hours and the APR was 21.91 GFlops.
Whatever was changed seems to be working with regards to the initial estimates assuming that the app and workload is more or less the same. Keep up the good work!

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

Nothing's been changed

Message 80194 in response to message 80193

Nothing's been changed yet...

I got something similar - 25.25Gflops and 4h57m02s24

Quote:
2014-07-02 17:43:24.7141 [PID=19995] [version] [AV#934] (BRP5-cuda32-nv301) using conservative projected flops: 25.25G
2014-07-02 17:43:24.7141 [PID=19995] [version] Best app version is now AV934 (102.01 GFLOP)
2014-07-02 17:43:24.7142 [PID=19995] [version] Checking plan class 'BRP5-opencl-ati'
2014-07-02 17:43:24.7142 [PID=19995] [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 0.480000
2014-07-02 17:43:24.7142 [PID=19995] [version] plan_class_spec: No AMD GPUs found
2014-07-02 17:43:24.7142 [PID=19995] [version] [AV#937] app_plan() returned false
2014-07-02 17:43:24.7142 [PID=19995] [version] Checking plan class 'BRP5-opencl-intel_gpu'
2014-07-02 17:43:24.7142 [PID=19995] [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 0.480000
2014-07-02 17:43:24.7142 [PID=19995] [version] [AV#935] Skipping Intel GPU version - user prefs say no Intel GPU
2014-07-02 17:43:24.7142 [PID=19995] [version] [AV#934] (BRP5-cuda32-nv301) using conservative projected flops: 25.25G
2014-07-02 17:43:24.7142 [PID=19995] [version] Best version of app einsteinbinary_BRP5 is [AV#934] (25.25 GFLOPS)
2014-07-02 17:43:24.7142 [PID=19995] [send] est delay 0, skipping deadline check
2014-07-02 17:43:24.7142 [PID=19995] [version] get_app_version(): getting app version for WU#625736 (PB0020_006A1_104) appid:27
2014-07-02 17:43:24.7143 [PID=19995] [version] returning cached version: [AV#934]
2014-07-02 17:43:24.7143 [PID=19995] [send] est delay 0, skipping deadline check
2014-07-02 17:43:24.7197 [PID=19995] [send] Sending app_version einsteinbinary_BRP5 2 140 BRP5-cuda32-nv301; projected 25.25 GFLOPS
2014-07-02 17:43:24.7198 [PID=19995] [send] est. duration for WU 625736: unscaled 17819.43 scaled 17822.25
2014-07-02 17:43:24.7198 [PID=19995] [send] [HOST#5367] sending [RESULT#1523511 PB0020_006A1_104_6] (est. dur. 17822.25s (4h57m02s24)) (max time 356388.68s (98h59m48s67))


But note that line I've picked out: that means there are fewer than 100 completed tasks for this app_version yet, across the project as a whole.

The worry is that when 100 tasks have been completed, but before you have completed 11 tasks on your host (to use APR), you'll see adjusting projected flops based on PFC avg and some absurdly large number. That'll be when the errors (if any) start.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

Roger that, will keep a close

Message 80195 in response to message 80194

Roger that, will keep a close watch on things until I've completed my first 11 tasks then.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

Well, here's the first

Well, here's the first conundrum:

All Binary Radio Pulsar Search (Perseus Arm Survey) tasks for computer 5367

After 200 minutes of solid GTX 670 work on Perseus, I earn the princely sum of ... 15 credits!

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

Yea, I've got something

Message 80197 in response to message 80196

Yea, I've got something similar, 13.01 cr for 150 minutes of HD7770 work:

https://albertathome.org/workunit/619367

Claggy