Project server code update

The project will be taken down in about an hour to perform an update of the BOINC server code. Ideally you shouldn't notice anything, but usually the world isn't ideal. See you again on the other side.

Comments

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

Noticed that I've been

Noticed that I've been assigned tasks from 2 "new" applications, BRP5 tasks for both Intel and Nvidia GPU. Non of those has an established APR so got another shot at the initial estimates.

Here's some numbers from my client_state.xml:

Intel GPU:

Quote:
581007031069.074340
BRP5-opencl-intel_gpu


That's 581 GFlops! Boinc reports it @ 147 GFlops peak in the startup messages.

Nvidia GPU:

Quote:
12454544406626.100000
BRP5-cuda32-nv301


That's 12454 GFlops or 12,45 TeraFlops! Boinc reports it @ 2985 GFlops peak in the startup messages. And the APR for the BRP4G Nvidia tasks is 58.1 GFlops when running 2 at a time.
If the BRP5 app gets the same APR then the initial speed estimate is that the card is 214 times as fast as it actually is!!!

Question:
How come the system estimates both resources to be much faster than what Boinc reports as their peak speed? Where's the logic in that?

All downloaded BRP5 task comes with 450000000000000.000000 or 450000 GFpops.

Crunching the numbers gives time estimates for the Intel GPU app @ 774,5 seconds or 12m54s. The 1st task has been running for 12m55s and reached 1,8% done...
For the tasks assigned to the Nvidia card the estimate is 36 seconds. First 2 tasks has been running for 1h8m and reached about 30% done...

I've resorted to add a few zeros to the to prevent Boinc from aborting the tasks with "maximum time limit exceeded".

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: ...Question: How come

Message 79948 in response to message 79947

Quote:
...Question:
How come the system estimates both resources to be much faster than what Boinc reports as their peak speed? Where's the logic in that?
...

The initial GPU guesses seem to rely on Marketing flops figures with some sortof scaling. There is coarse error there because achieving anywhere near rated peak GFlops on a GPU is extremely challenging... i.e. it's a guess, and not a very good one.

Then, basically after the first 11 completed, is where you enter control systems theory and 'averaging' (using the term 'averaging' loosely, because that part's quite incorrectly implemented).

Unfortunately in the current mechanism there are a number of identified instabilities. As soon as you introduce instabilities into a feedback control system, you basically either push it completely off the rails, oscillate (either in simple patterns or chaoitic ones), or fail to converge completely & just get garbage.

Something like this video of the famous Tacoma narrows Bridge collapse. You could ask similar questions like "Where is the logic in building a bridge like that ?", and "That bridge looked pretty stable yesterday, why'd it do that ?":

http://www.youtube.com/watch?v=j-zczJXSxnw

The answers probably lie in the designers not completely understanding the nature of the engineering problems at the time (quite understandable), and being caught quite off guard ... after all we've been building bridges for a very long time...

[Edit:] LoL, "Gallopin' Gertie" ... that's gonna stick :P

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: All downloaded BRP5

Message 79949 in response to message 79947

Quote:

All downloaded BRP5 task comes with 450000000000000.000000 or 450000 GFpops.

Crunching the numbers gives time estimates for the Intel GPU app @ 774,5 seconds or 12m54s. The 1st task has been running for 12m55s and reached 1,8% done...
For the tasks assigned to the Nvidia card the estimate is 36 seconds. First 2 tasks has been running for 1h8m and reached about 30% done...

I've resorted to add a few zeros to the to prevent Boinc from aborting the tasks with "maximum time limit exceeded".


What Jason said.

To which I'd add: once you have 11 tasks validated, at the next work fetch all the runtime estimates, including for work already downloaded and cached, will jump up to something approaching reality. At this point, you may find that you have too much work to complete within deadline. I'd strongly advise you to manage work fetch carefully to start with, using either NNT or by deselecting the new applications in preferences, so you don't overshoot the mark.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

RE: The initial GPU guesses

Message 79950 in response to message 79948

Quote:
The initial GPU guesses seem to rely on Marketing flops figures with some sortof scaling. There is coarse error there because achieving anywhere near rated peak GFlops on a GPU is extremely challenging... i.e. it's a guess, and not a very good one.


Might there be a typo in there that multiplies rater than divides or just a missed sign? It feels like the scaling goes in the opposite direction of what it's supposed to.

I've been following some of the discussion on the Seti-boards about this credit system and the code walk and look forward to the testing of an hopefully more stable and functional system here.
I understand that it's difficult to say but is there a timetable for when we start testing what hopefully is improvements to the system?

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: The initial GPU

Message 79951 in response to message 79950

Quote:
Quote:
The initial GPU guesses seem to rely on Marketing flops figures with some sortof scaling. There is coarse error there because achieving anywhere near rated peak GFlops on a GPU is extremely challenging... i.e. it's a guess, and not a very good one.

Might there be a typo in there that multiplies rater than divides or just a missed sign? It feels like the scaling goes in the opposite direction of what it's supposed to.

I've been following some of the discussion on the Seti-boards about this credit system and the code walk and look forward to the testing of an hopefully more stable and functional system here.
I understand that it's difficult to say but is there a timetable for when we start testing what hopefully is improvements to the system?

Basically If you overestimate the speed of the device, then you underestimate the time it'll take ... so for example I started with 3 second estimates for hour+ GPU tasks, because combined effects of any (even small) slop in the base estimates, room for optimisation in the application (s), plus the way the self scaling is set up, will tend to introduce those coarse errors first, then the erratic noise after as things try in vain to self-correct.

Simple effects can combine to look pretty complex :)

Thanks for helping out :) Yeah it's actually been a fascinating study so far, and no doubts many challenges to come.

[Edit:] timetable for first patch attempt is for sometime after Bernd's back from a break. No expectation that this first pass would solve every issue, but certainly should indicate quickly if we are poking at the right set of suspects.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

zombie67 [MM]
zombie67 [MM]
Joined: 10 Oct 06
Posts: 73
Credit: 30924459
RAC: 0

RE: Oh, you're going to

Message 79952 in response to message 79946

Quote:

Oh, you're going to love this one

[pre] Jason Holmis Claggy Zombie Zombie (Mac)
Host: 11363 2267 9008 6490 6109
GTX 780 GTX 660 GT 650M TITAN GTX 680MX

Credit for BRP4G (GPU)

Maximum 1170.48 1036.86 10239.0 1654.85 11847.50
Minimum 115.82 88.84 153.90 25.79 94.88
Average 548.33 463.98 3875.88 874.96 2256.70
Median 468.80 390.21 2977.38 865.33 1591.80
Std Dev 431.90 268.52 2873.26 362.30 2395.61[/pre]
I'll upload a graph after lunch, when my monitor has cooled down and I've stopped laughing.

Just FYI, for the machine with the TITAN, I am running 3 GPU tasks and reserving two full CPU threads. For the GTX 680MX, I am running 2 GPU tasks and reserving two full CPU threads. Not sure that makes any difference for the purpose of this discussion. Just explaining why the run times are longer than they should be.

Dublin, California
Team: SETI.USA

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Just FYI, for the

Message 79953 in response to message 79952

Quote:
Just FYI, for the machine with the TITAN, I am running 3 GPU tasks and reserving two full CPU threads. For the GTX 680MX, I am running 2 GPU tasks and reserving two full CPU threads. Not sure that makes any difference for the purpose of this discussion. Just explaining why the run times are longer than they should be.

It will make an impact with respect to those initial estimates, and possibly take a bit longer to semi-stabilise after through the initial period. Thanks for the info. It's an important case for us to consider, as multiple tasks on even modest GPUs is becoming more common now that it's feasible with app_config and stock applications.

I'm nearly ready to set some ambitious targets/goals for the patches, pending looking at Richard's funny graph(s). In the big picture I'm expecting a well damped response in the system fixes should keep tasks within estimated runtime bounds, even with more tasks than that per GPU, though I'll probably reserve final judgement on whether the server really needs to know more detail until after the first pass.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

This one's really rather

This one's really rather pretty, I think.

Note logarithmic credit scale :P

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Note logarithmic credit

Message 79955 in response to message 79954

Quote:
Note logarithmic credit scale :P

LoL. That's the same cheating as Eric did with the seti pfc_scales (as if I wouldn't notice :P)

The cool thing about logarithmic scales is how they can make naturally varying things look smooth & stable.... Doesn't seem to have worked here completely though ;)

[Edit:] going to stew on that, then set some targets for first pass, and longer term goals. Yep, everything seems to be matching what the Engineering and the code says...

[Edit2:] just a funny observation looking at that then this data again:

Quote:

[pre]Credit for BRP4G (GPU)

Maximum 1170.48 1036.86 10239.0 1654.85 11847.50
Minimum 115.82 88.84 153.90 25.79 94.88
Average 548.33 463.98 3875.88 874.96 2256.70
Median 468.80 390.21 2977.38 865.33 1591.80
Std Dev 431.90 268.52 2873.26 362.30 2395.61[/pre]

Six sigma ( 6 x std deviations) of near or greater than the maximum is a worrying amount of entropy...

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: Oh, you're going to

Message 79956 in response to message 79946

Quote:

Oh, you're going to love this one

[pre] Jason Holmis Claggy Zombie Zombie (Mac)
Host: 11363 2267 9008 6490 6109
GTX 780 GTX 660 GT 650M TITAN GTX 680MX

Credit for BRP4G (GPU)

Maximum 1170.48 1036.86 10239.0 1654.85 11847.50
Minimum 115.82 88.84 153.90 25.79 94.88
Average 548.33 463.98 3875.88 874.96 2256.70
Median 468.80 390.21 2977.38 865.33 1591.80
Std Dev 431.90 268.52 2873.26 362.30 2395.61[/pre]
I'll upload a graph after lunch, when my monitor has cooled down and I've stopped laughing.


For your info, my GT650M is running one task at a time, and I'm only running two CPU tasks at a time too,
(It runs very hot, the 2.5GHz i5-3210M is a dual core with hyper threading, with it running on it's turbo mode of 2.89GHz the CPU cores sit at 99°C,
add another core crunching, or the intel GPU crunching and it starts downclocking, both CPU and Nvidia GPU)

Since I've now got Intel GPU tasks, the CPU is flucturating between 1.90GHz and 2.89GHz in 0.1GHz steps, ie 2.89, 2.79, 2.69, 2.59, 2.50, 2.40, 2.20, 2.10, etc,
and the GT650M is switching between 950MHz and 118MHz, while the HD Graphics 4000 is switching between 950MHz, 1.0GHz, 1.05GHz and 1.10GHz,
expect all task durations to flucturate. ;-)

Claggy

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

Interesting. I was just

Message 79957 in response to message 79956

Interesting. I was just noticing on the task list that you had two tasks reported at 15:49 and 15:51 today (not the top two, #4 and #5 currently) with over 10K credit each. Ruined my nice trumpet-shaped graph! ;)

Same wingmate! I think 10320 Jacob Klein might go on my spare page.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

So here's an enlarged

So here's an enlarged view.

I wonder why two laptops validating each other should do that?

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

I tried to resend those BRP

I tried to resend those BRP (Arecibo, GPU) tasks, but got them expired instead (I had use ATI GPU set to No), So managed to get fresh GPU tasks, a mixture of BRP (Arecibo, GPU) and BRP (Perseus Arm Survey),
the (Arecibo, GPU) tasks now have estimates of 13 minutes, while they take an hour, so they are now completeable, the (Perseus Arm Survey) tasks have estimates of 16 seconds, so aren't, i'll let the ones I have run and error:

All tasks for computer 8143

Application details for host 8143

Claggy

nenym
nenym
Joined: 13 Jun 11
Posts: 15
Credit: 10001988
RAC: 0

I have fixed fpops intel_GPU

I have fixed fpops intel_GPU issue using app_info.xml containing tag
14479075542.794144 for BPR4, BPR4G and BPR5 intel_GPU applications. Seems to work at both HD4000 and HD4600. Is it correct way?

Bernd Machenschalk
Bernd Machenschalk
Administrator
Joined: 15 Oct 04
Posts: 155
Credit: 6218130
RAC: 0

RE: Intel GPUs are now

Message 79961 in response to message 79900

Quote:

Intel GPUs are now being shown by the project in the computer details pages:

Computer 9008

But 'Use Intel GPU' isn't being shown on the Albert project preferences page in spite of there being intel GPU apps available, perhaps those apps need their settings adjusted?

Claggy

Yes, the (old) web code hasn't been updated and probably won't be at all, due to the envisioned (but again postponed) migration to Drupal (see other news).

BM

BM

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: So here's an enlarged

Message 79962 in response to message 79958

Quote:
So here's an enlarged view.
...
I wonder why two laptops validating each other should do that?

With Claggy's clocking up & down all over the shop, (and probably Jacob's too), then you have an exaggerated form of what happens anyway. noisy elapsed times that are unsmoothed, and account for all sorts of klingons that had nothing to do with proccessing. Sampled averages don't cut it for this process.

Think of the moving sampled average as a conveyor belt perhaps. One small sample drops off to make room for a new much larger one, then a large one drops off making room for the next, and so on. Nothing smooth about it, but will lurch around like some sortof crazed Frankenstein's monster on a rampage. When two meet they could cancel or add.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: RE: So here's an

Message 79963 in response to message 79962

Quote:
Quote:
So here's an enlarged view.
...
I wonder why two laptops validating each other should do that?

With Claggy's clocking up & down all over the shop, (and probably Jacob's too), then you have an exaggerated form of what happens anyway. noisy elapsed times that are unsmoothed, and account for all sorts of klingons that had nothing to do with proccessing. Sampled averages don't cut it for this process.

Think of the moving sampled average as a conveyor belt perhaps. One small sample drops off to make room for a new much larger one, then a large one drops off making room for the next, and so on. Nothing smooth about it, but will lurch around like some sortof crazed Frankenstein's monster on a rampage. When two meet they could cancel or add.


Just provided they don't multiply...

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: RE: So here's

Message 79964 in response to message 79963

Quote:
Quote:
Quote:
So here's an enlarged view.
...
I wonder why two laptops validating each other should do that?

With Claggy's clocking up & down all over the shop, (and probably Jacob's too), then you have an exaggerated form of what happens anyway. noisy elapsed times that are unsmoothed, and account for all sorts of klingons that had nothing to do with proccessing. Sampled averages don't cut it for this process.

Think of the moving sampled average as a conveyor belt perhaps. One small sample drops off to make room for a new much larger one, then a large one drops off making room for the next, and so on. Nothing smooth about it, but will lurch around like some sortof crazed Frankenstein's monster on a rampage. When two meet they could cancel or add.


Just provided they don't multiply...

Well sadly, rampaging monsters everywhere :-O , because those averages feed scales (which multiply), 2 cascaded providing 'gain' to the noise, like an audio amplifier.

One totally impractical theoretical solution would be just to take many more samples (than 11...). Could work if you didn't mind waiting a few days/weeks/months for estimates to settle in...but worse for the onramp period, which is a lot of the problem.

Better (proven) approaches will be tried... nearly ready to set those short & medium term goals.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

just remember that it's nice

just remember that it's nice to quickly converge from a bad initial estimate, but if the tasks ends up Time limit exceeded it doesn't give any clue as to the eventual runtime. And we don't want to wind out the rsc_fpops_bound too much either, because hung apps occasionally happen. So we really need to do something about the initial runtime estimate for GPU too.

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: just remember that it's

Message 79966 in response to message 79965

Quote:
just remember that it's nice to quickly converge from a bad initial estimate, but if the tasks ends up Time limit exceeded it doesn't give any clue as to the eventual runtime. And we don't want to wind out the rsc_fpops_bound too much either, because hung apps occasionally happen. So we really need to do something about the initial runtime estimate for GPU too.


Which seems to be specifically an initial FLOPs estimate problem. David initially dismissed it as a local Einstein phenomenon, so we ought to check that possibility and find out where the enormous values reported in this thread really came from.

And then we need a way to assign a realistic starting point for GPU speeds, across all projects and extensible to new cards as they are released without waiting for BOINC client updates. We have the problem that the NV API won't report shaders per SM, so estimates for new cards are always faulty until the hard-wired multiplier is updated: and I believe detection of new ATI cards like the R 290 is even more flawed.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

I'm still monitoring credits,

I'm still monitoring credits, though the flow of validations seems to be slowing.

CPU tasks seem to be continuing as before, so let's concentrate on GPU for a while.

[pre] Jason Holmis Claggy Zombie ZombieM Jacob
Host: 11363 2267 9008 6490 6109 10320
GTX 780 GTX 660 GT 650M TITAN 680MX FX 3800M

Credit for Gw-CasA

Maximum 1503.78 1357.46 10951.9 1933.46 11847.5 10951.96
Minimum 115.82 88.84 153.90 91.50 94.88 508.73
Average 733.95 592.45 4200.42 994.85 2092.67 3640.96
Median 945.84 492.55 3037.38 1054.49 1642.95 1710.08
Std Dev 499.17 373.86 3179.62 374.42 2019.66 3970.94[/pre]

Is it my imagination, or is that trend continuing upwards? I think it was Eric that warned us that 'pure GPU' apps (without a CPU app_version to keep them grounded) tended to explode in the end.

Edit - yes:

Quote:
Its best to always have a max granted credit in your assimilator, unless the value really is indeterminant. BOINCs estimates for GPU FLOPS are often 20x what is actually achieved. Credit grants tend to float through the roof under credit_new without a CPU version of the app to pull them back to Earth.
6/6/2014

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: ... Is it my

Message 79968 in response to message 79967

Quote:

...
Is it my imagination, or is that trend continuing upwards? I think it was Eric that warned us that 'pure GPU' apps (without a CPU app_version to keep them grounded) tended to explode in the end.

Edit - yes:

Quote:
Its best to always have a max granted credit in your assimilator, unless the value really is indeterminant. BOINCs estimates for GPU FLOPS are often 20x what is actually achieved. Credit grants tend to float through the roof under credit_new without a CPU version of the app to pull them back to Earth.
6/6/2014

basically mentally subtracting the noise is fine, then you see the shape of the convergence. Initial estimates appear about 1/3rd what they should be (credit terms not time), which you could regard as 'fine' if convergence was more prompt... though obviously better would be a nice goal. Since my 780 was running around 10% of expected credits at the beginning, and Zombie's TITAN about 1/3rd, you can either accept that Zombie's multiple tasks per GPU from the beginning is a special situation, or not.

If you accept that Zombie's TITAN config is a special case, then you need to consider mine as 'normal', in which case the expected credits start about one tenth of what they should be. IOW the earlier observation from Holmis (?) seems correct, along with my observation of 3 second estimates, in that the estimates that are supposed to be conservative with respect to time bound, and generous with credit, are in fact the opposite... There is some math upside down.

Trends-wise, I'd say for certain that what you're seeing is there, since averages are in play. What's wrong with the shape is that the convergence (from wrong numbers toward ballpark) is way too slow (too many tasks), and as much as it looks like it's trending upward now, downward compensation after will be just as slow and obvious. It's [the rising edge] part of a long term oscillation. You could easily say that some of those machines are trending down.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

Let's wait and see. I've seen

Message 79969 in response to message 79968

Let's wait and see. I've seen some projects go exponential (literally - into the millions of credits) when they start like this.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Let's wait and see.

Message 79970 in response to message 79969

Quote:
Let's wait and see. I've seen some projects go exponential (literally - into the millions of credits) when they start like this.

That unstable ? interesting... Yeah with instabilities it can ring or run completely off the rails... A bit hard for me to predict that :P (both conditions are unstable)

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: RE: Let's wait and

Message 79971 in response to message 79970

Quote:
Quote:
Let's wait and see. I've seen some projects go exponential (literally - into the millions of credits) when they start like this.

That unstable ? interesting... Yeah with instabilities it can ring or run completely off the rails... A bit hard for me to predict that :P (both conditions are unstable)


One of my hosts is still showing that I reached a user RAC of 99,952,529.17 at AQUA - they were multithreaded CPU apps, rather than GPU, but they still blew up.

Edit:

    
        1309910400.000000
        16831271.163956
        37773.342884
        4602266.488011
        8549.212942
    
    
        1309996800.000000
        743356375.874693
        127074513.727733
        4602266.488011
        6283.249755
    
    
        1310256000.000000
        743402121.576823
        99952529.170651
        4602266.488011
        5661.725458
    

July 2011

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

RE: RE: RE: Let's wait

Message 79972 in response to message 79971

Quote:
Quote:
Quote:
Let's wait and see. I've seen some projects go exponential (literally - into the millions of credits) when they start like this.

That unstable ? interesting... Yeah with instabilities it can ring or run completely off the rails... A bit hard for me to predict that :P (both conditions are unstable)

One of my hosts is still showing that I reached a user RAC of 99,952,529.17 at AQUA - they were multithreaded CPU apps, rather than GPU, but they still blew up.


you must admit AQUA was a special case and they went rather suddenly offline before we had the slightes chance to investigate.
2011? feels like yesterday...

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

tullio
tullio
Joined: 22 Jan 05
Posts: 53
Credit: 137342
RAC: 0

AQUA was made by D-Wave,

AQUA was made by D-Wave, which says it has built quantum computers and sold them for about a hundred million dollars each.
Tullio

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: AQUA was made by

Message 79974 in response to message 79973

Quote:
AQUA was made by D-Wave, which says it has built quantum computers and sold them for about a hundred million dollars each.
Tullio


And still did, up to last year at least.

Google and NASA team up to use quantum computer

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Snow Crash
Snow Crash
Joined: 11 Aug 13
Posts: 10
Credit: 5011603
RAC: 0

I really appreciate what you

I really appreciate what you guys are doing for ALL of us crunching grunts

I have some hardware I can bring on board but I don't have the time to provide the ins and outs of log files etc. Is there a particular app mix that would be most helpful?
[pre]
OS BOINC CPU GPU
Win7 7.2.42 980x 670 + 660Ti
Win7 7.2.42 920 7950
Win7 7.2.42 4670k Intel
[/pre]

I also have some other gpus I could mix/ match if your looking for some diversity ... 7850, 7770, GTX480, GTX295.
For GPU would it be better if I ran 1 per card or is it enough for your modeling if I just let you know what multiplier I am using?
Any value in providing overclock settings?

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

Attached a new host to

Attached a new host to Albert, looking through the logs i keep getting the following download error:

14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_05.png
14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_07.png
14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_08.png
14-Jun-2014 06:06:33 [Albert@Home] Finished download of eah_slide_07.png
14-Jun-2014 06:06:33 [Albert@Home] Started download of EatH_mastercat_1344952579.txt
14-Jun-2014 06:06:34 [Albert@Home] Finished download of eah_slide_05.png
14-Jun-2014 06:06:34 [Albert@Home] Finished download of eah_slide_08.png
14-Jun-2014 06:06:34 [Albert@Home] Giving up on download of EatH_mastercat_1344952579.txt: permanent HTTP error

On this new host (as well as on my HD7770) i'm still getting the very short estimates for Perseus Arm Survey GPU tasks, so i've added two zero's to the rsc_fpops values so they'll complete.

Computer 11441

Claggy

Snow Crash
Snow Crash
Joined: 11 Aug 13
Posts: 10
Credit: 5011603
RAC: 0

updated details [pre] OS

Message 79977 in response to message 79975

updated details
[pre]
OS BOINC CPU GPU Utilization Factor
Win7 7.2.42 980x 670 0.50
Win7 7.2.42 920 7950 0.33
Win7 7.2.42 4670k 7850 0.50
[/pre]

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

The GPU plot continues to

The GPU plot continues to thicken.

Some of the machines I picked to monitor seem to have dropped out of the running, so I added one of my own yesterday.

It seems to have started in line with current trends, but like the others, there seem to be distinct 'upper' and 'lower' credit groupings. Why would that be?

I've also noticed something that doesn't show with this style of plotting against reporting time: when I've been infilling late validations, the credit awards - in line with the trendlines - have been much higher than their contemporaries got, and the double population is visible there too. For example, Claggy's WU 606387 from 5 June was awarded over 7K overnight.

That should show up in the changing averages:

[pre] Jason Holmis Claggy Zombie ZombieM Jacob RH
Host: 11363 2267 9008 6490 6109 10320 5367
GTX 780 GTX 660 GT 650M TITAN 680MX FX3800M GTX 670

Credit for BRP4G, GPU

Maximum 1584.75 1495.38 10951.9 2031.40 11847.5 10951.9 4137.85
Minimum 115.82 88.84 153.90 91.50 94.88 508.73 1355.49
Average 813.32 688.57 4120.83 1074.52 2010.65 2743.09 1917.71
Median 973.50 539.19 3037.38 1122.70 1591.80 1456.42 1641.43
Std Dev 523.51 428.48 3074.79 393.82 1823.08 3177.59 703.78

nSamples 36 52 48 387 189 17 21[/pre]
(Why does the new web code double-space [ pre ]?)

Edit - corrected copy/paste error on application type.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

Interesting. Of course credit

Interesting.
Of course credit is at time of validation not reporting against a moving scale, yadda yadda.
So, late validations earn higher. the wingmate is then either a slow host or has a larger cache. the first is in line with expectations iirc, the second would be puzzling.
then again if you are observig a chaotic system it's hard to know what may or may not happen anyway...

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Interesting. Of course

Message 79980 in response to message 79979

Quote:
Interesting.
Of course credit is at time of validation not reporting against a moving scale, yadda yadda.
So, late validations earn higher. the wingmate is then either a slow host or has a larger cache. the first is in line with expectations iirc, the second would be puzzling.
then again if you are observig a chaotic system it's hard to know what may or may not happen anyway...

Yes, the two mixed and partially overlapping time domains for the credit mechanism (issue time and validation time) both oscillate & interact. Those two are enough combined to setup a resonance by themselves ( additive & subtractive), but then with the natural variation in the real-world processing thrown in, you introduce another one or more 'bodies', making the system an unpredictable n-body problem.

For our purposes the dominant overlap is in those time domains, so sufficient damping to each control point to place them in separate time domains should be enough to break the chaotic behaviour.

e.g.

global pfc_scale -> minimise the coarse scaling error and vary smoothly over some 100's to 1000's of task validations

host (app version) scale -> vary smoothly over some 10's of validations, small enough to respond to hardware or app change in 'reasonable' time (number of tasks)

client side (currently either disabled or broken by project DCF not beign per application) - vary smoothly over a few to 10 tasks

Once separated in time, then this coarse->fine->finer tuning is difficult to destabilise, even with fairly sudden aggressive change at any point.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

The few wingmates I've

Message 79981 in response to message 79980

The few wingmates I've spot-checked against high credit awards (my own and others) don't seem to be outside the normal population. They too show bi-modal distributions, and high credits don't seem to be correlated with varying runtimes. I'm trying to run the GTX 670 as stable as possible, and runtimes are very steady.

Which leaves the moving average theory. Two more tasks have both reported and validated since I posted the graph: they both got over 2,000 credits, which suggests the average is still moving upwards.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

zombie67 [MM]
zombie67 [MM]
Joined: 10 Oct 06
Posts: 73
Credit: 30924459
RAC: 0

Just FYI, I had to take my

Just FYI, I had to take my two CUDA machines off albert for a while. I need to help a team mate at another project. I will be back.

Dublin, California
Team: SETI.USA

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: The few wingmates I've

Message 79983 in response to message 79981

Quote:

The few wingmates I've spot-checked against high credit awards (my own and others) don't seem to be outside the normal population. They too show bi-modal distributions, and high credits don't seem to be correlated with varying runtimes. I'm trying to run the GTX 670 as stable as possible, and runtimes are very steady.

Which leaves the moving average theory. Two more tasks have both reported and validated since I posted the graph: they both got over 2,000 credits, which suggests the average is still moving upwards.

Yes the beauty of testing here has been the *relatively* consistent runtimes on a given host/gpu, which tends to demonstrate that much of the instability is artificially induced (as opposed to a direct function of the noisy elapsed times).

The upward trend would appear to be a drift instability, and bets are off as to whether it continues indefinitely, levels off, or (my guess) starts a downward cycle after some peak.

The bi-modal characteristic appears to be that oscillation between saturation and cuttoff, most likely turn-around time related. That'd be predominantly proportional error ('overshoot'). [post scaling by the average of the validated claims isntt going to help that, though it was a valiant attempt over the original choice of taking the minimum claim]

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

Heisenberg? My Maths teacher?

Heisenberg? My Maths teacher? Chaos theory? Zen? Superstitious pidgeons?

In a chaotic system, a trend observed may be genuine or coincidental.

And in any case, we shouldn't be putting too much effort into observing the scintillations of a soapbubble we intend to burst. Of course the colourpattern is fascinating indeed...

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Heisenberg? My Maths

Message 79985 in response to message 79984

Quote:

Heisenberg? My Maths teacher? Chaos theory? Zen? Superstitious pidgeons?

In a chaotic system, a trend observed may be genuine or coincidental.

And in any case, we shouldn't be putting too much effort into observing the scintillations of a soapbubble we intend to burst. Of course the colourpattern is fascinating indeed...

Good. what I'm doing at the moment is drafting goals for the first patch. A lot of that will be involved with confirming/rejecting the particular suspects for the purposes of isolation. I do think the observations, or 'getting a feel' for the character of what it's doing, is important at the moment... and the patterns are pretty to look at... but you're right, I'd like to have the full first pass patch ready for trials by the time Bernd's back on duty...

Outline for pass 1:
Objectives and scope
Resources
Procedure
Results
Discussion of Results
Conclusions and further work (plan for pass 2)

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

I've been asked to pull out

I've been asked to pull out stats on runtime and turnround, in case they explain anything - I don't think they do.

[pre] Jason Holmis Claggy Zombie ZombieM Jacob RH
Host: 11363 2267 9008 6490 6109 10320 5367
GTX 780 GTX 660 GT 650M TITAN 680MX FX3800M GTX 670

Credit for BRP4G, GPU
Maximum 1584.75 1495.38 10952.0 2031.40 11847.5 10952.0 4137.85
Minimum 115.82 88.84 153.90 91.50 94.88 508.73 1355.49
Average 813.32 688.57 4120.83 1074.52 2010.65 2743.09 1890.82
Median 973.50 539.19 3037.38 1122.70 1591.80 1456.42 1644.33
Std Dev 523.51 428.48 3074.79 393.82 1823.08 3177.59 614.08

nSamples 36 52 48 387 189 17 28

Runtime (seconds)
Maximum 4401.98 5088.99 11295.0 5383.71 23977.4 11774.3 4138.67
Minimum 3259.10 3294.83 8136.47 1908.26 1512.16 11515.5 4061.45
Average 3668.57 4483.11 8910.76 4194.81 4216.60 11623.3 4102.53
Median 3603.24 4608.05 8837.27 4228.18 4183.74 11603.6 4109.83
Std Dev 314.13 506.27 554.34 555.88 1981.31 70.73 25.21

Turnround (days)
Maximum 2.11 3.91 1.70 3.44 2.94 5.57 0.72
Minimum 0.15 0.07 0.14 0.24 1.52 0.18 0.15
Average 1.14 1.97 0.69 2.10 2.32 1.64 0.44
Median 0.73 1.90 0.74 2.00 2.42 0.88 0.48
Std Dev 0.79 0.98 0.35 0.57 0.32 1.68 0.17[/pre]
Zombie's Mac had just two of those extra-long runtimes:
WU 612084, 1,328.76 cr
WU 612045, 6,419.08 cr

Edit - I updated my own column since this morning, the others are unchanged.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

hmm. maybe. would need to

hmm. maybe. would need to ball eye the code again to get more certainity.
smaple size is a bit small too - maybe in a few days.

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

I'll add that my GTX660Ti is

I'll add that my GTX660Ti is running 2 task at a time mixing BRP4G and BRP5 from Albert and BRP5 from Einstein. The Intel HD4000 is running single tasks.

Here's an updated Excel file with data and plots from host 2267 and the following searches: BRP4X64, BRP4G, S6CasA, BRP5 (iGPU) and BRP5 (Nvidia GPU).

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: ... in case they

Message 79989 in response to message 79986

Quote:
... in case they explain anything - I don't think they do.
...

Well explore any possibility for sure. From an engineering perspective, understanding the spoon doing the stirring is probably a good idea [though asking it what it's doing might not reveal much :) ]. Probably we won't try getting rid of the spoon, but instead what it's mixing... fine granulated sugar and some honey, instead of a solid hunk of extra thick molasses.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

RE: Nvidia GPU: RE:

Message 79990 in response to message 79947

Quote:
Nvidia GPU:
Quote:
12454544406626.100000
BRP5-cuda32-nv301

That's 12454 GFlops or 12,45 TeraFlops! Boinc reports it @ 2985 GFlops peak in the startup messages. And the APR for the BRP4G Nvidia tasks is 58.1 GFlops when running 2 at a time.
If the BRP5 app gets the same APR then the initial speed estimate is that the card is 214 times as fast as it actually is!!!


A follow up on my post about initial estimates, the 11th BRP5 task has now been validated and the APR has been calculated to 30.16 GFlops when running 2 tasks at a time.
So the initial estimate was that the card was a whooping 412,9 times faster than actual! =O

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: RE: Nvidia

Message 79991 in response to message 79990

Quote:
Quote:
Nvidia GPU:
Quote:
12454544406626.100000
BRP5-cuda32-nv301

That's 12454 GFlops or 12,45 TeraFlops! Boinc reports it @ 2985 GFlops peak in the startup messages. And the APR for the BRP4G Nvidia tasks is 58.1 GFlops when running 2 at a time.
If the BRP5 app gets the same APR then the initial speed estimate is that the card is 214 times as fast as it actually is!!!

A follow up on my post about initial estimates, the 11th BRP5 task has now been validated and the APR has been calculated to 30.16 GFlops when running 2 tasks at a time.
So the initial estimate was that the card was a whooping 412,9 times faster than actual! =O

Yeah, for my 780 initial guesstimates were 3 seconds, and came in about an hour, so about a thousand-fold discrepancy. Room for optimisation in the application doesn't account for the whole discrepancy (of course :) ).

looks like there'll be some digging along these lines to do. Just some factors to look at with new GPU apps & hosts might include:
- CPU app underclaim becomes global scaling reference, we know these leave out SIMD (sse->avx) on raw claims, so look more efficient than they are, because certain numbers come out 'impossible' if the logic were right (e.g. pfc_scale < 1) [perhaps up to 10x discrepancy)
- GPU overestimate of performance (attempting to be generous) [could be in the ballpark of another 10x discrepancy]
- Actual utilisation variation ( e.g. from people using their machines, multiple tasks per GPU...more...) [ combined perhaps up to another 10x]
- room for optimisation of the application(s), or inherent serial limitations [maybe from 1x to very big number discrepancy, let's go with another 10x]

So all-told with guesstimates probably 1000x discrepancy is easily plausible, just among known factors, and there may be more... That's completely ignoring the possibility of hard-logic flaws, which are entirely possible, even likely.

That all will probably be dependant on some of the customisations and special situations here at Albert, stil lto be examined in context. Thes include if & how some apps are 'pure GPU' or otherwise packages of multiple CPU-like tasks, how they are wired in and scaled.

On the surface it looks like there may well be multiple of these things in play.

These all seem connected, so I suspect we'll need to make patch one multi-step, so we make switch in one small change at a time, and watch for weird interactions.

For example, Fix CPU side coarse scale error then watch GPU estimate &/or credit blow-out in response... LoL

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

RE: RE: ... in case they

Message 79993 in response to message 79989

Quote:
Quote:
... in case they explain anything - I don't think they do.
...

Well explore any possibility for sure. From an engineering perspective, understanding the spoon doing the stirring is probably a good idea [though asking it what it's doing might not reveal much :) ]. Probably we won't try getting rid of the spoon, but instead what it's mixing... fine granulated sugar and some honey, instead of a solid hunk of extra thick molasses.


'There is no spoon' ;)

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

RE: RE: RE: Nvidia

Message 79994 in response to message 79991

Quote:
Quote:
Quote:
Nvidia GPU:
Quote:
12454544406626.100000
BRP5-cuda32-nv301

That's 12454 GFlops or 12,45 TeraFlops! Boinc reports it @ 2985 GFlops peak in the startup messages. And the APR for the BRP4G Nvidia tasks is 58.1 GFlops when running 2 at a time.
If the BRP5 app gets the same APR then the initial speed estimate is that the card is 214 times as fast as it actually is!!!

A follow up on my post about initial estimates, the 11th BRP5 task has now been validated and the APR has been calculated to 30.16 GFlops when running 2 tasks at a time.
So the initial estimate was that the card was a whooping 412,9 times faster than actual! =O

Yeah, for my 780 initial guesstimates were 3 seconds, and came in about an hour, so about a thousand-fold discrepancy. Room for optimisation in the application doesn't account for the whole discrepancy (of course :) ).

looks like there'll be some digging along these lines to do. Just some factors to look at with new GPU apps & hosts might include:
- CPU app underclaim becomes global scaling reference, we know these leave out SIMD (sse->avx) on raw claims, so look more efficient than they are, because certain numbers come out 'impossible' if the logic were right (e.g. pfc_scale < 1) [perhaps up to 10x discrepancy)
- GPU overestimate of performance (attempting to be generous) [could be in the ballpark of another 10x discrepancy]
- Actual utilisation variation ( e.g. from people using their machines, multiple tasks per GPU...more...) [ combined perhaps up to another 10x]
- room for optimisation of the application(s), or inherent serial limitations [maybe from 1x to very big number discrepancy, let's go with another 10x]

So all-told with guesstimates probably 1000x discrepancy is easily plausible, just among known factors, and there may be more... That's completely ignoring the possibility of hard-logic flaws, which are entirely possible, even likely.

That all will probably be dependant on some of the customisations and special situations here at Albert, stil lto be examined in context. Thes include if & how some apps are 'pure GPU' or otherwise packages of multiple CPU-like tasks, how they are wired in and scaled.

On the surface it looks like there may well be multiple of these things in play.

These all seem connected, so I suspect we'll need to make patch one multi-step, so we make switch in one small change at a time, and watch for weird interactions.

For example, Fix CPU side coarse scale error then watch GPU estimate &/or credit blow-out in response... LoL


Must be picking up extra factors from somewhere - might be due to the fact that (as Eric stated) there is no damping of GPu by CPU figures. But there is one or more massive scaling errors lurking around.
Dang. That code is a nightmare to walk.

might not all be flops scaling error of course, need to cast an eye over the rsc_fpops_est calculation as well...

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 143
Credit: 5409572
RAC: 0

RE: might not all be flops

Message 79995 in response to message 79994

Quote:
might not all be flops scaling error of course, need to cast an eye over the rsc_fpops_est calculation as well...


Since I'm not running anonymous platform, rsc_fpops_est isn't being scaled.

I'm seeing

fpops_est 280,000,000,000,000
FLOPs 69,293,632,242
APR 69.29 GFLOPS

Order-of-magnitude sanity check:

280 e12 fpops
70 e9 flops/sec

--> 4 e3 runtime estimate. Correct on both server and client, and sane.

Edit: identical card in same host has APR of 156.36 at GPUGrid. I'm running one task at a time there, and two at a time here, so I'd say that the speed estimates agree. IOW, runtime estimates are OK once the initial gross mis-estimation has been overcome: this bit of the quest is for credit purposes only.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

Ta. Looks like fpops estimate

Message 79996 in response to message 79995

Ta. Looks like fpops estimate isn't too shabby then. But you need to exclude that contribution.

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

jason_gee
jason_gee
Joined: 4 Jun 14
Posts: 111
Credit: 1043639
RAC: 0

RE: Ta. Looks like fpops

Message 79997 in response to message 79996

Quote:
Ta. Looks like fpops estimate isn't too shabby then. But you need to exclude that contribution.

In both the Seti cases, and in the CPU app case here that I've only looked at cursorily, the fpops_est have been what I would call 'reasonable'. I don't think it would be reasonable to ask of project developers to estimate more precisely than what theory tells them (though it's certainly open to debate)

Overnight I've been musing on how to stage/phase/break-up the tests. Since there seems to be evidence of interaction between CPU & GPU scaling, I'd like to start with the coarse CPU scaling first, and prescribe watching all applications for effects when it's engaged.

@Richard, yeah the pfc scale should be compensating handily for any initial SIMD related disagreement there (It's had enough time), but since the scaling swing is in the opposite direction to GPU, and likely below 1 as at seti (which implies magical CPU fairies), I believe the coarse scaling correction there should be the first step in isolation. Supporting effects include the SIMAP non-SIMD app on SIMD aware android client whetstone, as well as Seti's uniformly below 1 pfc_scales despite quite tight theoretically based estimates.

The damping component I'll shift to pass 2. It's important, and has proven treatments available, but I feel that analysis of the impact of the unmitigated [CPU app] coarse scaling error is more important at this stage.

[Edit:] i.e. in the examples given here, median credit awarded should be a bit over the prior fixed credit of 1000, rather than 500-600 , so the SIMD scaling correction of 1.5^2 = 2.25x seems appropriate given the numbers.

For dominant single precision, That'd be 1.5 times for each step in logbase2 of the vector length, so 1.5x for mmx, 2.25x for SSE+, 3.35x for avx256 where the app and host support it), coarse correction will be just fine IMO, as damping will be added in pass 2.

Maximum supported vector length in the app can be regarded as known, client scheduler requests contain the CPU features and OS, so a min() taken between the two.

On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage