The project will be taken down in about an hour to perform an update of the BOINC server code. Ideally you shouldn't notice anything, but usually the world isn't ideal. See you again on the other side.
Copyright © 2024 Einstein@Home. All rights reserved.
Comments
Noticed that I've been
)
Noticed that I've been assigned tasks from 2 "new" applications, BRP5 tasks for both Intel and Nvidia GPU. Non of those has an established APR so got another shot at the initial estimates.
Here's some numbers from my client_state.xml:
Intel GPU:
That's 581 GFlops! Boinc reports it @ 147 GFlops peak in the startup messages.
Nvidia GPU:
That's 12454 GFlops or 12,45 TeraFlops! Boinc reports it @ 2985 GFlops peak in the startup messages. And the APR for the BRP4G Nvidia tasks is 58.1 GFlops when running 2 at a time.
If the BRP5 app gets the same APR then the initial speed estimate is that the card is 214 times as fast as it actually is!!!
Question:
How come the system estimates both resources to be much faster than what Boinc reports as their peak speed? Where's the logic in that?
All downloaded BRP5 task comes with 450000000000000.000000 or 450000 GFpops.
Crunching the numbers gives time estimates for the Intel GPU app @ 774,5 seconds or 12m54s. The 1st task has been running for 12m55s and reached 1,8% done...
For the tasks assigned to the Nvidia card the estimate is 36 seconds. First 2 tasks has been running for 1h8m and reached about 30% done...
I've resorted to add a few zeros to the to prevent Boinc from aborting the tasks with "maximum time limit exceeded".
RE: ...Question: How come
)
The initial GPU guesses seem to rely on Marketing flops figures with some sortof scaling. There is coarse error there because achieving anywhere near rated peak GFlops on a GPU is extremely challenging... i.e. it's a guess, and not a very good one.
Then, basically after the first 11 completed, is where you enter control systems theory and 'averaging' (using the term 'averaging' loosely, because that part's quite incorrectly implemented).
Unfortunately in the current mechanism there are a number of identified instabilities. As soon as you introduce instabilities into a feedback control system, you basically either push it completely off the rails, oscillate (either in simple patterns or chaoitic ones), or fail to converge completely & just get garbage.
Something like this video of the famous Tacoma narrows Bridge collapse. You could ask similar questions like "Where is the logic in building a bridge like that ?", and "That bridge looked pretty stable yesterday, why'd it do that ?":
http://www.youtube.com/watch?v=j-zczJXSxnw
The answers probably lie in the designers not completely understanding the nature of the engineering problems at the time (quite understandable), and being caught quite off guard ... after all we've been building bridges for a very long time...
[Edit:] LoL, "Gallopin' Gertie" ... that's gonna stick :P
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: All downloaded BRP5
)
What Jason said.
To which I'd add: once you have 11 tasks validated, at the next work fetch all the runtime estimates, including for work already downloaded and cached, will jump up to something approaching reality. At this point, you may find that you have too much work to complete within deadline. I'd strongly advise you to manage work fetch carefully to start with, using either NNT or by deselecting the new applications in preferences, so you don't overshoot the mark.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: The initial GPU guesses
)
Might there be a typo in there that multiplies rater than divides or just a missed sign? It feels like the scaling goes in the opposite direction of what it's supposed to.
I've been following some of the discussion on the Seti-boards about this credit system and the code walk and look forward to the testing of an hopefully more stable and functional system here.
I understand that it's difficult to say but is there a timetable for when we start testing what hopefully is improvements to the system?
RE: RE: The initial GPU
)
Basically If you overestimate the speed of the device, then you underestimate the time it'll take ... so for example I started with 3 second estimates for hour+ GPU tasks, because combined effects of any (even small) slop in the base estimates, room for optimisation in the application (s), plus the way the self scaling is set up, will tend to introduce those coarse errors first, then the erratic noise after as things try in vain to self-correct.
Simple effects can combine to look pretty complex :)
Thanks for helping out :) Yeah it's actually been a fascinating study so far, and no doubts many challenges to come.
[Edit:] timetable for first patch attempt is for sometime after Bernd's back from a break. No expectation that this first pass would solve every issue, but certainly should indicate quickly if we are poking at the right set of suspects.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: Oh, you're going to
)
Just FYI, for the machine with the TITAN, I am running 3 GPU tasks and reserving two full CPU threads. For the GTX 680MX, I am running 2 GPU tasks and reserving two full CPU threads. Not sure that makes any difference for the purpose of this discussion. Just explaining why the run times are longer than they should be.
Dublin, California
Team: SETI.USA
RE: Just FYI, for the
)
It will make an impact with respect to those initial estimates, and possibly take a bit longer to semi-stabilise after through the initial period. Thanks for the info. It's an important case for us to consider, as multiple tasks on even modest GPUs is becoming more common now that it's feasible with app_config and stock applications.
I'm nearly ready to set some ambitious targets/goals for the patches, pending looking at Richard's funny graph(s). In the big picture I'm expecting a well damped response in the system fixes should keep tasks within estimated runtime bounds, even with more tasks than that per GPU, though I'll probably reserve final judgement on whether the server really needs to know more detail until after the first pass.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
This one's really rather
)
This one's really rather pretty, I think.
Note logarithmic credit scale :P
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: Note logarithmic credit
)
LoL. That's the same cheating as Eric did with the seti pfc_scales (as if I wouldn't notice :P)
The cool thing about logarithmic scales is how they can make naturally varying things look smooth & stable.... Doesn't seem to have worked here completely though ;)
[Edit:] going to stew on that, then set some targets for first pass, and longer term goals. Yep, everything seems to be matching what the Engineering and the code says...
[Edit2:] just a funny observation looking at that then this data again:
Six sigma ( 6 x std deviations) of near or greater than the maximum is a worrying amount of entropy...
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: Oh, you're going to
)
For your info, my GT650M is running one task at a time, and I'm only running two CPU tasks at a time too,
(It runs very hot, the 2.5GHz i5-3210M is a dual core with hyper threading, with it running on it's turbo mode of 2.89GHz the CPU cores sit at 99°C,
add another core crunching, or the intel GPU crunching and it starts downclocking, both CPU and Nvidia GPU)
Since I've now got Intel GPU tasks, the CPU is flucturating between 1.90GHz and 2.89GHz in 0.1GHz steps, ie 2.89, 2.79, 2.69, 2.59, 2.50, 2.40, 2.20, 2.10, etc,
and the GT650M is switching between 950MHz and 118MHz, while the HD Graphics 4000 is switching between 950MHz, 1.0GHz, 1.05GHz and 1.10GHz,
expect all task durations to flucturate. ;-)
Claggy
Interesting. I was just
)
Interesting. I was just noticing on the task list that you had two tasks reported at 15:49 and 15:51 today (not the top two, #4 and #5 currently) with over 10K credit each. Ruined my nice trumpet-shaped graph! ;)
Same wingmate! I think 10320 Jacob Klein might go on my spare page.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
So here's an enlarged
)
So here's an enlarged view.
I wonder why two laptops validating each other should do that?
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
I tried to resend those BRP
)
I tried to resend those BRP (Arecibo, GPU) tasks, but got them expired instead (I had use ATI GPU set to No), So managed to get fresh GPU tasks, a mixture of BRP (Arecibo, GPU) and BRP (Perseus Arm Survey),
the (Arecibo, GPU) tasks now have estimates of 13 minutes, while they take an hour, so they are now completeable, the (Perseus Arm Survey) tasks have estimates of 16 seconds, so aren't, i'll let the ones I have run and error:
All tasks for computer 8143
Application details for host 8143
Claggy
I have fixed fpops intel_GPU
)
I have fixed fpops intel_GPU issue using app_info.xml containing tag
14479075542.794144
for BPR4, BPR4G and BPR5 intel_GPU applications. Seems to work at both HD4000 and HD4600. Is it correct way?RE: Intel GPUs are now
)
Yes, the (old) web code hasn't been updated and probably won't be at all, due to the envisioned (but again postponed) migration to Drupal (see other news).
BM
BM
RE: So here's an enlarged
)
With Claggy's clocking up & down all over the shop, (and probably Jacob's too), then you have an exaggerated form of what happens anyway. noisy elapsed times that are unsmoothed, and account for all sorts of klingons that had nothing to do with proccessing. Sampled averages don't cut it for this process.
Think of the moving sampled average as a conveyor belt perhaps. One small sample drops off to make room for a new much larger one, then a large one drops off making room for the next, and so on. Nothing smooth about it, but will lurch around like some sortof crazed Frankenstein's monster on a rampage. When two meet they could cancel or add.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: So here's an
)
Just provided they don't multiply...
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: RE: So here's
)
Well sadly, rampaging monsters everywhere :-O , because those averages feed scales (which multiply), 2 cascaded providing 'gain' to the noise, like an audio amplifier.
One totally impractical theoretical solution would be just to take many more samples (than 11...). Could work if you didn't mind waiting a few days/weeks/months for estimates to settle in...but worse for the onramp period, which is a lot of the problem.
Better (proven) approaches will be tried... nearly ready to set those short & medium term goals.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
just remember that it's nice
)
just remember that it's nice to quickly converge from a bad initial estimate, but if the tasks ends up Time limit exceeded it doesn't give any clue as to the eventual runtime. And we don't want to wind out the rsc_fpops_bound too much either, because hung apps occasionally happen. So we really need to do something about the initial runtime estimate for GPU too.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: just remember that it's
)
Which seems to be specifically an initial FLOPs estimate problem. David initially dismissed it as a local Einstein phenomenon, so we ought to check that possibility and find out where the enormous values reported in this thread really came from.
And then we need a way to assign a realistic starting point for GPU speeds, across all projects and extensible to new cards as they are released without waiting for BOINC client updates. We have the problem that the NV API won't report shaders per SM, so estimates for new cards are always faulty until the hard-wired multiplier is updated: and I believe detection of new ATI cards like the R 290 is even more flawed.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
I'm still monitoring credits,
)
I'm still monitoring credits, though the flow of validations seems to be slowing.
CPU tasks seem to be continuing as before, so let's concentrate on GPU for a while.
[pre] Jason Holmis Claggy Zombie ZombieM Jacob
Host: 11363 2267 9008 6490 6109 10320
GTX 780 GTX 660 GT 650M TITAN 680MX FX 3800M
Credit for Gw-CasA
Maximum 1503.78 1357.46 10951.9 1933.46 11847.5 10951.96
Minimum 115.82 88.84 153.90 91.50 94.88 508.73
Average 733.95 592.45 4200.42 994.85 2092.67 3640.96
Median 945.84 492.55 3037.38 1054.49 1642.95 1710.08
Std Dev 499.17 373.86 3179.62 374.42 2019.66 3970.94[/pre]
Is it my imagination, or is that trend continuing upwards? I think it was Eric that warned us that 'pure GPU' apps (without a CPU app_version to keep them grounded) tended to explode in the end.
Edit - yes:
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: ... Is it my
)
basically mentally subtracting the noise is fine, then you see the shape of the convergence. Initial estimates appear about 1/3rd what they should be (credit terms not time), which you could regard as 'fine' if convergence was more prompt... though obviously better would be a nice goal. Since my 780 was running around 10% of expected credits at the beginning, and Zombie's TITAN about 1/3rd, you can either accept that Zombie's multiple tasks per GPU from the beginning is a special situation, or not.
If you accept that Zombie's TITAN config is a special case, then you need to consider mine as 'normal', in which case the expected credits start about one tenth of what they should be. IOW the earlier observation from Holmis (?) seems correct, along with my observation of 3 second estimates, in that the estimates that are supposed to be conservative with respect to time bound, and generous with credit, are in fact the opposite... There is some math upside down.
Trends-wise, I'd say for certain that what you're seeing is there, since averages are in play. What's wrong with the shape is that the convergence (from wrong numbers toward ballpark) is way too slow (too many tasks), and as much as it looks like it's trending upward now, downward compensation after will be just as slow and obvious. It's [the rising edge] part of a long term oscillation. You could easily say that some of those machines are trending down.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
Let's wait and see. I've seen
)
Let's wait and see. I've seen some projects go exponential (literally - into the millions of credits) when they start like this.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: Let's wait and see.
)
That unstable ? interesting... Yeah with instabilities it can ring or run completely off the rails... A bit hard for me to predict that :P (both conditions are unstable)
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: Let's wait and
)
One of my hosts is still showing that I reached a user RAC of 99,952,529.17 at AQUA - they were multithreaded CPU apps, rather than GPU, but they still blew up.
Edit:
July 2011
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: RE: Let's wait
)
you must admit AQUA was a special case and they went rather suddenly offline before we had the slightes chance to investigate.
2011? feels like yesterday...
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
AQUA was made by D-Wave,
)
AQUA was made by D-Wave, which says it has built quantum computers and sold them for about a hundred million dollars each.
Tullio
RE: AQUA was made by
)
And still did, up to last year at least.
Google and NASA team up to use quantum computer
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
I really appreciate what you
)
I really appreciate what you guys are doing for ALL of us crunching grunts
I have some hardware I can bring on board but I don't have the time to provide the ins and outs of log files etc. Is there a particular app mix that would be most helpful?
[pre]
OS BOINC CPU GPU
Win7 7.2.42 980x 670 + 660Ti
Win7 7.2.42 920 7950
Win7 7.2.42 4670k Intel
[/pre]
I also have some other gpus I could mix/ match if your looking for some diversity ... 7850, 7770, GTX480, GTX295.
For GPU would it be better if I ran 1 per card or is it enough for your modeling if I just let you know what multiplier I am using?
Any value in providing overclock settings?
Attached a new host to
)
Attached a new host to Albert, looking through the logs i keep getting the following download error:
14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_05.png
14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_07.png
14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_08.png
14-Jun-2014 06:06:33 [Albert@Home] Finished download of eah_slide_07.png
14-Jun-2014 06:06:33 [Albert@Home] Started download of EatH_mastercat_1344952579.txt
14-Jun-2014 06:06:34 [Albert@Home] Finished download of eah_slide_05.png
14-Jun-2014 06:06:34 [Albert@Home] Finished download of eah_slide_08.png
14-Jun-2014 06:06:34 [Albert@Home] Giving up on download of EatH_mastercat_1344952579.txt: permanent HTTP error
On this new host (as well as on my HD7770) i'm still getting the very short estimates for Perseus Arm Survey GPU tasks, so i've added two zero's to the rsc_fpops values so they'll complete.
Computer 11441
Claggy
updated details [pre] OS
)
updated details
[pre]
OS BOINC CPU GPU Utilization Factor
Win7 7.2.42 980x 670 0.50
Win7 7.2.42 920 7950 0.33
Win7 7.2.42 4670k 7850 0.50
[/pre]
The GPU plot continues to
)
The GPU plot continues to thicken.
Some of the machines I picked to monitor seem to have dropped out of the running, so I added one of my own yesterday.
It seems to have started in line with current trends, but like the others, there seem to be distinct 'upper' and 'lower' credit groupings. Why would that be?
I've also noticed something that doesn't show with this style of plotting against reporting time: when I've been infilling late validations, the credit awards - in line with the trendlines - have been much higher than their contemporaries got, and the double population is visible there too. For example, Claggy's WU 606387 from 5 June was awarded over 7K overnight.
That should show up in the changing averages:
[pre] Jason Holmis Claggy Zombie ZombieM Jacob RH
Host: 11363 2267 9008 6490 6109 10320 5367
GTX 780 GTX 660 GT 650M TITAN 680MX FX3800M GTX 670
Credit for BRP4G, GPU
Maximum 1584.75 1495.38 10951.9 2031.40 11847.5 10951.9 4137.85
Minimum 115.82 88.84 153.90 91.50 94.88 508.73 1355.49
Average 813.32 688.57 4120.83 1074.52 2010.65 2743.09 1917.71
Median 973.50 539.19 3037.38 1122.70 1591.80 1456.42 1641.43
Std Dev 523.51 428.48 3074.79 393.82 1823.08 3177.59 703.78
nSamples 36 52 48 387 189 17 21[/pre]
(Why does the new web code double-space [ pre ]?)
Edit - corrected copy/paste error on application type.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
Interesting. Of course credit
)
Interesting.
Of course credit is at time of validation not reporting against a moving scale, yadda yadda.
So, late validations earn higher. the wingmate is then either a slow host or has a larger cache. the first is in line with expectations iirc, the second would be puzzling.
then again if you are observig a chaotic system it's hard to know what may or may not happen anyway...
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: Interesting. Of course
)
Yes, the two mixed and partially overlapping time domains for the credit mechanism (issue time and validation time) both oscillate & interact. Those two are enough combined to setup a resonance by themselves ( additive & subtractive), but then with the natural variation in the real-world processing thrown in, you introduce another one or more 'bodies', making the system an unpredictable n-body problem.
For our purposes the dominant overlap is in those time domains, so sufficient damping to each control point to place them in separate time domains should be enough to break the chaotic behaviour.
e.g.
global pfc_scale -> minimise the coarse scaling error and vary smoothly over some 100's to 1000's of task validations
host (app version) scale -> vary smoothly over some 10's of validations, small enough to respond to hardware or app change in 'reasonable' time (number of tasks)
client side (currently either disabled or broken by project DCF not beign per application) - vary smoothly over a few to 10 tasks
Once separated in time, then this coarse->fine->finer tuning is difficult to destabilise, even with fairly sudden aggressive change at any point.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
The few wingmates I've
)
The few wingmates I've spot-checked against high credit awards (my own and others) don't seem to be outside the normal population. They too show bi-modal distributions, and high credits don't seem to be correlated with varying runtimes. I'm trying to run the GTX 670 as stable as possible, and runtimes are very steady.
Which leaves the moving average theory. Two more tasks have both reported and validated since I posted the graph: they both got over 2,000 credits, which suggests the average is still moving upwards.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
Just FYI, I had to take my
)
Just FYI, I had to take my two CUDA machines off albert for a while. I need to help a team mate at another project. I will be back.
Dublin, California
Team: SETI.USA
RE: The few wingmates I've
)
Yes the beauty of testing here has been the *relatively* consistent runtimes on a given host/gpu, which tends to demonstrate that much of the instability is artificially induced (as opposed to a direct function of the noisy elapsed times).
The upward trend would appear to be a drift instability, and bets are off as to whether it continues indefinitely, levels off, or (my guess) starts a downward cycle after some peak.
The bi-modal characteristic appears to be that oscillation between saturation and cuttoff, most likely turn-around time related. That'd be predominantly proportional error ('overshoot'). [post scaling by the average of the validated claims isntt going to help that, though it was a valiant attempt over the original choice of taking the minimum claim]
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
Heisenberg? My Maths teacher?
)
Heisenberg? My Maths teacher? Chaos theory? Zen? Superstitious pidgeons?
In a chaotic system, a trend observed may be genuine or coincidental.
And in any case, we shouldn't be putting too much effort into observing the scintillations of a soapbubble we intend to burst. Of course the colourpattern is fascinating indeed...
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: Heisenberg? My Maths
)
Good. what I'm doing at the moment is drafting goals for the first patch. A lot of that will be involved with confirming/rejecting the particular suspects for the purposes of isolation. I do think the observations, or 'getting a feel' for the character of what it's doing, is important at the moment... and the patterns are pretty to look at... but you're right, I'd like to have the full first pass patch ready for trials by the time Bernd's back on duty...
Outline for pass 1:
Objectives and scope
Resources
Procedure
Results
Discussion of Results
Conclusions and further work (plan for pass 2)
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
I've been asked to pull out
)
I've been asked to pull out stats on runtime and turnround, in case they explain anything - I don't think they do.
[pre] Jason Holmis Claggy Zombie ZombieM Jacob RH
Host: 11363 2267 9008 6490 6109 10320 5367
GTX 780 GTX 660 GT 650M TITAN 680MX FX3800M GTX 670
Credit for BRP4G, GPU
Maximum 1584.75 1495.38 10952.0 2031.40 11847.5 10952.0 4137.85
Minimum 115.82 88.84 153.90 91.50 94.88 508.73 1355.49
Average 813.32 688.57 4120.83 1074.52 2010.65 2743.09 1890.82
Median 973.50 539.19 3037.38 1122.70 1591.80 1456.42 1644.33
Std Dev 523.51 428.48 3074.79 393.82 1823.08 3177.59 614.08
nSamples 36 52 48 387 189 17 28
Runtime (seconds)
Maximum 4401.98 5088.99 11295.0 5383.71 23977.4 11774.3 4138.67
Minimum 3259.10 3294.83 8136.47 1908.26 1512.16 11515.5 4061.45
Average 3668.57 4483.11 8910.76 4194.81 4216.60 11623.3 4102.53
Median 3603.24 4608.05 8837.27 4228.18 4183.74 11603.6 4109.83
Std Dev 314.13 506.27 554.34 555.88 1981.31 70.73 25.21
Turnround (days)
Maximum 2.11 3.91 1.70 3.44 2.94 5.57 0.72
Minimum 0.15 0.07 0.14 0.24 1.52 0.18 0.15
Average 1.14 1.97 0.69 2.10 2.32 1.64 0.44
Median 0.73 1.90 0.74 2.00 2.42 0.88 0.48
Std Dev 0.79 0.98 0.35 0.57 0.32 1.68 0.17[/pre]
Zombie's Mac had just two of those extra-long runtimes:
WU 612084, 1,328.76 cr
WU 612045, 6,419.08 cr
Edit - I updated my own column since this morning, the others are unchanged.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
hmm. maybe. would need to
)
hmm. maybe. would need to ball eye the code again to get more certainity.
smaple size is a bit small too - maybe in a few days.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
I'll add that my GTX660Ti is
)
I'll add that my GTX660Ti is running 2 task at a time mixing BRP4G and BRP5 from Albert and BRP5 from Einstein. The Intel HD4000 is running single tasks.
Here's an updated Excel file with data and plots from host 2267 and the following searches: BRP4X64, BRP4G, S6CasA, BRP5 (iGPU) and BRP5 (Nvidia GPU).
RE: ... in case they
)
Well explore any possibility for sure. From an engineering perspective, understanding the spoon doing the stirring is probably a good idea [though asking it what it's doing might not reveal much :) ]. Probably we won't try getting rid of the spoon, but instead what it's mixing... fine granulated sugar and some honey, instead of a solid hunk of extra thick molasses.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: Nvidia GPU: RE:
)
A follow up on my post about initial estimates, the 11th BRP5 task has now been validated and the APR has been calculated to 30.16 GFlops when running 2 tasks at a time.
So the initial estimate was that the card was a whooping 412,9 times faster than actual! =O
RE: RE: Nvidia
)
Yeah, for my 780 initial guesstimates were 3 seconds, and came in about an hour, so about a thousand-fold discrepancy. Room for optimisation in the application doesn't account for the whole discrepancy (of course :) ).
looks like there'll be some digging along these lines to do. Just some factors to look at with new GPU apps & hosts might include:
- CPU app underclaim becomes global scaling reference, we know these leave out SIMD (sse->avx) on raw claims, so look more efficient than they are, because certain numbers come out 'impossible' if the logic were right (e.g. pfc_scale < 1) [perhaps up to 10x discrepancy)
- GPU overestimate of performance (attempting to be generous) [could be in the ballpark of another 10x discrepancy]
- Actual utilisation variation ( e.g. from people using their machines, multiple tasks per GPU...more...) [ combined perhaps up to another 10x]
- room for optimisation of the application(s), or inherent serial limitations [maybe from 1x to very big number discrepancy, let's go with another 10x]
So all-told with guesstimates probably 1000x discrepancy is easily plausible, just among known factors, and there may be more... That's completely ignoring the possibility of hard-logic flaws, which are entirely possible, even likely.
That all will probably be dependant on some of the customisations and special situations here at Albert, stil lto be examined in context. Thes include if & how some apps are 'pure GPU' or otherwise packages of multiple CPU-like tasks, how they are wired in and scaled.
On the surface it looks like there may well be multiple of these things in play.
These all seem connected, so I suspect we'll need to make patch one multi-step, so we make switch in one small change at a time, and watch for weird interactions.
For example, Fix CPU side coarse scale error then watch GPU estimate &/or credit blow-out in response... LoL
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: ... in case they
)
'There is no spoon' ;)
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: RE: RE: Nvidia
)
Must be picking up extra factors from somewhere - might be due to the fact that (as Eric stated) there is no damping of GPu by CPU figures. But there is one or more massive scaling errors lurking around.
Dang. That code is a nightmare to walk.
might not all be flops scaling error of course, need to cast an eye over the rsc_fpops_est calculation as well...
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: might not all be flops
)
Since I'm not running anonymous platform, rsc_fpops_est isn't being scaled.
I'm seeing
fpops_est 280,000,000,000,000
FLOPs 69,293,632,242
APR 69.29 GFLOPS
Order-of-magnitude sanity check:
280 e12 fpops
70 e9 flops/sec
--> 4 e3 runtime estimate. Correct on both server and client, and sane.
Edit: identical card in same host has APR of 156.36 at GPUGrid. I'm running one task at a time there, and two at a time here, so I'd say that the speed estimates agree. IOW, runtime estimates are OK once the initial gross mis-estimation has been overcome: this bit of the quest is for credit purposes only.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
Ta. Looks like fpops estimate
)
Ta. Looks like fpops estimate isn't too shabby then. But you need to exclude that contribution.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: Ta. Looks like fpops
)
In both the Seti cases, and in the CPU app case here that I've only looked at cursorily, the fpops_est have been what I would call 'reasonable'. I don't think it would be reasonable to ask of project developers to estimate more precisely than what theory tells them (though it's certainly open to debate)
Overnight I've been musing on how to stage/phase/break-up the tests. Since there seems to be evidence of interaction between CPU & GPU scaling, I'd like to start with the coarse CPU scaling first, and prescribe watching all applications for effects when it's engaged.
@Richard, yeah the pfc scale should be compensating handily for any initial SIMD related disagreement there (It's had enough time), but since the scaling swing is in the opposite direction to GPU, and likely below 1 as at seti (which implies magical CPU fairies), I believe the coarse scaling correction there should be the first step in isolation. Supporting effects include the SIMAP non-SIMD app on SIMD aware android client whetstone, as well as Seti's uniformly below 1 pfc_scales despite quite tight theoretically based estimates.
The damping component I'll shift to pass 2. It's important, and has proven treatments available, but I feel that analysis of the impact of the unmitigated [CPU app] coarse scaling error is more important at this stage.
[Edit:] i.e. in the examples given here, median credit awarded should be a bit over the prior fixed credit of 1000, rather than 500-600 , so the SIMD scaling correction of 1.5^2 = 2.25x seems appropriate given the numbers.
For dominant single precision, That'd be 1.5 times for each step in logbase2 of the vector length, so 1.5x for mmx, 2.25x for SSE+, 3.35x for avx256 where the app and host support it), coarse correction will be just fine IMO, as damping will be added in pass 2.
Maximum supported vector length in the app can be regarded as known, client scheduler requests contain the CPU features and OS, so a min() taken between the two.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage