The project will be taken down in about an hour to perform an update of the BOINC server code. Ideally you shouldn't notice anything, but usually the world isn't ideal. See you again on the other side.
Copyright © 2024 Einstein@Home. All rights reserved.
Comments
RE: @Richard, yeah the pfc
)
Are you sure that the SETI rsc_fpops_est are 'tight'?
I remember that when we were helping Josh Von Korff choose initial runtime estimates for Astropulse, we had a rule-of-thumb that the *stock* MB CPU app reached a DCF - only available scaling factor in those days - of ~0.2 on the then cutting-edge Intel Core2 range (Q6600 and similar). The stock SETI app has had internal despatch of at least some SIMD pathways for a long time, and more have been added over the years.
Knowing Eric's approach to these matters - he's never wanted to exclude anyone from the search for ET, no matter how primitive their hardware - I suspect rsc_fpops_est may have been 'tight' for the mythical cobblestone reference machine (1 GHz FPU only), but never since.
Certainly the same GPU I'm plotting here has a SETI APR of 180, again running two tasks at once. So, even allowing for the inefficient GPU processing of autocorrelations (which I don't think has been reversed into the AR-fpops curve), SETI thinks this card is twice as fast as the stock BOINC code here and at GPUGrid does. At GPUGrid, it runs pretty tight cuda60 code, and since the whole project has revolved around NV and cuda since they dropped the PS3 dead-end, I reckon they know their stuff. They over-pay credit, but that's a manual decision, not BOINC's fault.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: Are you sure that the
)
Yes, to +/- 10% of actual operations w/o overhead (which is not 'paid' at this point). That's still quite a spread in user and machine utilisation terms, which is the entropy that damped responses should be absorbing (as opposed to estimates). That's still approximately 3.3-30x 'tighter' than the coarse scaling error induced by unnacounted for AVX multiplied by machine utilisation.
It's relative. [You use the theoretically best serial algorithm for estimate, as opposed to reflect back implementation ineffeicient or otherwise]... On Seti the GPU autocorrelation uses my own 4NFFT method, so in [uncounted] computation it's 4x...Since that's drowned by latencies to the tune of 60% on high end cards, you roughly double the claim for that portion. Other areas (variable) are more efficient. Then divide the overall claim by about 3.3 times (because of AVX global pfc). The net result is 'shorties' that should be getting ~100 credits, getting ~40-60
A similar effect is happening here, with tasks that should be ~1000, seeing a median of ~500. It's not the project supplied estimates that are out, but the induced scaling error.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: Are you sure that
)
I have to say I'm ... surprised.
For a recently completed task, I can see
14979862651149.264000
111064200000000.000000
Flopcounter: 38995606754768.391000
Because I'm running Anon. Plat., the first is scaled to allow the client to show a decent runtime estimate using it's internal reference speed. It seems to be using 9235454731.586426, although the APR on that machine is 122.47 GFLOPS.
The second and third differ by Eric's imfamous 'credit multiplier' of x2.85, of course.
I'll have to load up a machine with a stock app sometime and try that one again.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: RE: Are you
)
'That' has already been downscaled. 2.85x would indeed be a great compromise if on average most of the hosts were SSE+ (~2.25x) equipped with about a third taking up AVX (3.375x).
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: Attached a new host to
)
Was related to the web code update. Shouldn't have affected crunching. Should be fixed now.
BM
BM
RE: 'That' has already
)
Yes, I said I was running anon. plat. I'll have to get a raw one on a stock machine.
?? My own machine - attached after the bulk of the averaging has been done by the population at large - is showing a median of 1716.32 currently. That's climbed from 1644.33 at last night's show.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: 'That' has
)
i.e. median on *my* GPU here early in the piece, which has been running one task at a time, from the figures you posted earlier at http://albertathome.org/node/84961&postid=112911
We know that after that it's been going up, It's not clear yet IMO whether that's a controlled rise/correction or another instability.
[There's two scales fighting, the host app version driving it upward, and the global PFC scale selected from the underclaiming CPU app driving it down, which one wins is a matter of numbers]
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: i.e. median on *my* GPU
)
Still rising inexorably here - this is roughly the last three days (right margin is midnight UTC tonight). Horizontal lines are 1K and 10K, still logarithmic. My minimum (of 60) is 1355
Edit - to your edit: there's no CPU app here - that was Eric's point.
'Binary Radio Pulsar Search' and 'Binary Radio Pulsar Search (Arecibo, GPU)' are deployed as different Applications on this project, not just different app_versions.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: RE: RE: Are
)
CPU flops under anon are [unless supplied in app_info.xml] raw whetstone for CPU and some mystical figure for GPU. At some point it was just 10X CPU and I am seeing roughly 10x for Eve.
rsc_fpops_est for anon is a combination of APR and rsc_fpops_est iirc.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: Still rising inexorably
)
No certainty in how that will react to correcting the CPU app scale as patch one. In an ideal world it wouldn't react at all (though I firmly believe it will react visibly, I'm open to surprises.). That's what I want to watch, because I suspect it should bump up a bit further, then level, drop and oscillate (for subsequent smoothing in patch two). Anyway, with such steady work it should have stabilised by now and hasn't so rolling up my sleeves for the first pass (CPU coarse scale correction)
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: [There's two scales
)
There is no underclaiming CPU app!
edit: what Richard said... global pfc are for GPU apps only - but there are several of those according to applications
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: CPU flops under anon
)
Yep, that's pretty much the way I remember it ( spaghetti, including assorted mysticism and voodoo :P). Once we get the cpu-misscalingout of the picture, pass 2 might either be focussed on smoothing/damping, GPU scale correction, or both. In any of those cases, I'll let what happens with CPU guide.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: ... global pfc are for
)
whatever the web pages call them (probably not the same thing), in code the global pfc_scale for a suite of applications is the lowest claiming application for given tasks (by unstable averages). That's then used to downscale the estimate of every other application wholesale, and is how the underclaiming SSE-AVX apps are dividing the credit. They are claiming fewer operations than it actually takes to do a task, evidenced by pfc_scales being below 1 ( magical fairy CPU applications doing work for free )
I'm telling you it takes no fewer than nlogn operations to do an FFT ...
- AVX CPU + App tells me it did it in nlogn/3.3 . 'I'm Magic'
- Server stupidly sets pfc_scale to 1/3.3 'This magic app wins'
- Jason says 'hang on a minute, Boinc client uses broken inapplicable non-SIMD whetstone to calculate that... Ya canna defy the laws of physics, there are no CPU fairies or magic... It's using SIMD and doing up to 8 operations per cycle for an average of ~3.3x throughput' (scaling is about 50%)
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: ... global pfc
)
that would be whatever _GPU_ application claims lowest!
AVX doesn't come into play there. Don't ask me what happens when there is no CPU version to take the lead. well we see what happens we end up pretty low (compared to the dip CPU apps take as per my 'day zero' analysis. where did O put that? wiki?)
take your pick of opencl_ati for win, mac linux and cuda versions for win mac and linux.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: RE: RE: ... global
)
I'm pretty sure the CPU apps will be claiming far fewer operations so be chosen as scale. Here is a bit special because of those aggregate GPU tasks, so how that's weaved in is another question ( i.e. treated as GPU Only? or 16x a CPU task ?).
With the current logic pure GPU only projects would likely grant from 3.3x-100x the credit, and initial estimates not be all that bad.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: ... global pfc
)
Well, you're the one who's been walking the code, but let's standardise on one set of terminology.
Using *BOINC* terminology, the hierarchy is
[pre]Project
|__> Application
|__> App_version (each separate binary executable)[/pre]
The 'underclaiming SSE-AVX' *executables* are a suite within the BRP4 (CPU only) Application. That is different from, and will have a (if I understand you correctly) *different* pfc_scale from the suite of BRP4G (GPU only)
According to the CreditNew whitepaper, normalisation is applied across versions of each application, not across the project as a whole. So, in the specific case at point (BRP4 and BRP4G), the GPU app will not be scaled by CPU concerns, because there isn't a CPU executable within the GPU suite.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: RE: ... global
)
That's where it depends on how they hooked in those *4G and *5G aggregates of 16 tasks. If the estimate is standalone, then there is no visible reason it should give us 3 second estimates for hour long tasks. If it is hooked in via a multiple of the CPU apps PFC_Scale &/or CPU app estimate, then that would explain it.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: RE: RE: ...
)
JASON, CPU and GPU have different APPS here!
there is a BRP app and a BRPG app. thta's like MB and AP on SETI...
or are you saying pfc for MB and AP are the same? :P
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: JASON, CPU and GPU have
)
No, and don't yell at me please ;)
I am saying that the BRP4G and 5G estimates come from CPU app estimates multiplied by 16 ---> different app & hardware, but estimates are linked.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: That's where it depends
)
I think that one is for Bernd and/or Oliver to answer. Or find the code for it...
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: RE: That's where it
)
That's what I asked for the customisations for. In either case, It's still broke ;)
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: JASON, CPU and
)
I'm female. If you don't listen I yell. At which point the voice becomes so highpitched, males go into automatic 'isn't she cute when she is upset' ignore mode :P Aren't sterotypes great?
that might be an explanation for that unreasonable extra scaling we are seeing, indeed.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: In either case, It's
)
If It is broke, It needs to either work, win the lottery or find somebody that will keep It [I shudder to think what services It might supply in return].
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: That's where it depends
)
The runtime estimate we see displayed in a BOINC client doesn't come - as an estimate - from the server. Instead, we get two different numbers from the server - job size and host speed - and the client does the math.
size --> rsc_fpops_est
speed -->
I have no doubt that Bernd, Oliver et al will have set the rsc_fpops_est for the GPU app at 16* the est for the CPU app. We can check that, because rsc_fpops_est is set by hand into the workunit template, and transmitted *unchanged* through the workunit generator (analog of SETI's 'splitter') and on to our computers. At this project - thankfully - all workunits of a given type will have the same rsc_fpops_est.
You saw 3-second estimates because the *speed* term in the formula was wrong. We need to track that down, but it's nothing to do with a task size estimate. And it corrected itself once there was a usable APR for the host, to substitute for the faulty initial estimate.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: That's where it
)
Actually at least for anon platform the code says est and bound are being scaled by *something* ... checking what, and where non anon-platform sets those, is requiring some backtracking.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: RE: That's
)
Completely agree for anon_plat. But we're going down the other fork here.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
...
)
...
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: Actually at least
)
Yep we need both... looking for where the scheduler calls add_result_to_reply(), which is located in sched_send.cpp . If the preceeding raw estimates come straight from the WU generator (aka splitter as you say), and never get a scalem then there are further mysteries to track down further into the GPU portion of the expedition.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
For your info, my
)
For your info, my i7-2600K/HD7770 is now picking up Gamma-ray pulsar search #3 tasks, the initial CPU estimates look O.K at 4hrs 55mins, the ATI estimates are at 5 seconds.
(This application type has CPU, Nvidia, ATI and Intel apps across Windows, Mac and Linux (But no Intel app on Linux))
All Gamma-ray pulsar search #3 tasks for computer 8143
Claggy
RE: Actually at least for
)
One of those somethings is APR, since APR cant be sent as under anon.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: RE: RE: Actually at
)
Let's focus on the NON-anon_plat case for now. That's the one in widest use across the BOINC community.
We can see the process in action in the server logs in this project. Taking my latest for reference:
In my case, the main estimation variable is
setting projected flops based on host elapsed time avg: 69.17G
I see something new in there, too:
comparison pfc: 95.26G et: 69.17G
Later on, the server does it's own version of the runtime estimate: that's purely to check that it's not sending more work than the host requested. There's a little more scaling at that stage, but to account for hosts which don't run 24/7. My correction is tiny, but Eyrie would see a big rescale for restricted BOINC availability.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: For your info, my
)
whetstone, Flops and rsc_fpops_est for GPu and CPU?
edit: 'please' - sorry ::)
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: In my case, the main
)
I've seen that annotation before, somewhere.
rr_sim I think - can you look at a sample please, to check local boinc log against server values?
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: RE: In my case, the
)
17/06/2014 18:08:02 | | [rr_sim] start: work_buf min 25920 additional 3456 total 29376 on_frac 0.999 active_frac 1.000
17/06/2014 18:08:02 | Albert@Home | [rr_sim] 0.00: p2030.20130202.G202.32-01.96.N.b1s0g0.00000_3552_3 finishes (17271.46G/69.12G)
17/06/2014 18:08:02 | Albert@Home | [rr_sim] 2808.68: p2030.20131124.G176.58-00.38.S.b5s0g0.00000_3024_2 finishes (240866.14G/69.12G)
17/06/2014 18:08:02 | Albert@Home | [rr_sim] 3551.29: p2030.20131124.G176.58-00.38.S.b5s0g0.00000_2912_2 finishes (280000.00G/69.12G)
17/06/2014 18:08:02 | Albert@Home | [rr_sim] 4293.89: p2030.20131124.G176.30-00.82.S.b4s0g0.00000_3040_2 finishes (280000.00G/69.12G)
The first two must be running tasks, partially completed. But the next two show the same 280 e12 I posted earlier as rsc_fpops_est here, divided by the familiar speed from APR.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: For your info, my
)
CPU p_fpops is 4514900817.923695
HD7770 peak_flops is 3584000000000.000000
flops for the CPU app_version of hsgamma_FGRP3 is 845960315.482654
flops for the ATI GPU app_version of hsgamma_FGRP3 is 2950327174499.708000
rsc_fpops_est is 15000000000000.000000, with rsc_fpops_bound at 300000000000000.000000
With an Gamma-ray pulsar search #3 only request I got:
https://albert.phys.uwm.edu/host_sched_logs/8/8143
2014-06-17 17:18:23.1994 [PID=2155 ] [send] CPU: req 8330.13 sec, 0.00 instances; est delay 0.00
2014-06-17 17:18:23.1995 [PID=2155 ] [send] AMD/ATI GPU: req 8692.21 sec, 0.00 instances; est delay 0.00
2014-06-17 17:18:23.1995 [PID=2155 ] [send] work_req_seconds: 8330.13 secs
2014-06-17 17:18:23.1995 [PID=2155 ] [send] available disk 95.78 GB, work_buf_min 95040
2014-06-17 17:18:23.1995 [PID=2155 ] [send] on_frac 0.923624 active_frac 0.985800 gpu_active_frac 0.984082
2014-06-17 17:18:23.1995 [PID=2155 ] [send] CPU features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 pbe
2014-06-17 17:18:23.3103 [PID=2155 ] [mixed] sending locality work first
2014-06-17 17:18:23.3223 [PID=2155 ] [version] get_app_version(): getting app version for WU#604131 (LATeah0109C_32.0_0_-1.48e-10) appid:30
2014-06-17 17:18:23.3223 [PID=2155 ] [version] looking for version of hsgamma_FGRP3
2014-06-17 17:18:23.3224 [PID=2155 ] [version] Checking plan class 'FGRPopencl-ati'
2014-06-17 17:18:23.3234 [PID=2155 ] [version] reading plan classes from file '/BOINC/projects/AlbertAtHome/plan_class_spec.xml'
2014-06-17 17:18:23.3234 [PID=2155 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_fgrp' : true : 1.000000
2014-06-17 17:18:23.3234 [PID=2155 ] [version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G
2014-06-17 17:18:23.3234 [PID=2155 ] [version] Best app version is now AV911 (85.84 GFLOP)
2014-06-17 17:18:23.3235 [PID=2155 ] [version] Checking plan class 'FGRPopencl-intel_gpu'
2014-06-17 17:18:23.3235 [PID=2155 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_fgrp' : true : 1.000000
2014-06-17 17:18:23.3235 [PID=2155 ] [version] [version] No Intel GPUs found
2014-06-17 17:18:23.3235 [PID=2155 ] [version] [AV#912] app_plan() returned false
2014-06-17 17:18:23.3235 [PID=2155 ] [version] Checking plan class 'FGRPopencl-nvidia'
2014-06-17 17:18:23.3235 [PID=2155 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_fgrp' : true : 1.000000
2014-06-17 17:18:23.3235 [PID=2155 ] [version] plan_class_spec: No NVIDIA GPUs found
2014-06-17 17:18:23.3235 [PID=2155 ] [version] [AV#925] app_plan() returned false
2014-06-17 17:18:23.3235 [PID=2155 ] [version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G
2014-06-17 17:18:23.3235 [PID=2155 ] [version] Best version of app hsgamma_FGRP3 is [AV#911] (2950.33 GFLOPS)
2014-06-17 17:18:23.3236 [PID=2155 ] [send] est delay 0, skipping deadline check
2014-06-17 17:18:23.3264 [PID=2155 ] [send] Sending app_version hsgamma_FGRP3 7 111 FGRPopencl-ati; projected 2950.33 GFLOPS
2014-06-17 17:18:23.3265 [PID=2155 ] [CRITICAL] No filename found in [WU#604131 LATeah0109C_32.0_0_-1.48e-10]
2014-06-17 17:18:23.3265 [PID=2155 ] [send] est. duration for WU 604131: unscaled 5.08 scaled 5.59
2014-06-17 17:18:23.3265 [PID=2155 ] [send] [HOST#8143] sending [RESULT#1450173 LATeah0109C_32.0_0_-1.48e-10_1] (est. dur. 5.59s (0h00m05s59)) (max time 101.68s (0h01m41s68))
2014-06-17 17:18:23.3291 [PID=2155 ] [locality] send_old_work(LATeah0109C_32.0_0_-1.48e-10_1) sent result created 344.0 hours ago [RESULT#1450173]
2014-06-17 17:18:23.3291 [PID=2155 ] [locality] Note: sent NON-LOCALITY result LATeah0109C_32.0_0_-1.48e-10_1
2014-06-17 17:18:23.3292 [PID=2155 ] [locality] send_results_for_file(h1_0997.00_S6Direct)
2014-06-17 17:18:23.3365 [PID=2155 ] [locality] in_send_results_for_file(h1_0997.00_S6Direct, 0) prev_result.id=1488887
Claggy
RE: RE: RE: For your
)
Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please?
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: I've seen that
)
yes, were were there the other day digging out where whetstone was hiding. sched_version.cpp, estimate_flops() functions. That one for non- anon, and another slightly different for anon. For non-anon, Before statistics are gathered it's Boinc Whetstone for CPU (incidentally SIMD aware oin Android but not x86), and some mystery guesstimate for GPUs
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: I've seen that
)
Those mystery guesstimates for GPUs are one of the major quarries for our quest.
Claggy's ATI is running at 2.95 Teraflops, to put it in simpler numbers.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: Unfortunately I missed
)
I did another request, and suspended network:
https://albert.phys.uwm.edu/host_sched_logs/8/8143
Claggy
RE: RE: RE: I've seen
)
Yep. Also be aware in that area, just to complicate matters, that there is a scheduler config option David's thrown in, enabling a random multiplier across the project_flops for each app_version, so that app versions get juggled at least before stats are gathered.
I'm getting the distinct impression he's 'lost' the old 0.1 GPU flops scaling there (haven't come across it yet anyway, still looking), meaning that'll probably be using the raw client supplied marketing flops value, possibly by some random number...
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: Unfortunately I
)
[version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: RE: RE: Unfortunate
)
That's not TeraFlops (speed), That's peak flop count, as in # of operations.(verifying in code now)
*scratch that* looks broken, walking the lot with beer
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
RE: RE: RE: RE: Unfor
)
The server is using it as a speed for estimation purposes. Maybe that's our problem.
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
RE: *scratch that* looks
)
peanut gallery: that's like saying that water is wet after falling in andd getting soaked...
Enjoy the beer. Valium might be the better choice.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: RE: RE: RE: Unfor
)
Boinc startup says:
17/06/2014 18:17:17 | | CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak)
17/06/2014 18:17:17 | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.5 (VM), device version OpenCL 1.2 AMD-APP (1348.5), 1024MB, 984MB available, 3584 GFLOPS peak)
17/06/2014 18:17:17 | | OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.5 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.5))
The GTX460 always had a lot lower GFLOPS peak value, but was a lot more effective at Seti v6, v7 and AP v6, the exception being here, and the OpenCL Gamma-ray pulsar search #3 1.07 app, where the HD7770 was a little faster:
https://albert.phys.uwm.edu/host_app_versions.php?hostid=8143
http://boinc.berkeley.edu/dev/forum_thread.php?id=8767&postid=51659
04/12/2013 21:25:07 | | CUDA: NVIDIA GPU 0: GeForce GTX 460 (driver version 331.58, CUDA version 6.0, compute capability 2.1, 1024MB, 854MB available, 1075 GFLOPS peak)
04/12/2013 21:25:07 | | CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak)
04/12/2013 21:25:07 | | OpenCL: NVIDIA GPU 0: GeForce GTX 460 (driver version 331.58, device version OpenCL 1.1 CUDA, 1024MB, 854MB available, 1075 GFLOPS peak)
04/12/2013 21:25:07 | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.4 (VM), device version OpenCL 1.2 AMD-APP (1348.4), 1024MB, 984MB available, 3584 GFLOPS peak)
04/12/2013 21:25:07 | | OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.4 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.4))
Claggy
RE: RE: RE: RE: Quote
)
of course it;s speed, it's APR later - 'based on' is our problem - something is being factored in incorrectly. AFAIK on SETI there's no such gross overestimation of GPU speed.
@ Claggy what is the peak flop count for that card? (sorry if you posted that aready)
edit: ta.
peak flops x pfc_ave ? the latter being <1 ?
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
yes, this is bizarre: once
)
yes, this is bizarre:
once stats are gathered:
Dodgy average aside (which we know all about the problems of sampled averages there, particularly with very few samples), looks like ratio of marketing flops estimate (from client) to operations (effective claimed)
Going to check if he's tweaked the definition of pfc here, because flops rate over average operations would give average time in seconds to me... chgecking that pfc with that beer...
[Edit:] no sign of our 0.1x scaling for GPU either, at least in albert code.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
Jason, with the high-scoring
)
Jason, with the high-scoring late validations, your average is now above par, at 1003.97
And your median is higher still, at 1168.97
I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.
http://www.boincsynergy.com/images/stats/comb-3475.jpg
Ok, so it is effectively
)
Ok, so it is effectively using a scaled (marketing) peak flops value - iow a totally unrealistic estimate.
We do need something as a starting point though. Those peak flops are as inadequate as using 10X CPU speed was.
Eve comes in at 91e9 peak flops. From SETI (too small to run here) her GPU is slightly faster than her CPU. CPU needs ~2h for BRP. So roughly the GPU tasks would take 32 hours. That makes her about 32x slower than a 780 - that's the span we are dealing with and it will only grow larger as GPUs get ever faster.
91*32 = 2912 - which is about the figure we saw earlier for fast GPUs - so the slope of the peak flops is not too bad, but the offset is. With an APR of 33 for the 780 and about 1 for Eve we are looking at a ~90x overestimate. For BRP at least.
that scaling value that is being applied must bring the estimates into the correct magnitude over on seti...
any chance to get that number from Eric?
I don't know. If you underestimate the speed, you cache too few tasks - more frequent top up - only a problem if you really can't connect for longer periods of time as you'd run dry (not really a problem either ;) ).
It's the overestimation that runs afoul of the built-in safety-checks.
So how about using 1/100 of peak flops as a GPU starting point? I mean you have to start _somewhere_ ...
Any problems with underestimating I've failed to consider?
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
RE: Jason, with the
)
good. better late than never :D
Yes we'll definitely need to stabilise CPU here first. GPU is going to take a bit more digging yet, and whether or not there is any connection at estimate, scheduler or validation determined before that one's tackled in detail
There are definitely those dicey averages in play (everywhere) to start with, then also I'm surprised to be finding reliance on those (nearly useless) GPU marketing flops figures embedded even after stats are gathered. Until the primary CPU scales are fixed, and averages for all kinds are replaced with damped values, any particular odd logic choice in there is likely to be obliterated in the noise anyway. (Paraphrasing the comments about chaos burying the noise, lol )
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage