Wrong estimates of "Remaining" time

Nikolay
Nikolay
Joined: 13 Jan 12
Posts: 4
Credit: 6500
RAC: 0
Topic 84817

Binary Radio Pulsar Search 1.19 (atiOpenCLLion) application gives wrong estimates of remaining time, constantly.
In example, right now I have a WU which is complete about 10%, elapsed time is about 1 hour,
but the "remaining" time is near 83 hours - while it should be equal to some value near 9 hours.

----------------------------------------------------------------------------------
Mac OS X Lion 10.7.2, CPU Intel i7-2720QM, GPU AMD Radeon HD 6750M

Gary Roberts
Gary Roberts
Joined: 9 Feb 05
Posts: 17
Credit: 85000
RAC: 0

Wrong estimates of "Remaining" time

Quote:
... "remaining" time is near 83 hours - while it should be equal to some value near 9 hours.


Sure, but I think you'll find that's just the way BOINC works when the original estimate is significantly wrong. The remaining time should be decreasing relatively quickly - eg by the time you get to 20% completed (~1 extra hour), maybe the value will be 60 or 50 or 40 hours but it certainly wont be 8 hours. It should progressively get closer to reality but it won't become very close until much nearer to the finish.

If the remaining time is continuing to stay near 80 hours then maybe you really do have something to worry about :-).

Cheers, Gary.

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

The problem you run into is

The problem you run into is one that's always been with BOINC. Unless a project stands up that only allows one type of computer to attach, with exactly the same hardware as the one they made their tasks on, this cannot be solved that easily.

Until that time, with a lot of different sorts of hardware out there and the project really only putting out one kind of task --which is the same for CPU and GPU-- it is impossible for them to definitely state how long that task is going to take.

On one CPU it may take 24 hours, on another 14 hours, on your GPU perhaps 10 hours, on another GPU slightly over an hour. So what value of estimated time should they give such work then? Impossible to know up front.

So what a project can do is run some of this work on a variety of own computers, and from their run time they extract a general average that can be calculated into a flops estimate. The tasks of this gender will get that estimate.

BOINC on the other hand will learn from running work. It --at least up till the 6.12s-- sports what's called the TDCF, or Task Duration Correction Factor. This value will go up or down, whenever work is finished. Slower running work will increase the value, faster work will decrease it. Using this TDCF value BOINC will eventually, after many a task, show you a more correct number for the task's run time estimate.

Which is all fine, until you change which tasks you run, as the TDCF is only project wide, not per application. But that's something different completely. ;-)

Nikolay
Nikolay
Joined: 13 Jan 12
Posts: 4
Credit: 6500
RAC: 0

Thank you for the

Thank you for the information.

But why BOINC can not use the following simple algorithm to calculate the remaining time:
1) Start the project and calculate the project until it gets to 0%+0.1%=0.1%
2) Multiply the elapsed time by 1/0.1-1=1000-1=999 and get the remaining time
3) Calculate until 0.1%+0.1%=0.2%
4) Multiply the elapsed time by 1/0.2-1=500-1=499 and get the remaining time
5) Calculate until 0.2%+0.1%=0.3%
6) etc. until 100%

Step is equal to 0.1%

Of course, this algorithm could use any step: 0.05, or 0.025, or 0.0125, or anything else.

That would solve the issue.

Gary Roberts
Gary Roberts
Joined: 9 Feb 05
Posts: 17
Credit: 85000
RAC: 0

And what would you do if

Message 79000 in response to message 78999

And what would you do if parts of the calculation go slowly and other parts speed right up? You can't just assume that there's always a constant rate of progress.

Also what do you do if the normal use of the machine is quite variable? The science app gets out of the way when other heavy CPU jobs are running. A short test measurement of performance could be very heavily skewed - one way or the other. BOINC makes continuous adjustments over the duration of the task which is probably the safest way to do things.

If you are certain there is a better way to handle this, you should talk to the BOINC Devs. It's beyond the scope of what individual projects want to deal with.

Cheers, Gary.

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Addendum to what Gary said,

Message 79001 in response to message 78999

Addendum to what Gary said, it's even more difficult to calculate the progress bar.

- Some projects run work in 15% segments, quickly going to 90% before sitting there literally for hours seemingly doing nothing. (most projects using Autodock have this problem)
- Some projects run in 2% increments. The Gamma Ray application at Einstein will do this.
- Some projects will run to 100% and over it for several minutes. Einstein's Gravitational Wave S6 app will do this continuously.
- And then there's at least one project that runs to 100%, resets to zero and starts again. Enigma.
- Nothing said about the various wrapper apps with their own weird things out there.
- That topped off with invariable run times just amongst applications. Seti runs high angle, low angle and normal angle ranges, all that work runs at varying lengths of time. The devs there can't see when they split the work what is what, so it's astronomically impossible to give them correct flop numbers.

As you can see, if it were so simple as you state, someone would've used it already. :-)

TRuEQ & TuVaLu
TRuEQ & TuVaLu
Joined: 11 Sep 06
Posts: 75
Credit: 615315
RAC: 0

RE: Addendum to what Gary

Message 79002 in response to message 79001

Quote:

Addendum to what Gary said, it's even more difficult to calculate the progress bar.

- Some projects run work in 15% segments, quickly going to 90% before sitting there literally for hours seemingly doing nothing. (most projects using Autodock have this problem)
- Some projects run in 2% increments. The Gamma Ray application at Einstein will do this.
- Some projects will run to 100% and over it for several minutes. Einstein's Gravitational Wave S6 app will do this continuously.
- And then there's at least one project that runs to 100%, resets to zero and starts again. Enigma.
- Nothing said about the various wrapper apps with their own weird things out there.
- That topped off with invariable run times just amongst applications. Seti runs high angle, low angle and normal angle ranges, all that work runs at varying lengths of time. The devs there can't see when they split the work what is what, so it's astronomically impossible to give them correct flop numbers.

As you can see, if it were so simple as you state, someone would've used it already. :-)

There is always the app_info.xml possibility to adjust the flops.

Put 209876543210 into the file and then adjust it if it turns out to be very wrong.

Like an seti ap task have "remaining 24h",when the actual task takes 2h to complete. Then adjust the flops *10 and you will get closer to th real value.

It still is an aproximation, but when differences is very big it can be of help.

TRuEQ & TuVaLu
TRuEQ & TuVaLu
Joined: 11 Sep 06
Posts: 75
Credit: 615315
RAC: 0

I forgot.... There isn't any

I forgot....

There isn't any app_info.xml for Albert yet....Or???

pragmatic prancing periodic problem child, left
pragmatic pranc...
Joined: 26 Jan 05
Posts: 153
Credit: 70000
RAC: 0

Using the anonymous platform

Message 79004 in response to message 79002

Using the anonymous platform is a bit too steep a thing to expect from a new user. It's even a steep angle expected from a normal user. And even advanced users don't get everything right the first time around.

Quote:
Put 209876543210 into the file and then adjust it if it turns out to be very wrong.


Right, so you just add an imaginary number and go adjust that in case it doesn't work. Where's the science in that? it also makes all of BOINC a hands-on experience, with you having to exit BOINC tingle with some files and restart BOINC for every task, or else the TDCF will be out of whack again. Which it will be anyway, when a new application is released and the resource flops estimate is not completely correct.

But all that aside, it would be nice if there was an easy way to calculate the flops value of every piece of hardware you have in your computer. Because it's not just the flops value for the GPU that you need here, but also that for the CPU.

There's 11 different GPUs used on this project alone, that means 11 different flop values for those GPUs alone. Added to that there are a lot more different CPUs on this project. Are you starting to see the problem?

And that's outside the people who have e.g. 8 different GPUs in their system plus a 12 core CPU with HT. I'd love you see help them all to the correct values... forget sleep, forget food, forget family, forget friends, forget TV, you're needed here, 24/7/365.

TRuEQ & TuVaLu
TRuEQ & TuVaLu
Joined: 11 Sep 06
Posts: 75
Credit: 615315
RAC: 0

RE: Using the anonymous

Message 79005 in response to message 79004

Quote:

Using the anonymous platform is a bit too steep a thing to expect from a new user. It's even a steep angle expected from a normal user. And even advanced users don't get everything right the first time around.

Quote:
Put 209876543210 into the file and then adjust it if it turns out to be very wrong.

Right, so you just add an imaginary number and go adjust that in case it doesn't work. Where's the science in that? it also makes all of BOINC a hands-on experience, with you having to exit BOINC tingle with some files and restart BOINC for every task, or else the TDCF will be out of whack again. Which it will be anyway, when a new application is released and the resource flops estimate is not completely correct.

But all that aside, it would be nice if there was an easy way to calculate the flops value of every piece of hardware you have in your computer. Because it's not just the flops value for the GPU that you need here, but also that for the CPU.

There's 11 different GPUs used on this project alone, that means 11 different flop values for those GPUs alone. Added to that there are a lot more different CPUs on this project. Are you starting to see the problem?

And that's outside the people who have e.g. 8 different GPUs in their system plus a 12 core CPU with HT. I'd love you see help them all to the correct values... forget sleep, forget food, forget family, forget friends, forget TV, you're needed here, 24/7/365.

Well, if there already exists differen't app_info.xml files out there and the people who has them can share them and their experience to new people.
I think there will be alot of people that can help eachother with all the different configurations that is needed.

And, I say it again about the "209876543210" estimated.
I use that when a task is downloaded and it shows a dramatic difference in the real time to task completion to get it aproximately right. It will not be correct, but you can come very close. That works fine for me.

And When started to use Computors in 1983 I had noone to show me how it worked.
I got copied printer paper with the code to be programmed so I could run some simple games.
After have typing in several 100 of pages I started to "manipulate" the original programs by changing variables to see how it worked.

I would say the cc_config.xml and app_info.xml are good ways to learn how programs work and can be altered.

The use of a computor shouldn't be just push a button and start a browser by clicking a mouse to start shoping and send simple messages to people.

A computor is alot more then that.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.