Errors - 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED

Claggy

Joined: 29 Dec 06

Posts: 122

Credit: 4040969

RAC: 0

Got my first invalid, where

6 Jul 2014 17:57:12 UTC

Message 80276 in response to message 80275

(moderation:

)

Got my first invalid, where my task was matched against two OpenCL 1.1 running intel GPUs:

Workunit 603716

Claggy

Jacob Klein

Joined: 6 Nov 11

Posts: 16

Credit: 2938967

RAC: 0

Let's keep this thread on the

6 Jul 2014 19:39:25 UTC

Message 80277

(moderation:

)

Let's keep this thread on the topic of its subject, please.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 143

Credit: 5409572

RAC: 0

RE: Let's keep this thread

6 Jul 2014 19:45:03 UTC

Message 80278 in response to message 80277

(moderation:

)

Quote:

Let's keep this thread on the topic of its subject, please.

Your last contribution to the subject, five days ago was.

Quote:

Just wanted to post to mention that I will likely no-longer be testing these problematic work units.

Have you come back to testing, and if so, what have you discovered in the meantime about the cause of the problems?

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Jacob Klein

Joined: 6 Nov 11

Posts: 16

Credit: 2938967

RAC: 0

I have not resumed testing on

6 Jul 2014 19:58:14 UTC

Message 80279

(moderation:

)

I have not resumed testing on this issue, and do not anticipate doing so. I replaced my GTS 240 with a second GTX 660 Ti, and am focusing on GPUGrid and Poem.

Regarding the issue, it seemed to be bad estimations (based on the existing GTX 660 Ti, which had a local exclude_gpu option set on this project) for tasks that ran on the uber weak GTS 240. As I said, rsc_fpops_bound was busted, and the problem was server-side. I don't think it's fixed yet, though I don't know for sure.

Regards,
Jacob

Richard Haselgrove

Joined: 10 Dec 05

Posts: 143

Credit: 5409572

RAC: 0

RE: I have not resumed

6 Jul 2014 20:03:35 UTC

Message 80280 in response to message 80279

(moderation:

)

Quote:

I have not resumed testing on this issue, and do not anticipate doing so. I replaced my GTS 240 with a second GTX 660 Ti, and am focusing on GPUGrid and Poem.

Regarding the issue, it seemed to be bad estimations (based on the existing GTX 660 Ti, which had a local exclude_gpu option set on this project) for tasks that ran on the uber weak GTS 240. As I said, rsc_fpops_bound was busted, and the problem was server-side.

Regards,
Jacob

Exactly. Specifically during what we are calling "stage 2 of the onramp", between 100 global validations for the project as a whole, and 11 local validations for the individual host - the phase during which flops determined by "PFC avg" can be seen in the server logs.

If you're not testing any more, and we understand that much, why do you wish to prevent us discussing other matters of mutual interest in this thread?

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Jacob Klein

Joined: 6 Nov 11

Posts: 16

Credit: 2938967

RAC: 0

This thread is for the

6 Jul 2014 20:45:19 UTC

Message 80281 in response to message 80280

(moderation:

)

This thread is for the "EXIT_TIME_LIMIT_EXCEEDED" error a user might get running these newer apps. A forum search on that error, will find this thread. I am actually still monitoring this thread for an answer.

Any other problem, such as bad OpenCL versions generating bad results and bad validations, deserve to be in their own threads.

Thanks.

Claggy

Joined: 29 Dec 06

Posts: 122

Credit: 4040969

RAC: 0

RE: Exactly. Specifically

6 Jul 2014 20:49:51 UTC

Message 80282 in response to message 80280

(moderation:

)

Quote:

Exactly. Specifically during what we are calling "stage 2 of the onramp", between 100 global validations for the project as a whole, and 11 local validations for the individual host - the phase during which flops determined by "PFC avg" can be seen in the server logs.

And to get to those 100 global validations, and 11 local validations, tasks need to validate, having masses of hosts throwing inconclusives into the works is slowing down the process of recovering from the -197 errors, at least for the Gamma-ray pulsar search #3

Claggy

Jacob Klein

Joined: 6 Nov 11

Posts: 16

Credit: 2938967

RAC: 0

I don't know hardly anything

6 Jul 2014 21:15:41 UTC

Message 80283

(moderation:

)

I don't know hardly anything about how the server does its calculations. I had a problem, I reported the problem, and at some point I was hoping to receive an answer to the problem.

In the meantime, I was expecting the thread to stay on-topic to the problem, to make it easier to find an answer to in the future. Maybe I'm old-fashioned.

If you (the Albert team) need me to do additional testing on the "197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED" problem that my computer was receiving, I'd have to swap hardware to do it, but could do so. Let me know if you'd like to request that.

Thanks,
Jacob

Eyrie

Joined: 20 Feb 14

Posts: 48

Credit: 2410

RAC: 0

I think I found the problem.

11 Jul 2014 6:50:42 UTC

Message 80284

(moderation:

)

I think I found the problem. FGRP has been updated to version 1.12 as a result - the apps are identical.

Anybody runs into further -197 time limit exceeded errors please report here ASAP. Please always include host ID - we can glean most variables from database dumps now, but if you can also state your peak_flops (from BOINC startup messages) that would be very helpful.

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 143

Credit: 5409572

RAC: 0

Unfortunately, I think Eyrie

11 Jul 2014 8:36:32 UTC

Message 80285

(moderation:

)

Unfortunately, I think Eyrie has jumped the gun on this one.

Speaking specifically about FGRP (Gamma-ray pulsar search #3) only:

My NVidia 420M laptop (host 11359) has just been allocated new work from the v1.12 run. It was sent out with the 'conservative' (first stage onramp) speed estimate of 27.76 GFlops: that's very close to the 23.59 GFlops the same host achieves on BRP4G-cuda32-nv301.

BUT: FGRP is a beta app, which makes very little use of the GPU as yet. It runs much, much slower than BRP4G-cuda32-nv301 on my hardware. The tasks would have got error 197 if I hadn't taken precautions. I can't say whether the problem is Einstein's programming, or NVidia's OpenCL implementation, but at this initial stage for the new app_version, we can't blame BOINC.

But we're back to square one with the validation count. Could testers please run more of these tasks (with edited , so they can complete), please? We still need to test how BOINC handles the transitions at 100 validations for the app_version across the project as a whole, and 11 validations for each individual host.

I didn't want to spam the boards with my stats - just milestone theads - but apparently signatures are no longer optional. Follow the link if you're interested.

http://www.boincsynergy.com/images/stats/comb-3475.jpg

Errors - 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports