BRP application v 1.33 feedback thread

archae86
archae86
Joined: 6 Dec 05
Posts: 10
Credit: 67924
RAC: 0
Topic 84909

I assume 1.33 is the version which employs file compression in order greatly to reduce the download network traffic. I applaud the attempt to obtain this improvement. While Comcast has stopped posting my bandwidth consumption where I can see it, when last I could look just two GTX460 hosts running BRP were using up about half my allowed monthly traffic.

So I'm happy to report that both of my two CUDA Windows 7 hosts have returned a stock of v1.33 work. Already 4307 has 2/5 validated, and 4306 other has 5/13 validated. Execution timings look in line with recent Einstein 1.32 work on the same hosts.

This report is neither a problem nor a bug report, but this board seemed most nearly suitable.

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 12
Credit: 638256
RAC: 0

BRP application v 1.33 feedback thread

I have 1.33 running on one host so far. So far 9 tasks have completed and 3 tasks have validated. The other 6 are pending validation.

The file size reduction is very significant from 2MB to 475K per file. Thank you.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Joined: 28 Aug 06
Posts: 164
Credit: 1864017
RAC: 0

Hi! Yup, 1.33 is a new

Hi!

Yup, 1.33 is a new version which is testing compression of the input files, plus it uses a newer version of the BOINC API code, which is recommended for the next generation of BOINC clients.

Unfortunately this new BOINC API version introduced a bug that broke all but the OSX versions of the OpenCL BRP app versions :-(.

There is also a problem with the Linux 32 bit CPU app version (doesn't link zlib statically).

We plan to publish a new, corrected suite of BRP4 apps on Albert for testing next week.

Cheers
HB

tullio
tullio
Joined: 22 Jan 05
Posts: 53
Credit: 137342
RAC: 0

But it runs OK on my SuSE

Message 79510 in response to message 79509

But it runs OK on my SuSE Linux 12.1 32-bit.
Tullio

Alex
Alex
Joined: 1 Mar 05
Posts: 88
Credit: 398734
RAC: 0

27 validated, 3 pending (win

27 validated, 3 pending (win cuda wu's).
Looks good so far, returning to Einstein.

skgiven
Joined: 14 Oct 12
Posts: 9
Credit: 4734887
RAC: 0

I'm getting 7.0.44 upload

Message 79512 in response to message 79511

I'm getting 7.0.44

upload failure:
p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_0
-161

p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_1
-161

p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_2
-161

p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_3
-161

p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_4
-161

p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_5
-161

p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_6
-161

p2030.20120218.G178.84-02.08.C.b1s0g0.00000_2952_3_7
-161

]]>

3 Invalid tasks from the 13th and 14th (same host):

http://albertathome.org/task/470974 (standard error shown below)
http://albertathome.org/task/470633
http://albertathome.org/task/468200

Name p2030.20120219.G177.98-03.39.S.b0s0g0.00000_48_1
Workunit 184410
Created 13 Jan 2013 | 9:55:17 UTC
Sent 14 Jan 2013 | 2:05:30 UTC
Received 14 Jan 2013 | 21:17:43 UTC
Server state Over
Outcome Validate error (58:00111010)
Client state Done
Exit status 0 (0x0)
Computer ID 5305
Report deadline 28 Jan 2013 | 2:05:30 UTC
Run time 1,112.41
CPU time 197.95
Validate state Invalid
Credit 0.00
Application version Binary Radio Pulsar Search v1.33 (BRP4cuda32nv301)
Stderr output

7.0.42

Activated exception handling...
[20:58:27][308800][INFO ] Starting data processing...
[20:58:28][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[20:58:28][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[20:58:28][308800][INFO ] Version of installed CUDA driver: 5000
[20:58:28][308800][INFO ] Version of CUDA driver API used: 3020
[20:58:28][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[20:58:28][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM4.80
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.964674255258
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 4.8 cm^-3 pc
------> Scale factor: 0.00102345
[20:58:29][308800][INFO ] Seed for random number generator is 1173636489.
[20:58:29][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[20:58:29][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:00:48][308800][INFO ] Statistics: count dirty SumSpec pages 12644 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:00:48][308800][INFO ] Data processing finished successfully!
[21:00:48][308800][INFO ] Starting data processing...
[21:00:48][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:00:48][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[21:00:48][308800][INFO ] Version of installed CUDA driver: 5000
[21:00:48][308800][INFO ] Version of CUDA driver API used: 3020
[21:00:48][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:00:48][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM4.90
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.96467425322
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 4.9 cm^-3 pc
------> Scale factor: 0.00102345
[21:00:49][308800][INFO ] Seed for random number generator is 1171635415.
[21:00:50][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[21:00:50][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:03:07][308800][INFO ] Statistics: count dirty SumSpec pages 14138 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:03:07][308800][INFO ] Data processing finished successfully!
[21:03:07][308800][INFO ] Starting data processing...
[21:03:07][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:03:07][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[21:03:07][308800][INFO ] Version of installed CUDA driver: 5000
[21:03:07][308800][INFO ] Version of CUDA driver API used: 3020
[21:03:07][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:03:07][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM5.00
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.96467425119
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 5 cm^-3 pc
------> Scale factor: 0.00102345
[21:03:09][308800][INFO ] Seed for random number generator is 1171635415.
[21:03:09][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[21:03:09][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:05:26][308800][INFO ] Statistics: count dirty SumSpec pages 13713 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:05:26][308800][INFO ] Data processing finished successfully!
[21:05:26][308800][INFO ] Starting data processing...
[21:05:26][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:05:26][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[21:05:26][308800][INFO ] Version of installed CUDA driver: 5000
[21:05:26][308800][INFO ] Version of CUDA driver API used: 3020
[21:05:26][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:05:26][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM5.10
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.964674249153
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 5.1 cm^-3 pc
------> Scale factor: 0.00102345
[21:05:28][308800][INFO ] Seed for random number generator is 1173636489.
[21:05:28][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[21:05:28][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:07:45][308800][INFO ] Statistics: count dirty SumSpec pages 13239 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:07:45][308800][INFO ] Data processing finished successfully!
[21:07:45][308800][INFO ] Starting data processing...
[21:07:45][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:07:45][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[21:07:45][308800][INFO ] Version of installed CUDA driver: 5000
[21:07:45][308800][INFO ] Version of CUDA driver API used: 3020
[21:07:45][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:07:45][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM5.20
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.964674247123
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 5.2 cm^-3 pc
------> Scale factor: 0.00102345
[21:07:46][308800][INFO ] Seed for random number generator is 1173636489.
[21:07:47][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[21:07:47][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:08:27][308800][INFO ] Checkpoint committed!
[21:10:03][308800][INFO ] Statistics: count dirty SumSpec pages 6747 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:10:03][308800][INFO ] Data processing finished successfully!
[21:10:03][308800][INFO ] Starting data processing...
[21:10:03][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:10:03][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[21:10:03][308800][INFO ] Version of installed CUDA driver: 5000
[21:10:03][308800][INFO ] Version of CUDA driver API used: 3020
[21:10:03][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:10:03][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM5.30
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.964674245086
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 5.3 cm^-3 pc
------> Scale factor: 0.00102345
[21:10:04][308800][INFO ] Seed for random number generator is 1173636489.
[21:10:05][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[21:10:05][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:12:21][308800][INFO ] Statistics: count dirty SumSpec pages 11398 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:12:21][308800][INFO ] Data processing finished successfully!
[21:12:21][308800][INFO ] Starting data processing...
[21:12:22][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:12:22][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[21:12:22][308800][INFO ] Version of installed CUDA driver: 5000
[21:12:22][308800][INFO ] Version of CUDA driver API used: 3020
[21:12:22][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:12:22][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM5.40
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.964674243056
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 5.4 cm^-3 pc
------> Scale factor: 0.00102345
[21:12:23][308800][INFO ] Seed for random number generator is 1173636489.
[21:12:23][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[21:12:23][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:14:39][308800][INFO ] Statistics: count dirty SumSpec pages 11084 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:14:39][308800][INFO ] Data processing finished successfully!
[21:14:39][308800][INFO ] Starting data processing...
[21:14:39][308800][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 90 MB (1959 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:14:39][308800][INFO ] Using CUDA device #1 "GeForce GTX 660 Ti" (0 CUDA cores / 0.00 GFLOPS)
[21:14:39][308800][INFO ] Version of installed CUDA driver: 5000
[21:14:39][308800][INFO ] Version of CUDA driver API used: 3020
[21:14:39][308800][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:14:39][308800][INFO ] Header contents:
------> Original WAPP file: ./p2030.20120219.G177.98-03.39.S.b0s0g0.00000_DM5.50
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 55976.964674241019
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 52736.3790016
------> DEC (J2000): 284603.9856
------> Galactic l: 0
------> Galactic b: 0
------> Name: G177.98-03.39.S
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 5.5 cm^-3 pc
------> Scale factor: 0.00102345
[21:14:41][308800][INFO ] Seed for random number generator is 1173636489.
[21:14:41][308800][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[21:14:41][308800][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 294 MB (1755 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 204 MB
[21:16:58][308800][INFO ] Statistics: count dirty SumSpec pages 10223 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[21:16:58][308800][INFO ] Data processing finished successfully!
21:16:58 (308800): called boinc_finish

]]>

Eyrie
Eyrie
Joined: 20 Feb 14
Posts: 48
Credit: 2410
RAC: 0

I realise this is rather late

I realise this is rather late feedback, but I've only just (re-)attached.

I got a bunch of errors - CPU app.

The error is simple - out of mem.
The host only has 3 GB and it was rather strained trying to run BOINC with Einstein/Albert and memory heavy Rosetta tasks AND a memory heavy game. Having LAIM in effect suspending boinc would free the CPUs but not the memory.

So, when BRP tried to start up there wasn't enough memory to be had [no idea if making my pagefile larger would help] and a whole bunch of tasks bit the bullet.

The error made it into stderr, so the app did notice that there was not enough memory. Since that very often is a transient condition, resulting from the user doing something memory heavy, it would be nice if the app could invoke 'temporary exit' instead of hard exits. That way boinc will try to start the task again at a later time, hopefully with more free mem, and the task will be able to run, instead of producing a cacheful of errors.

Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.

Dr Who Fan
Dr Who Fan
Joined: 3 May 14
Posts: 13
Credit: 191726
RAC: 0

Not sure if this a SCHEDULER

Not sure if this a SCHEDULER Problem or a lack of available wing man for this type of work:

Workunit# 594225
on 20 May 2014 | 15:06:40 UTC my PC returnd the completed task;
on 23 May 2014 | 9:16:23 UTC my wing man Aborted their task;
on 23 May 2014 | 9:16:28 UTC a 3rd task was generated but 2 Days, 5.75 Hours later it has yet to be sent out to another PC for computation.

Same thing has occurred with different date/times for Workunits 594230 and 594236.

sig-1347.png

sig.png

Dr Who Fan
Dr Who Fan
Joined: 3 May 14
Posts: 13
Credit: 191726
RAC: 0

Can the project

Message 79515 in response to message 79514

Can the project Administrators/Scientists please look into this problem?
Over 24 Hours has passed since I originally posted and the 3 tasks remain unsent to a 3rd wing man for validation.

Quote:

Not sure if this a SCHEDULER Problem or a lack of available wing man for this type of work:

Workunit# 594225
on 20 May 2014 | 15:06:40 UTC my PC returnd the completed task;
on 23 May 2014 | 9:16:23 UTC my wing man Aborted their task;
on 23 May 2014 | 9:16:28 UTC a 3rd task was generated but 2 Days, 5.75 Hours later it has yet to be sent out to another PC for computation.

Same thing has occurred with different date/times for Workunits 594230 and 594236.

sig-1347.png

sig.png

Claggy
Claggy
Joined: 29 Dec 06
Posts: 122
Credit: 4040969
RAC: 0

RE: Can the project

Message 79516 in response to message 79515

Quote:
Can the project Administrators/Scientists please look into this problem?
Over 24 Hours has passed since I originally posted and the 3 tasks remain unsent to a 3rd wing man for validation.
Quote:

Not sure if this a SCHEDULER Problem or a lack of available wing man for this type of work:

Workunit# 594225
on 20 May 2014 | 15:06:40 UTC my PC returnd the completed task;
on 23 May 2014 | 9:16:23 UTC my wing man Aborted their task;
on 23 May 2014 | 9:16:28 UTC a 3rd task was generated but 2 Days, 5.75 Hours later it has yet to be sent out to another PC for computation.

Same thing has occurred with different date/times for Workunits 594230 and 594236.


It's not a problem, Einstein/Albert employs a scheduler that will send out tasks to computers that have the right data files, why increase bandwidth utilisation for server and client, when it just has to wait for the right client to come along, and then save on that download, it just may have to wait days or weeks for the right client to come along.

Claggy

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

RE: RE: Can the project

Message 79517 in response to message 79516

Quote:
Quote:
Can the project Administrators/Scientists please look into this problem?
Over 24 Hours has passed since I originally posted and the 3 tasks remain unsent to a 3rd wing man for validation.
Quote:

Not sure if this a SCHEDULER Problem or a lack of available wing man for this type of work:

Workunit# 594225
on 20 May 2014 | 15:06:40 UTC my PC returnd the completed task;
on 23 May 2014 | 9:16:23 UTC my wing man Aborted their task;
on 23 May 2014 | 9:16:28 UTC a 3rd task was generated but 2 Days, 5.75 Hours later it has yet to be sent out to another PC for computation.

Same thing has occurred with different date/times for Workunits 594230 and 594236.


It's not a problem, Einstein/Albert employs a scheduler that will send out tasks to computers that have the right data files, why increase bandwidth utilisation for server and client, when it just has to wait for the right client to come along, and then save on that download, it just may have to wait days or weeks for the right client to come along.

Claggy


Added to that explanation is that the scheduler will not wait forever, there is a maximum time before the tasks get sent to the next host asking for that type of work. I don't know what that time is set to here and now but over on Einstein it used to be set to 7 days/1 week. That might have been changed since I picked up that info, it's been several years...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.