BOINC wants us to quit prematurely or we lost contact! Exiting...

Trotador
Trotador
Joined: 15 May 13
Posts: 7
Credit: 26130548
RAC: 0
Topic 85082

Opencl ati tasks (BRP4G) tasks failing exactly after 4 minutes and 51 seconds

Last two lines of stderr contain the error message

FGRP and Einstein opencl tasks process ok in the same GPU.

 

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
<stderr_txt>
../../projects/albert.phys.uwm.edu/einsteinbinary_BRP4G_1.34_x86_64-pc-linux-gnu__BRP4G-opencl-ati: /usr/lib/x86_64-linux-gnu/libOpenCL.so.1: no version information available (required by ../../projects/albert.phys.uwm.edu/einsteinbinary_BRP4G_1.34_x86_64-pc-linux-gnu__BRP4G-opencl-ati)
[20:28:04][2419][INFO ] Application startup - thank you for supporting Einstein@Home!
[20:28:04][2419][INFO ] Starting data processing...
[20:28:04][2419][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[20:28:04][2419][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[20:28:04][2419][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[20:28:04][2419][INFO ] Header contents:
------> Original WAPP file: ./p2030.20131125.G176.86-00.79.N.b3s0g0.00000_DM209.60
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 56621.233735180904
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.336182022
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 53518.7119999
------> DEC (J2000): 311836.789799
------> Galactic l: 0
------> Galactic b: 0
------> Name: G176.86-00.79.N
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 209.6 cm^-3 pc
------> Scale factor: 9.08446e-05
[20:28:06][2419][INFO ] Seed for random number generator is 1201045329.
[20:28:12][2419][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-08
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[20:30:18][2419][INFO ] OpenCL shutdown complete!
[20:30:18][2419][INFO ] Statistics: count dirty SumSpec pages 7243 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[20:30:18][2419][INFO ] Data processing finished successfully!
[20:30:18][2419][INFO ] Starting data processing...
[20:30:18][2419][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[20:30:18][2419][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[20:30:19][2419][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[20:30:19][2419][INFO ] Header contents:
------> Original WAPP file: ./p2030.20131125.G176.86-00.79.N.b3s0g0.00000_DM209.70
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 56621.233735178866
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.336182022
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 53518.7119999
------> DEC (J2000): 311836.789799
------> Galactic l: 0
------> Galactic b: 0
------> Name: G176.86-00.79.N
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 209.7 cm^-3 pc
------> Scale factor: 9.08429e-05
[20:30:20][2419][INFO ] Seed for random number generator is 1201045523.
[20:30:25][2419][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-08
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[20:32:36][2419][INFO ] OpenCL shutdown complete!
[20:32:36][2419][INFO ] Statistics: count dirty SumSpec pages 7112 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 329052
[20:32:36][2419][INFO ] Data processing finished successfully!
[20:32:36][2419][INFO ] Starting data processing...
[20:32:36][2419][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc.
[20:32:36][2419][INFO ] Using OpenCL device "Tahiti" by: Advanced Micro Devices, Inc.
[20:32:37][2419][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[20:32:37][2419][INFO ] Header contents:
------> Original WAPP file: ./p2030.20131125.G176.86-00.79.N.b3s0g0.00000_DM209.80
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 56621.233735176836
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.336182022
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 53518.7119999
------> DEC (J2000): 311836.789799
------> Galactic l: 0
------> Galactic b: 0
------> Name: G176.86-00.79.N
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 209.8 cm^-3 pc
------> Scale factor: 9.08402e-05
[20:32:38][2419][INFO ] Seed for random number generator is 1201045810.
[20:32:43][2419][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-08
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
[20:32:55][2419][INFO ] OpenCL shutdown complete!
[20:32:55][2419][WARN ] BOINC wants us to quit prematurely or we lost contact! Exiting...

</stderr_txt>
]]>

Trotador