All Intel 4600 GPU Tasks Failing

RFGuy_KCCO
RFGuy_KCCO
Joined: 9 Oct 14
Posts: 2
Credit: 3580679
RAC: 0
Topic 86285

It appears that since upgrading my Intel 4600 driver to the most recent (15.36.7.64.3960), almost every one of my Albert (and Einstein, for that matter) tasks are failing. Further, it seems that the tasks that do pass validation are only those tasks that were also processed on another Intel 4600 GPU (not conclusive). This is happening on all 5 of my machines with these GPU's. With a driver that is two versions old, the tasks seem to pass validation. Has this been reported or observed by anyone else yet?

Here is the last part of a recent WU that failed on one of my machines. Let me know what other info I can provide to help troubleshoot. This is from my computer 12068:

[22:34:23][956][INFO ] Checkpoint committed!
[22:39:51][956][INFO ] Checkpoint committed!
[22:45:19][956][INFO ] Checkpoint committed!
[22:50:48][956][INFO ] Checkpoint committed!
[22:51:17][956][INFO ] OpenCL shutdown complete!
[22:51:17][956][INFO ] Statistics: count dirty SumSpec pages 444 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1100505
[22:51:17][956][INFO ] Data processing finished successfully!
[22:51:17][956][INFO ] Starting data processing...
[22:51:17][956][INFO ] Using OpenCL platform provided by: Intel(R) Corporation
[22:51:17][956][INFO ] Using OpenCL device "Intel(R) HD Graphics 4600" by: Intel(R) Corporation
[22:51:18][956][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[22:51:18][956][INFO ] Header contents:
------> Original WAPP file: ./PB0058_006B1_DM374.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 53884.408052324958
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 83009.6196003
------> DEC (J2000): -330115.611
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4542518
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 374 cm^-3 pc
------> Scale factor: 1.83048
[22:51:18][956][INFO ] Seed for random number generator is 1082907307.
[22:51:19][956][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-008
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[22:56:16][956][INFO ] Checkpoint committed!
[23:01:44][956][INFO ] Checkpoint committed!
[23:07:12][956][INFO ] Checkpoint committed!
[23:12:40][956][INFO ] Checkpoint committed!
[23:18:08][956][INFO ] Checkpoint committed!
[23:23:37][956][INFO ] Checkpoint committed!
[23:29:05][956][INFO ] Checkpoint committed!
[23:34:33][956][INFO ] Checkpoint committed!
[23:40:01][956][INFO ] Checkpoint committed!
[23:45:29][956][INFO ] Checkpoint committed!
[23:50:57][956][INFO ] Checkpoint committed!
[23:56:25][956][INFO ] Checkpoint committed!
[00:01:54][956][INFO ] Checkpoint committed!
[00:07:22][956][INFO ] Checkpoint committed!
[00:12:50][956][INFO ] Checkpoint committed!
[00:18:18][956][INFO ] Checkpoint committed!
[00:23:46][956][INFO ] Checkpoint committed!
[00:29:14][956][INFO ] Checkpoint committed!
[00:34:42][956][INFO ] Checkpoint committed!
[00:40:11][956][INFO ] Checkpoint committed!
[00:45:39][956][INFO ] Checkpoint committed!
[00:51:07][956][INFO ] Checkpoint committed!
[00:56:35][956][INFO ] Checkpoint committed!
[01:02:03][956][INFO ] Checkpoint committed!
[01:07:31][956][INFO ] Checkpoint committed!
[01:13:00][956][INFO ] Checkpoint committed!
[01:18:28][956][INFO ] Checkpoint committed!
[01:23:56][956][INFO ] Checkpoint committed!
[01:29:24][956][INFO ] Checkpoint committed!
[01:34:52][956][INFO ] Checkpoint committed!
[01:40:21][956][INFO ] Checkpoint committed!
[01:45:48][956][INFO ] Checkpoint committed!
[01:51:17][956][INFO ] Checkpoint committed!
[01:56:45][956][INFO ] Checkpoint committed!
[02:02:13][956][INFO ] Checkpoint committed!
[02:07:22][956][INFO ] Checkpoint committed!
[02:10:35][956][INFO ] OpenCL shutdown complete!
[02:10:35][956][INFO ] Statistics: count dirty SumSpec pages 445 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1100505
[02:10:35][956][INFO ] Data processing finished successfully!
02:10:35 (956): called boinc_finish

</stderr_txt>
]]>

 

 

Holmis
Holmis
Joined: 4 Jan 05
Posts: 89
Credit: 2104736
RAC: 0

You could check out this

You could check out this rather long thread over at the main Einstein@home project for a discussion about Intel GPUs and recent drivers, although the discussion there pertains to another application the problem with invalid results is the same. Start looking at say the 10-15 most recent posts.

The general consensus in that thread is to use an older version of the driver as the apps here don't seem to like newer versions.

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

Thanks for the report. We'll

Thanks for the report. We'll look into this...

Oliver

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.