it's time to ask for help

Stephan Goll
Stephan Goll
Joined: 13 Dec 05
Posts: 20
Credit: 1874367
RAC: 0
Topic 84833

It's a 64 bit Debian, mixed stable and testing, the X server is running but not used because the system is running headless.
Primegrid is running well, so CAL / OpenCL is working. The card is an AMD Radeon HD 5570, lspci tells this:
04:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Redwood PRO [Radeon HD 5500 Series]
It's this one: http://albertathome.org/host/2756/tasks

boinc@celeron:~$ ldd projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.22_i686-pc-linux-gnu__atiOpenCL
linux-gate.so.1 => (0xf773a000)
libOpenCL.so.1 => /usr/lib32/libOpenCL.so.1 (0xf772e000)
libpthread.so.0 => /lib32/libpthread.so.0 (0xf7715000)
libm.so.6 => /lib32/libm.so.6 (0xf76ef000)
libstdc++.so.6 => /usr/lib32/libstdc++.so.6 (0xf75fa000)
libc.so.6 => /lib32/libc.so.6 (0xf749e000)
/lib/ld-linux.so.2 (0xf773b000)
libdl.so.2 => /lib32/libdl.so.2 (0xf749a000)
libgcc_s.so.1 => /usr/lib32/libgcc_s.so.1 (0xf747c000)

boinc@celeron:~$ ldd projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.00_graphics_i686-pc-linux-gnu
linux-gate.so.1 => (0xf7701000)
libpthread.so.0 => /lib32/libpthread.so.0 (0xf76e2000)
libm.so.6 => /lib32/libm.so.6 (0xf76bc000)
libdl.so.2 => /lib32/libdl.so.2 (0xf76b8000)
libX11.so.6 => not found
libXext.so.6 => not found
libGL.so.1 => /usr/lib32/libGL.so.1 (0xf75ca000)
libGLU.so.1 => not found
libc.so.6 => /lib32/libc.so.6 (0xf746e000)
/lib/ld-linux.so.2 (0xf7702000)
libXext.so.6 => not found
libgcc_s.so.1 => /usr/lib32/libgcc_s.so.1 (0xf7451000)

Okay, some missing libs for the graphics application, but the OpenCL app seems to be happy. It should not matter because I do not use the X11 system.
I installed only the amd-driver-installer-12-3-x86.x86_64.run, not the SDK.

23-Apr-2012 16:28:32 [---] Starting BOINC client version 7.0.26 for x86_64-pc-linux-gnu
23-Apr-2012 16:28:32 [---] log flags: file_xfer, sched_ops, task
23-Apr-2012 16:28:32 [---] Libraries: libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6
23-Apr-2012 16:28:32 [---] Running as a daemon
23-Apr-2012 16:28:32 [---] Data directory: /home/boinc
23-Apr-2012 16:28:32 [---] Processor: 2 GenuineIntel Intel(R) Celeron(R) CPU E3400 @ 2.60GHz [Family 6 Model 23 Stepping 10]
23-Apr-2012 16:28:32 [---] Processor: 1.00 MB cache
23-Apr-2012 16:28:32 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm tpr_shadow vnmi flexpriority
23-Apr-2012 16:28:32 [---] OS: Linux: 2.6.32-5-amd64
23-Apr-2012 16:28:32 [---] Memory: 7.79 GB physical, 512.36 MB virtual
23-Apr-2012 16:28:32 [---] Disk: 19.22 GB total, 17.70 GB free
23-Apr-2012 16:28:32 [---] Local time is UTC +2 hours
23-Apr-2012 16:28:32 [---] ATI GPU 0: Redwood (CAL version 1.4.1703, 1024MB, 1000MB available, 50 GFLOPS peak)
23-Apr-2012 16:28:32 [---] OpenCL: ATI GPU 0: Redwood (driver version CAL 1.4.1703, device version OpenCL 1.1 AMD-APP (898.1), 1024MB, 1000MB available)
...

Boinc seems to be happy. And I'm getting:

7.0.26

process exited with code 255 (0xff, -1)

[17:37:18][1626][INFO ] Application startup - thank you for supporting Einstein@Home!
[17:37:18][1626][INFO ] Starting data processing...
[17:37:18][1626][ERROR] Failed to get OpenCL platform/device info from BOINC (error: -1)!
[17:37:18][1626][ERROR] Demodulation failed (error: -1)!
17:37:18 (1626): called boinc_finish

]]>

I have no idea and I will surrender for today. Any help is welcome.
Stephan

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Joined: 28 Aug 06
Posts: 164
Credit: 1864017
RAC: 0

it's time to ask for help

Hmm....this kind of error happens when the BOINC part that is part of the app cannot "see" the GPU that was assigned to the app by the boinc core client. This is a rather tricky part of the BOINC code and I would not be surprised if there were still some remaining problems.

Could you please do the following:

in a console on that machine:

[pre]
export DISPLAY=":0.0"

clinfo
[/pre]

is the GPU listed in the output? Just one GPU installed or several of them, and if many, which one(s) is/are detected?

Thanks
HBE

Stephan Goll
Stephan Goll
Joined: 13 Dec 05
Posts: 20
Credit: 1874367
RAC: 0

Hi Bikeman, sure I can. Btw.

Message 79163 in response to message 79162

Hi Bikeman,
sure I can. Btw. clinfo runs on the console, no need to export the DISPLAY variable. Here comes the complete output (snipped some CPU related parts):

boinc@celeron:~$ clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.1 AMD-APP (898.1)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4098
Board name: ATI Radeon HD 5570
Device Topology: PCI[ B#4, D#0, F#0 ]
Max compute units: 5
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 4
Preferred vector width double: 0
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 4
Native vector width double: 0
Max clock frequency: 0Mhz
Address bits: 32
Max memory allocation: 134217728
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 536870912
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 0x7fe58a9c2480
Name: Redwood
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 1.1
Driver version: CAL 1.4.1703
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP (898.1)
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_meminfo

Device Type: CL_DEVICE_TYPE_CPU
Device ID: 4098
Board name:
Max compute units: 2
........... .......
Platform ID: 0x7fe58a9c2480
Name: Intel(R) Celeron(R) CPU E3400 @ 2.60GHz
Vendor: GenuineIntel
Device OpenCL C version: OpenCL C 1.1
Driver version: 2.0
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP (898.1)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt

Regards,
Stephan

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Joined: 28 Aug 06
Posts: 164
Credit: 1864017
RAC: 0

Hi! Thanks for the feedback.

Message 79164 in response to message 79163

Hi!

Thanks for the feedback. Looks normal...hmm.....

Quote:

Btw. clinfo runs on the console, no need to export the DISPLAY variable.

Yup, that was just to ensure it's not forwarded, as then clinfo won't display the GPU. As you wrote, X server needs to be running even for headless GPU operation.

Cheers
HBE

steffen_moeller
steffen_moeller
Joined: 9 Feb 05
Posts: 6
Credit: 397892
RAC: 0

This is a bit weird, indeed.

Message 79165 in response to message 79164

This is a bit weird, indeed. If BOINC is happy, so should be the app. What version of Debian are you using? Have you tried some another OpenCL app than Albert?

Just to exclude things a bit:
* from the user running the X session try "xhost +"
* add the boinc user to the video group
* restart the boinc client after X was started

Good luck

Steffen

Stephan Goll
Stephan Goll
Joined: 13 Dec 05
Posts: 20
Credit: 1874367
RAC: 0

It is Debian, mostly from the

Message 79166 in response to message 79165

It is Debian, mostly from the stable release. Some parts like the libc, libssl and some of the 32 bit i686 compatibility libs are from the testing release. Not all libs are available in 32 bit version and that's why some of the libs are missing.
About the xhost thing ... well, I can try this, but honestly: this is to allow boinc to write on the screen. I do not think this will help because this box is running headless. Adding the boinc accout to the video group ... well, there is no group with this name. But the devicefile (/dev/ati/card0) is set to 666 and PrimeGrip does not have any problems (see http://www.primegrid.com/results.php?hostid=252937).
But there is an hint inside your answer ... I will check if I can start the X server as boinc user. May be this helps ... and I will report it later.
Thanks,
Stephan
PS: Okay, I modified the /etc/X11/Xwrapper.config file and was able to start the X server from the boinc account. Let's wait ...

Stephan Goll
Stephan Goll
Joined: 13 Dec 05
Posts: 20
Credit: 1874367
RAC: 0

Nothing changed ... still the

Message 79167 in response to message 79166

Nothing changed ... still the same error. I'm clueless. Are some devs here who can explain the error message? It seems to come from the application, not from the boinc itself. Anyway: I will continue to fetch some work every week to see if something changes.
Stephan

Stephan Goll
Stephan Goll
Joined: 13 Dec 05
Posts: 20
Credit: 1874367
RAC: 0

I just found that POEM do

Message 79168 in response to message 79167

I just found that POEM do have an OpenCL application. I subscribed at POEM and got my first WU. I will report when it has finished ...
Stephan
PS: http://boinc.fzk.de/poem/results.php?userid=46736

Trog Dog
Trog Dog
Joined: 25 Nov 05
Posts: 13
Credit: 64008
RAC: 0

RE: I just found that POEM

Message 79169 in response to message 79168

Quote:
I just found that POEM do have an OpenCL application.

As does milkyway and collatz

Stephan Goll
Stephan Goll
Joined: 13 Dec 05
Posts: 20
Credit: 1874367
RAC: 0

RE: RE: I just found that

Message 79170 in response to message 79169

Quote:
Quote:
I just found that POEM do have an OpenCL application.

As does milkyway and collatz

Milkyway requires double precision ... my 5570 does not have this feature. Collatz ... ATI only for Windows. I'm running only linux.
But the good thing is: the POEM app generates valid results just like PrimeGrid does. So I'm sure my card is fine, the box is fine ... but then there must be something wrong with the albert application.

I was going back with the libc6, the libc6-i383 und the lib32gcc1 from the testing to the stable release and restartet the boinc. I will give albert another try. But then ... I don't know.
Stephan

Trog Dog
Trog Dog
Joined: 25 Nov 05
Posts: 13
Credit: 64008
RAC: 0

RE: Milkyway requires

Message 79171 in response to message 79170

Quote:

Milkyway requires double precision ... my 5570 does not have this feature. Collatz ... ATI only for Windows. I'm running only linux.
But the good thing is: the POEM app generates valid results just like PrimeGrid does. So I'm sure my card is fine, the box is fine ... but then there must be something wrong with the albert application.

I was going back with the libc6, the libc6-i383 und the lib32gcc1 from the testing to the stable release and restartet the boinc. I will give albert another try. But then ... I don't know.
Stephan

The collatz linux app is available under the optimised apps link on their front page from memory.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.