Gravitational Wave S6 Folllow-up #1 v1.05 (SSE2) Errors

Dr Who Fan
Dr Who Fan
Joined: 3 May 14
Posts: 13
Credit: 191726
RAC: 0
Topic 85703

Another one that keeps RESTARTING FROM START.

I aborted this one because I noticed it had restarted TWICE from the beginning (0%) after attempting to checkpoint:
I looked in the slots folder BEFORE ABORTING and THERE WAS a checkpoint file .

TASK# 1711521
https://albertathome.org/task/1711521
Name: h1_0053.70_S6GC1__S6BucketFU1UB_00003992_1

---
IMPORTANT Parts of Stderr output:

969.......
2014-11-19 11:06:56.7343 (2516) [normal]: Finished main analysis.
2014-11-19 11:06:56.7343 (2516) [normal]: Recalculating statistics for the final toplist...
2014-11-19 11:12:45.5468 (2516) [normal]: Finished recalculating toplist statistics.
2014-11-19 11:12:45.5625 (2516) [debug]: Writing output ... done.
command line: projects/albert.phys.uwm.edu/einstein_S6BucketFU1UB_1.05_windows_intelx86__SSE2.exe @../../projects/albert.phys.uwm.edu/S6BucketFU1UB_00003993.conf.gz --DataFiles1=..\..\projects\albert.phys.uwm.edu\h1_0053.70_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.70_S6GC1;..\..\projects\albert.phys.uwm.edu\h1_0053.75_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.75_S6GC1;..\..\projects\albert.phys.uwm.edu\h1_0053.80_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.80_S6GC1;..\..\projects\albert.phys.uwm.edu\h1_0053.85_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.85_S6GC1 --ephemE=../../projects/albert.phys.uwm.edu/earth_09_11 --ephemS=../../projects/albert.phys.uwm.edu/sun_09_11 --segmentList=../../projects/albert.phys.uwm.edu/seglist-S6BucketFU1UB.dat -o ../../projects/albert.phys.uwm.edu/h1_0053.70_S6GC1__S6BucketFU1UB_00003992_1_1
Code-version: %% LAL: 6.12.0.1 (CLEAN c509104fe62b61ba73cd5ce58a7a39cc90f0531c)
%% LALPulsar: 1.9.0.1 (CLEAN c509104fe62b61ba73cd5ce58a7a39cc90f0531c)
%% LALApps: 6.14.0.1 (CLEAN c509104fe62b61ba73cd5ce58a7a39cc90f0531c)

2014-11-19 11:12:46.8593 (2516) [normal]: FstatMethod used: 'DemodSSE'
2014-11-19 11:12:46.8593 (2516) [normal]: Reading input data ... 2014-11-19 11:14:08.6875 (2516) [normal]: Number of segments: 90, total number of SFTs in segments: 12080
done.
% --- GPS reference time = 960499913.5000 , GPS data mid time = 960499913.5000
2014-11-19 11:14:08.8125 (2516) [normal]: dFreqStack = 3.609707e-006, df1dot = 1.002697e-010, df2dot = 0.000000e+000, df3dot = 0.000000e+000
% --- Setup, N = 90, T = 215977 s, Tobs = 22059873 s, gammaRefine = 230, gamma2Refine = 35741, gamma3Refine = 1
2014-11-19 11:14:08.8437 (2516) [normal]: INFO: No checkpoint checkpoint.cpt found - starting from scratch
% --- Cpt:0, total:6356, sky:1/908, f1dot:1/7

0.% --- CG:88110 FG:671 f1dotmin_fg:-5.653552502045e-010 df1dot_fg:4.340677489177e-013 f2dotmin_fg:0 df2dot_fg:0 f3dotmin_fg:0 df3dot_fg:1
c
......
1.......
2....c
...
3.......
4.......c

=== SNIP ===

902.......
903......c
.
904.......
905.......
906..c
.....
907.......
2014-11-19 18:27:51.9843 (2516) [normal]: Finished main analysis.
2014-11-19 18:27:51.9843 (2516) [normal]: Recalculating statistics for the final toplist...
2014-11-19 18:33:42.8593 (2516) [normal]: Finished recalculating toplist statistics.
2014-11-19 18:33:42.8593 (2516) [debug]: Writing output ... done.
command line: projects/albert.phys.uwm.edu/einstein_S6BucketFU1UB_1.05_windows_intelx86__SSE2.exe @../../projects/albert.phys.uwm.edu/S6BucketFU1UB_00003994.conf.gz --DataFiles1=..\..\projects\albert.phys.uwm.edu\h1_0053.70_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.70_S6GC1;..\..\projects\albert.phys.uwm.edu\h1_0053.75_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.75_S6GC1;..\..\projects\albert.phys.uwm.edu\h1_0053.80_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.80_S6GC1;..\..\projects\albert.phys.uwm.edu\h1_0053.85_S6GC1;..\..\projects\albert.phys.uwm.edu\l1_0053.85_S6GC1 --ephemE=../../projects/albert.phys.uwm.edu/earth_09_11 --ephemS=../../projects/albert.phys.uwm.edu/sun_09_11 --segmentList=../../projects/albert.phys.uwm.edu/seglist-S6BucketFU1UB.dat -o ../../projects/albert.phys.uwm.edu/h1_0053.70_S6GC1__S6BucketFU1UB_00003992_1_2
Code-version: %% LAL: 6.12.0.1 (CLEAN c509104fe62b61ba73cd5ce58a7a39cc90f0531c)
%% LALPulsar: 1.9.0.1 (CLEAN c509104fe62b61ba73cd5ce58a7a39cc90f0531c)
%% LALApps: 6.14.0.1 (CLEAN c509104fe62b61ba73cd5ce58a7a39cc90f0531c)

2014-11-19 18:33:44.1562 (2516) [normal]: FstatMethod used: 'DemodSSE'
2014-11-19 18:33:44.1718 (2516) [normal]: Reading input data ... 2014-11-19 18:35:07.0937 (2516) [normal]: Number of segments: 90, total number of SFTs in segments: 12080
done.
% --- GPS reference time = 960499913.5000 , GPS data mid time = 960499913.5000
2014-11-19 18:35:07.2187 (2516) [normal]: dFreqStack = 3.609707e-006, df1dot = 1.002697e-010, df2dot = 0.000000e+000, df3dot = 0.000000e+000
% --- Setup, N = 90, T = 215977 s, Tobs = 22059873 s, gammaRefine = 230, gamma2Refine = 35741, gamma3Refine = 1
2014-11-19 18:35:07.2500 (2516) [normal]: INFO: No checkpoint checkpoint.cpt found - starting from scratch
% --- Cpt:0, total:5922, sky:1/846, f1dot:1/7

0.% --- CG:88110 FG:671 f1dotmin_fg:-3.676246445875e-010 df1dot_fg:4.340677489177e-013 f2dotmin_fg:0 df2dot_fg:0 f3dotmin_fg:0 df3dot_fg:1
c
......
1.......
2....c
...

Bernd Machenschalk
Bernd Machenschalk
Administrator
Joined: 15 Oct 04
Posts: 155
Credit: 6218130
RAC: 0

Thanks for the report.Note

Thanks for the report.

Note that the app kind of restarts itself with slightly different command line (S6BucketFU1UB_00003993.conf.g h1_0053.70_S6GC1__S6BucketFU1UB_00003992_1_1 vs. S6BucketFU1UB_00003994.conf.gz h1_0053.70_S6GC1__S6BucketFU1UB_00003992_1_2). The checkpoint is kept for each such analysis call, and at the beginning of such there indeed isn't a checkpoint.

So from stderr output the behavior is prtty normal. It might be that there is something wrong with the progress counting, though. But that wouldn't require aborting a task.

BM

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 320
Credit: 8545955
RAC: 0

To expand on Bernd's comment,

To expand on Bernd's comment, just in case you don't know: the workunit contains a bundle of eight sub-tasks that get run one after the other, each "starting from scratch". Each of those sub-tasks does its own checkpointing...

HTH,
OliverĀ 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.