Project

General

Profile

Bug #40

gcl-create calls abort: there should be a version of _gdp_gcl_decref that does not invoke GDP_ASSERT_GOOD_GCL

Added by Anonymous almost 7 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
libgdp
Start date:
08/14/2016
Due date:
% Done:

0%


Description

I suspect that gdp-03.eecs.berkeley.edu might be down.

The more immediate symptom is that gcl-create Is failing with an abort.

bash-3.2$ /Users/cxh/ptII/vendors/gdp/gdp/apps/gcl-create -k none -D '*=70' -s edu.berkeley.eecs.gdp-01.gdplogd cxh.test11 >& ~/Downloads/gcl-create.log.txt
Abort trap: 6
bash-3.2$

I added some fprintfs to the end of _gdp_gcl_create():

fail0:
    fprintf(stderr, "%s:%d:_gdp_gcl_create: fail0!\n", __FILE__, __LINE__);
    if (gcl != NULL) {
          fprintf(stderr, "%s:%d:_gdp_gcl_create: fail0, calling _gdp_gcl_decref\n", __FILE__, __LINE__);
          _gdp_gcl_decref(&gcl);
        }
        if (req != NULL) {
          fprintf(stderr, "%s:%d:_gdp_gcl_create: fail0, calling _gdp_recfref\n", __FILE__, __LINE__);
        _gdp_req_free(&req);
    }

        {
            char ebuf[100];

        ep_dbg_cprintf(Dbg, 8, "Could not create GCL: %s\n",
                ep_stat_tostr(estat, ebuf, sizeof ebuf));
    }
        fprintf(stderr, "%s:%d:_gdp_gcl_create: fail0, returning estat\n", __FILE__, __LINE__);
    return estat;
}

During the run, it looks like the gcl is not null, but it is not valid?

gdp_invoke(CMD_CREATE) <<< ERROR: Operation timed out [EPLIB:errno:60]
  req@0x7ffe895007b0:
    nextrec=0, numrecs=0, chan=0x7ffe89604f10
    postproc=0x0, sub_cb=0x0, udata=0x0
    state=ACTIVE, stat=OK
    act_ts=1970-01-01T00:00:00.000000000Z
    flags=0x140<ALLOC_RID,ON_CHAN_LIST>
    GCL@0x0: NULL
    PDU@0x7ffe89403530:
    v=3, ttl=15, rsvd1=0, cmd=66=CMD_CREATE
        dst=m5FiAV65ufV8Oe3SinfrOZPcbzXtlNOG-lmD-KSsE1U
    src=fb7vsMiCnmvxIssRD0H5RXoYa9QZeAvl9b92lkyJt5g
        rid=2, olen=0, chan=0x0, seqno=0
        flags=0
        datum=0x7ffe894039c0, recno=(none), dbuf=0x7ffe89403a40, dlen=117
        ts=(none)
    sigmdalg=0x0, siglen=0, sig=0x0
    total header=80

gdp_gcl_ops.c:283:_gdp_gcl_create: fail0!
gdp_gcl_ops.c:285:_gdp_gcl_create: fail0, calling _gdp_gcl_decref
_gdp_gcl_decref(0x7ffe89403f50)...
Assertion failed at gdp_gcl_cache.c:466: require:
        (gcl) != NULL && EP_UT_BITSET(GCLF_INUSE, (gcl)->flags)

there should be a version of gdp_gcl_decref that does invoke GDPASSERT_GOOD_GCL(gcl);

Attached is the complete log of the command.

In general, having assertions that call abort make it very difficult to have robust IoT code.

gcl-create.log.txt Magnifier (22 KB) Anonymous, 08/14/2016 09:16 AM


Related issues

Related to GDP - Bug #83: Assertions should not crash the calling process: assertion in gdp_gcl_close() causes the application to exit Closed 10/23/2016

History

#1 Updated by Nitesh Mor almost 7 years ago

  • Assignee set to Eric Allman

Christopher Brooks wrote:

I suspect that gdp-03.eecs.berkeley.edu might be down.

gdp-03 is working fine (just checked). It's issues with GDP router. gcl-create works okay, if you connect directly to the router on gdp-03. The following works fine

$ ./gcl-create -D*=10 -k none -G gdp-03.eecs.berkeley.edu -s edu.berkeley.eecs.gdp-03.gdplogd edu.berkeley.eecs.mor.aug14.test1

But going to gdp-03 via gdp-01 results in the assertion error:

$ ./gcl-create -D*=10 -k none -s edu.berkeley.eecs.gdp-03.gdplogd edu.berkeley.eecs.mor.aug14.test2
_gdp_chan_open(gdp-01.eecs.berkeley.edu; gdp-02.eecs.berkeley.edu; gdp-03.eecs.berkeley.edu; gdp-04.eecs.berkeley.edu)
Trying gdp-01.eecs.berkeley.edu
Talking to router at gdp-01.eecs.berkeley.edu:8007
_gdp_advertise => OK
gdp_init: OK
Couldn't open GCL 0aZ_mffIWtEZQ6CLTpEsA2oQ9a0b54aMgKYxI0ZUePc:
    ERROR: 600 no route available [Berkeley:Swarm-GDP:600]
Creating log as mor@io
_gdp_req_unsend: req 0x191fef0 has NULL GCL
_gdp_req_unsend: req 0x191fef0 has NULL GCL
Assertion failed at gdp_gcl_cache.c:466: require:
    (gcl) != NULL && EP_UT_BITSET(GCLF_INUSE, (gcl)->flags)
Aborted

Regardless, the assertion error still stands.

#2 Updated by Anonymous almost 7 years ago

Is gdp-03 up? I'm getting failures from both Mac OS and RHEL:
~~~
bash-3.2$ ./gcl-create -D*=10 -k none -G gdp-03.eecs.berkeley.edu -s edu.berkeley.eecs.gdp-03.gdplogd edu.berkeley.eecs.cxh.aug15.test2
gdp_chan_open(gdp-03.eecs.berkeley.edu)
Trying gdp-03.eecs.berkeley.edu
Talking to router at gdp-03.eecs.berkeley.edu:8007
gdp_advertise => OK
gdp_init: OK
Couldn't open GCL QU606BrplvT6p4HCZ5exUwhNUonhm8CyO6BQGCzrumE:
ERROR: 600 no route available [Berkeley:Swarm-GDP:600]
Creating log as cxh@terramac1.local
gdp_gcl_ops.c:250:gdp_gcl_create: create a new pseudo-GCL for the daemon so we can correlate the results
gdp_gcl_ops.c:255:gdp_gcl_create: create the request
gdp_gcl_ops.c:260:_gdp_gcl_create: send the name of the log to be created in the payload
gdp
gcl_ops.c:264:_gdp_gcl_create: add the metadata to the output stream
gdp
gcl_ops.c:268:gdp_gcl_create: about to call gdp_invoke()
_gdp_req_unsend: req 0x7feb12e00360 not on GCL list
gdp
pdu
proc_resp: no req for incoming response
PDU@0x7feb12d058e0:
v=2, ttl=0, rsvd1=0, cmd=240=NAK
R_NOROUTE
dst=C584dLeKIDgrPdi2ZemGCp4kLXl1RDMz7FFjIK6ypNM
src=ra41V0YihrYsPkUji64k0Z_i6KMfgOQcbUmHDo4-vMs
rid=2, olen=0, chan=0x7feb12d04f50, seqno=0
flags=0
datum=0x7feb12d06020, recno=(none), dbuf=0x7feb12d060a0, dlen=0
ts=(none)
sigmdalg=0x0, siglen=0, sig=0x0
total header=80
GCL@0x7feb12f005b0: ra41V0YihrYsPkUji64k0Z_i6KMfgOQcbUmHDo4-vMs
iomode = 0, refcnt = 2, reqs = 0x0, nrecs = 0
flags = 0xa
freefunc = 0x0, gclmd = 0x0, digest = 0x0, x = 0x0
utime = 2016-08-15T22:34:26
_gdp
req_unsend: req 0x7feb12e00360 has NULL GCL
gdp_pdu_proc_resp: no req for incoming response
PDU@0x7feb12d058e0:
v=2, ttl=0, rsvd1=0, cmd=240=NAK_R_NOROUTE
dst=C584dLeKIDgrPdi2ZemGCp4kLXl1RDMz7FFjIK6ypNM
src=ra41V0YihrYsPkUji64k0Z_i6KMfgOQcbUmHDo4-vMs
rid=2, olen=0, chan=0x7feb12d04f50, seqno=0
flags=0
datum=0x7feb12d06020, recno=(none), dbuf=0x7feb12d060a0, dlen=0
ts=(none)
sigmdalg=0x0, siglen=0, sig=0x0
total header=80
GCL@0x7feb12f005b0: ra41V0YihrYsPkUji64k0Z_i6KMfgOQcbUmHDo4-vMs
iomode = 0, refcnt = 3, reqs = 0x0, nrecs = 0
flags = 0xa
freefunc = 0x0, gclmd = 0x0, digest = 0x0, x = 0x0
utime = 2016-08-15T22:34:41
gdp_req_unsend: req 0x7feb12e00360 has NULL GCL
gdp_gcl_ops.c:283:gdp_gcl_create: fail0!
gdp_gcl_ops.c:285:gdp_gcl_create: fail0, calling gdp_gcl_decref
gdp
gcl
ops.c:289:_gdp_gcl_create: fail0, calling _gdp
recfref
Could not create GCL: ERROR: Operation timed out [EPLIB:errno:60]
gdp_gcl_ops.c:299:
gdp_gcl_create: fail0, returning estat
gdp_api.c:313:gdp_gcl_create: _gdp_gcl_create() returned
gdp_api.c:318:gdp_gcl_create: about to print estat
gdp_gcl_create: ERROR: Operation timed out [EPLIB:errno:60]
exiting with status ERROR: Operation timed out [EPLIB:errno:60]
bash-3.2$
~~~

#3 Updated by Anonymous almost 7 years ago

  • Category set to libgdp
  • Priority changed from Normal to Low

I can't reproduce this, it was probably because some of the daemons were not working, or else there was a fix. I'm moving this to low priority. It could be closed, though the bug might still exist. I don't seem to have the ability to close bugs?

#4 Updated by Anonymous almost 7 years ago

  • Assignee changed from Eric Allman to Anonymous

Assigning this to myself to see if I can get a close button

#5 Updated by Anonymous almost 7 years ago

  • Status changed from New to Closed
  • Assignee changed from Anonymous to Eric Allman

I'm going to close this because I can no longer reproduce it. The bug probably still persists though.

#6 Updated by Anonymous almost 7 years ago

I can't reopen this, but it is happening again:

bash-3.2$ $PTII/vendors/gdp/gdp/apps/gcl-create -D*=10 -k none -G gdp-01.eecs.berkeley.edu -s edu.\
berkeley.eecs.gdp-03.gdplogd $LOGNAME.5
$PTII/vendors/gdp/gdp/apps/gcl-create -D*=10 -k none -G gdp-01.eecs.berkeley.edu -s edu.berkeley.e\
ecs.gdp-03.gdplogd $LOGNAME.5
_gdp_lib_init(NULL)
        @(#)libgdp 0.7.0 (2016-08-22 09:08) 88929fffe1c744d41409d42683e4760b25d2102e
My GDP routing name = _TdAJ6UhFwCjwkxFyEWeZbYrSYukss1su42ED0sTZRU
gdp_lib_init: OK
_gdp_chan_open(gdp-01.eecs.berkeley.edu)
Trying gdp-01.eecs.berkeley.edu
_gdp_chan_open: talking to router at gdp-01.eecs.berkeley.edu:8007
_gdp_advertise => OK
gdp_init: OK
Couldn't open GCL ZljkBm2KG9yw17pzPnRg8GEC9owAJWXx4DtQi30Cdfc:
        ERROR: 600 no route available [Berkeley:Swarm-GDP:600]
Creating log as cxh@terramac1.local
_gdp_req_unsend: req 0x7fd2c1408a80 has NULL GCL
_gdp_req_unsend: req 0x7fd2c1408a80 has NULL GCL
Assertion failed at gdp_gcl_cache.c:466: require:
        (gcl) != NULL && EP_UT_BITSET(GCLF_INUSE, (gcl)->flags)
Abort trap: 6
bash-3.2$

#7 Updated by Eric Allman almost 7 years ago

  • Status changed from Closed to In Progress

#8 Updated by Eric Allman over 6 years ago

  • Status changed from In Progress to Resolved

I believe this is related to (or the same as) issue #83, which has been fixed.

#9 Updated by Eric Allman over 6 years ago

  • Related to Bug #83: Assertions should not crash the calling process: assertion in gdp_gcl_close() causes the application to exit added

#10 Updated by Eric Allman about 6 years ago

  • Status changed from Resolved to Closed

Closed due to duplication (#83).

Also available in: Atom PDF