Project

General

Profile

Bug #84

Assertion in gdp_gcl_create() crashes the calling process

Added by Anonymous over 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
libgdp
Start date:
10/23/2016
Due date:
% Done:

0%


Description

The GDP should not crash programs that use the GDP.

Here's an assertion in gdp_gcl_create().

Is there a way to turn these off?

Here's the log:

JavaScriptGDPLogName: JavaScriptGDPLogName: ptolemy.actor.lib.jjs.modules.gdp.test.auto.GDPLogSubscribeJS.0.5884172563617972 (Thread[GDPLogSubscribeJS,1,main])
GDPLogCreate: GDPLogCreate.js: create() Start. (Thread[GDPLogSubscribeJS,1,main])
GDPHelper.GDPHelper(ptolemy.actor.lib.jjs.modules.gdp.test.auto.GDPLogSubscribeJS.0.5884172563617972, 3, edu.berkeley.eecs.gdp-01.gdplogd):
GDPManager: Using configuration files in /Users/cxh/.ep_adm_params
GDPManager: gdp settings:
swarm.gdp.routers=gdp-03.eecs.berkeley.edu; gdp-02.eecs.berkeley.edu
GDP_GCL.java: newGCL(org.terraswarm.gdp.GDP_NAME@42a3cfbc, 3, org.terraswarm.gdp.GDP_NAME@6acf3338)
GDP_GCL.java: GDP_GCL(org.terraswarm.gdp.GDP_NAME@42a3cfbc(gEwMhqmS_9hcQEX9-U05nsYdwUCmnsRWcsd-H9wYhtU^@), RA, org.terraswarm.gdp.GDP_NAME@6acf3338(m5FiAV65ufV8Oe3SinfrOZPcbzXtlNOG-lmD-KSsE1U^@)
_gdp_lib_init(NULL)
        @(#)libgdp 0.7.2 (2016-10-22 21:40) ae1cebb1a5135c3ec22f4a3222e12a60cc76bb3b
My GDP routing name = uWkPiA1k94Pujw10-A4YDx4pHjbafUPYozGvvEszG9E
gdp_lib_init: OK
Trying Zeroconf:
create_new_simple_poll_client: avahi_simple_poll_get: Daemon not running
Failed to create client object: Daemon not running
gdp_zc_scan: create_new_simple_poll_client failed: No such file or directory
gdp_zc_scan failed
_gdp_chan_open(gdp-03.eecs.berkeley.edu; gdp-02.eecs.berkeley.edu)
Trying gdp-03.eecs.berkeley.edu
_gdp_chan_open: trying host gdp-03.eecs.berkeley.edu port 8007
Trying gdp-02.eecs.berkeley.edu
_gdp_chan_open: trying host gdp-02.eecs.berkeley.edu port 8007
_gdp_chan_open: could not open channel: ERROR: Network is unreachable [EPLIB:errno:51]
gdp_init: ERROR: Network is unreachable [EPLIB:errno:51]

>>> gdp_gcl_open(gEwMhqmS_9hcQEX9-U05nsYdwUCmnsRWcsd-H9wYhtU)

>>> _gdp_invoke(req=0x7ffab7719e40 rid=1): CMD_OPEN_RA (75), gcl@0x7ffab7735690
        datum @ 0x7ffab7735850: recno -1, len 0, no timestamp
<<< _gdp_invoke(0x7ffab7719e40 rid=1) CMD_OPEN_RA: ABORT: lost connection to GDP [Berkeley:Swarm-GDP:20]
Couldn't open GCL gEwMhqmS_9hcQEX9-U05nsYdwUCmnsRWcsd-H9wYhtU:
        ABORT: lost connection to GDP [Berkeley:Swarm-GDP:20]
_gdp_req_lock: req @ 0x7ffab7719e40 freed
2016-10-23 16:01:47.171041 -0700 java: _gdp_req_freeall: couldn't acquire req lock: ERROR: request freed while in use [Berkeley:Swarm-GDP:31]

<<< _gdp_req_freeall(0x7ffab77356e8): ERROR: request freed while in use [Berkeley:Swarm-GDP:31]
<<< gdp_gcl_open(gEwMhqmS_9hcQEX9-U05nsYdwUCmnsRWcsd-H9wYhtU): ABORT: lost connection to GDP [Berkele\
y:Swarm-GDP:20]
GDP_GCL: gdp_gcl_open() failed, trying to create the log and call gdp_gcl_open() again.
GDP_GCL.java: create(org.terraswarm.gdp.GDP_NAME@42a3cfbc, , org.terraswarm.gdp.GDP_NAME@6acf3338, {}\
)
gdp_lib_init: OK
Trying Zeroconf:
create_new_simple_poll_client: avahi_simple_poll_get: Daemon not running
Failed to create client object: Daemon not running
gdp_zc_scan: create_new_simple_poll_client failed: No such file or directory
gdp_zc_scan failed
_gdp_chan_open(gdp-03.eecs.berkeley.edu; gdp-02.eecs.berkeley.edu)
Trying gdp-03.eecs.berkeley.edu
_gdp_chan_open: trying host gdp-03.eecs.berkeley.edu port 8007
Trying gdp-02.eecs.berkeley.edu
_gdp_chan_open: trying host gdp-02.eecs.berkeley.edu port 8007
_gdp_chan_open: could not open channel: ERROR: Network is unreachable [EPLIB:errno:51]
gdp_init: ERROR: Network is unreachable [EPLIB:errno:51]

>>> gdp_gcl_create
_gdp_gcl_create: gcl=gEwMhqmS_9hcQEX9-U05nsYdwUCmnsRWcsd-H9wYhtU
        logd=m5FiAV65ufV8Oe3SinfrOZPcbzXtlNOG-lmD-KSsE1U
Assertion failed at gdp_req.c:150: assert:
        !EP_UT_BITSET(GDP_REQ_ON_GCL_LIST, req->flags)
Abort trap: 6
bash-3.2$

Here's the important part of the Java stack trace:

Thread 30 Crashed:: Java: GDPLogSubscribeJS
0   libsystem_kernel.dylib          0x00007fff96037f06 __pthread_kill + 10
1   libsystem_pthread.dylib         0x00007fff83c0f4ec pthread_kill + 90
2   libsystem_c.dylib               0x00007fff8af666df abort + 129
3   jna2395229788052593588.tmp      0x000000017a8c7ac1 ep_assert_failure + 273
4   jna2395229788052593588.tmp      0x000000017a8c32f8 _gdp_req_new + 984
5   jna2395229788052593588.tmp      0x000000017a8bc7e1 _gdp_gcl_create + 273 (gdp_gcl_ops.c:261)
6   jna2395229788052593588.tmp      0x000000017a8b78c5 gdp_gcl_create + 245 (gdp_api.c:319)
7   jna1907739429753297179.tmp      0x000000011c4af514 ffi_call_unix64 + 76
8   jna1907739429753297179.tmp      0x000000011c4af42d ffi_call + 781
9   jna1907739429753297179.tmp      0x000000011c4a6565 0x11c4a2000 + 17765
10  ???                             0x00000001053b49d4 0 + 4382738900
11  ???                             0x00000001053a52bd 0 + 4382675645
12  ???                             0x00000001053a5040 0 + 4382675008

Related issues

Duplicated by GDP - Bug #83: Assertions should not crash the calling process: assertion in gdp_gcl_close() causes the application to exit Closed 10/23/2016

History

#1 Updated by Nitesh Mor over 3 years ago

  • Description updated (diff)

#2 Updated by Nitesh Mor over 3 years ago

Reason this crashed is because I just rebooted the servers, doing some upgrades.

As for "assertion failure" crashing things, yes we will fix that after the review is over, and Eric is back in good shape.

#3 Updated by Anonymous over 3 years ago

Ok, feel free to close these two and open a different one that says "Production code should not use asserts to crash the process".

I'd be happy if I could compile with assertions disabled, or modified to only print a warning message.

#4 Updated by Nitesh Mor over 3 years ago

  • Duplicated by Bug #83: Assertions should not crash the calling process: assertion in gdp_gcl_close() causes the application to exit added

#5 Updated by Eric Allman over 3 years ago

  • Status changed from New to Feedback

This is not a bug. If you look carefully, gdp_init failed:

gdp_init: ERROR: Network is unreachable [EPLIB:errno:51]

... but the program went on to try to use the (uninitialized) library. It's not surprising that didn't work.

As it turns out, gdp_gcl_create attempts an initialization if the caller had not, and that call ignored the result (that is a legitimate bug that will be fixed). You really should do an explicit gdp_init — that's part of the library semantics; in particular, not all libgdp and libep entry points try to "auto-initialize". For this reason it may make sense to remove the auto-initialization entirely in the future.

#6 Updated by Anonymous over 3 years ago

  • Status changed from Feedback to In Progress

If people are to use the the GDP library, then it must be robust. Part of that robustness is to properly handle any input that is supplied by the user.

Crashing the process with assert is not robust. If you want to have asserts in your development code, that is fine, but asserts that cause the process to exist must be disabled if the GDP is to be used as a library.

One possibility is that I could create a branch of the GDP repository that has asserts disabled.

#7 Updated by Eric Allman over 3 years ago

I don't think you appreciate that assertions are not bugs themselves, they are symptoms of bugs. Feel free to create a private version that disables all assertions, but that won't fix the underlying bugs that are causing the assertion failure in the first place. Also beware that any bug reports you file with assertions disabled may be de-prioritized, since there is a likelyhood that the bugs you see simply do not exist (that is, one bug that is undetected can trigger another bug, which would never occur if the first bug were fixed).

That said, I have been working on a version that tries to deal more forgivingly with some (not all) assertions. That's not in the public tree yet, since in many cases it is unclear how any recovery can occur in the first place — in some cases, possibly harder than fixing the real bug that caused the assertion failure in the first place. It's an uphill slog.

#8 Updated by Anonymous over 3 years ago

Your version that deals more forgivingly with assertions sounds like a good thing.

Having preconditions is a good practice. If the precondition is not met, then an error code should be returned.

#9 Updated by Eric Allman over 3 years ago

I believe this assertion has been fixed. Since it's a "can't happen" assertion it's hard to reproduce. None the less, that particular assertion is now non-fatal, although if it triggers you will have a memory leak.

#10 Updated by Eric Allman over 3 years ago

  • Status changed from In Progress to Resolved

#11 Updated by Eric Allman about 3 years ago

  • Status changed from Resolved to Closed

Closing because it appears to have been resolved.

Also available in: Atom PDF