Project

General

Profile

Bug #69

Read, Write Errors

Added by Syed Hasnain almost 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Start date:
10/07/2016
Due date:
% Done:

0%


Description

Get the following errors while trying to write or read to GDP , have successfully written and read these logs earlier. Please advise, as we are trying to test our setup for annual review

The writer error:

sbuser@swarmnuc015:~/gdp/apps$ sudo ./gdp-writer edu.tamu.watch1.test 1 4567
2016-10-07 12:57:40.005047 -0500 gdp-writer: _gdp_req_freeall: couldn't acquire req lock: ERROR: request freed while in use [Berkeley:Swarm-GDP:31]

gdp-writer: exiting with status ERROR: Connection timed out [EPLIB:errno:110]

While trying to read:

The command I used to read log (multiple times)

debian@swarmgateway:/var/log$ gdp-reader -s -t edu.tamu.watch1.test
Reading GCL tl0B06__BR4LdVoJnv5b-4BVqq3IurevgNdfD-Gf_FY
 >>> recno 72580, len 207, ts 2016-10-07T17:39:56.878161000Z +/- 0.500000
{"DEVICE": "9096", "ACC_X": 2.392578125, "ACC_Y": 0.0244140625, "ACC_Z": -0.048828125, "GYRO_X": 0.48828125, "GYRO_Y": -0.732421875, "GYRO_Z": 0.9765625, "PPG": 10520065, "HEAR_RATE": 70, "TIME": 2688123835}
 >>> recno 72581, len 207, ts 2016-10-07T17:39:56.881148000Z +/- 0.500000
{"DEVICE": "9096", "ACC_X": 2.392578125, "ACC_Y": 0.0244140625, "ACC_Z": -0.048828125, "GYRO_X": 0.48828125, "GYRO_Y": -0.732421875, "GYRO_Z": 0.9765625, "PPG": 10520065, "HEAR_RATE": 70, "TIME": 2688123835}
 >>> recno 72583, len 207, ts 2016-10-07T17:39:56.881206000Z +/- 0.500000
{"DEVICE": "9096", "ACC_X": 2.392578125, "ACC_Y": 0.0244140625, "ACC_Z": -0.048828125, "GYRO_X": 0.48828125, "GYRO_Y": -0.732421875, "GYRO_Z": 0.9765625, "PPG": 10520065, "HEAR_RATE": 70, "TIME": 2688123835}
 >>> recno 72582, len 207, ts 2016-10-07T17:39:56.881337000Z +/- 0.500000
{"DEVICE": "9096", "ACC_X": 2.392578125, "ACC_Y": 0.0244140625, "ACC_Z": -0.048828125, "GYRO_X": 0.48828125, "GYRO_Y": -0.732421875, "GYRO_Z": 0.9765625, "PPG": 10520065, "HEAR_RATE": 70, "TIME": 2688123835}
 >>> recno 72584, len 207, ts 2016-10-07T17:39:56.924673000Z +/- 0.500000
{"DEVICE": "9096", "ACC_X": 2.392578125, "ACC_Y": 0.0244140625, "ACC_Z": -0.048828125, "GYRO_X": 0.48828125, "GYRO_Y": -0.732421875, "GYRO_Z": 0.9765625, "PPG": 10520065, "HEAR_RATE": 70, "TIME": 2688123835}
 >>> recno 72585, len 207, ts 2016-10-07T17:39:56.927736000Z +/- 0.500000
{"DEVICE": "9096", "ACC_X": 2.392578125, "ACC_Y": 0.0244140625, "ACC_Z": -0.048828125, "GYRO_X": 0.48828125, "GYRO_Y": -0.732421875, "GYRO_Z": 0.9765625, "PPG": 10520065, "HEAR_RATE": 70, "TIME": 2688123835}

Running the exact same command a minute later:

debian@swarmgateway:~/gdp$ gdp-reader -s -t edu.tamu.watch1.test
Reading GCL tl0B06__BR4LdVoJnv5b-4BVqq3IurevgNdfD-Gf_FY
Assertion failed at gdp_gcl_cache.c:466: require:
        (gcl) != NULL && EP_UT_BITSET(GCLF_INUSE, (gcl)->flags)
Aborted

Even If I try to read just the records already present from the beginning:

debian@swarmgateway:~/gdp$ gdp-reader -t edu.tamu.watch1.test
Reading GCL tl0B06__BR4LdVoJnv5b-4BVqq3IurevgNdfD-Gf_FY
Assertion failed at gdp_gcl_cache.c:466: require:
        (gcl) != NULL && EP_UT_BITSET(GCLF_INUSE, (gcl)->flags)
Aborted

If i try to read an old log, that is not involved in my code at all:

debian@swarmgateway:~/gdp$ gdp-reader -t edu.esp.Tamu.device.MotionNetBKP

Reading GCL MerBLVlMI8-syn_mtadS2eLkAxCsjVvSsZAoo1SMWOw
Assertion failed at gdp_gcl_cache.c:466: require:
        (gcl) != NULL && EP_UT_BITSET(GCLF_INUSE, (gcl)->flags)
Aborted

The gcl-create remains unresponsive, hangs when run.


Related issues

Duplicated by GDP - Bug #76: ERROR: request freed while in use Closed 10/15/2016

History

#1 Updated by Syed Hasnain almost 7 years ago

Getting this on trying to read now.

sbuser@swarmnuc015:~$ gdp-reader -t edu.tamu.watch1.test
Reading GCL tl0B06__BR4LdVoJnv5b-4BVqq3IurevgNdfD-Gf_FY
2016-10-07 15:13:04.350007 -0500 gdp-reader: _gdp_req_freeall: couldn't acquire req lock: ERROR: request freed while in use [Berkeley:Swarm-GDP:31]

[ERROR] gdp-reader: Cannot open GCL:
ERROR: Connection timed out [EPLIB:errno:110]
exiting after 0 records with status ERROR: Connection timed out [EPLIB:errno:110]

#2 Updated by Eric Allman almost 7 years ago

  • Status changed from New to In Progress
  • Assignee set to Eric Allman
  • Priority changed from Normal to High

As near as I can tell, the log server (gdplogd) is losing synchronization with the router layer under high load. I think this is exasperated by some corrupt logs, which I suspect were created on an earlier version of the server. However, I have evidence that at least one log is being written by multiple programs, which will definitely cause anomalies (in particular, lost data). Unfortunately, this is unavoidable owing to the (still partially implemented) leaderless replication algorithm. In any case, this is clearly a bug, and I've upped the Priority since it is clearly serious.

Some of the messages lead me to believe that you are running an old version of the code. Other than the error messages that you are seeing (e.g., Assertion failed at gdp_gcl_cache.c:466: require: (gcl) != NULL && EP_UT_BITSET(GCLF_INUSE, (gcl)->flags)) should be gone (or at least different) now, and only occur after another error has occurred (notably a timeout). But I don't understand the command sudo ./gdp-writer edu.tamu.watch1.test 1 4567 — the gdp-writer from the distribution doesn't allow the additional parameter (1 4567). I can only assume that you have built a custom version.

I'm logging a lot more information from the log server now in the hopes of getting more information, but since I don't know how to reproduce it, I have to wait until it happens again. I'm keeping an eye on it over the weekend.

#3 Updated by Syed Hasnain almost 7 years ago

Hi Eric,

I thought I was running the latest gdp version, after cloning it from the repo, i will check again. Yes The gdp-writer is a custom version, modified to take input from UDP ports and work with Moto 360.

#4 Updated by Nitesh Mor almost 7 years ago

  • Description updated (diff)

Edited description for readability.

#5 Updated by Syed Hasnain almost 7 years ago

Nitesh Mor wrote:

Edited description for readability.

Seeing this error on local GDP as well. Restarting the gdplogd and router fixes the problem. I am pretty sure I was running only one writer.

#6 Updated by Nitesh Mor almost 7 years ago

  • Assignee changed from Eric Allman to Nitesh Mor

#7 Updated by Nitesh Mor almost 7 years ago

  • Duplicated by Bug #76: ERROR: request freed while in use added

#8 Updated by Eric Allman almost 7 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from Nitesh Mor to Eric Allman

Syed, could you try this again, preferably using the latest gdplogd from the repo? I think it may be fixed, and I don't want to waste time trying to reproduce it if it is.

#9 Updated by Eric Allman over 6 years ago

  • Status changed from Feedback to Closed

I'm closing this out since I think it's fixed and I haven't heard from Syed in six months.

Also available in: Atom PDF