AFAIK this issue is only present in kplex/SK interaction so I propose to wait for the fix in SK node server that tkurki propose. If this happens again in other scenario we could apply solutions proposed by e-sailing and stripydog.
To summarize...
Provisional fix
Manually edit ~/pi/.kplex.conf
Code:
###defaults
[udp]
name=system
direction=in
port=10110
[tcp]
name=signalk
direction=out (remove this line or replace by both)
mode=server
port=30330
###end of defaults
Fix in SK
https://github.com/SignalK/signalk-server-node/pull/581
If this happens again
changes in kplex?
heartbeat in kplex
heartbeat in openplotter
watchdog in openplotter closing connections
Signal K server 1.4.3 is already available in npm, please could you test and report?
Type:
sudo npm install --unsafe-perm -g signalk-server
and reset OP
(2018-08-10, 05:45 PM)Sailoog Wrote: [ -> ]Signal K server 1.4.3 is already available in npm, please could you test and report?
Type:
sudo npm install --unsafe-perm -g signalk-server
and reset OP
Updated and this from running overnight >
Code:
kplex 1436 pi 5u IPv4 17290 0t0 TCP *:30330 (LISTEN)
kplex 1436 pi 9u IPv4 19599 0t0 TCP localhost:30330->localhost:57696 (CLOSE_WAIT)
kplex 1436 pi 10u IPv4 19603 0t0 TCP 10.10.10.1:30330->10.10.10.1:40634 (ESTABLISHED)
kplex 1436 pi 11u IPv4 19734 0t0 TCP localhost:30330->localhost:45436 (CLOSE_WAIT)
kplex 1436 pi 12u IPv4 83155 0t0 TCP localhost:30330->localhost:51624 (CLOSE_WAIT)
kplex 1436 pi 13u IPv4 104581 0t0 TCP localhost:30330->localhost:53076 (CLOSE_WAIT)
kplex 1436 pi 14u IPv4 107982 0t0 TCP localhost:30330->localhost:53594 (CLOSE_WAIT)
kplex 1436 pi 15u IPv4 111659 0t0 TCP localhost:30330->localhost:56174 (ESTABLISHED)
node-red 1453 pi 20u IPv4 17367 0t0 TCP 10.10.10.1:40634->10.10.10.1:30330 (ESTABLISHED)
node 10278 pi 48u IPv4 119362 0t0 TCP localhost:56174->localhost:30330 (ESTABLISHED)
(2018-08-08, 03:36 PM)tkurki Wrote: [ -> ]How about a watchdog thread that closes connections that the client has closed (that are in CLOSE_WAIT), no matter what?
Not that simple. Each connection in kplex has its own thread and each thread is responsible for all the data associated with that connection. If you simply close the socket the next client connection will re-use the closed file descriptor which its (old) controlling thread doesn't realise has been closed under it and gets you into a world of pain. Better to signal the controlling thread to die but then this is becomes a non-trivial architectural change.
Again, this is not ideal and hands up, no output with thousands of unfulfilled clients was not a scenario I considered, but a kludge as a workaround while awaiting a complete re-architecture is probably the most straight forward solution here.
(2018-08-11, 10:21 AM)PaddyB Wrote: [ -> ]Updated and this from running overnight >
Code:
kplex 1436 pi 5u IPv4 17290 0t0 TCP *:30330 (LISTEN)
kplex 1436 pi 9u IPv4 19599 0t0 TCP localhost:30330->localhost:57696 (CLOSE_WAIT)
kplex 1436 pi 10u IPv4 19603 0t0 TCP 10.10.10.1:30330->10.10.10.1:40634 (ESTABLISHED)
kplex 1436 pi 11u IPv4 19734 0t0 TCP localhost:30330->localhost:45436 (CLOSE_WAIT)
kplex 1436 pi 12u IPv4 83155 0t0 TCP localhost:30330->localhost:51624 (CLOSE_WAIT)
kplex 1436 pi 13u IPv4 104581 0t0 TCP localhost:30330->localhost:53076 (CLOSE_WAIT)
kplex 1436 pi 14u IPv4 107982 0t0 TCP localhost:30330->localhost:53594 (CLOSE_WAIT)
kplex 1436 pi 15u IPv4 111659 0t0 TCP localhost:30330->localhost:56174 (ESTABLISHED)
node-red 1453 pi 20u IPv4 17367 0t0 TCP 10.10.10.1:40634->10.10.10.1:30330 (ESTABLISHED)
node 10278 pi 48u IPv4 119362 0t0 TCP localhost:56174->localhost:30330 (ESTABLISHED)
So better but not perfect: still a few of those pesky CLOSE_WAIT connections.
A scenario that could produce this is SK server crashing for some unrelated reason and getting restarted by systemd.
This is actually a realistic case re: kplex and CLOSE_WAIT in general: a crashing client that gets restarted will produce dangling connections.
Do you see SK server restarting in syslog?
You can also get SK tcp provider debug logging by running the server with environment variable DEBUG set
Code:
DEBUG=signalk-provider-tcp
This will log all connects, disconnects, errors and reconnects.