Restarted test node from wedged state @ +/- few seconds Wed Feb 26 20:30:19 CDT 2020 Catching up to max height... 619146 SetBestChain: new best=00000000000000000002db0a938c02f03e2b4a7537fb7fcde118e034a2812b40 height=619146 work=4043113637057386640608850467 Complete. Now to let the node get 10 connections, currenly has 9... Done, now has 10 connections. GDB connected to process, currently in 'continue' mode. mod6@localhost ~ $ gdb -p 31902 GNU gdb (Gentoo 7.12.1 vanilla) 7.12.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word". Attaching to process 31902 [New LWP 31913] [New LWP 31915] [New LWP 31933] [New LWP 31934] [New LWP 31935] 0x000000000079c25e in __syscall () (gdb) break net.h:322 Breakpoint 1 at 0x454700: file net.h, line 322. (gdb) continue Continuing. -------------------------------------------------------------------------- Here are the code changes that I've compiled in this time: mod6@localhost ~/trb-keccak $ vdiff a b diff -uNr a/bitcoin/src/net.h b/bitcoin/src/net.h --- a/bitcoin/src/net.h 492c9cc92a504bb8174d75fafcbee6980986182a459efc9bfa1d64766320d98ba2fa971d78d00a777c6cc50f82a5d424997927378e99738b1b3b550bdaa727f7 +++ b/bitcoin/src/net.h d52ec234b0d7a7088aa357b70e0d33a3eaa9b37c0ae64466d09f119bb9d7f2420d89a76e839a832b77fc15c95b9e8ead377989fec991d4f9f6555b087cae5d8f @@ -106,8 +106,8 @@ int64 nLastRecv; int64 nLastSendEmpty; int64 nTimeConnected; - unsigned int nHeaderStart; - unsigned int nMessageStart; + int64 nHeaderStart; + int64 nMessageStart; CAddress addr; int nVersion; std::string strSubVer; @@ -278,6 +278,11 @@ ENTER_CRITICAL_SECTION(cs_vSend); if (nHeaderStart != -1) AbortMessage(); + if (vSend.size() >= SendBufferSize()) + { + printf("Overran send buffer! Abort!\n"); + AbortMessage(); + } nHeaderStart = vSend.size(); vSend << CMessageHeader(pszCommand, 0); nMessageStart = vSend.size(); @@ -289,7 +294,7 @@ void AbortMessage() { - if (nHeaderStart == -1) + if (nHeaderStart < 0) return; vSend.resize(nHeaderStart); nHeaderStart = -1; @@ -309,9 +314,14 @@ return; } - if (nHeaderStart == -1) + if (nHeaderStart < 0) return; + // XXX Debug: Check for 'Size wedge' early, before we call Hash() + if ((vSend.end() - (vSend.begin() + nMessageStart)) >= 0x100000000) { + printf("XXX Debug: Size wedge. Break.\n"); + } + // Set the size unsigned int nSize = vSend.size() - nMessageStart; memcpy((char*)&vSend[nHeaderStart] + offsetof(CMessageHeader, nMessageSize), &nSize, sizeof(nSize)); @@ -337,7 +347,7 @@ void EndMessageAbortIfEmpty() { - if (nHeaderStart == -1) + if (nHeaderStart < 0) return; int nSize = vSend.size() - nMessageStart; if (nSize > 0) -------------------------------------------------------------------------- Test 4: Send 49999 'getdata for block' commands to the node, see how it reacts and if it recovers. Host, Port, File = 127.0.0.1, 8333, snap_49999.txt mod6@localhost ~ $ ./wedger.py 127.0.0.1 8333 snap_49999.txt Alive: V=99999 (/therealbitcoin.org:0.9.99.99/) Jumpers=0x1 (TRB-Compat.) Return Addr=1.2.3.4:8333 Blocks=619147 Sending 1799991-byte message packet... Now listening for replies (Ctl-C to quit...) Violated BTC Protocol: Invalid payload length! .... In the debug.log after all of the request spam: mod6@localhost ~ $ grep "Size wedge" ~/.bitcoin/debug.log | wc -l 0 Lots of these: received getdata for: block 000000000000000000342b4b65f2b73f350b4919242e703d81407678ac1d0f04 Overran send buffer! Abort! 02/27/20 02:53:59 sending: block Size large: 678745 (0x7f622727e17b, 0x7f6227323cd4) (678745 bytes) received getdata for: block 000000000000000000109a2f982ce3dc7d9630e0a17dc25e420563c103a3ac40 Overran send buffer! Abort! 02/27/20 02:53:59 sending: block Size large: 315159 (0x7f6227323cec, 0x7f6227370c03) (315159 bytes) Reinspect code, think through if we need these changes in net.h and if and where we should put a check for SendBufferSize(). ----------------------------------------------------------------------- Debugging section: --------------------------------------------------------------------------