Hi everyone,
Writing to get some ideas of causes for an already fixed issue.
We got reports of random call drops and delays in one of our offices (Office "A"). MXeIII logs showed a bunch of packet loss / loss of connectivity errors, like such:
2297 Info 2019/Jul/24 9:23:43 E2tSp "IP Network Dropped 2.82 percent of Packets from phone 10.221.41.18Total Rx: 13522 Lost=392 maxLossBurst=392 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2296 Info 2019/Jul/24 9:23:24 E2tSp "IP Network Dropped 2.88 percent of Packets from phone 10.221.41.96Total Rx: 13232 Lost=392 maxLossBurst=392 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2295 Info 2019/Jul/24 9:21:25 E2tSp "IP Network Dropped 8.31 percent of Packets from phone 10.221.41.26Total Rx: 4315 Lost=391 maxLossBurst=391 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2294 Info 2019/Jul/24 9:17:31 E2tSp "IP Network Dropped 2.02 percent of Packets from phone 10.221.41.57Total Rx: 18929 Lost=391 maxLossBurst=391 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2293 Info 2019/Jul/24 9:17:20 SIP Resiliency "All Resilient Devices associated with peer ICP CEID = 5, IP addr = 10.121.1.216, have been failed-back to their primary controller"
2292 Info 2019/Jul/24 9:16:23 SIP Resiliency Link 5-6 to peer IP address 10.121.1.216 has gone down. Attempting to bring link back up.
2291 Info 2019/Jul/24 9:05:47 E2tSp "IP Network Dropped 9.62 percent of Packets from phone 10.221.41.85Total Rx: 3446 Lost=367 maxLossBurst=367 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2290 Info 2019/Jul/24 9:02:03 E2tSp "IP Network Dropped 6.35 percent of Packets from phone 10.221.41.41Total Rx: 5400 Lost=366 maxLossBurst=366 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2289 Info 2019/Jul/24 9:01:24 E2tSp "IP Network Dropped 5.75 percent of Packets from phone 10.221.41.26Total Rx: 6013 Lost=367 maxLossBurst=367 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2288 Info 2019/Jul/24 8:48:41 E2tSp "IP Network Dropped 4.92 percent of Packets from phone 10.221.41.34Total Rx: 7494 Lost=388 maxLossBurst=388 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2287 Info 2019/Jul/24 8:32:16 E2tSp "IP Network Dropped 2.08 percent of Packets from phone 10.221.41.47Total Rx: 17425 Lost=370 maxLossBurst=370 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2286 Info 2019/Jul/24 8:25:28 E2tSp "IP Network Dropped 2.51 percent of Packets from phone 10.221.41.95Total Rx: 15120 Lost=389 maxLossBurst=389 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
2285 Info 2019/Jul/24 8:23:35 E2tSp "IP Network Dropped 4.31 percent of Packets from phone 10.221.41.85Total Rx: 8651 Lost=390 maxLossBurst=390 1 in a row: 0 2 in a row: 0 3-5 in a row: 0 6-10 in a row: 0 11 or more: 1"
Obviously Mitel suspected a network issue.
Office "A" in question has an MXeIII 8.0PR3, with a PRI where the inbound calls are received. It is clustered over an MPLS WAN to a vMCD (where phones register, no call flows here), and to 2 other MXeIII over the WAN as well (Office "B" and "C")
Problem in these logs is that the Office "A" MXe had supposed network connectivity loss to the vMCD.. AND to phones in Office "A" (which are local/same vlan) AND Office "B" phones AND Office "C" phones.
Basically, Office "A" MXe had problems talking with any IP device. We suspected that Office "A" MXe was perhaps overwhelmed, but CPU/MEM usage was showing normal
So after running a few traces and tests on the network, we determined the network was fine and proceeded to reboot the Office "A" MXeII. Lo and behold, all packet loss messages after the reboot stopped, voice returned to normal, etc.
We've escalated to Mitel for further investigation, and we'll check a software version upgrade, but anyone else here has seen this before? I've read a few posts here mentioned problems in older gen MXeII MIPS processors that affected IP connectivity, could this be the same thing?