Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Phluxed

Pages: [1]
1
Hi Vince - thanks for the idea. We're looking at that now, however, the challenge is that it is VOIP traffic over the WAN. We also are having no call quality issues whatsoever, and it just instantly drops the call when it flips over.

We can't really QOS beyond our modem.

2
The heartbeat is lost to the main MBG. 3 Beats are missed. We see the applications loading because it fails to the secondary site, we can see that in the packet captures. During the whole time the phone is re-transmitting the packet (we cannot figure out why this is happening, is the original issue at hand) the UDP traffic is still solid. So because the TCP stream is broken, the phone doesn't start a new TCP stream to the same MGB but all the while, the 2 devices are talking a-o-k over the network and internet.

I guess what I'm wondering is, even in the case it does fail over in this instance, if the configurations were the same at each datacenter, would the phone drop the call from the original site MGB? It doesn't seem to drop it when the heartbeats aren't reaching, and only does it the instant the phone flips to the second site.

As for the network issue - we appear to have it happen at probably 10 other sites with different internet and router devices. Aruba switches are a constant, all flavours S1500, S2500, S3500.

3
Hey Ralph - interesting thought on the config being different per site.

If it's failing to the secondary site, should the phone be dropping a call if the software is the same on both sides? That would explain why the phone doesn't seem all that resilient to the heartbeat being missed and immediately dropping the call.

We're not sure if they are clustered in each datacenter that it's hosted in.

4
Hi All,

Thanks for taking the time to take a quick read here - any feedback or insight is appreciated.

We are experiencing an issue where 5330 and 5320 phones that are connected to a MiCloud provider are experiencing the reboot of a random subset of phones on the :01 minute mark of every hour.

In the office in question, let's say there are 200 phones. This appears to happen at multiple offices, but I will focus on the biggest site that is experiencing it. It is also the most frequently impacted as it is where call centers are located and the users are regularly on the phones. The phones are connected to a set of Stacked Aruba MAS3500 switches and are connecting to the provider over public internet. In this case, Rogers in Canada. The bandwidth is 250/20.

We've done some port mirroring and have found that the phones are polled regularly by the controller to send a heartbeat. The heartbeat that is sent on the hour sometimes seems to fail to make it to the MGB - sometimes this can be 1 phone, sometimes 10 phones. The behaviour is ALWAYS seen at the :01 minute and 12 second mark. This is because the heartbeat is set to 24 seconds to time out and is set to 3 heartbeats to decide it is in DR mode. We see the UDP traffic continuously moving out of the network and is only stopped the instant the TCP stream for the phone fails the primary site and flips to the secondary site that is programmed into the phone.

We have done a number of packet captures here (which I unfortunately cannot entirely share) on our WAN interface and see that the heartbeat TCP packet is sending outbound but not getting an ACK so it retransmits the packet a few times and eventually dies failing over (after 72 seconds).

Is anyone aware of any sort automated job that may cause this? Is there something in the switching world that may cause it?

Pages: [1]