Mitel Forums - The Unofficial Source

Mitel Forums => Mitel Software Applications => Topic started by: johnp on September 25, 2015, 05:55:08 PM

Title: Resiliency via MBG
Post by: johnp on September 25, 2015, 05:55:08 PM
I was wondering if anyone has setup a backup connection via internet to mbg for a down MPLS network. If so how is your's setup?
Title: Re: Resiliency via MBG
Post by: ralph on September 26, 2015, 09:24:43 AM
I believe we did it with clustering. 
For example we have an MBG in NY and another in NJ.   
These are clustered.
We point the phones to one or the other.    The phones actually connect to the systems based on their cluster weighting.

Ralph

Title: Re: Resiliency via MBG
Post by: johnp on September 26, 2015, 01:08:14 PM
Certainly it needs to be a cluster of MBG's, with the way the runtime resiliency values get populated. Boottime is easy enough to make happen via dhcp.

The current config I'm working on has a MiCollab in LAN zone with 2 MBG's one at the main site another at the fail over site in the default zone. The main has a vMCD and MXe(with PRI) while the fail over has a CX with second PRI.

My question is more on how best to set up the MBG cluster. I think that they may require a third that handles the initial connection in lan mode weighted the highest followed by the other 2 in either DMZ or server-gateway mode. My hope is to minimize any one way audio issues that may occur.
Title: Re: Resiliency via MBG
Post by: dilkie on September 26, 2015, 04:41:35 PM
The way I've seen it done was for a CPE router to route the MBG's public internet address over MPLS, if it's up, and route that same address via the internet, if MPLS is down. That way the same MBG is reachable on two routes. You can then cluster MBG's to provide resiliency for MBG's downtime (upgrades mostly).

Title: Re: Resiliency via MBG
Post by: dilkie on September 26, 2015, 04:47:59 PM
If you don't want to deploy CPE routers to do what I described above, there is a new feature in MBG. If you deploy a cluster of two MBGs, one accessable from mpls, but not the internet, and the other accessable from the internet (but not mpls), they are clustered on their datacenter lan side... There's a new feature in 9.0 (and available in 8.1 as an override) that's called "ping before redirect" that tells the MBG to ask the set to ping an address to see if it's reachable. You need this if the set can't reach it's mpls MBG, it fails over to the internet MBG, but the internet MBG is still connected, via the backend lan, to the mpls MBG and so will normally try to redirect the set back to the mpls MBG (where the set belongs).. This feature, when turned on, will keep the set hosted on the internet mbg util the set can successfully ping the mpls mbg. Only works for minet sets, not sip, but it works.
Title: Re: Resiliency via MBG
Post by: johnp on September 26, 2015, 05:31:17 PM
Dilkie, I assume in your second scenario you point the phone to the lan MBG for first choice. Do you think having 3, 1 lan only and 2 with internet access would work?
Title: Re: Resiliency via MBG
Post by: dilkie on September 26, 2015, 11:18:42 PM
sure, you can 4, 2 on the lan and 2 with internet if you want, you put them in different cluster zones and assign the sets to the lan zone.
Title: Re: Resiliency via MBG
Post by: johnp on September 27, 2015, 09:06:48 AM
Dilkie can you elaborate and what the "ping before redirect" does? I assume it stops the attempt to reconnect when in failover until it receives a ping response.
Title: Re: Resiliency via MBG
Post by: johnp on September 28, 2015, 08:42:07 AM
I guess if I read your post fully there would be no need for an elaboration :-)
Title: Re: Resiliency via MBG
Post by: johnp on September 28, 2015, 02:15:44 PM
Dilkie do you know haw to set the override? Tech support hasn't been much help and it behaves just as you describe.
Title: Re: Resiliency via MBG
Post by: dilkie on September 28, 2015, 03:21:39 PM
sure, the tug.ini setting will end up as

[ping_before_redirect]
enabled=1

so in the overrides panel that translates to:

section: ping_before_redirect
parameter: enabled
content: 1

if they are running older set loads they may also want to set

section: ping_before_redirect
parameter: reboot_stuck_set
content: 1

to get around a bug in the older set f/w loads. If a set gets into this mode it won't respond to the request to ping and has to be rebooted (automatically, via a s/w command to the set). If you are running MBG 9.0 or better, don't set this!

there's a few other settings to fine tune things, like "send X pings and consider Y responses to be okay to redirect" and the ability to set the ping timeout threshold, but those aren't generally needed, a single ping is sent and if there's a response within 800ms, we consider the other mbg reachable.
Title: Re: Resiliency via MBG
Post by: johnp on September 28, 2015, 03:43:37 PM
Looks like my remote 5340 get version 6.3.0.11 I think that is fairly current MBG 8.1.26
Title: Re: Resiliency via MBG
Post by: dilkie on September 28, 2015, 03:54:16 PM
you have the fix then, that was the f/w version it showed up in. so you just need the one override to turn it on... I recommend you test it out to make sure it's doing what you expect.
Title: Re: Resiliency via MBG
Post by: johnp on September 28, 2015, 07:10:12 PM
Dilkie,

Doing testing with a remote and it looks good no longer tries to connect to lan MBG every 30 seconds. I plan to do more tests onsite but it looks promising and thanks.
Title: Re: Resiliency via MBG
Post by: dilkie on September 28, 2015, 07:16:29 PM
you're welcome.
Title: Re: Resiliency via MBG
Post by: johnp on September 28, 2015, 07:31:48 PM
One other thing, where is that found in version 9? I don't have access to one and was looking at the install blade but didn't see where it was. I also would assume that some translations may need to be added
Title: Re: Resiliency via MBG
Post by: johnp on September 29, 2015, 04:10:52 PM
Did some onsite testing and with the override everything behaved as expected(desired). I was able to fail over and back between MBG's, and the best thing was it didn't try a reconnect every 30 seconds, only when path was restored.

Thanks again for the help.
Title: Re: Resiliency via MBG
Post by: johnp on October 01, 2015, 07:12:56 AM
Dilkie,

I did some more testing and while my 5340 has been rock solid. I have issues with 5320e and 5330e. From the logs, they say ping before redirect is enabled, attempt the ping, then Ping request timer expired, carrying on with redirect-force

Title: Re: Resiliency via MBG
Post by: dilkie on October 01, 2015, 08:18:02 AM
are they running upgraded firmware? check the tug.log, the f/w boot and main version are printed there for each connected set.
Title: Re: Resiliency via MBG
Post by: johnp on October 01, 2015, 09:56:50 AM
I have used 6.3.0.11 be currently have them using 6.3.0.14. Saw some ping lockups at 6.3.0.11. I can revert or use the 6.3.0.12 loads from MBG 9. It have the versions from their MCD running now.
Title: Re: Resiliency via MBG
Post by: dilkie on October 01, 2015, 10:26:22 AM
not sure man, the fix should be in all the .11 versions, at least that I was told. I've asked again.
Title: Re: Resiliency via MBG
Post by: johnp on October 01, 2015, 10:33:25 AM
I'll revert back. From what I gather from my testing is that the ping request time expired log entry is due to the set not responding in a timely fashion as to whether it's ping was a success or failure;

I would further add that this set did run fairly well when connected at my office. I currently have it connected at my house
Title: Re: Resiliency via MBG
Post by: dilkie on October 01, 2015, 10:51:57 AM
that is correct... the set is asked to ping the other mbg's ip address with an 800ms timeout and if mbg doesn't get a response back in a reasonable time, we give the set 10 seconds to respond, then the set is "stuck"... like I said, it's a bug that was fixed.. enabling the other option "reboot_stuck_set" does clear the problem, at the cost of forcing the set through a reboot cycle.

note that mbg doesn't ask a set to ping/or redirect if it's "busy"... ie. the set has to not be in a call, or been in a call, had a digit, handset or tone (ringing) operation performed in the past 15 seconds.
Title: Re: Resiliency via MBG
Post by: johnp on October 01, 2015, 11:24:33 AM
Just an update, I have a static address on the 5320e instead of my home dhcp and it hasn't toggled. Ther may be somthing weird with the address it was given and my home equipment. I added a third 5330e to the mix to monitor.

It looks like it is operating noe closer to what I saw in the office. I do have the reboot option running. Strange thing is my 5340 is handled by dhcp and hasn't had any issue.
Title: Re: Resiliency via MBG
Post by: johnp on October 01, 2015, 03:44:03 PM
My 3 phones seem to be running fine. I'm going to use a different dhcp server and test further. Is there likely to be any adverse effect if I choose to use the firmware from the MCD? I have quite a few phones to pin and get there mac address in the MiCollab. I don't relly want to have to upgrade them and then downgrade if not necessary.
Title: Re: Resiliency via MBG
Post by: dilkie on October 01, 2015, 04:02:11 PM
whatever works...
Title: Re: Resiliency via MBG
Post by: johnp on October 01, 2015, 06:51:48 PM
Just want to say thanks again. Looks like this is going to work. I have the MiCollab border gateway set in the default zone, as most remotes will have this as first choice. Created 2 additional zones one for each back up location. Default is backed up but location 1, and location 1 is backed up by location 2. Location 2 is backed up by location 1.

I see all 3 as availble icp sources with loction 1 as my current, default as the second and location 2 as third. Being a teleworker, this is what I would expect. An internal set should be default, L1 then L2. haven't verified this though.