Author Topic: Fail back phones from the resilient controller  (Read 5534 times)

Offline VinceWhirlwind

  • Hero Member
  • *****
  • Posts: 899
  • Country: au
  • Karma: +31/-0
    • View Profile
Fail back phones from the resilient controller
« on: September 05, 2017, 10:50:15 PM »
I have a cluster.
My handsets and extensions are resilient to a second controller.
I have a site that had a network outage on their primary link and the failover took long enough for the phones to decide to failover to their secondary controller.

How do I now fail back individual extensions from their resilient controller? HST CANCEL HANDOFF does nothing.


Offline Ronan

  • Full Member
  • ***
  • Posts: 149
  • Country: fr
  • Karma: +1/-0
    • View Profile
Re: Fail back phones from the resilient controller
« Reply #1 on: September 06, 2017, 04:56:05 AM »
That should be automatic.

With the command hsTrace 1 on both controllers (using Putty, not in the maintenance commands) you can see what's going on. The secondary controller should be checking the status of the primary. After a few exchanges the secondary should be sending the phones back, small groups at a time.

Once you have the traces activated you can try these commands to force things :

hsShowIcps              show a list of handoff ICPs
hsShowIcp index         show a given ICP

hsShowSets              show all resilient devices
hsShowSet index         show a given device

hsTrace 1/0             turn on/off handoff process trace on console

                              Case 1: Safely Failover:
                              ------------------------
hsHandoff               do at 1st-ICP: safely handoff ALL DNs to 2nd-ICP
hsTakeback              do at 1st-ICP: undo hsHandoff, safely rehome all phones

                              Case 2: Forced Failover:
                              ------------------------
hsPushAll               do at 1st-ICP: push ALL phones to 2nd-ICP
hsPushOne  "dn"         do at 1st-ICP: push ONE phone given DN to 2nd-ICP
hsTakeback              do at 1st-ICP: undo above both Push commands, safely rehome phones

                              Case 3: Forced Failback:
                              ------------------------
hsPushback              do at 2nd-ICP: forced-failback ALL phones to 1st-ICP

                              Case 4: Repair Failures:
                              ------------------------
hsRun                   do at ANY-ICP: safely clear/repair handoff troubles!

Offline eugenej

  • Full Member
  • ***
  • Posts: 94
  • Country: 00
  • Karma: +2/-0
    • View Profile
Re: Fail back phones from the resilient controller
« Reply #2 on: September 14, 2017, 02:58:07 PM »
I have a cluster.
My handsets and extensions are resilient to a second controller.
I have a site that had a network outage on their primary link and the failover took long enough for the phones to decide to failover to their secondary controller.

How do I now fail back individual extensions from their resilient controller? HST CANCEL HANDOFF does nothing.

I think the HST command only works when you initiated the failover manually in the first place.
to answer your question though, the phones will failback automatically after a pre-set period. normally 5 minutes (I think). A heartbeat check every 60 secs for 5 consecutive times.


Offline VinceWhirlwind

  • Hero Member
  • *****
  • Posts: 899
  • Country: au
  • Karma: +31/-0
    • View Profile
Re: Fail back phones from the resilient controller
« Reply #3 on: September 18, 2017, 03:47:14 AM »
Yeah, except they aren't failing back, which is why I want to force them.

Offline eugenej

  • Full Member
  • ***
  • Posts: 94
  • Country: 00
  • Karma: +2/-0
    • View Profile
Re: Fail back phones from the resilient controller
« Reply #4 on: September 19, 2017, 08:26:02 AM »
Yeah, except they aren't failing back, which is why I want to force them.

is it possible that the connectivity between your two systems isn't 100%. the secondary ICP/MiVB will poll the primary and only if it's happy will it instruct the phones to redirect back to their primary system

Offline lundah

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1216
  • Country: us
  • Karma: +66/-0
  • Senior Chief Grunt
    • View Profile
Re: Fail back phones from the resilient controller
« Reply #5 on: September 21, 2017, 10:21:04 AM »
If they aren't failing back, open a session to the secondary controller's console (either via the serial port or use puTTY to connect in RAW mode to port 2002) and do the hsPushback command.

Offline VinceWhirlwind

  • Hero Member
  • *****
  • Posts: 899
  • Country: au
  • Karma: +31/-0
    • View Profile
Re: Fail back phones from the resilient controller
« Reply #6 on: September 28, 2017, 09:28:31 PM »
OK thanks for that, I will try it, however the issue is quite weird.
It *looks* like a connectivity issue - it looks like the failed over handset can't see the primary controller to fail back to.
Additionally, this handset on the resilient controller cannot dial handsets on the primary *if those extensions were recently dialled*.
If I dial from the failed over handset to a handset on the primary that I haven't previously dialled, it works fine.
This happens to a small number of handsets whenever a group of handsets on a site loses its connection to the primary. Most come back fine. Some need a reboot.

Ultimately, what this looks like is a networking issue on the handset. The handset may be holding onto routes that are no longer valid. When the secondary controller gives the dialling handset the IP address of the dialled handset, the handset can't get to handsets it already knows the IP for (from earlier calls).
I would have thought it would just use the default route and so either all succeed or all fail. I need to see the routing table on the handset. I hope that's possible, will investigate further.

Offline eugenej

  • Full Member
  • ***
  • Posts: 94
  • Country: 00
  • Karma: +2/-0
    • View Profile
Re: Fail back phones from the resilient controller
« Reply #7 on: October 31, 2017, 05:38:49 AM »

It *looks* like a connectivity issue - it looks like the failed over handset can't see the primary controller to fail back to.

Just bear in mind, the phone does not poll the primary system. the secondary system does that for the phone and once it is happy, it will instruct the phone to fail back. The phone does not make this decision as far as I recall. I haven't been an onsite tech for many years but recall this from memory





 

Sitemap 1 2 3 4 5 6 7 8 9 10