This is the first in a series of (long overdue) posts related to odd bugs and behavior experienced in the Cisco Unified Border Element (CUBE) which is built into Cisco IOS. I will spare you all the details, but high level our environment looks like this:
- Cisco Unified Communications Manager (CUCM) – multisite deployment with centralized call processing with geographical diversity
- Contact Center – Cisco CVP including Call Studio, UCCE, Nuance ASR/TTS, Cisco Unified Presence Server (SIP Proxy)
- SIP Trunks with CUBE for Local/Long-Distance and Inbound Toll-Free
Recently, at work, we have had two separate instances with our SIP Service Provider where both their primary and secondary Acme Session Border Controller (SBC) clusters went into a “hung” state and we were off the air from the outside telephone world’s perspective. Despite all the provisioning precautions of having two geographically diverse carrier SBCs accessed from two geographically diverse MPLS transport circuits (used exclusively for SIP trunking) that route to two geographically diverse data centers with a dedicated CUBE router in each, we were still hosed. Doing a quick packet capture on the CUBE’s external interface we could see the provider’s SBCs were responding with SIP 503 “Service Unavailable” messages for every call attempt we made outbound. Inbound calls resulted in an “All Circuits Busy” message to callers and nothing was signaling ingress to our CUBEs from the provider.