I have a S100 that always deliver the CPU overload emails. But there is no impact for the calls.
Is there any way I could find out why the CPU overload frequently?
Thanks,
I have a S100 that always deliver the CPU overload emails. But there is no impact for the calls.
Is there any way I could find out why the CPU overload frequently?
Thanks,
The CPU overload is usually attributable to a few things which may be associated to one or more of the below which I have seen in the past :
1. You are using TLS and SRTP to encrypt and protect the calls from and to the provider and the number of calls being conducted is taxing the CPU as it must perform both the encryption and decryption and if the same security is used on the phones, it poses an additional load. Other IP-PBX manufacturers will indicate something along the lines of being able to accommodate X number of calls, but if using TLS/SRTP, the call quantity is degraded to Y because of the extra load. While Yeastar does support TLS and SRTP, they make no such mention of any degradation on call capacity, but regardless, the function does add CPU load which while it may not tax the S-PBX by itself, it might be a contributor when some of the other PBX functionality is employed.
2. Call Recording - adds additional load as the PBX is being asked to take a voice stream that is encoded with a codec and then translate same to a file format and then store the file to a device be it local or external to the PBX. The more calls, the more translations & transfers resulting in more CPU load.
3. Codec transcoding - as an example, this is where the provider is using g711 yet the phones are set to use g722 and the system is having to transcode the codecs between the provider and the phones so as to deliver understandable audio. The same, albeit somewhat less cumbersome, is when the phones behind the NAT are not set to allow RTP streams between themselves, but rather go thru the PBX which again must handle the streams. Generally, the PBX would set the call up and allow the phones to communicate the voice between phones directly thereby averting the need for the PBX to handle. This is why I limit the number of codecs and typically set the same codecs in the phones as what the provider supports. g722 may sound great, but not that many providers support it and its vocal frequency bandwidth is greater than what a PSTN will convey; so its use is generally only fully usable when communicating internally. This by itself is lower than the others, but it is a good practice to try and manage.
4. My favorite - the PBX is exposed to the Internet or sitting behind a router/firewall and is also using a SIP provider and the firewall (preferably the router/firewall) is not set to filter and allow desired IPs. The first issue is that while the PBX does indeed have a firewall, the device is first and foremost a PBX. By using the firewall function when directly exposed to the Internet, the PBX is now facing the threat of attacks that it must now defend itself from. As a result, it must see and handle every packet that comes its way and then must make a decision as to allowing the packet to pass (accept), drop or reject. Even if an IP ends up in the balcklist, this does not mean that the offending party has stopped their attempts; only that the PBX will not allow the IP to progress even should they stumble upon a valid extension, authID and password or login. If the firewall is set to reject, this then tells a hacker that there is indeed a device responding, so they know something is there which may invite the hacker to increase their efforts. If set to drop, then the firewall will not allow the packet to pass, but will also not respond to the hacker, so the hacker has no reason to think they actually reached a device. I never expose any IP-PBX directly to the Internet, but instead put it behind a decent router/firewall where I can let the router/firewall manage what gets in by setting up allowed IPs and ports so that the PBX does not have to spend its time fighting off any attacks. You might get an idea of the possibility of this by temporarily disconnecting the Ethernet for a bit and see if the CPU load drops in conjunction. I am assuming the overload occurs when there are no active calls so, if true, then disconnecting the cable and seeing a drop in the load (use the system resource monitor).
5. User access to the system. If you have setup and allowed user to get into the system, then they may be a contributor.
Can you specifically identify which aspect, maybe, but I suspect that you would need to setup a syslog server and then monitor the various modules and try and decipher same. Given that the CPU load is a culmination of all modules and their need to perform tasks, it may not just be a single aspect but a combination of several and as things are occurring dynamically, it could prove to be daunting to sort it all out. Perhaps Yeastar has a tool of some sort that they can use to advice more so on the matter, You would need to submit a ticket with the request.
Hi Larry,
Thanks for the your detailed suggestion. Very helpful.
But my device is just for intranet use, no remote extension. And no recording setup, neither is the TLS or SRTP.
I agree that it is a disaster to eliminate one by one which takes much time. I think Yeastar should offer a tool or log to show more details.
So, the PBX is not using the Internet for any communications?
Still, what happens if you disconnect the Ethernet cable?
Yes, I would try it. Same time, I had submit a ticket to the Yeastar support to see how they respond.
Hi Riddle,
Sometimes, the cpu overload issue is related to data flooding or application process occupancy. You may login the device via Putty, then use the "df" command to check if there is any abnormal event on the cpu utility.
Hi Graham, Thanks. I already contacted the Yeastar support, still in investigation stage.
It seems the firewall issue. Yeastar sent me a patch to fix this issue. Till now, it looks fine. They said, firewall got stuck for some specific rules when parse domains. I mean the default rules with Yeastar domains. Then CPU got overload. Hopefully this is fixed. I would watch for more days.