Skip to content

aruba/sdwan-edgeconnect-performance-monitoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edge Connect Monitoring

This project offers tools to monitor the performance of Edge Connect appliances, either through the GMS environment or directly from the appliances. It supports monitoring CPU usage, packet drops by reason codes, and more. The system collects consecutive samples at regular intervals and logs alerts to a syslog server.

Project Structure

orch-monitoring/

├── main.py # Entry point for the monitoring application

├── config.ini # Configuration file, refer documentation in config.ini for details

├── requirements.txt # Python dependencies

├── monitor/ │

├── appliance_helper.py │

├── edge_connect_monitor.py │

├── gms_helper.py │

├── gms_monitor.py │

└── utils.py

If you are using API key for authentication The Orchestrator api key must have read-write permission.

Setup and how to run the script

Note: Follow points 1,5,6,7 to run script as setting up virtual environment and installing dependencies are handled in main.py, you can also set up manually if needed as mentioned in point 2,3 4.

##Ensure you have required root permissions to create venv

Manual Virtual Environment Creation

  1. Clone the repository Download or clone the repository to your local machine.

  2. Create a Python virtual environment (comment the lines in main.py that create venv and install dependencies if you are doing this manually) Open a terminal and run: python -m venv orch_monitoring/venv This creates a virtual environment in orch_monitoring/venv.

  3. Activate the virtual environment On Linux/macOS: source orch_monitoring/venv/bin/activate On Windows: orch_monitoring\venv\Scripts\activate

  4. Install dependencies With the virtual environment activated, run: pip install -r requirements.txt

  5. Configure your settings Edit config.ini to match your environment,credentials and other monitoring settings.

  6. Run the main application python3 main.py python3 main.py --logfile=<file_name> --> in case if you need to create a local log file This starts the monitoring application.

7.how to check syslog To monitor syslog, Login into syslog server run the following command in linux: Ensure port 514 is allowed in firewall and enable lines"module(load="imudp") and input(type="imudp" port="514")" in /etc/rsyslog.config and restart the service Ensure syslog service is running in server eg commands: systemctl service rsyslog restart, systemctl service rsyslog status

tail -f /var/log/syslog or tail -f /var/log/boot.log

This will display the latest log entries in real-time. eg:Jul 30 08:46:00 <client_ip> Orch-monitor: NOTICE hostname: CPU core 2 consecutively below threshold 60 for 30 seconds.

System Requirements

upto 10K parallel requests

  • Operating System: Linux (Ubuntu 20.04+), or macOS 12+,Windows 10/11
  • CPU: 2 CPU's
  • Memory: 4GB RAM and 20GB free Hard disk space
  • Python: Version 3.8 or higher
  • Network: Access to Edge Connect appliances and GMS environments and internet access to install dependencies

One liner explanation for Packet Drop Reason Codes

  1. LAN Rx q full --> datapath inbound processing queue drop – inbound packet rate is higher than datapath packet processing capacity.
  2. LAN Tx fifo full --> This is deprecated
  3. LAN Tx zci fifo full --> This is deprecated:PKT_DBG_LAN_TX_ZCI_FIFO_FULL_DEPRECATED
  4. WAN Rx q full --> Inbound fabric tunnel processing queue drop – inbound packet rate over fabric tunnel is higher than packet processing capacity
  5. WAN Tx zci fifo full --> This is deprecated:PKT_DBG_WAN_TX_ZCI_FIFO_FULL_DEPRECATED
  6. ZCI TX fifo full --> datapath outbound packet drop in kernel – kernel is experiencing internal error
  7. ZCI rx fifo overflow --> ZCI Internal Error – inbound packet receives queue drop
  8. Datapath to ZCI TX LBMQ full --> datapath to ZCI tx queue drop – ZCI is over capacity and unable to process data packets from datapath for transmission.
  9. RX inner lbmq full --> Inbound packet from fabric tunnel is dropped by ZCI - ZCI is over capacity and unable to process data packet received from fabric tunnel
  10. QOS packet expired (egress) --> packet was in queue longer than “max wait time” – these are packets waiting to be transmitted
  11. QOS packet expired (ingress) --> packet was in queue longer than “max wait time” – these are inbound shaped packets
  12. WAN Rx decoal q full --> Decoalescing queue full

Contact

For questions or support, contact the Automation_SP_Team [email protected].

About

EdgeConnect scripts for performance monitoring

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages