Troubleshooting¶
This guide covers common issues and their solutions when running pybmpmon.
Quick Diagnostics¶
Check Service Health¶
# Check if services are running
docker-compose ps
# Check BMP port
nc -z localhost 11019 && echo "BMP port OK" || echo "BMP port FAILED"
# Check database
docker-compose exec postgres pg_isready
# View recent logs
docker-compose logs --tail=100 pybmpmon
# Check for errors
docker-compose logs pybmpmon | grep -i error
Common Log Messages¶
Successful startup:
{"event": "bmp_listener_started", "level": "INFO", "host": "0.0.0.0", "port": 11019}
{"event": "sentry_initialized", "level": "INFO", "environment": "production"}
Router connected:
Routes being processed:
Installation Issues¶
Docker Compose Won't Start¶
Symptom: docker-compose up
fails
Causes and Solutions:
-
Missing .env file
-
Port already in use
-
Insufficient permissions
Container Exits Immediately¶
Symptom: docker-compose ps
shows exited container
Diagnosis:
# View exit logs
docker-compose logs pybmpmon
# Common causes:
# - Configuration error
# - Database connection failure
# - Python import error
Solutions:
-
Configuration validation failed
-
Database not ready
Database Issues¶
Cannot Connect to Database¶
Symptom: ConnectionRefusedError
or timeout
Diagnosis:
# Check if PostgreSQL is running
docker-compose ps postgres
# Check database logs
docker-compose logs postgres
# Test connection
docker-compose exec postgres psql -U bmpmon -d bmpmon
Solutions:
-
Database not started
-
Wrong credentials
-
Network issue
Database Schema Not Created¶
Symptom: relation "route_updates" does not exist
Solution:
# Run database initialization
docker-compose exec postgres psql -U bmpmon -d bmpmon -f /docker-entrypoint-initdb.d/01_init.sql
# Or recreate database
docker-compose down -v
docker-compose up -d
Database Fills Up Disk¶
Symptom: No space left on device
Diagnosis:
# Check database size
docker-compose exec postgres psql -U bmpmon -d bmpmon -c "
SELECT pg_size_pretty(pg_database_size('bmpmon'));
"
# Check table sizes
docker-compose exec postgres psql -U bmpmon -d bmpmon -c "
SELECT tablename, pg_size_pretty(pg_total_relation_size(tablename::text))
FROM pg_tables WHERE schemaname = 'public';
"
# Check disk usage
docker-compose exec postgres df -h
Solutions:
-
Enable compression (if not already enabled)
-
Reduce retention
-
Vacuum database
BMP Listener Issues¶
BMP Port Not Accessible¶
Symptom: Routers can't connect to port 11019
Diagnosis:
# Check if port is listening
nc -z localhost 11019
# Check from remote host
nc -z <pybmpmon-ip> 11019
# View firewall rules (Linux)
sudo iptables -L -n | grep 11019
Solutions:
-
Firewall blocking
-
Docker port mapping issue
-
Listening on wrong interface
No Routes Received¶
Symptom: Database empty after router connects
Diagnosis:
# Check if peer connected
docker-compose exec postgres psql -U bmpmon -d bmpmon -c "SELECT * FROM bmp_peers;"
# Check for route updates
docker-compose exec postgres psql -U bmpmon -d bmpmon -c "SELECT COUNT(*) FROM route_updates;"
# View DEBUG logs
# Set LOG_LEVEL=DEBUG in .env and restart
docker-compose logs -f pybmpmon
Solutions:
- Router not configured correctly
- Verify BMP configuration on router
- Check router logs for BMP connection status
-
Ensure router is sending route-monitoring messages
-
Parse errors
-
Batch writer not flushing
Performance Issues¶
High CPU Usage¶
Symptom: Container using 100% CPU
Diagnosis:
# Monitor CPU usage
docker stats pybmpmon
# Check route throughput in logs
docker-compose logs pybmpmon | grep route_stats
Solutions:
-
Too many concurrent connections
-
DEBUG logging overhead
-
Slow database writes
High Memory Usage¶
Symptom: Container using excessive RAM
Diagnosis:
# Monitor memory usage
docker stats pybmpmon
# Check for memory leaks
# Restart and monitor growth over time
docker-compose restart pybmpmon
watch docker stats pybmpmon
Solutions:
-
Large batches
-
Too many database connections
-
Memory leak
Slow Route Processing¶
Symptom: Routes/second much lower than expected
Diagnosis:
# Check throughput in logs (every 10 seconds)
docker-compose logs pybmpmon | grep route_stats
# Expected: throughput_per_sec > 1500
# Check database write latency
docker-compose exec postgres psql -U bmpmon -d bmpmon -c "
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
"
Solutions:
-
Database I/O bottleneck
-
Network latency
-
Too many indexes
Logging Issues¶
No Logs Appearing¶
Symptom: docker-compose logs
shows nothing
Diagnosis:
# Check if container is running
docker-compose ps
# Check log level
echo $LOG_LEVEL
# Try running in foreground
docker-compose up pybmpmon
Solutions:
-
Container not running
-
Logs being sent elsewhere
Logs Too Verbose (DEBUG Mode)¶
Symptom: Huge log files with hex dumps
Solution:
# Change LOG_LEVEL to INFO in .env
LOG_LEVEL=INFO
docker-compose restart pybmpmon
# Rotate logs
docker-compose logs --tail=1000 pybmpmon > last_1000.log
Sentry Not Receiving Events¶
Symptom: No events in Sentry dashboard
Diagnosis:
# Check Sentry initialization
docker-compose logs pybmpmon | grep sentry_initialized
# Verify DSN is set
docker-compose exec pybmpmon env | grep SENTRY_DSN
# Test network connectivity
docker-compose exec pybmpmon curl -I https://sentry.io
Solutions:
-
Sentry not initialized
-
Sentry SDK not installed
-
Network firewall
Router Compatibility Issues¶
Cisco IOS-XR¶
Issue: Router doesn't send routes
Solution:
! Ensure route-monitoring is configured
bmp server 1
update-source Loopback0
router bgp 65000
bmp server 1
activate
route-monitoring pre-policy ! Important
Juniper Junos¶
Issue: Connection established but no routes
Solution:
! Enable route monitoring
set routing-options bmp station pybmpmon route-monitoring pre-policy
commit and-quit
Arista EOS¶
Issue: BMP session flapping
Solution:
! Increase hold timers
router bgp 65000
bmp server pybmpmon
tcp keepalive idle 120 interval 60 probes 3
Data Integrity Issues¶
Duplicate Routes¶
Symptom: Same prefix appears multiple times with same timestamp
Diagnosis:
Cause: This is expected - denormalized storage stores every route update
Solution: Use DISTINCT ON
queries to get latest state (see Queries)
Missing Routes¶
Symptom: Routes missing that should be present
Diagnosis:
# Check for parse errors
docker-compose logs pybmpmon | grep parse_error
# Check database capacity
docker-compose exec postgres df -h
# Verify router is sending routes
# Check router BMP statistics
Solutions:
- Parse errors: Report issue with DEBUG logs
- Database full: Free up space or increase retention policy
- Router not sending: Check router BMP configuration
Backup and Recovery¶
Restore from Backup¶
# Stop application
docker-compose stop pybmpmon
# Restore database
docker-compose exec postgres pg_restore \
-U bmpmon -d bmpmon \
--clean \
/path/to/backup.dump
# Restart application
docker-compose start pybmpmon
Emergency Database Reset¶
Data Loss Warning
This will delete all data. Only use as last resort.
# Stop services
docker-compose down
# Remove volumes
docker-compose down -v
# Restart (will recreate database)
docker-compose up -d
Getting Help¶
Collect Diagnostic Information¶
Before reporting issues, collect:
-
Version information
-
Configuration (redact passwords)
-
Logs
-
System information
-
Database stats
Report Issues¶
Include in issue reports:
- Description of problem
- Steps to reproduce
- Expected vs actual behavior
- Diagnostic information (above)
- Relevant log snippets
- Router vendor and OS version
Next Steps¶
- Configuration: Adjust settings
- Queries: Verify data with SQL
- Logging: Understand log messages
- Sentry: Configure error tracking