[Bjonnh.net]# _

The context

 

SNMP

Devices have a lot of things that can be monitored, we will call them metrics here. These metrics can be considered sensitive, so we will here protect them, note that in some place they are not considered sensitive and can be gathered publicly, I do not recommend that.

To solve the problem of all the devices having their own kind of metrics, own interfaces to them and custom formats, a standard was developed, called SNMP. It has been through several iterations, and the current version is SNMP v3 which is the one we are going to implement.

SNMP metrics can be accessed through different means. Either the monitored device is talking to a trap-server that will gather all the values. Or as we are going to configure it here, as a service, listening on an interface that allows authenticated and encrypted communication.

Many things can be done with SNMP, including configuring devices. In my case here, I didn’t need that, so the user is read-only.

Prometheus and Grafana

I am using Prometheus to collect and store metrics and Grafana to make the dashboards. They are easy to use and implement. And it looks great out of the box.

Grafana screen shot

The problem

  • I have several devices that can report their status using SNMP.
  • I want to monitor these devices using nice looking dashboards and have alerts.
  • Documentation on how all these moving parts can fit together is between sparse and non-existent (or my search engine skills are rusted)

The Solution

 

Configuration of an Ubiquiti Router

The test device in question is an EdgeRouter Lite. A nice (and inexpensive) piece of equipment.

In the console, here is the configuration I used. It is really generic, it gives full read-only access to all metrics to that user. Using SNMP v3 with authentication and privacy.

Replace all the values between <> with what you want (you can simply remove the < > and it should work for a demo). I couldn’t find a way to generate the encrypted passwords. This should have happened automatically, but is seems that the feature was never added according to this thread.

For simple tasks, we do not really care about the engineID. This becomes useful with large installations where you want to have different contexts. See the RFC5343 if you want more details on how all of this fits together.

user@machine# show service snmp
 contact <you@provider.com>
 description <myroutersnmp>                                                     
 listen-address <10.0.0.1> {                                                   
 }                                                                              
 location "<Closet>"                                                         
 v3 {                                                                           
     engineid <0x1234>                                                            
     group viewer {                                                             
         mode ro                                                                
         seclevel priv                                                          
         view simpleview                                                        
     }                                                                          
     user <username> {                                                               
         auth {                                                                 
             encrypted-key ""                                                   
             plaintext-key <mysecretpassword>                                            
             type sha                                                           
         }                                                                      
         engineid <0x1234>                     
         group viewer                        
         privacy {                           
             encrypted-key ""                
             plaintext-key <mysecretpassword>
             type aes
         }
     }
     view simpleview {
         oid 1 {
         }
     }
 }

Configuration of the SNMP Prometheus exporter.

Generating the configuration

I’m using the official SNMP exporter. To use with SNMP v3 it requires a little bit of tweaking. The described configuration is really simple, you will need to add the metrics you want. It is usually not recommended to add all metrics because “one day we may need it”.

Pull the repository, build it (follow the instructions in the README), go in the generator* directory and make a generator.yml file.

Again customize the values between < and > . You can also add other metrics at this stage in walk.

modules:
  <cpu_net_uptime>:
    walk:
      - 1.3.6.1.2.1.2              # Same as "interfaces"
      - sysUpTime                  # Same as "1.3.6.1.2.1.1.3"
      - 1.3.6.1.2.1.25.3.3.1.2     # CPUs
    version: 3
    max_repetitions: 25
    retries: 3
    timeout: 10s
    auth:
      username: <username>
      security_level: authPriv
      password: <mysecretpassword>
      auth_protocol: SHA
      priv_protocol: AES
      priv_password: <mysecretpassword>

run

MIBDIRS=mibs ./generator generate

This will generate a snmp.yml file that will be used by the node exporter.

Starting the SNMP exporter

In the root of the repo, run

./snmp_exporter --config.file=generator/snmp.yml

Add a job in Prometheus

Add to your Prometheus config (replace the <10.0.0.1>, <127.0.0.1:9116> and <cpu_net_uptime> according to your config):

- job_name: 'snmp'
    static_configs:
      - targets:
        - <10.0.0.1>  # The SNMP device (you can add more here).
    metrics_path: /snmp
    params:
      module: [<cpu_net_uptime>]
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: <127.0.0.1:9116>

Conclusion

It works. It took more work than expected, but…

Some example visualizations in a Grafana dashboard