Wednesday, February 15, 2017

Air Pollution!

Since it has recently become a huge problem in my city and I'm currently working with prometheus I though it might be fun to get the pollution data and then plot it with grafana :)

Here's the task list:
- find the required data
- gather the data somehow
- scrub with prometheus
- display with grafana
- admire with satisfaction ;)

First we need to find some source of air pollution data, ideally it should provide an API we can use easily. One such case is World Air Quality Index. They gather data for more than 9k stations in 800 cities from 70 countries thanks to worlds Environmental Protection Agencies. API is easy, available and exports as JSON, all you really need is a token. A searchable cities list is also available: http://aqicn.org/city/all/.

Now let's gather the data by creating our own exporter for prometheus. We can use the code created by robustperception and documented by the prometheus team:

#!/usr/bin/python
# -*- coding: UTF-8 -*-
from prometheus_client import start_http_server, Metric, REGISTRY
import json
import requests
import sys
import time

class JsonCollector(object):
  def __init__(self):
    pass

  def collect(self):
    # Fetch the JSON
    token = 'demo-token-won\'t-work' #apply for a token here: http://aqicn.org/data-platform/token/
    url = 'http://api.waqi.info/feed/'
    city = 'Wrocław' # Cities: http://aqicn.org/city/all/
    response = json.loads(requests.get(url+city+"/?token="+token).content.decode('Utf-8'))

    metric = Metric('aqi_airquality', 'Air Quality Index', 'gauge')
    metric.add_sample('aqi_airquality', value=response['data']['aqi'], labels={})
    yield metric

    metric = Metric('aqi_co', 'CO gases', 'gauge')
    metric.add_sample('aqi_co', value=response['data']['iaqi']['co']['v'], labels={})
    yield metric

    metric = Metric('aqi_humidity', 'Air Humidity', 'gauge')
    metric.add_sample('aqi_humidity', value=response['data']['iaqi']['h']['v'], labels={})
    yield metric

    metric = Metric('aqi_no2', 'NO2 gases', 'gauge')
    metric.add_sample('aqi_no2', value=response['data']['iaqi']['no2']['v'], labels={})
    yield metric

    metric = Metric('aqi_pressure', 'Air Pressure', 'gauge')
    metric.add_sample('aqi_pressure', value=response['data']['iaqi']['p']['v'], labels={})
    yield metric

    metric = Metric('aqi_pm25', 'pm2.5 particles', 'gauge')
    metric.add_sample('aqi_pm25', value=response['data']['iaqi']['pm25']['v'], labels={})
    yield metric

    metric = Metric('aqi_temp', 'Air temperature', 'gauge')
    metric.add_sample('aqi_temp', value=response['data']['iaqi']['t']['v'], labels={})
    yield metric
if __name__ == '__main__':
  # Usage: airq_exporter.py port
  start_http_server(int(sys.argv[1]))
  REGISTRY.register(JsonCollector())

  while True: time.sleep(1)

Ok so now to run our exporter on localhost and port 9999:

./airq_exporter.py 9999

Opening http://localhost:9999 should give a similar result:

telnet localhost 9999
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /metrics HTTP/1.0

HTTP/1.0 200 OK
Server: BaseHTTP/0.3 Python/2.7.12
Date: Wed, 15 Feb 2017 12:06:22 GMT
Content-Type: text/plain; version=0.0.4; charset=utf-8

# HELP aqi_airquality Air Quality Index
# TYPE aqi_airquality gauge
aqi_airquality 219.0
# HELP aqi_co CO gases
# TYPE aqi_co gauge
aqi_co 12.7
# HELP aqi_humidity Air Humidity
# TYPE aqi_humidity gauge
aqi_humidity 80.0
# HELP aqi_no2 NO2 gases
# TYPE aqi_no2 gauge
aqi_no2 42.6
# HELP aqi_pressure Air Pressure
# TYPE aqi_pressure gauge
aqi_pressure 1037.0
# HELP aqi_pm25 pm25
# TYPE aqi_pm25 gauge
aqi_pm25 219.0
# HELP aqi_temp Air temperature
# TYPE aqi_temp gauge
aqi_temp 4.77
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 132415488.0
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 19132416.0
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1487149538.19
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 6.08
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 7.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024.0
Connection closed by foreign host.

So as you can see we gathered all the important data and there are also some typical prometheus client metrics.

Scrubbing the data is really simple, just edit your prometheus.conf and add:

  - job_name: 'airq'
    static_configs:
          - targets: ['IP:9999']

Remember to reload prometheus;)

Next step -> grafana. This is fairly easy now, since we have the data and need to present it. Choose anything that suits your needs. End result could look like this:
Grafana Dashboard
Grafana Dashboard for AirQuality
Looks like all the TODOs have been checked by now... so the last thing to do is 'admire with satisfaction;)'
All the code is available on github.