2017-05-10

netdata と statsd によるリアルタイムモニタリング

NetData Metrics

statsd の設定
- netdata.conf
個別のチャート
- 個別チャートのサンプル画像
合成チャート
- $NETDATA_PREFIX/etc/netdata/statsd.d/sample.conf
- 合成チャートのサンプル画像
メトリクスデータのタイプ一覧

netdataにstatsdが組み込まれました。

github.com

statsdはデータ収集のシステムです。netdataはstatsdを利用したリアルタイムモニタリングも提供出来るようになりました。

github.com

statsd の設定

netdata.conf

[statsd]
        # enabled = yes
        # update every (flushInterval) = 1
        # udp messages to process at once = 10
        # create private charts for metrics matching = *
        # max private charts allowed = 200
        # max private charts hard limit = 1000
        # private charts memory mode = save
        # private charts history = 3996
        # histograms and timers percentile (percentThreshold) = 95.00000
        # add dimension for number of events received = yes
        # gaps on gauges (deleteGauges) = no
        # gaps on counters (deleteCounters) = no
        # gaps on meters (deleteMeters) = no
        # gaps on sets (deleteSets) = no
        # gaps on histograms (deleteHistograms) = no
        # gaps on timers (deleteTimers) = no
        # listen backlog = 4096
        # default port = 8125
        # bind to = udp:localhost:8125 tcp:localhost:8125

初期状態でstatsdが有効です。

無効化したい場合は以下の様に編集します。

[statsd]
    enabled = no
(snip)

個別のチャート

statsdが有効な状態であれば、特に設定の必要はありません。

この様にコマンドを実行すると、メトリクスデータを送信出来ます。

$ echo "NAME:VALUE|TYPE" | nc -u -w 1 localhost 8125

$ echo "private-metric2:123|c" | nc -u -w 1 localhost 8125

個別チャートのサンプル画像

f:id:biaxident:20170510121610p:plain

“private-metric1"がgaugesで、"private-metric2"がcountersです。

合成チャート

合成チャートを作成するには設定が必要です。$NETDATA_PREFIX/etc/netdata/statsd.d/ディレクトリにファイル(*.conf)を配置します。

ここでは、"app.“のプレフィクスをつけて送信したメトリクスデータを合成します。以下はデータの送信例です。

$ echo "app.metric1:666|g" | nc -u -w 1 localhost 8125

$NETDATA_PREFIX/etc/netdata/statsd.d/sample.conf

[app]
        name = myapp
        metrics = app.*
        private charts = no
        gaps when not collected = no
        memory mode = ram
        history = 60

[area]
        name = mychart1 name
        title = mychart1 title
        family = app area
        units = tests/s
        priority = 91000
        type = area
        dimension = app.metric1 m1
        dimension = app.metric2 m2

[stacked]
        name = mychart2 name
        title = mychart2 title
        family = app stacked
        units = tests/s
        priority = 91000
        type = stacked
        dimension = app.metric1 m1
        dimension = app.metric2 m2

[line]
        name = mychart3 name
        title = mychart3 title
        family = app line
        units = tests/s
        priority = 91000
        type = line
        dimension = app.metric1 m1
        dimension = app.metric2 m2

上記のサンプルでは同じデータからarea,stacked,lineの3つのチャートを作成してみました。

合成チャートのサンプル画像

f:id:biaxident:20170510122036p:plain

“app.metric1"と"app.metric2"はどちらもgaugesでデータを送信しています。

メトリクスデータのタイプ一覧

type	value
gauges	g
timers	ms
histograms	h
counters	c
meters	m
sets	s

2017-04-06

ansible-playbook 実行時のエラーをメールで通知する

Ansible Python

mailコールバックプラグイン
環境
コールバックプラグインの準備
- コールバックプラグインの編集
ansible.cfgの編集
playbookの編集
- test.yml
実行と結果
メール本文
- 件名
- 本文

`mail`コールバックプラグイン

mailコールバックプラグインを利用します。 ansible-playbook実行時にエラーがあった場合はメールで通知します。

環境

# cat /etc/redhat-release 
CentOS Linux release 7.3.1611 (Core)

# ansible-playbook --version
ansible-playbook 2.2.1.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides

ansibleはEPELのRPMパッケージを利用しています。

コールバックプラグインの準備

今回は以下の理由により、githubから最新のmailコールバックプラグインをダウンロードして利用します。

プラグインを直接編集する必要がある
- メールの宛先や送信元アドレスを指定する場合、設定ファイル等で指定できれば良いのですが、該当のプラグインを確認する限り現時点ではそのような仕組みにはなっていないようです。
バグ
- この記事を作成時点ではRPMパッケージに含まれるmailコールバックプラグインにはバグが存在します。

# mkdir -p ~/ansible/plugins/callback/
# wget -O ~/ansible/plugins/callback/mail_on_failed.py https://raw.githubusercontent.com/ansible/ansible/devel/lib/ansible/plugins/callback/mail.py

mail.pyとは別のファイル名で保存します。同じ名前だと、ansible-playbook実行時にRPMパッケージでインストールされたmail.pyも実行されてしまいます。

コールバックプラグインの編集

v2_runner_on_failed()内のmail()オプションを編集して、宛先を指定します。

~/ansible/plugins/callback/mail_on_failed.py

(snip)
    CALLBACK_VERSION = 2.0
    CALLBACK_TYPE = 'notification'
    CALLBACK_NAME = 'mail'
    CALLBACK_NEEDS_WHITELIST = True

    def v2_runner_on_failed(self, res, ignore_errors=False):
(snip)
        mail(sender=sender, subject=subject, body=body, to="foobar@example.com")

    def v2_runner_on_unreachable(self, result):
(snip)

ansible.cfgの編集

保存したプラグインを下記の様に指定します。

callback_plugins=~/ansible/plugins/callback
callback_whitelist=mail_on_failed

playbookの編集

test.yml

存在しない/mail_on_failed_testをlsさせてエラーを発生させます。

- hosts: all
  gather_facts: no
  tasks:
    - ping:
    - shell: echo test
    - shell: ls /mail_on_failed_test
    - shell: date

実行と結果

# ansible-playbook -i localhost, -c local test.yml                                                                                                                                                                                                   

PLAY [all] *********************************************************************

TASK [ping] ********************************************************************
ok: [localhost]

TASK [command] *****************************************************************
changed: [localhost]

TASK [command] *****************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "ls /mail_on_failed_test", "delta": "0:00:00.004637", "end": "2017-0X-0X 10:32:28.102568", "failed": true, "rc": 2, "start": "2017-0X-0X 10:32:28.097931", "stderr": "ls: cannot access /mail_on_failed_test: No such file or directory", "stdout": "", "stdout_lines": [], "warnings": []}
        to retry, use: --limit @/root/test.retry

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=1

メール本文

以下のメールが配送されます。

件名

ls: cannot access /mail_on_failed_test: No such file or directory

本文

The following task failed for host localhost:

command:  {"warn": true, "executable": null, "_uses_shell": true, "_raw_params": "ls /mail_on_failed_test", "removes": null, "creates": null, "chdir": null}

with the following output in standard error:

ls: cannot access /mail_on_failed_test: No such file or directory

A complete dump of the error:

{"changed": true, "cmd": "ls /mail_on_failed_test", "delta": "0:00:00.004637", "end": "2017-0X-0X 10:32:28.102568", "failed": true, "rc": 2, "start": "2017-0X-0X 10:32:28.097931", "stderr": "ls: cannot access /mail_on_failed_test: No such file or directory", "stdout": "", "stdout_lines": [], "warnings": []}

2017-04-03

netdata による nginx のためのモニタリング設定

NetData Metrics

nginxプラグイン
- nginxの設定
- netdataの設定(python.d/nginx.conf)
web_logプラグイン
- nginxの設定
- netdataの設定(python.d/web_log.conf)

nginxのモニタリングに関連するプラグインは以下の2つです。

nginx
web_log

nginxプラグイン

nginxのアクティブコネクション等のチャートを表示できます。設定次第で、リモートで稼働しているnginxのチャートを表示させることも可能です。

nginxの設定

location /stub_status {
    stub_status;
    allow 127.0.0.1;
    deny all;
}

netdataの設定(python.d/nginx.conf)

localhost:
  name : 'local'
  url  : 'https://localhost/stub_status'

以上の設定により、ダッシュボードにnginx localのチャートグループが追加されます。

web_logプラグイン

nginxのアクセスログをモニタリングするプラグインです。

web_logプラグインはnginxだけでなくapacheやlighttpd,gunicornのログをモニタリングできます。

nginxの設定

log_format netdata '$remote_addr - $remote_user [$time_local] '
                   '"$request" $status $body_bytes_sent '
                   '$request_length $request_time '
                   '"$http_referer" "$http_user_agent"';

access_log  /var/log/nginx/access.log  netdata;

上記の設定をすることで、nginx規定のログフォーマットでは表示させることの出来ないtimings等のチャートも表示出来るようになります。

netdataの設定(python.d/web_log.conf)

nginx_log:
  name: 'nginx'
  path: '/var/log/nginx/access.log'
  all_time: no

アクセス数の多いサーバではメモリを多く消費するのでall_time: noを追加して、クライアントIPアドレスのチャートを無効化しています。

statsd の設定

netdata.conf

個別のチャート

個別チャートのサンプル画像

合成チャート

$NETDATA_PREFIX/etc/netdata/statsd.d/sample.conf

合成チャートのサンプル画像

メトリクスデータのタイプ一覧

mailコールバックプラグイン

環境

コールバックプラグインの準備

コールバックプラグインの編集

ansible.cfgの編集

playbookの編集

test.yml

実行と結果

メール本文

件名

本文

nginxプラグイン

nginxの設定

netdataの設定(python.d/nginx.conf)

web_logプラグイン

nginxの設定

netdataの設定(python.d/web_log.conf)

`mail`コールバックプラグイン