**route**
路由块定义路由树中的节点及其子节点。如果未设置,则其可选配置参数将从其父节点继承。
每个警报都会在配置的顶级路由中进入路由树,该路由必须匹配所有警报(即没有任何已配置的匹配器)。然后它遍历子节点。如果`continue`设置为false,则在第一个匹配的子项后停止。如果`continue`在匹配节点上为true,则警报将继续与后续兄弟节点匹配。如果警报与节点的任何子节点都不匹配(没有匹配的子节点,或者不存在),则根据当前节点的配置参数处理警报。
~~~
route:
[ receiver: <string> ]
# 默认的receiver,如果某条告警没有被一个route匹配,则发送给默认的接收器
# The labels by which incoming alerts are grouped together. For example,
# multiple alerts coming in for cluster=A and alertname=LatencyHigh would
# be batched into a single group.
# 这里的标签列表是接受到告警信息后的重新分组标签,
# To aggregate by all possible labels use the special value '...' as the sole label name, for example:
# group_by: ['...']
# This effectively disables aggregation entirely, passing through all
# alerts as-is. This is unlikely to be what you want, unless you have
# a very low alert volume or your upstream notification system performs
# its own grouping.
[ group_by: '[' <labelname>, ... ']' ]
# Whether an alert should continue matching subsequent sibling nodes.
# 警报是否应继续匹配后续兄弟节点
[ continue: <boolean> | default = false ]
# A set of equality matchers an alert has to fulfill to match the node.
# 一组平等匹配警报必须满足以匹配节点
match:
[ <labelname>: <labelvalue>, ... ]
# A set of regex-matchers an alert has to fulfill to match the node.
# 一组正则表达式匹配器必须满足警报以匹配节点
match_re:
[ <labelname>: <regex>, ... ]
# How long to initially wait to send a notification for a group
# of alerts. Allows to wait for an inhibiting alert to arrive or collect
# more initial alerts for the same group. (Usually ~0s to few minutes.)
# 最初等待为一组警报发送通知的时间。 允许等待禁止警报到达或收集同一组的更多初始警报
# 在一个新的告警分组被创建后,等待至少group_wait时间来初始化通知,这种方式可以确保有足够的时间为同一分组获取多条告警,然后一起触发这条警报。
[ group_wait: <duration> | default = 30s ]
# How long to wait before sending a notification about new alerts that
# are added to a group of alerts for which an initial notification has
# already been sent. (Usually ~5m or more.)
# 在第一条警报信息发送后,等待group_interval时间来发送新的一组警报信息。
[ group_interval: <duration> | default = 5m ]
# How long to wait before sending a notification again if it has already
# been sent successfully for an alert. (Usually ~3h or more).
# 如果某条警报信息已经发送成功,则等待repeat_interval时间重新发送它们。
[ repeat_interval: <duration> | default = 4h ]
# Zero or more child routes.
# 上面的所有属性都由所有子路由继承,并且可以在每个子路由上覆盖。
routes:
[ - <route> ... ]
~~~
#### 示例
~~~
# The root route with all parameters, which are inherited by the child
# routes if they are not overwritten.
route:
receiver: 'default-receiver'
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
group_by: [cluster, alertname]
# All alerts that do not match the following child routes
# will remain at the root node and be dispatched to 'default-receiver'.
routes:
# All alerts with service=mysql or service=cassandra
# are dispatched to the database pager.
- receiver: 'database-pager'
group_wait: 10s
match_re:
service: mysql|cassandra
# All alerts with the team=frontend label match this sub-route.
# They are grouped by product and environment rather than cluster
# and alertname.
- receiver: 'frontend-pager'
group_by: [product, environment]
match:
team: frontend
~~~
[https://github.com/prometheus/alertmanager/blob/master/doc/examples/simple.yml](https://github.com/prometheus/alertmanager/blob/master/doc/examples/simple.yml)
```
global:
# The smarthost and SMTP sender used for mail notifications.
smtp_smarthost: 'localhost:25'
smtp_from: 'alertmanager@example.org'
smtp_auth_username: 'alertmanager'
smtp_auth_password: 'password'
# The auth token for Hipchat.
hipchat_auth_token: '1234556789'
# Alternative host for Hipchat.
hipchat_api_url: 'https://hipchat.foobar.org/'
# The directory from which notification templates are read.
templates:
- '/etc/alertmanager/template/*.tmpl'
# The root route on which each incoming alert enters.
route:
group_by: ['alertname', 'cluster', 'service']
group_wait: 30s
# When the first notification was sent, wait 'group_interval' to send a batch
# of new alerts that started firing for that group.
group_interval: 5m
# If an alert has successfully been sent, wait 'repeat_interval' to
# resend them.
repeat_interval: 3h
# A default receiver
receiver: team-X-mails
# All the above attributes are inherited by all child routes and can
# overwritten on each.
# The child route trees.
# 子路由树
routes:
# This routes performs a regular expression match on alert labels to
# catch alerts that are related to a list of services.
# 此路由在警报标签上执行正则表达式匹配,以捕获与服务列表相关的警报。
- match_re:
service: ^(foo1|foo2|baz)$
receiver: team-X-mails
# The service has a sub-route for critical alerts, any alerts
# that do not match, i.e. severity != critical, fall-back to the
# parent node and are sent to 'team-X-mails'
routes:
- match:
severity: critical
receiver: team-X-pager
- match:
service: files
receiver: team-Y-mails
routes:
- match:
severity: critical
receiver: team-Y-pager
# This route handles all alerts coming from a database service. If there's
# no team to handle it, it defaults to the DB team.
# 此路由处理来自数据库服务的所有警报。 如果没有团队可以处理它,则默认为数据库团队。
- match:
service: database
receiver: team-DB-pager
# Also group alerts by affected database.
group_by: [alertname, cluster, database]
routes:
- match:
owner: team-X
receiver: team-X-pager
continue: true
- match:
owner: team-Y
receiver: team-Y-pager
# Inhibition rules allow to mute a set of alerts given that another alert is firing.
# We use this to mute any warning-level notifications if the same alert is already critical.
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
# Apply inhibition if the alertname is the same.
equal: ['alertname', 'cluster', 'service']
receivers:
- name: 'team-X-mails'
email_configs:
- to: 'team-X+alerts@example.org'
- name: 'team-X-pager'
email_configs:
- to: 'team-X+alerts-critical@example.org'
pagerduty_configs:
- service_key: <team-X-key>
- name: 'team-Y-mails'
email_configs:
- to: 'team-Y+alerts@example.org'
- name: 'team-Y-pager'
pagerduty_configs:
- service_key: <team-Y-key>
- name: 'team-DB-pager'
pagerduty_configs:
- service_key: <team-DB-key>
- name: 'team-X-hipchat'
hipchat_configs:
- auth_token: <auth_token>
room_id: 85
message_format: html
notify: true
```
- 笔记
- shell
- 如何才能学好Shell编程之“老鸟”经验谈
- scripts
- 迁移脚本
- centos_install.sh
- https.support.lwork.com.conf
- newbroker.default.lwork.com.conf
- bwnginx.conf
- twnginx.conf
- pre.default.lwork.com.conf
- zabbix_agentInstall
- getcc.sh
- shell脚本调试
- shell学习
- 第一章shell脚本入门
- shell脚本开发的基本规范及习惯
- 脚本规范示例
- 第三章变量的核心知识与实践
- 第四章变量知识进阶和实践
- 4.3 shell变量子串知识及实践
- 4.4 shell特殊扩展变量的知识与实践
- 第五章 变量的数值计算实践
- 第六章 shell脚本的条件测试
- 第七章 if条件的知识与实践
- 第8章 shell函数的基础实践
- 第13章 Shell数组的应用实践
- 经验
- for和while读行的区别
- 一个文件取2个参数
- 重定向正确及错误输出
- linux常用命令
- awk
- 详解
- 例子
- 内置变量
- 实例2
- 实例3
- find/grep
- iostat
- java启动脚本
- ln -s
- nmap
- passwd
- sed
- 详解
- 例子
- ssh-copy-id
- vim
- linux systemd详解
- 常用命令实列
- ss
- rz,sz小文件上传下载
- 文件的合并,排序和分割
- sort,uniq
- sort
- uniq
- cut
- paste
- tr
- curl
- cpu
- scp
- 批量添加注释
- nc
- yarn
- lsof
- tar
- cat
- openssl自签名证书
- pwgen
- logrotate
- 中间件
- mongo
- mongo配置文件详解
- mongo安装
- mongo常用命令
- mongo导入导出
- 导出数据的mongojs
- mongo shell
- mongo异常关闭
- mongo的缺点
- mysql
- 安装
- Gitd
- 主从同步
- 常用命令
- 日志清理
- 连接数,最大并发数,超时
- 错误
- 错误1872
- 错误1236
- 错误1-gitd主从报错
- 一些优化
- 服务器硬件优化
- 编译安装
- mysql配置文件优化
- 根据status优化
- 优化思路
- index
- 查询数据库大小
- ubuntu18.04mysql启动脚本
- pure-ftpd
- rabbitmq
- consul
- redis
- 安装
- 配置
- redis-sentinel
- 常用命令
- supervisor启动redis
- freeipa
- ftp
- 错误530根本原因和解决方法
- vsftp
- sftp
- JDK
- java参数
- zabbix
- 安装
- nginx
- 基础
- 1.基础web配置
- 2.nginx的日志格式
- 3.Nginx的请求限制
- 4.Nginx访问控制
- 进阶
- 1.静态资源web服务
- 2.Nginx作为代理服务
- 3.负载均衡
- 4.rewrite模块
- 5.Geoip
- location与proxy_pass
- proxy_set_header参数
- add_header
- 安装
- 4XX5XX重定向
- Nginx resolver explained
- 关于防止自己网页内容被别人iframe的问题
- nginx全局变量
- nginx错误代码
- 平滑升级nginx
- nginx相关资料网站
- nginx配置下载目录
- 反向代理并发数
- php
- 安装centos6,7
- xtrabackup
- apache
- 常用工具
- SSL证书在线工具SSL
- wordpress
- kafka
- nssm
- GoCD
- gocd简介
- gocd一些概念
- gocd客户端环境变量
- 建立一个piplines
- gocd添加nodejs
- supervisor
- mongo,mysql,hadoop比较
- screen
- python
- minio-私有存储桶
- kubernetes
- YAML格式简单说明
- k8s集群常用命令
- 概念
- k8s组件
- 对象
- workloads
- pods
- overview
- pod lifecycle
- init containers
- env向容器暴露pod信息
- controllers
- rs
- deployments
- daemonset
- StatefulSet
- service
- ingress
- volumes及configmap
- pv和pvc
- serviceaccount及认证
- dashboard及分级授权
- flannel&calico
- 调度器,预选策略及优先函数
- 资源指标API及自定义指标API
- helm
- k8s最佳实践
- 配置kubelet
- 简单命令定位问题
- k8s中日志收集-1
- k8s中日志收集-2
- lxcfs
- v1.24以后镜像问题
- 单控制节点集群v1.24以后适用
- 单控制节点集群v1.24前适用
- K8s.1.11.x阿里云安装HA版
- 国内k8s安装指定版本
- 发布及回滚
- 检查yaml文件格式
- pod分配到指定节点
- k8s跨集群访问
- 在docker中查看对应k8s容器日志
- cert-manager
- 问题定位技巧:容器内抓包
- 为容器设置启动时要执行的命令及其入参
- deploy.yaml文件实例
- kube/config
- 系统守护进程预留资源
- k8s集群证书pki过期处理
- pod跑java时内存的运用
- 从外部访问k8s中的pod
- HPA实战
- Docker
- Docker常用命令
- 基本概念
- 镜像
- 容器
- 仓库
- 安装 Docker
- Ubuntu
- Centos
- 镜像加速器
- 使用镜像
- 获取镜像
- 使用 Dockerfile 定制镜像
- Dockerfile 指令详解
- COPY 复制文件
- CMD 容器启动命令
- ENTRYPOINT 入口点
- ENV
- 其他命令
- 参考文档
- Alpine制作JDK8镜像
- Dockerfile示例
- 访问仓库
- nexus
- 最佳实践
- 镜像删除
- 清理docker磁盘空间
- docker容器日志管理
- 镜像基础上构建镜像
- git
- 公钥私钥免登
- 常用命令
- git pull
- git升级
- jenkins
- jenkins使用git
- 设置构建作业
- General
- Source Code Management
- Build Traggers
- Build Environment
- Build
- Post-build Actions
- 高级构建
- 参数化构建作业
- prometheus
- 监控原则
- 第一章 采集数据
- HPA
- meterics-server
- custom metrics
- kube-state-metrics
- node-exporter
- 第二章 prometheus
- prometheus概述
- prometheus基本架构
- prometheus安装
- prometheus的配置和服务发现
- scrape_configs
- kubernetes_sd_config
- relabel_config
- relabel_config例子
- 服务发现配置
- alertmanager_config
- alerting
- configuration
- route
- receivers
- inhibits_rules
- 第三章 展示与告警
- 第四章 PromQL
- rate,irate和delta的区别
- prometheus-operator
- maven
- maven命令
- maven仓库配置
- openstack
- 网络基础
- 计算机网络原理
- 一个URL请求的过程
- 2.记录
- 3.数据链路层
- 4.网络层
- 网络常用命令
- 命令
- iptables
- nc
- ipset
- mtr
- ss
- lsof
- ip
- 抓包
- tcpdump
- 网络排错与观察
- netstat
- traceroute
- dig与nslookup
- 计算机网络协议
- 负载均衡总结性说明
- NAT
- Tinc
- ubuntu
- ubuntu-var-log-下各个日志文件
- apt和dpkg
- systemctl详解
- 关闭系统更新,有些更新可能影响运行的程序
- ubuntu常用命令
- 基础工具journalctl命令
- za
- 恢复阿里云物理备份
- 域名证书申请和更换
- 正则表达式常用
- 服务器上排查问题得头5分钟
- windows
- winserver关闭事件跟踪程序
- windows常用命令
- win10企业LTSC版激活
- windows通过网卡只开80端口
- debug-tools
- win10-1903及以上版本Realtek高清晰音频管理器
- 彻底解决WPS Office Expansion tool弹出问题
- services延迟启动时间修改
- windows服务器定时重启
- windows sc命令
- 防火墙概述
- iptables
- 简单说明
- 例子
- 项目一
- DevOps简介
- 项目介绍
- 高并发内核优化
- gitlab
- gitlab社区和企业版本区别
- gitlab社区版安装
- gitlab指定版本安装
- gitlab安全设置
- gitlab的备份和恢复
- gitlab容器化安装
- jenkins
- jenkins安装
- ubuntu 16.04 install jenkins
- ubuntu 20.04 install jenkins
- jenkins使用git
- jenkins配置第一个项目
- jenkins发布及制作jar镜像
- jenkins安全策略
- gocd
- gocd安装
- k8s中gocd的server和agent模板
- gocd配置第一个项目
- 脚本
- gocd+ldap
- nexus
- 安装和配置
- freeipa
- 介绍
- 安装
- freeipa集成ocserv
- VPN
- 原理
- vpn部署ocserv
- k8s
- k8s高可用集群
- DNSmasq
- SNIproxy
- Tinc
- prometheus
- 简介
- helm安装prometheus
- 采集数据概览
- 采集数据node-exporter
- 采集数据kube-state-metrics
- 采集数据cadvisor和apiservers
- 指标汇总展示
- 监控
- nginx+lua+waf
- 项目二
- 简介
- nacos
- 简介
- nacos配置管理功能
- tengine
- java
- java参数说明及优化
- github快速访问
- amd和arm区别
- AWS
- 负载均衡ALB
