多应用+插件架构,代码干净,二开方便,首家独创一键云编译技术,文档视频完善,免费商用码云13.8K 广告
## 日志与监控 Mesos 自身提供了强大的日志和监控功能,某些应用框架也提供了针对框架中任务的监控能力。通过这些接口,用户可以实时获知集群的各种状态。 ### 日志配置 日志文件默认在 `/var/log/mesos` 目录下,根据日志等级带有不同后缀。 用户可以通过日志来调试使用中碰到的问题。 一般的,推荐使用 `--log_dir` 选项来指定日志存放路径,并通过日志分析引擎来进行监控。 ### 监控 Mesos 提供了方便的监控接口,供用户查看集群中各个节点的状态。 #### 主节点 通过 `http://MASTER_NODE:5050/metrics/snapshot` 地址可以获取到 Mesos 主节点的各种状态统计信息,包括资源(CPU、硬盘、内存)使用、系统状态、从节点、应用框架、任务状态等。 例如查看主节点 `10.0.0.2` 的状态信息,并用 jq 来解析返回的 json 对象。 ``` <pre class="prettyprint"><ol class="linenums"><li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln">$ curl </span><span class="pun">-</span><span class="pln">s http</span><span class="pun">://</span><span class="lit">10.0</span><span class="pun">.</span><span class="lit">0.2</span><span class="pun">:</span><span class="lit">5050</span><span class="pun">/</span><span class="pln">metrics</span><span class="pun">/</span><span class="pln">snapshot </span><span class="pun">|</span><span class="pln">jq </span><span class="pun">.</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pun">{</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/mem_total_bytes"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">4144713728</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/mem_free_bytes"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">153071616</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/load_5min"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.37</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/load_1min"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.6</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/load_15min"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.29</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/cpus_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">4</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"registrar/state_store_ms/p9999"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">45.4096616192</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"registrar/state_store_ms/p999"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">45.399272192</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"registrar/state_store_ms/p99"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">45.29537792</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"registrar/state_store_ms/p95"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">44.8336256</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"registrar/state_store_ms/p90"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">44.2564352</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"registrar/state_store_ms/p50"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">34.362368</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="pun">...</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/recovery_slave_removals"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_registrations"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_removals"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_removals/reason_registered"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_removals/reason_unhealthy"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_removals/reason_unregistered"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_reregistrations"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">2</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_shutdowns_canceled"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_shutdowns_completed"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"master/slave_shutdowns_scheduled"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pun">}</span></code></li> </ol> ``` #### 从节点 通过 `http://SLAVE_NODE:5051/metrics/snapshot` 地址可以获取到 Mesos 从节点的各种状态统计信息,包括资源、系统状态、各种消息状态等。 例如查看从节点 `10.0.0.10` 的状态信息。 ``` <pre class="prettyprint"><ol class="linenums"><li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln">$ curl </span><span class="pun">-</span><span class="pln">s http</span><span class="pun">://</span><span class="lit">10.0</span><span class="pun">.</span><span class="lit">0.10</span><span class="pun">:</span><span class="lit">5051</span><span class="pun">/</span><span class="pln">metrics</span><span class="pun">/</span><span class="pln">snapshot </span><span class="pun">|</span><span class="pln">jq </span><span class="pun">.</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pun">{</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/mem_total_bytes"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">16827785216</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/mem_free_bytes"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">3377315840</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/load_5min"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.11</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/load_1min"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.16</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/load_15min"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.13</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"system/cpus_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">8</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/valid_status_updates"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">11</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/valid_framework_messages"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/uptime_secs"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">954125.458927872</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/tasks_starting"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/tasks_staging"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/tasks_running"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/tasks_lost"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/tasks_killed"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">2</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/tasks_finished"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/executors_preempted"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/executor_directory_max_allowed_age_secs"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">403050.709525191</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/disk_used"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/disk_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">88929</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/disk_revocable_used"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/disk_revocable_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/disk_revocable_percent"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/disk_percent"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"containerizer/mesos/container_destroy_errors"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/container_launch_errors"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">6</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/cpus_percent"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.025</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/cpus_revocable_percent"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/cpus_revocable_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/cpus_revocable_used"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/cpus_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">8</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/cpus_used"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.2</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/executors_registering"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/executors_running"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/executors_terminated"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">8</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/executors_terminating"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/frameworks_active"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/invalid_framework_messages"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/invalid_status_updates"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/mem_percent"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0.00279552715654952</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/mem_revocable_percent"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/mem_revocable_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/mem_revocable_used"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/mem_total"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">15024</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/mem_used"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">42</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/recovery_errors"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">0</span><span class="pun">,</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/registered"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1</span><span class="pun">,</span></code></li> <li class="l"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pln"> </span><span class="str">"slave/tasks_failed"</span><span class="pun">:</span><span class="pln"> </span><span class="lit">6</span></code></li> <li class="l1"><code class="pcalibre10 pcalibre11 pcalibre9"><span class="pun">}</span></code></li> </ol> ``` 另外,通过 `http://MASTER_NODE:5050/monitor/statistics.json` 地址可以看到该从节点上容器网络相关的统计数据,包括进出流量、丢包数、队列情况等。获取方法同上,在此不再演示。