mirror of
https://github.com/CoderSherlock/CoderSherlock.github.io.git
synced 2026-06-13 08:08:10 -07:00
Compare commits
2 Commits
c59459f1f0
...
2f9718bfbe
| Author | SHA1 | Date | |
|---|---|---|---|
| 2f9718bfbe | |||
| 3bd704c592 |
@@ -27,4 +27,4 @@ TL;DR: Bumping up to level 5 would satisfy most debugging needs.
|
||||
By the time, this note was written. In `kubelet` related code, level 8 was only used in `pkg/kubelet/prober/prober_manager.go` and level 7 was only used in `pkg/kubelet/logs/container_log_manager.go`. And there are 11 occurrences that level 6 was used, and all of them are not part of workload lifecycle related.
|
||||
|
||||
## Further readings
|
||||
[Inotify watcher leaks in Kubelet]
|
||||
[Inotify watcher leaks in Kubelet](/posts/inotify-watcher-leaks-in-kubelet.html)
|
||||
@@ -1,12 +1,156 @@
|
||||
---
|
||||
layout: post
|
||||
title: Inotify watcher leaks in Kubelet
|
||||
date: 2024-04-11 16:35 -0400
|
||||
date: 2024-04-18 16:35 -0400
|
||||
description:
|
||||
cover:
|
||||
cover: '/static/2024-04/kubelet_inotify_leak_logo.png'
|
||||
category:
|
||||
tags:
|
||||
published: false
|
||||
tags: ["Kubernetes", "Kubelet", "Debug", "Inotify"]
|
||||
published: true
|
||||
sitemap: true
|
||||
permalink:
|
||||
author: Pengzhan Hao
|
||||
---
|
||||
|
||||
## Symptom
|
||||
Recently, I faced an issue where Kubelet on a node reported error message failed to create file descriptors.
|
||||
|
||||
```bash
|
||||
error creating file watcher: too many open files
|
||||
error creating file watcher: no space left on device
|
||||
```
|
||||
|
||||
After short checking, I found the node has `max_user_watches` of 10000, but the `TotalinotifyWatches` is beyond this value. (P. S still not sure why watcher can initiate more than cap). In order to find which process occupied the most watchers. I used following command[^flbit_ino] to find it out.
|
||||
|
||||
```bash
|
||||
echo -e "COUNT\tPID\tUSER\tCOMMAND"; sudo find /proc/[0-9]*/fdinfo -type f 2>/dev/null | sudo xargs grep ^inotify 2>/dev/null | cut -d/ -f 3 | uniq -c | sort -nr | { while read -rs COUNT PID; do echo -en "$COUNT\t$PID\t"; ps -p $PID -o user=,command=; done}
|
||||
|
||||
COUNT PID USER COMMAND
|
||||
7491 8412 root /home/kubernetes/bin/kubelet --v=2 --cloud-provide=gce --experi
|
||||
2620 1 root /sbin/init
|
||||
....
|
||||
```
|
||||
|
||||
Surprisingly, Kubelet initiated more than 7000 inotify watchers. I assumed there was an inotify leakage in Kubelet.
|
||||
## Leakage check
|
||||
|
||||
### Clean Kubelet
|
||||
To better understand the situation, I created a clean cluster with only 1 clean node on GKE. Roughly 70 inotify watchers were there. I created a single nginx pod and the number increased by 3. Theoretically, these 3 watchers are used by Kubelet to monitor any changes on `rootfs `, `kube-api-access` and `PodSandbox`. But to verify it, we need to check more details on which inodes are monitored by Kubelet.
|
||||
### Check inotify file descriptors
|
||||
To do so, let's take a look how to track a single inotify file descriptor. Opened processes' `fdinfo` folder, we can examine each or them to find an inotify fd.
|
||||
|
||||
```bash
|
||||
# Find kubelet pid
|
||||
ps -aux | grep kubelet
|
||||
KPID=2430
|
||||
|
||||
# File the an example fd
|
||||
sudo ls /proc/2430/fdinfo
|
||||
|
||||
0 1 10 11 12 13 14 2 3 4 5 6 7 8 9
|
||||
|
||||
...
|
||||
|
||||
sudo cat /proc/2430/fdinfo/8
|
||||
|
||||
pos: 0
|
||||
flags: 02004000
|
||||
mnt_id: 15
|
||||
ino: 1057
|
||||
inotify wd:1 ino:3f327 sdev:800001 mask:fc6 ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:27f30300e5059ea2
|
||||
```
|
||||
|
||||
This is very confusing, so I rely on `man proc`[^man_proc] to understand every piece of them. In given fd, the needed information to continue sit in the last line. It's an inotify entry represents the 1 file or folder to be monitored. And the most useful data is `ino:3f327` which represents the inode number of target file (in hexadecimal). And `sdev:800001`, which represents the ID of device where the inode sit on, and it's also in hex.
|
||||
|
||||
Using `lsblk`, I can see there's only 1 disk I'm using on the node, so finding the target file would be easy.
|
||||
|
||||
```bash
|
||||
# Cast to decimal
|
||||
ino=3f327
|
||||
dec="$((16#${ino}))"
|
||||
|
||||
# Find the target file
|
||||
loc="debugfs -R 'ncheck ${dec}' /dev/sda1"
|
||||
sudo eval $loc 2>/dev/null
|
||||
|
||||
debugfd 1.46.5 (30-Dec-2021)
|
||||
Inode Pathname
|
||||
258855 /etc/srv/kubernetes/pki/ca-certificates.crt
|
||||
```
|
||||
|
||||
Put all processes above into one single script, I can retrieve all target files, that would help to understand if there's a real leakage. Also, I count the unique inode amount, this could also help to know which inode are monitored multiple times.
|
||||
|
||||
```bash
|
||||
cat << EOF | sudo tee -a test.sh
|
||||
echo "kubelet pid="${PID}
|
||||
in_fds=$(find /proc/${PID}/fdinfo -type f 2>/dev/null | xargs grep ^inotify | cut -d " " -f 3 | cut -d ":" -f 2)
|
||||
echo ${in_fds}
|
||||
echo "Count: $(find /proc/${PID}/fdinfo -type f 2>/dev/null | xargs grep ^inotify | wc -l)"
|
||||
|
||||
uniq_fds=$(echo "${in_fds[@]}" | sort | uniq)
|
||||
echo ${uniq_fds}
|
||||
|
||||
while read -r element;
|
||||
do
|
||||
count=$(echo "${in_fds[@]}" | grep -o "${element}" | wc -l)
|
||||
dec="$((16#${element}))"
|
||||
loc="debugfs -R 'ncheck ${dec}' /dev/sda1"
|
||||
loc=$(eval $loc 2>/dev/null | tail -1 | cut -d " " -f 4)
|
||||
printf "%-6s %-10s %-6s %s\n" "${element}" "${dec}" "${count}" "${loc}"
|
||||
done <<< "${uniq_fds}"
|
||||
EOF
|
||||
|
||||
sudo bash test.sh
|
||||
|
||||
kubelet pid=2430
|
||||
3f327 3f321 ...
|
||||
Count: 120
|
||||
1 10b 1259 128a ...
|
||||
1 1 72 Inode Pathname
|
||||
10b 267 1 267 /etc/systemd/system/multi-user.target.wants/snapd.service
|
||||
...
|
||||
```
|
||||
|
||||
The given results are consists by following parts:
|
||||
- One line for get Kubelet pid
|
||||
- One line for all target inode numbers
|
||||
- One line for tell how many unique inode (120)
|
||||
- One line of sorted target
|
||||
- Following 120 lines, each of them represents a unique inode number, its decimal number, count, another time of decimal number and the target file path.
|
||||
|
||||
I used the same script to the problematic node, and it showed the following result. In summary, most Kubelet watchers were targeting `ino:1 `. And there are 6649 targets files, which likely to be leakage, because there were only 150 pods on this pod. Unfortunately, `debugfs` can't find any target files, so the output showed as meaningless string `"Inode Pathname"`.
|
||||
|
||||
```
|
||||
kubelet pid=8412
|
||||
...
|
||||
Count: 7491
|
||||
...
|
||||
1 1 6649 Inode Pathname
|
||||
```
|
||||
|
||||
### Bad apple
|
||||
Why `debugfs` can't help anymore? The reason is simple, each cgroup for a pod is using its own rootfs. This means the watcher are somehow residing on different rootfs and using independent inode index. There are some other ways to do it, I choose the most common tool `grep` to find out.
|
||||
|
||||
```
|
||||
sudo grep / -inum 1
|
||||
/home/kubernetes/containerized_mounter/rootfs/dev
|
||||
/home/kubernetes/containerized_mounter/rootfs/proc
|
||||
...
|
||||
/home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/5325873d-f2a0-48df-83e2-0b911df2f77f/volumes/kubernetes.io~projected/kube-api-access-227jg
|
||||
...
|
||||
/dev
|
||||
/boot/efi
|
||||
...
|
||||
```
|
||||
|
||||
This turns things easy, because I can just use pod ID to compare between running pods to find out if there are any terminated pods are there. And it did show there's some non-existence pods still being watched somehow.
|
||||
|
||||
## What to expect next
|
||||
|
||||
- [How Kubelet leaked inotify watchers?]()
|
||||
- [debugfs]()
|
||||
|
||||
## References
|
||||
|
||||
[^flbit_ino]: [Fluentbit error "cannot adjust chunk size" on GKE](https://stackoverflow.com/a/76712244)
|
||||
[^man_proc]: [proc(5)](https://manpages.courier-mta.org/htmlman5/proc.5.html)
|
||||
[^list_ino]: [Listing the files that are being watched by `inotify` instances](https://unix.stackexchange.com/a/646113)
|
||||
+13
-8
@@ -442,7 +442,7 @@ c13 9 26 20 30 26 7 11 -9 26 -27 26 -5 0 -3 -5 5 -10 9 -6 10 -10 3 -10 -24
|
||||
<ul class="menu">
|
||||
<li>
|
||||
<button type="button" class="button button--secondary button--pill tag-button tag-button--all" data-encode="">
|
||||
Show All<div class="tag-button__count">9</div>
|
||||
Show All<div class="tag-button__count">10</div>
|
||||
</button>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Binghamton+university">
|
||||
<span>Binghamton university</span><div class="tag-button__count">1</div>
|
||||
@@ -450,8 +450,8 @@ c13 9 26 20 30 26 7 11 -9 26 -27 26 -5 0 -3 -5 5 -10 9 -6 10 -10 3 -10 -24
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Charles+proxy">
|
||||
<span>Charles proxy</span><div class="tag-button__count">1</div>
|
||||
</button>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Debug">
|
||||
<span>Debug</span><div class="tag-button__count">1</div>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-2" data-encode="Debug">
|
||||
<span>Debug</span><div class="tag-button__count">2</div>
|
||||
</button>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Diary">
|
||||
<span>Diary</span><div class="tag-button__count">1</div>
|
||||
@@ -459,11 +459,14 @@ c13 9 26 20 30 26 7 11 -9 26 -27 26 -5 0 -3 -5 5 -10 9 -6 10 -10 3 -10 -24
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-2" data-encode="Edge+computing">
|
||||
<span>Edge computing</span><div class="tag-button__count">2</div>
|
||||
</button>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Kubelet">
|
||||
<span>Kubelet</span><div class="tag-button__count">1</div>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Inotify">
|
||||
<span>Inotify</span><div class="tag-button__count">1</div>
|
||||
</button>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Kubernetes">
|
||||
<span>Kubernetes</span><div class="tag-button__count">1</div>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-2" data-encode="Kubelet">
|
||||
<span>Kubelet</span><div class="tag-button__count">2</div>
|
||||
</button>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-2" data-encode="Kubernetes">
|
||||
<span>Kubernetes</span><div class="tag-button__count">2</div>
|
||||
</button>
|
||||
</li><li><button type="button" class="button button--pill tag-button tag-button-1" data-encode="Log">
|
||||
<span>Log</span><div class="tag-button__count">1</div>
|
||||
@@ -491,7 +494,9 @@ c13 9 26 20 30 26 7 11 -9 26 -27 26 -5 0 -3 -5 5 -10 9 -6 10 -10 3 -10 -24
|
||||
</button>
|
||||
</li></ul>
|
||||
</div>
|
||||
<div class="js-result layout--archive__result d-none"><div class="article-list items"><section><h2 class="article-list__group-header">2024</h2><ul class="items"><li class="item" itemscope itemtype="http://schema.org/BlogPosting" data-tags="Kubernetes,Kubelet,Debug">
|
||||
<div class="js-result layout--archive__result d-none"><div class="article-list items"><section><h2 class="article-list__group-header">2024</h2><ul class="items"><li class="item" itemscope itemtype="http://schema.org/BlogPosting" data-tags="Kubernetes,Kubelet,Debug,Inotify">
|
||||
<div class="item__content"><span class="item__meta">Apr 18</span><a itemprop="headline" class="item__header" href="/posts/inotify-watcher-leaks-in-kubelet">Inotify watcher leaks in Kubelet</a></div>
|
||||
</li><li class="item" itemscope itemtype="http://schema.org/BlogPosting" data-tags="Kubernetes,Kubelet,Debug">
|
||||
<div class="item__content"><span class="item__meta">Apr 10</span><a itemprop="headline" class="item__header" href="/posts/Debug-kubelet">Debug Kubelet</a></div>
|
||||
</li></ul></section><section><h2 class="article-list__group-header">2022</h2><ul class="items"><li class="item" itemscope itemtype="http://schema.org/BlogPosting" data-tags="Xv6,Teaching,Operating+system,Binghamton+university">
|
||||
<div class="item__content"><span class="item__meta">Feb 22</span><a itemprop="headline" class="item__header" href="/posts/cs350-labs">Labs of CS350</a></div>
|
||||
|
||||
@@ -1 +1 @@
|
||||
window.TEXT_SEARCH_DATA={'posts':[{'title':"STSD: Stop Talking Start Doing",'url':"/posts/welcome-to-my-blog"},{'title':"Using charles proxy to monitor mobile SSL traffics",'url':"/posts/charles-is-not-a-good-tool"},{'title':"Some of my previews experiment works: 2016",'url':"/posts/some-of-my-previews-exper-work"},{'title':"Xv6 introduction",'url':"/posts/intro-xv6"},{'title':"Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries",'url':"/posts/generate-word-cloud-with-chinese-fenci"},{'title':"EDDL: How do we train neural networks on limited edge devices - PART 1",'url':"/posts/eddl-how-do-we-train-on-limited-edge-devices"},{'title':"EDDL: How do we train neural networks on limited edge devices - PART 2",'url':"/posts/eddl-how-do-we-train-on-limited-edge-devices-part2"},{'title':"Labs of CS350",'url':"/posts/cs350-labs"},{'title':"Debug Kubelet",'url':"/posts/Debug-kubelet"}]};
|
||||
window.TEXT_SEARCH_DATA={'posts':[{'title':"STSD: Stop Talking Start Doing",'url':"/posts/welcome-to-my-blog"},{'title':"Using charles proxy to monitor mobile SSL traffics",'url':"/posts/charles-is-not-a-good-tool"},{'title':"Some of my previews experiment works: 2016",'url':"/posts/some-of-my-previews-exper-work"},{'title':"Xv6 introduction",'url':"/posts/intro-xv6"},{'title':"Generate Word Cloud Figures with Chinese-Tokenization and WordCloud python libraries",'url':"/posts/generate-word-cloud-with-chinese-fenci"},{'title':"EDDL: How do we train neural networks on limited edge devices - PART 1",'url':"/posts/eddl-how-do-we-train-on-limited-edge-devices"},{'title':"EDDL: How do we train neural networks on limited edge devices - PART 2",'url':"/posts/eddl-how-do-we-train-on-limited-edge-devices-part2"},{'title':"Labs of CS350",'url':"/posts/cs350-labs"},{'title':"Debug Kubelet",'url':"/posts/Debug-kubelet"},{'title':"Inotify watcher leaks in Kubelet",'url':"/posts/inotify-watcher-leaks-in-kubelet"}]};
|
||||
|
||||
+149
-3
File diff suppressed because one or more lines are too long
+26
-13
@@ -439,7 +439,31 @@ c13 9 26 20 30 26 7 11 -9 26 -27 26 -5 0 -3 -5 5 -10 9 -6 10 -10 3 -10 -24
|
||||
<div class="col-main cell cell--auto"><!-- start custom main top snippet -->
|
||||
|
||||
<!-- end custom main top snippet -->
|
||||
<article itemscope itemtype="http://schema.org/WebPage"><header style="display:none;"><h1>Home</h1></header><meta itemprop="headline" content="Home"><meta itemprop="author" content="Pengzhan Hao"/><div class="js-article-content"><div class="layout--articles"><div class="article-list items items--divided"><article class="item" itemscope itemtype="http://schema.org/BlogPosting"><div class="item__image" style="vertical-align: middle"><img class="image" src="/static/2024-04/Kubelet.webp" /></div><div class="item__content">
|
||||
<article itemscope itemtype="http://schema.org/WebPage"><header style="display:none;"><h1>Home</h1></header><meta itemprop="headline" content="Home"><meta itemprop="author" content="Pengzhan Hao"/><div class="js-article-content"><div class="layout--articles"><div class="article-list items items--divided"><article class="item" itemscope itemtype="http://schema.org/BlogPosting"><div class="item__image" style="vertical-align: middle"><img class="image" src="/static/2024-04/kubelet_inotify_leak_logo.png" /></div><div class="item__content">
|
||||
<header><a href="/posts/inotify-watcher-leaks-in-kubelet"><h2 itemprop="headline" class="item__header">Inotify watcher leaks in Kubelet</h2></a></header>
|
||||
<div class="item__description"><div class="article__content" itemprop="description articleBody">Symptom
|
||||
Recently, I faced an issue where Kubelet on a node reported error message failed to create file descriptors.
|
||||
|
||||
error creating file watcher: too many open files
|
||||
error creating file watcher: no space left on device
|
||||
|
||||
|
||||
After short checking, I found the node has max_user_watches of 10000, but the TotalinotifyWatches is beyond this value. (P. S...</div><p><a href="/posts/inotify-watcher-leaks-in-kubelet">Read more</a></p></div><div class="article__info clearfix"><ul class="left-col menu"><li>
|
||||
<a class="button button--secondary button--pill button--sm"
|
||||
href="/archive.html?tag=Kubernetes">Kubernetes</a>
|
||||
</li><li>
|
||||
<a class="button button--secondary button--pill button--sm"
|
||||
href="/archive.html?tag=Kubelet">Kubelet</a>
|
||||
</li><li>
|
||||
<a class="button button--secondary button--pill button--sm"
|
||||
href="/archive.html?tag=Debug">Debug</a>
|
||||
</li><li>
|
||||
<a class="button button--secondary button--pill button--sm"
|
||||
href="/archive.html?tag=Inotify">Inotify</a>
|
||||
</li></ul><ul class="right-col menu"><li><i class="fas fa-user"></i> <span>Pengzhan Hao</span></li><li><i class="far fa-calendar-alt"></i> <span>Apr 18, 2024</span>
|
||||
</li></ul></div><meta itemprop="author" content="Pengzhan Hao"/><meta itemprop="datePublished" content="2024-04-18T16:35:00-04:00">
|
||||
<meta itemprop="keywords" content="Kubernetes,Kubelet,Debug,Inotify"></div>
|
||||
</article><article class="item" itemscope itemtype="http://schema.org/BlogPosting"><div class="item__image" style="vertical-align: middle"><img class="image" src="/static/2024-04/Kubelet.webp" /></div><div class="item__content">
|
||||
<header><a href="/posts/Debug-kubelet"><h2 itemprop="headline" class="item__header">Debug Kubelet</h2></a></header>
|
||||
<div class="item__description"><div class="article__content" itemprop="description articleBody">Debug logs
|
||||
|
||||
@@ -546,19 +570,8 @@ In the second half of this post, I will discuss a little bit more on how to debu
|
||||
</li></ul><ul class="right-col menu"><li><i class="fas fa-user"></i> <span>Pengzhan Hao</span></li><li><i class="far fa-calendar-alt"></i> <span>Oct 28, 2016</span>
|
||||
</li></ul></div><meta itemprop="author" content="Pengzhan Hao"/><meta itemprop="datePublished" content="2016-10-28T12:27:33-04:00">
|
||||
<meta itemprop="keywords" content="Research,Log,Miscellanies"></div>
|
||||
</article><article class="item" itemscope itemtype="http://schema.org/BlogPosting"><div class="item__image" style="vertical-align: middle"><img class="image" src="/static/2021-12/charles-proxy-logo.png" /></div><div class="item__content">
|
||||
<header><a href="/posts/charles-is-not-a-good-tool"><h2 itemprop="headline" class="item__header">Using charles proxy to monitor mobile SSL traffics</h2></a></header>
|
||||
<div class="item__description"><div class="article__content" itemprop="description articleBody">In this blog, I will generally talk about how to use proper tools to monitor SSL traffics of a mobile devices. Currently, I only can dealing with those SSL traffics which use an obviously certification. Some applications may not using system root cert or they doesn’t provide us a method to modify their own certs. For these situation, I still did...</div><p><a href="/posts/charles-is-not-a-good-tool">Read more</a></p></div><div class="article__info clearfix"><ul class="left-col menu"><li>
|
||||
<a class="button button--secondary button--pill button--sm"
|
||||
href="/archive.html?tag=Network">Network</a>
|
||||
</li><li>
|
||||
<a class="button button--secondary button--pill button--sm"
|
||||
href="/archive.html?tag=Charles+proxy">Charles proxy</a>
|
||||
</li></ul><ul class="right-col menu"><li><i class="fas fa-user"></i> <span>Pengzhan Hao</span></li><li><i class="far fa-calendar-alt"></i> <span>Oct 27, 2016</span>
|
||||
</li></ul></div><meta itemprop="author" content="Pengzhan Hao"/><meta itemprop="datePublished" content="2016-10-27T22:50:33-04:00">
|
||||
<meta itemprop="keywords" content="Network,Charles proxy"></div>
|
||||
</article></div>
|
||||
</div><div class="layout--home"><div class="pagination"><p>9 post articles, 2 pages.</p>
|
||||
</div><div class="layout--home"><div class="pagination"><p>10 post articles, 2 pages.</p>
|
||||
<div class="pagination__menu">
|
||||
<ul class="menu menu--nowrap"><li><div class="button button--secondary button--circle disabled">
|
||||
<i class="fas fa-angle-left"></i>
|
||||
|
||||
+5
-1
@@ -34,7 +34,11 @@
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://blog.pengzhan.dev/posts/Debug-kubelet</loc>
|
||||
<lastmod>2024-04-10T23:51:19-04:00</lastmod>
|
||||
<lastmod>2024-04-18T21:14:15-04:00</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://blog.pengzhan.dev/posts/inotify-watcher-leaks-in-kubelet</loc>
|
||||
<lastmod>2024-04-18T21:14:15-04:00</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://blog.pengzhan.dev/archive</loc>
|
||||
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 59 KiB |
Reference in New Issue
Block a user