故障描述
Kuboad默认使用的Etcd镜像存在2G的存储限制,当到达限制时会etcd会报出NOSPACE告警。
处理办法
修改原镜像启动参数调整etcd后端存储限制
拉取原镜像,修改entrypoint文件
docker pull eipwork/etcd-host:3.4.16-2
docker create eipwork/etcd-host:3.4.16-2
docker cp <container>:/docker-entrypoint.sh .
vim docker-entrypoint.sh
在结尾添加两行参数
etcd --name ${HOSTNAME} \
--listen-peer-urls http://${HOSTIP}:2382 \
--listen-client-urls http://${HOSTIP}:2381 \
--advertise-client-urls http://${HOSTIP}:2381 \
--initial-advertise-peer-urls http://${HOSTIP}:2382 \
--initial-cluster-token kuboard-etcd-cluster-1 \
--initial-cluster ${PEERS} \
--initial-cluster-state new \
--snapshot-count=10000 \
--log-level=info \
--logger=zap \
--data-dir /data \
#数据自动压缩
--auto-compaction-retention=1 \
#限制后端存储为8G
--quota-backend-bytes=8388608000
重新构建镜像
FROM eipwork/etcd-host:3.4.16-2
COPY ./docker-entrypoint.sh /docker-entrypoint.sh
调整镜像后需要手动解除告警:
首先修改存活探针启动时间
kubectl -n kuboard edit stateful kuboard-etcd
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 2381
scheme: HTTP
initialDelaySeconds: 30 #调高一点
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
进入容器解除告警
kubectl -n kuboard exec -it kuboard-etcd -- sh
ETCDCTL_API=3 etcdctl --endpoints="http://127.0.0.1:2381" --write-out=table endpoint status
ETCDCTL_API=3 etcdctl --endpoints="http://127.0.0.1:2381" alarm disarm
解除告警后恢复探针