Using kubeadm
and flannel
over 4-nodes running on RHEL 7
I did the following:
- Open port 10250 on all nodes
- Applied Failed to get kubernetes address: No kubernetes source found to address
no source found
issue - Ran
kubectl create -f deploy/1.8+/
- Ran
kubectl get pods -n=kube-system
and then got CrashLoopBackOff
on metrics-server
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-4q7ct 1/1 Running 10 7d
coredns-78fcdf6894-7tj52 1/1 Running 10 7d
etcd-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-apiserver-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-controller-manager-thalia0.ahc.umn.edu 1/1 Running 0 7d
kube-flannel-ds-amd64-78hbk 1/1 Running 0 7d
kube-flannel-ds-amd64-gdttr 1/1 Running 0 7d
kube-flannel-ds-amd64-rzhm2 1/1 Running 0 7d
kube-flannel-ds-amd64-xc2n7 1/1 Running 0 7d
kube-proxy-b86kn 1/1 Running 0 7d
kube-proxy-g27sk 1/1 Running 0 7d
kube-proxy-rtgtp 1/1 Running 0 7d
kube-proxy-x2pp7 1/1 Running 0 7d
kube-scheduler-thalia0.ahc.umn.edu 1/1 Running 0 7d
kubernetes-dashboard-7b7cb74c5c-wgt8f 1/1 Running 0 6d
metrics-server-85ff8f7b84-2x5th 0/1 CrashLoopBackOff 8 23m
- Ran
kubectl -n kube-system logs $(kubectl get pods --namespace=kube-system -l k8s-app=metrics-server -o name)
and got output:
I0828 19:26:41.686932 1 heapster.go:71] /metrics-server --source=kubernetes:https://kubernetes.default
I0828 19:26:41.687023 1 heapster.go:72] Metrics Server version v0.2.1
I0828 19:26:41.687360 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version
I0828 19:26:41.687388 1 configs.go:62] Using kubelet port 10255
E0828 19:27:01.692571 1 kubelet.go:331] Failed to load nodes: Get https://kubernetes.default/api/v1/nodes: dial tcp: lookup kubernetes.default on 10.96.0.10:53: read udp 10.244.2.4:34644->10.96.0.10:53: read: no route to host
I0828 19:27:01.692700 1 heapster.go:128] Starting with Metric Sink
I0828 19:27:02.500852 1 serving.go:308] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
W0828 19:27:04.381151 1 authentication.go:222] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLE_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
F0828 19:27:04.381187 1 heapster.go:97] Could not create the API server: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.96.0.1:443: getsockopt: no route to host
I also looked at the various logs, and noticed for the flannel
pods I was getting a whole slew of these errors:
E0829 19:41:32.636680 1 reflector.go:201] github.com/coreos/flannel/subnet/kube/kube.go:295: Failed to list *v1.Node: Get https://10.96.0.1:443/api/v1/nodes?resourceVersion=0: net/http: TLS handshake timeout
Also, getting this error on the scheduler pod:
E0829 19:41:32.637368 1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:129: Failed to list *core.Service: Get https://134.84.53.162:6443/api/v1/services?limit=500&resourceVersion=0: net/http: TLS handshake timeout
EDIT 1
I rebuilt the cluster after tearing in down and adding a rule on the local firewall to allow port 443 (for dealing with kubectl proxy
).
Output of kubectl get services --namespace=kube-system
is
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 15h
kubernetes-dashboard ClusterIP 10.98.72.170 <none> 443/TCP 20m
metrics-server ClusterIP 10.111.155.9 <none> 443/TCP 1m
Also, of note, after the teardown and reinitialization of the cluster, both the flannel and scheduler pods are not throwing the error. I'm Only getting the error on the metrics-server pod, along with this new error on the apiserver pod:
: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0830 20:43:38.101286 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
I0830 20:45:38.101548 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
E0830 20:45:38.101757 1 controller.go:111] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0830 20:45:38.101779 1 controller.go:119] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0830 20:45:44.532250 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Get https://10.111.155.9:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I0830 20:45:48.894505 1 controller.go:105] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
E0830 20:45:48.894693 1 controller.go:111] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
Furthermore, digging into the error W0828 19:27:04.381151 1 authentication.go:222] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLE_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
I ran kubectl get roles -n kube-system extension-apiserver-authentication-reader -o yaml
and got the following output:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: 2018-08-30T00:58:35Z
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: extension-apiserver-authentication-reader
namespace: kube-system
resourceVersion: "132"
selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/roles/extension-apiserver-authentication-reader
uid: d2f1c80c-abef-11e8-95cc-005056891f42
rules:
- apiGroups:
- ""
resourceNames:
- extension-apiserver-authentication
resources:
- configmaps
verbs:
- get
Lastly, the output kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
of is
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
creationTimestamp: 2018-08-30T22:41:26Z
name: v1beta1.metrics.k8s.io
resourceVersion: "119754"
selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
uid: d403e18f-aca5-11e8-95cc-005056891f42
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
status:
conditions:
- lastTransitionTime: 2018-08-30T22:41:26Z
message: endpoints for service/metrics-server in "kube-system" have no addresses
reason: MissingEndpoints
status: "False"
type: Available
This seems like an obvious network problem (firewall?), but I am not sure how to proceed with this. Is this a flannel
or a coredns
configuration issue?
Best Answer
I switched the CNI from
flannel
tocalico
and that seems to have resolved the problems I was having (I was also unable to get theargo workflow
controller to fire up in my cluster).