Saturday, November 7, 2015

Docker: How to setup Swarm with etcd?

Alright, let me start by sharing some information about my test environment:
(i) 3 nodes (hdp1, hdp2 and hdp3) running CentOS 7.1.1503
(ii) Docker v1.9.0
(iii) etcd 2.1.1 (running only on hdp1) listening for client connection at port 4001

Node "hdp1" will be Swarm Master.

(1) Firstly, let's reconfigure the Docker daemon running on ALL the nodes by adding "-H tcp://0.0.0.0:2375".

Eg.
vi /usr/lib/systemd/system/docker.service
[Amend the line] ExecStart=/usr/bin/docker daemon -H fd:// -H tcp://0.0.0.0:2375


(2) You would have to reload systemctl and restart docker on ALL the nodes after the above changes.

systemctl daemon-reload
systemctl restart docker
[To verify] ps -ef | grep docker

(3)  Make sure "etcd" is running. If not, please start it and make sure it's listening on the intended port (4001 in this case).

systemctl start etcd
[To verify] netstat -an | grep [port] | grep LISTEN

(4) On the other nodes (non-swarm-master - in this case "hdp2" and "hdp3"), execute the following command to join them to the cluster:

docker run -d swarm join --addr=[IP of the node]:2375 etcd://[IP of etcd host]:[port]/[optional path prefix]

WHERE (in this example)
[IP of the node] = IP address of node "hdp2" and "hdp3"
[IP of etcd host] = IP address of node "hdp1" where the only etcd instance is running
[port] = Port that etcd uses to listen to incoming client connection (in this example = 4001)
[optional path prefix] = Path that etcd uses to store data about the registered Swarm nodes

The final command:
docker run -d swarm join --addr=192.168.0.171:2375 etcd://192.168.0.170:4001/swarm
docker run -d swarm join --addr=192.168.0.172:2375 etcd://192.168.0.170:4001/swarm


(5) You can verify that the nodes are registered with the following command:

etcdctl ls /swarm/docker/swarm/nodes


(6) If all nodes are registered successfully, you can now start up the Swarm Master (in this example, on node "hdp1").

docker run -p [host port]:2375 -d swarm manage -H tcp://0.0.0.0:2375 etcd://[IP of etcd host]:4001/swarm


WHERE [in this example]
[host port] = 9999 (or any other free network port you selected - you will use this port to communicate with Swarm Master)
[IP of etcd host = IP address of node "hdp1" where the only etcd instance is running

Eg.

docker run -p 9999:2375 -d swarm manage -H tcp://0.0.0.0:2375 etcd://192.168.0.170:4001/swarm

(7) To verify that the Swarm cluster is now working properly, execute the following command:

docker -H [IP of the host where Swarm Master is running]:[port] info

WHERE (in this example)
[IP of the host where Swarm Master is running] = Node "hdp1" (192.168.0.170)
[port] = 9999 (refer to Step (6) above)

NOTE: You can use any Docker CLI as normal with Swarm cluster =>

docker -H [IP of the host where Swarm Master is running]:[port] ps -a
docker -H [IP of the host where Swarm Master is running]:[port] logs [container id]
docker -H [IP of the host where Swarm Master is running]:[port] inspect [container id]

(8) Let's spin up a container and see how your Swarm cluster handles it.

docker -H [IP of the host where Swarm Master is running]:[port] run -d --name nginx-1 -p 80 nginx


(9) Let's check where the container is running.

docker -H [IP of the host where Swarm Master is running]:[port] ps -a

(10) You can stop the running container by issuing:

docker -H [IP of the host where Swarm Master is running]:[port] stop [container id]



Tuesday, November 3, 2015

Mesos/Kubernetes: How to install and run Kubernetes on Mesos with your local cluster?

First of all, let me share with you my test environment:
(1) CentOS 7.1.1503 (nodes = hdp1, hdp2 and hdp3)
(2) HDP 2.3.2 (re-using the installed Zookeeper)
(3) Docker v1.8.3
(4) golang 1.4.2
(5) etcd 2.1.1

The official documentation for Kubernetes-Mesos integration can be found here. It uses Google Compute Engine (GCE), but this blog entry will share about deploying Kubernetes-Mesos integration on a local cluster.

Ok, let's begin...

Prerequisites

(1) A working local Mesos cluster
NOTE: To build one, please refer to this.

(2) Install Docker on ALL nodes.
(a) Make sure yum has access to the official Docker repository.
(b) Execute "yum install docker-engine"
(c) Enable docker.service with "systemctl enable docker.service"
(d) Start docker.service with "systemctl start docker.service"

(3) Install "golang" on the node which you wish to install and deploy Kubernetes-Mesos integration.
(a) Execute "yum install golang"

(4) Install "etcd" on a selected node (preferably on the node that host the Kubernetes-Mesos integration for testing purposes).
(a) Execute "yum install etcd"
(b) Amend file "/usr/lib/systemd/system/etcd.service" (see below):
[FROM]
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd"
[TO]
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --listen-client-urls http://0.0.0.0:4001 --advertise-client-urls http://[node_ip]:4001"
WHERE
[node_ip] = IP Address of the node (hostname -i)

(c) Reload systemctl daemon with "systemctl daemon-reload".
(d) Enable etcd.service with "systemctl enable etcd.service".
(e) Start etcd.service with "systemctl start etcd.service".

***

Build Kubernetes-Mesos

NOTE: Execute the following on the node selected to host the Kubernetes-Mesos integration.
cd [directory to install kubernetes-mesos]
git clone https://github.com/kubernetes/kubernetes
cd kubernetes
export KUBERNETES_CONTRIB=mesos
make

***

Export environment variables

(1) Export the following environment variables:

export KUBERNETES_MASTER_IP=$(hostname -i)
export KUBERNETES_MASTER=http://${KUBERNETES_MASTER_IP}:8888
export MESOS_MASTER=[zk://.../mesos]
export PATH="[directory to install kubernetes-mesos]/_output/local/go/bin:$PATH"

WHERE
[zk://.../mesos] = URL of the zookeeper nodes (Eg. zk://hdp1:2181,hdp2:2181,hdp3:2181/mesos)
[directory to install kubernetes-mesos] = Directory used to perform "git clone" (see "Build Kubernetes-Mesos" above).

(2) Amend .bash_profile to make the variables permanent.
(3) Remember to source the .bash_profile file after amendment (. ~/.bash_profile).

***

Configure and start Kubernetes-Mesos service

(1) Create a cloud config file mesos-cloud.conf in the current directory with the following contents:
$ cat <<EOF >mesos-cloud.conf
[mesos-cloud]
        mesos-master        = ${MESOS_MASTER}
EOF
NOTE:
If you have not set ${MESOS_MASTER}, it should be like (example) "zk://hdp1:2181,hdp2:2181,hdp3:2181/mesos".

(2) Create a script to start all the relevant components (API server, controller manager, and scheduler):

km apiserver \
  --address=${KUBERNETES_MASTER_IP} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --service-cluster-ip-range=10.10.10.0/24 \
  --port=8888 \
  --cloud-provider=mesos \
  --cloud-config=mesos-cloud.conf \
  --secure-port=0 \
  --v=1 >apiserver.log 2>&1 &

sleep 3

km controller-manager \
  --master=${KUBERNETES_MASTER_IP}:8888 \
  --cloud-provider=mesos \
  --cloud-config=./mesos-cloud.conf  \
  --v=1 >controller.log 2>&1 &

sleep 3

km scheduler \
  --address=${KUBERNETES_MASTER_IP} \
  --mesos-master=${MESOS_MASTER} \
  --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \
  --mesos-user=root \
  --api-servers=${KUBERNETES_MASTER_IP}:8888 \
  --cluster-dns=10.10.10.10 \
  --cluster-domain=cluster.local \
  --contain-pod-resources=false \
  --v=2 >scheduler.log 2>&1 &

NOTE:
Since CentOS uses systemd, you will hit this issue. Hence, you need to add the "--contain-pod-resources=false" to the scheduler (in bold above).

(3) Give execute permission to the script (chmod 700 <script>).
(4) Execute the script.

***

Validate Kubernetes-Mesos services
$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
# NOTE: Your service IPs will likely differ
$ kubectl get services
NAME             LABELS                                    SELECTOR   IP(S)          PORT(S)
k8sm-scheduler   component=scheduler,provider=k8sm         <none>     10.10.10.113   10251/TCP
kubernetes       component=apiserver,provider=kubernetes   <none>     10.10.10.1     443/TCP
(4) Lastly, look for Kubernetes in the Mesos web GUI by pointing your browser to http://[mesos-master-ip:port]. Go to the Frameworks tab, and look for an active framework named "Kubernetes".

Kubernetest framework is registered with Mesos
***

Let's spin up a pod

(1) Write a JSON pod description to a local file:
$ cat <<EOPOD >nginx.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx
    ports:
    - containerPort: 80
EOPOD
(2) Send the pod description to Kubernetes using the "kubectl" CLI:
$ kubectl create -f ./nginx.yaml
pods/nginx
Submitted pod through kubectl
(3) Wait a minute or two while Docker downloads the image layers from the internet. We can use the kubectl interface to monitor the status of our pod:
$ kubectl get pods
NAME      READY     STATUS    RESTARTS   AGE
nginx     1/1       Running   0          14s
(4) Verify that the pod task is running in the Mesos web GUI. Click on the Kubernetes framework. The next screen should show the running Mesos task that started the Kubernetes pod.
Mesos WebGUI shows active Kubernetes task

Mesos WebGUI shows that the Kubernetes task is RUNNING
Click through "Sandbox" link of the task to get to the "executor.log"

An example of "executor.log"

Connected to the node where the container is running

Getting Kubernetes to work on Mesos can be rather challenging at this point of time.

However, it is possible and hopefully, over time, Kubernetes-Mesos integration can work seamlessly.

Have fun!