Hi Guys,
I am back (like Superman :)!!
I had been bogged down with work, family and health issues for past 1+ year. But then, I decided that my passion for technology is too great for me to stop sharing :)
So, here we go again...this time, get ready for more interesting stuff that I learned in the past year.
***
Now, let me start by sharing a data platform diagram that I have based my past year's learning on...
** At the DATA STORAGE layer, I am still contemplating which NoSQL database to select.
Before you get too excited or disappointed, let me just say that this is a platform that "I" think is perfect (and I am still constantly re-adjusting by adding/removing technologies) for today.
Each and individual selected technology is the cream of the crop. I do not declare that they are perfect, but then they are good enough to perform the work consistently well.
Everyone of them plays a specific role from the source to the destination (reporting).
I am still learning most (if not all) of them, so stay tuned for more exciting sharing!
Anything Linux, Big Data, Containerization, Graph Database and Cloud
I am a technology enthusiast and application support engineer by profession. All posts shared in this blog is based on my best knowledge. The objective of this blog is to share information and experience in hope that it will help save some of you some time while exploring these fun and exciting technologies.
Sunday, July 23, 2017
Tuesday, January 26, 2016
Docker: How to start a container on a specific node in a Swarm cluster?
If you are using a Docker Swarm cluster and are wondering how you can start a container on a specific node, this post is for you!
The short answer is to custom label the docker node(daemon):
Make sure you restart the docker service after the changes:
Verify that the daemon is now started with a label:
Make sure you remember to re-join the nodes to the swarm cluster after you restarted docker:
After you have labeled all the nodes (daemons), then you can proceed to test:
The short answer is to custom label the docker node(daemon):
[root@hdp1 ~]# cat /etc/sysconfig/docker OPTIONS="-H tcp://0.0.0.0:2375 --label nodename=n1.manfrix.com" [root@hdp1 ~]#NOTE: If you want to know more about the "label" option of docker daemon, you can refer here.
Make sure you restart the docker service after the changes:
systemctl restart docker
Verify that the daemon is now started with a label:
[root@hdp1 ~]# ps -ef | grep docker root 50961 1 0 20:38 ? 00:00:02 /usr/bin/docker daemon -H fd:// -H tcp://0.0.0.0:2375 --label nodename=n1.manfrix.com
Make sure you remember to re-join the nodes to the swarm cluster after you restarted docker:
docker run -d swarm join --addr=[IP of the node]:2375 etcd://[IP of etcd host]:[port]/[optional path prefix]
After you have labeled all the nodes (daemons), then you can proceed to test:
[root@hdp1 ~]# docker -H tcp://localhost:9999 run -d --name centos-1 -p 80 -e constraint:nodename==n3.manfrix.com centos /bin/bash 82d42f3052da181ebb876d79e2aeeb68787c17045c625367cced067107f3cb08 [root@hdp1 ~]# docker -H tcp://localhost:9999 run -d --name nginx-1 -p 80 -e constraint:nodename==n2.manfrix.com nginx 68664b5046b1dc031b015c9241a2f16f1e663f0b384d395d810d36b46f317839
For more information about Swarm node constraints, you can refer here.
Friday, January 15, 2016
Docker: /etc/default/docker or /etc/sysconfig/docker does not work anymore under systemd?
If you question why all of the sudden your OPTIONS under /etc/default/docker or /etc/sysconfig/docker no longer work, I can assure you that you are not alone!
Let's put it simply, it is due to the /usr/lib/systemd/system/docker.service shipped with newer version of Docker (most probably v1.7 and above).
The file looks like this now =>
You can see clearly that it no longer contains any environment variable for you to customize stuff.
To fix that, we need to create a systemd drop-in file for docker.service.
(1) Create a directory "/etc/systemd/system/docker.service.d" on ALL servers.
(2) In the directory, create a file - "local.conf" - using your favorite text editor with the following lines:
(3) Create or edit the /etc/sysconfig/docker or /etc/sysconfig/docker-storage or /etc/sysconfig/docker-network to specify your customized options.
(4) Reload systemd.
(5) Verify that docker.service is now aware of its environment files.
(6) Restart docker.
(7) Verify the changes.
Hope that helps to resolve your headaches! :)
Let's put it simply, it is due to the /usr/lib/systemd/system/docker.service shipped with newer version of Docker (most probably v1.7 and above).
The file looks like this now =>
[Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com After=network.target docker.socket Requires=docker.socket [Service] Type=notify ExecStart=/usr/bin/docker daemon -H fd:// MountFlags=slave< LimitNOFILE=1048576 LimitNPROC=1048576 LimitCORE=infinity [Install] WantedBy=multi-user.target
You can see clearly that it no longer contains any environment variable for you to customize stuff.
To fix that, we need to create a systemd drop-in file for docker.service.
(1) Create a directory "/etc/systemd/system/docker.service.d" on ALL servers.
mkdir /etc/systemd/system/docker.service.d
(2) In the directory, create a file - "local.conf" - using your favorite text editor with the following lines:
[Service]
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=
ExecStart=/usr/bin/docker daemon -H fd:// $OPTIONS \
$DOCKER_STORAGE_OPTIONS \
$DOCKER_NETWORK_OPTIONS \
$BLOCK_REGISTRY \
$INSECURE_REGISTRY
(3) Create or edit the /etc/sysconfig/docker or /etc/sysconfig/docker-storage or /etc/sysconfig/docker-network to specify your customized options.
Eg.
[/etc/sysconfig/docker]
OPTIONS="-H tcp://0.0.0.0:2375 --label nodename=node1.mytestmac.com"
(4) Reload systemd.
systemctl daemon-reload
(5) Verify that docker.service is now aware of its environment files.
systemctl show docker | grep -i env
Eg.
[root@hdp1 ~]# systemctl show docker | grep -i env
EnvironmentFile=/etc/sysconfig/docker (ignore_errors=yes)
EnvironmentFile=/etc/sysconfig/docker-storage (ignore_errors=yes)
EnvironmentFile=/etc/sysconfig/docker-network (ignore_errors=yes)
[root@hdp1 ~]#
(6) Restart docker.
systemctl restart docker
(7) Verify the changes.
ps -ef | grep docker
Hope that helps to resolve your headaches! :)
Saturday, November 7, 2015
Docker: How to setup Swarm with etcd?
Alright, let me start by sharing some information about my test environment:
(i) 3 nodes (hdp1, hdp2 and hdp3) running CentOS 7.1.1503
(ii) Docker v1.9.0
(iii) etcd 2.1.1 (running only on hdp1) listening for client connection at port 4001
Node "hdp1" will be Swarm Master.
(1) Firstly, let's reconfigure the Docker daemon running on ALL the nodes by adding "-H tcp://0.0.0.0:2375".
Eg.
(2) You would have to reload systemctl and restart docker on ALL the nodes after the above changes.
(3) Make sure "etcd" is running. If not, please start it and make sure it's listening on the intended port (4001 in this case).
(4) On the other nodes (non-swarm-master - in this case "hdp2" and "hdp3"), execute the following command to join them to the cluster:
WHERE (in this example)
[IP of the node] = IP address of node "hdp2" and "hdp3"
[IP of etcd host] = IP address of node "hdp1" where the only etcd instance is running
[port] = Port that etcd uses to listen to incoming client connection (in this example = 4001)
[optional path prefix] = Path that etcd uses to store data about the registered Swarm nodes
The final command:
(5) You can verify that the nodes are registered with the following command:
(6) If all nodes are registered successfully, you can now start up the Swarm Master (in this example, on node "hdp1").
WHERE [in this example]
[host port] = 9999 (or any other free network port you selected - you will use this port to communicate with Swarm Master)
[IP of etcd host = IP address of node "hdp1" where the only etcd instance is running
Eg.
(7) To verify that the Swarm cluster is now working properly, execute the following command:
WHERE (in this example)
[IP of the host where Swarm Master is running] = Node "hdp1" (192.168.0.170)
[port] = 9999 (refer to Step (6) above)
NOTE: You can use any Docker CLI as normal with Swarm cluster =>
(8) Let's spin up a container and see how your Swarm cluster handles it.
(9) Let's check where the container is running.
(10) You can stop the running container by issuing:
(i) 3 nodes (hdp1, hdp2 and hdp3) running CentOS 7.1.1503
(ii) Docker v1.9.0
(iii) etcd 2.1.1 (running only on hdp1) listening for client connection at port 4001
Node "hdp1" will be Swarm Master.
(1) Firstly, let's reconfigure the Docker daemon running on ALL the nodes by adding "-H tcp://0.0.0.0:2375".
Eg.
vi /usr/lib/systemd/system/docker.service [Amend the line] ExecStart=/usr/bin/docker daemon -H fd:// -H tcp://0.0.0.0:2375
(2) You would have to reload systemctl and restart docker on ALL the nodes after the above changes.
systemctl daemon-reload systemctl restart docker [To verify] ps -ef | grep docker
(3) Make sure "etcd" is running. If not, please start it and make sure it's listening on the intended port (4001 in this case).
systemctl start etcd [To verify] netstat -an | grep [port] | grep LISTEN
(4) On the other nodes (non-swarm-master - in this case "hdp2" and "hdp3"), execute the following command to join them to the cluster:
docker run -d swarm join --addr=[IP of the node]:2375 etcd://[IP of etcd host]:[port]/[optional path prefix]
WHERE (in this example)
[IP of the node] = IP address of node "hdp2" and "hdp3"
[IP of etcd host] = IP address of node "hdp1" where the only etcd instance is running
[port] = Port that etcd uses to listen to incoming client connection (in this example = 4001)
[optional path prefix] = Path that etcd uses to store data about the registered Swarm nodes
The final command:
docker run -d swarm join --addr=192.168.0.171:2375 etcd://192.168.0.170:4001/swarm docker run -d swarm join --addr=192.168.0.172:2375 etcd://192.168.0.170:4001/swarm
(5) You can verify that the nodes are registered with the following command:
etcdctl ls /swarm/docker/swarm/nodes
(6) If all nodes are registered successfully, you can now start up the Swarm Master (in this example, on node "hdp1").
docker run -p [host port]:2375 -d swarm manage -H tcp://0.0.0.0:2375 etcd://[IP of etcd host]:4001/swarm
WHERE [in this example]
[host port] = 9999 (or any other free network port you selected - you will use this port to communicate with Swarm Master)
[IP of etcd host = IP address of node "hdp1" where the only etcd instance is running
Eg.
docker run -p 9999:2375 -d swarm manage -H tcp://0.0.0.0:2375 etcd://192.168.0.170:4001/swarm
(7) To verify that the Swarm cluster is now working properly, execute the following command:
docker -H [IP of the host where Swarm Master is running]:[port] info
WHERE (in this example)
[IP of the host where Swarm Master is running] = Node "hdp1" (192.168.0.170)
[port] = 9999 (refer to Step (6) above)
NOTE: You can use any Docker CLI as normal with Swarm cluster =>
docker -H [IP of the host where Swarm Master is running]:[port] ps -a docker -H [IP of the host where Swarm Master is running]:[port] logs [container id] docker -H [IP of the host where Swarm Master is running]:[port] inspect [container id]
(8) Let's spin up a container and see how your Swarm cluster handles it.
docker -H [IP of the host where Swarm Master is running]:[port] run -d --name nginx-1 -p 80 nginx
(9) Let's check where the container is running.
docker -H [IP of the host where Swarm Master is running]:[port] ps -a
(10) You can stop the running container by issuing:
docker -H [IP of the host where Swarm Master is running]:[port] stop [container id]
Tuesday, November 3, 2015
Mesos/Kubernetes: How to install and run Kubernetes on Mesos with your local cluster?
First of all, let me share with you my test environment:
(1) CentOS 7.1.1503 (nodes = hdp1, hdp2 and hdp3)
(2) HDP 2.3.2 (re-using the installed Zookeeper)
(3) Docker v1.8.3
(4) golang 1.4.2
(5) etcd 2.1.1
The official documentation for Kubernetes-Mesos integration can be found here. It uses Google Compute Engine (GCE), but this blog entry will share about deploying Kubernetes-Mesos integration on a local cluster.
Ok, let's begin...
Prerequisites
(1) A working local Mesos cluster
NOTE: To build one, please refer to this.
(2) Install Docker on ALL nodes.
(a) Make sure yum has access to the official Docker repository.
(b) Execute "yum install docker-engine"
(c) Enable docker.service with "systemctl enable docker.service"
(d) Start docker.service with "systemctl start docker.service"
(3) Install "golang" on the node which you wish to install and deploy Kubernetes-Mesos integration.
(a) Execute "yum install golang"
(4) Install "etcd" on a selected node (preferably on the node that host the Kubernetes-Mesos integration for testing purposes).
(a) Execute "yum install etcd"
(b) Amend file "/usr/lib/systemd/system/etcd.service" (see below):
(c) Reload systemctl daemon with "systemctl daemon-reload".
(d) Enable etcd.service with "systemctl enable etcd.service".
(e) Start etcd.service with "systemctl start etcd.service".
***
Build Kubernetes-Mesos
NOTE: Execute the following on the node selected to host the Kubernetes-Mesos integration.
***
Export environment variables
(1) Export the following environment variables:
WHERE
[zk://.../mesos] = URL of the zookeeper nodes (Eg. zk://hdp1:2181,hdp2:2181,hdp3:2181/mesos)
[directory to install kubernetes-mesos] = Directory used to perform "git clone" (see "Build Kubernetes-Mesos" above).
(2) Amend .bash_profile to make the variables permanent.
(3) Remember to source the .bash_profile file after amendment (. ~/.bash_profile).
***
Configure and start Kubernetes-Mesos service
(1) Create a cloud config file
NOTE:
If you have not set ${MESOS_MASTER}, it should be like (example) "zk://hdp1:2181,hdp2:2181,hdp3:2181/mesos".
(2) Create a script to start all the relevant components (API server, controller manager, and scheduler):
NOTE:
Since CentOS uses systemd, you will hit this issue. Hence, you need to add the "--contain-pod-resources=false" to the scheduler (in bold above).
(3) Give execute permission to the script (chmod 700 <script>).
(4) Execute the script.
***
Validate Kubernetes-Mesos services
(4) Lastly, look for Kubernetes in the Mesos web GUI by pointing your browser to http://[mesos-master-ip:port]. Go to the Frameworks tab, and look for an active framework named "Kubernetes".
***
Let's spin up a pod
(1) Write a JSON pod description to a local file:
(2) Send the pod description to Kubernetes using the "kubectl" CLI:
(3) Wait a minute or two while Docker downloads the image layers from the internet. We can use the kubectl interface to monitor the status of our pod:
(4) Verify that the pod task is running in the Mesos web GUI. Click on the Kubernetes framework. The next screen should show the running Mesos task that started the Kubernetes pod.
Getting Kubernetes to work on Mesos can be rather challenging at this point of time.
(1) CentOS 7.1.1503 (nodes = hdp1, hdp2 and hdp3)
(2) HDP 2.3.2 (re-using the installed Zookeeper)
(3) Docker v1.8.3
(4) golang 1.4.2
(5) etcd 2.1.1
The official documentation for Kubernetes-Mesos integration can be found here. It uses Google Compute Engine (GCE), but this blog entry will share about deploying Kubernetes-Mesos integration on a local cluster.
Ok, let's begin...
Prerequisites
(1) A working local Mesos cluster
NOTE: To build one, please refer to this.
(2) Install Docker on ALL nodes.
(a) Make sure yum has access to the official Docker repository.
(b) Execute "yum install docker-engine"
(c) Enable docker.service with "systemctl enable docker.service"
(d) Start docker.service with "systemctl start docker.service"
(3) Install "golang" on the node which you wish to install and deploy Kubernetes-Mesos integration.
(a) Execute "yum install golang"
(4) Install "etcd" on a selected node (preferably on the node that host the Kubernetes-Mesos integration for testing purposes).
(a) Execute "yum install etcd"
(b) Amend file "/usr/lib/systemd/system/etcd.service" (see below):
[FROM] ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd"[TO] ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --listen-client-urls http://0.0.0.0:4001 --advertise-client-urls http://[node_ip]:4001"
WHERE [node_ip] = IP Address of the node (hostname -i)
(c) Reload systemctl daemon with "systemctl daemon-reload".
(d) Enable etcd.service with "systemctl enable etcd.service".
(e) Start etcd.service with "systemctl start etcd.service".
***
Build Kubernetes-Mesos
NOTE: Execute the following on the node selected to host the Kubernetes-Mesos integration.
cd [directory to install kubernetes-mesos] git clone https://github.com/kubernetes/kubernetes cd kubernetes export KUBERNETES_CONTRIB=mesos make
***
Export environment variables
(1) Export the following environment variables:
export KUBERNETES_MASTER_IP=$(hostname -i) export KUBERNETES_MASTER=http://${KUBERNETES_MASTER_IP}:8888 export MESOS_MASTER=[zk://.../mesos] export PATH="[directory to install kubernetes-mesos]/_output/local/go/bin:$PATH"
WHERE
[zk://.../mesos] = URL of the zookeeper nodes (Eg. zk://hdp1:2181,hdp2:2181,hdp3:2181/mesos)
[directory to install kubernetes-mesos] = Directory used to perform "git clone" (see "Build Kubernetes-Mesos" above).
(2) Amend .bash_profile to make the variables permanent.
(3) Remember to source the .bash_profile file after amendment (. ~/.bash_profile).
***
Configure and start Kubernetes-Mesos service
(1) Create a cloud config file
mesos-cloud.conf
in the current directory with the following contents:$ cat <<EOF >mesos-cloud.conf
[mesos-cloud]
mesos-master = ${MESOS_MASTER}
EOF
If you have not set ${MESOS_MASTER}, it should be like (example) "zk://hdp1:2181,hdp2:2181,hdp3:2181/mesos".
(2) Create a script to start all the relevant components (API server, controller manager, and scheduler):
km apiserver \ --address=${KUBERNETES_MASTER_IP} \ --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \ --service-cluster-ip-range=10.10.10.0/24 \ --port=8888 \ --cloud-provider=mesos \ --cloud-config=mesos-cloud.conf \ --secure-port=0 \ --v=1 >apiserver.log 2>&1 & sleep 3 km controller-manager \ --master=${KUBERNETES_MASTER_IP}:8888 \ --cloud-provider=mesos \ --cloud-config=./mesos-cloud.conf \ --v=1 >controller.log 2>&1 & sleep 3 km scheduler \ --address=${KUBERNETES_MASTER_IP} \ --mesos-master=${MESOS_MASTER} \ --etcd-servers=http://${KUBERNETES_MASTER_IP}:4001 \ --mesos-user=root \ --api-servers=${KUBERNETES_MASTER_IP}:8888 \ --cluster-dns=10.10.10.10 \ --cluster-domain=cluster.local \ --contain-pod-resources=false \ --v=2 >scheduler.log 2>&1 &
NOTE:
Since CentOS uses systemd, you will hit this issue. Hence, you need to add the "--contain-pod-resources=false" to the scheduler (in bold above).
(3) Give execute permission to the script (chmod 700 <script>).
(4) Execute the script.
***
Validate Kubernetes-Mesos services
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
# NOTE: Your service IPs will likely differ
$ kubectl get services
NAME LABELS SELECTOR IP(S) PORT(S)
k8sm-scheduler component=scheduler,provider=k8sm <none> 10.10.10.113 10251/TCP
kubernetes component=apiserver,provider=kubernetes <none> 10.10.10.1 443/TCP
Kubernetest framework is registered with Mesos |
Let's spin up a pod
(1) Write a JSON pod description to a local file:
$ cat <<EOPOD >nginx.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
EOPOD
$ kubectl create -f ./nginx.yaml
pods/nginx
Submitted pod through kubectl |
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 14s
Mesos WebGUI shows active Kubernetes task |
Mesos WebGUI shows that the Kubernetes task is RUNNING |
Click through "Sandbox" link of the task to get to the "executor.log" |
An example of "executor.log" |
Connected to the node where the container is running |
Getting Kubernetes to work on Mesos can be rather challenging at this point of time.
However, it is possible and hopefully, over time, Kubernetes-Mesos integration can work seamlessly.
Have fun!
Saturday, October 10, 2015
Docker: All Containers Get Automatically Updated /etc/hosts (!?!?!?)
While I have always wanted such feature (automatically updated /etc/hosts for all running containers), I understand that Docker does not provide it natively (just yet - or at least AFAIK). I also understand some security issues that might come with such feature (not all running containers want other containers to connect to it).
Anyway, about 2 weeks ago, while I was dockerizing a system automation application that requires at least 2 running nodes (containers), I found that the feature was silently available*.
* I went through the Release Notes of almost all recent releases and could not find such feature being mentioned. If I got it wrong, please point me to the proper Release Notes. Thanks!
Before I forget, let me share with you the reason I have always wanted such feature. My reason is simple - I need all my running containers to know the "existence" of other related containers and have a way to communicate with them (in this case, through /etc/hosts).
My test environment was on CentOS 7.1 and Docker 1.8.2.
(1) Firstly, I started a container without any hostname and container name. You can see that the /etc/hosts file was updated with:
(a) the container ID as the hostname
(b) the container name as the hostname
(2) Next, I started another container with an assigned hostname of "node1". You can see that now the /etc/hosts was updated with:
(a) the assigned hostname
(b) the container name as the hostname
(3) To spice it up a little bit more, I started another container with an assigned hostname and gave the container a name. You can see that the /etc/hosts was updated with:
(a) the assigned hostname
(b) the given container name
(4) The next test was to start up 3 containers with different hostname and given name and left all of them running. You can see that "node1" was started and its /etc/hosts was updated accordingly.
(5) Next, I started "node2" and its /etc/hosts was updated with details of "node1" too.
(6) What happened to the /etc/hosts of "node1" at this moment? Surprise, surpirse...you can see that its /etc/hosts was updated with details of "node2" too.
(7) Just to make sure it's not devaju. I started the third container ("node1" and "node2" were still running). This time I wasn't surprised to see that the /etc/hosts of "node3" was updated with details of "node1" and "node2".
(8) Lastly, let's check the /etc/hosts of "node1" and "node2". Viola, they are updated too!
Seriously, I am not sure whether this feature has been available for some time or it is an experimental feature. Anyway, I like it...for the thing I do! So, I am not speaking for you :)
Anyway, about 2 weeks ago, while I was dockerizing a system automation application that requires at least 2 running nodes (containers), I found that the feature was silently available*.
* I went through the Release Notes of almost all recent releases and could not find such feature being mentioned. If I got it wrong, please point me to the proper Release Notes. Thanks!
Before I forget, let me share with you the reason I have always wanted such feature. My reason is simple - I need all my running containers to know the "existence" of other related containers and have a way to communicate with them (in this case, through /etc/hosts).
My test environment was on CentOS 7.1 and Docker 1.8.2.
(1) Firstly, I started a container without any hostname and container name. You can see that the /etc/hosts file was updated with:
(a) the container ID as the hostname
(b) the container name as the hostname
(2) Next, I started another container with an assigned hostname of "node1". You can see that now the /etc/hosts was updated with:
(a) the assigned hostname
(b) the container name as the hostname
(3) To spice it up a little bit more, I started another container with an assigned hostname and gave the container a name. You can see that the /etc/hosts was updated with:
(a) the assigned hostname
(b) the given container name
(4) The next test was to start up 3 containers with different hostname and given name and left all of them running. You can see that "node1" was started and its /etc/hosts was updated accordingly.
(5) Next, I started "node2" and its /etc/hosts was updated with details of "node1" too.
(6) What happened to the /etc/hosts of "node1" at this moment? Surprise, surpirse...you can see that its /etc/hosts was updated with details of "node2" too.
(7) Just to make sure it's not devaju. I started the third container ("node1" and "node2" were still running). This time I wasn't surprised to see that the /etc/hosts of "node3" was updated with details of "node1" and "node2".
(8) Lastly, let's check the /etc/hosts of "node1" and "node2". Viola, they are updated too!
Seriously, I am not sure whether this feature has been available for some time or it is an experimental feature. Anyway, I like it...for the thing I do! So, I am not speaking for you :)
Sunday, October 4, 2015
Docker: Sharing Kernel Module
This is more like a request-for-help entry :)
I was working on a small project to dockerize a system automation application that supports clustering. Everything was working fine until the very last minute - during system startup!!!
The database was working fine in a primary-standby setup. The system automation application was running fine on both of the nodes in the cluster. Configuration went fine too.
However, when I started the automation monitoring, KABOOOOMMM, problem happened!
What happened was that the first node came up fine, but the second node seems to be reporting weird status. A quick read of the log file told me that the second node was unable to load the "softdog" kernel module and would not be able to initialize. The reason is very likely because the first node has already gotten hold of the kernel module.
Ok, I have to admit that this is a silly mistake I made and an oversight before I started the project.
Anyway, I would like to share what I learned from this experience:
(1) It is possible to have access to kernel module by:
(a) mounting the /lib/modules directory read-only into the container (-v /lib/modules:/lib/modules:ro).
(b) the kernel version of the host machine must be the same with the kernel version of the image/container.
(c) run the container with --privileged option.
For those people out there that are trying to dockerize applications that require kernel module, please be reminded that all containers on a same host would share the same underlying (host) kernel. Hence, you have to make sure that the application would still work under such condition.
I am not a kernel expert and have not ventured into this area in Docker previously, so any help is appreciated - SOS!!!!!
I was working on a small project to dockerize a system automation application that supports clustering. Everything was working fine until the very last minute - during system startup!!!
The database was working fine in a primary-standby setup. The system automation application was running fine on both of the nodes in the cluster. Configuration went fine too.
However, when I started the automation monitoring, KABOOOOMMM, problem happened!
What happened was that the first node came up fine, but the second node seems to be reporting weird status. A quick read of the log file told me that the second node was unable to load the "softdog" kernel module and would not be able to initialize. The reason is very likely because the first node has already gotten hold of the kernel module.
Ok, I have to admit that this is a silly mistake I made and an oversight before I started the project.
Anyway, I would like to share what I learned from this experience:
(1) It is possible to have access to kernel module by:
(a) mounting the /lib/modules directory read-only into the container (-v /lib/modules:/lib/modules:ro).
(b) the kernel version of the host machine must be the same with the kernel version of the image/container.
(c) run the container with --privileged option.
For those people out there that are trying to dockerize applications that require kernel module, please be reminded that all containers on a same host would share the same underlying (host) kernel. Hence, you have to make sure that the application would still work under such condition.
I am not a kernel expert and have not ventured into this area in Docker previously, so any help is appreciated - SOS!!!!!
Subscribe to:
Posts (Atom)