Virtual Machines vs. Containers Revisited - Part 3

23/10/2019 58 min Episodio 83
Virtual Machines vs. Containers Revisited - Part 3

Listen "Virtual Machines vs. Containers Revisited - Part 3"

Episode Synopsis


In this episode, we cover the following topics:Operating-system-level virtualization = containersAllows the resources of a computer to be partitioned via the kernelAll containers share single kernel with each other AND the host systemDepend on their host OS to do all the communication and interaction with the physical machineContainers don't need a hypervisor; they run directly within the host machine's kernelContainers are using the underlying operational system resources and driversThis is why you cannot run different OSes on the same host systemi.e. Windows containers can run on Windows only, and Linux Containers can run on Linux onlyWhat we think of different OSes (RHEL, CentOS, SUSE, Debian, Ubuntu) are not really different...They are all same core OS (Linux), they just differ in apps/filesBased on the virtualization, isolation, and resource management mechanisms provided by the Linux kernelnamespacescgroupsContainer historyFreeBSD Jails (2000)BSD userland software that runs on top of the chroot(2) system callchroot is used to change the root directory of a set of processesProcesses created in the chrooted environment cannot access files or resources outside of itJails virtualize access to the file system, the set of users, and the networking subsystemA jail is characterized by four elements:Directory subtree: the starting point from which a jail is enteredOnce inside the jail, a process is not permitted to escape outside of this subtreeHostnameIP addressCommand: the path name of an executable to run inside the jailConfigured via jail.conf fileLXC containers (2008)Userspace interface for the Linux kernel features to contain processes, including:Kernel namespaces (ipc, uts, mount, pid, network and user)Apparmor and SELinux profilesSeccomp policiesChroots (using pivot_root)Kernel capabilitiesCGroups (control groups)Docker containers (2014)Early versions of Docker used LXC as the container runtimeLXC was made optional in v0.9 (March 2014)Replaced by libcontainer)libcontainer became the core of runCLXC was dropped in v1.10 (February 2016)Container technologyContainers are just processes. So what makes them special?NamespacesRestrict what you can SEEVirtualize system resources, like the file system or networkingMakes it appear to processes within the namespace that they have their own isolated instance of resourceChanges to the global resource only visible to processes that are members of the namespaceProcesses inherit from parentLinux provides the following namespaces:IPC (interprocess communications)CLONE_NEWIPC: Isolates System V IPC, POSIX message queuesNetworkCLONE_NEWNET: Isolates network devices, stacks, ports, etcMountCLONE_NEWNS: Isolates mount pointsPIDCLONE_NEWPID: Isolates process IDsUserCLONE_NEWUSER: Isolates user and group IDsUTS (Unix Timesharing System)CLONE_NEWUTS: Isolates hostname and NIS domain nameCgroupCLONE_NEWCGROUP: Isolates cgroup root directorySyscall interfaceSystem call is the fundamental interface between an app and the Linux kerneli.e. Linux kernel calls to create/enter namespaces for processesControl groups (cgroups)Restrict what you can DOLimits an application (container) to a specific set of resources like CPU and memoryAllow containers to share available hardware resources and optionally enforce limits and constraintsCreating, modifying, using cgroups is done through the cgroup virtual filesystemProcesses inherit from parentCan be reassigned to different cgroupsMemoryCPU / CPU coresDevicesI/OProcessesUsing cgroupsTo see mounted cgroups:mount | grep cgroupTo create a new cgroup:mkdir /sys/fs/cgroup/cpu/chrisTo set "cpu.shares" to 512:echo 512 > /sys/fs/cgroup/cpu/chris/cpu.sharesNow add a process to this cgroup:echo <get_pid> > /sys/fs/cgroup/cpu/chris/cgroup.procsPseudo code: Creating a containerSteps:Create root filesystem for containerSpin up busybox in Docker container, and then export filesystemRun "launcher" process that sets up "child" namespaceLauncher process forks new child process (now under new namespaces)Child process then forks new process for containerchroot (to our root filesystem)mount any other FSset cgroups (e.g. apply CPU constraints)LinksFreeBSD JailsLinux Container Project - LXC, LXD, LXCFSnamespaces - overview of Linux namespacescgroups kernel documentationWhat Have Namespaces Done For You Lately? - YouTube videoEnd SongBettie Black & Sophia - Something BeautifulFor a full transcription of this episode, please visit the episode webpage.We'd love to hear from you! You can reach us at:Web: https://mobycast.fmVoicemail: 844-818-0993Email: [email protected]: https://twitter.com/hashtag/mobycast

More episodes of the podcast Mobycast