@alejandro | Good evening everybody. |
@alejandro | Now its going to start the next conference. |
@alejandro | Evaldo Gardenali is going to talk about the Xen virtualization. |
@alejandro | Evaldo studies in the Universidade Estadual Paulista (unesp) in Bauru-SP, Brasil |
@alejandro | and has contributed to some projects like NetBSD, CAcert and freenode. |
@alejandro | The questions as usual will be in #qc and the spanish translations in #redes. |
@alejandro | Welcome Evaldo. |
@alejandro | :-) |
@E0x | uff |
@alejandro | Please be patient, technical problems. |
@alejandro | Welcome back Evaldo, your turn. |
@Evaldo | First, we need to consider the reason to virtualize, such as supporting heterogeneous environments on a single server |
@Evaldo | ahhh |
@sarnold | Evaldo: (sorry, you weren't opped, so nothing you said in the last five minutes has made it through.. until "First, we..") |
@Evaldo | channel was moderated, and I was voiceless |
@Evaldo | (1134595089 19:18) < Evaldo> I am going to talk about Xen, a paravirtualization engine |
@Evaldo | (1134595114 19:18) < Evaldo> please download the presentation available on http://evaldo.gardenali.biz/umeet/xen.pdf |
@Evaldo | (1134595153 19:19) < Evaldo> well, I am going to talk about virtualization, in an overview |
@Evaldo | (1134595173 19:19) < Evaldo> then proceed to Xen specifics, its features |
@Evaldo | (1134595184 19:19) < Evaldo> Scalability and Performance considerations |
@Evaldo | (1134595229 19:20) < Evaldo> then introduce Quality of Service management, show a "from scratch" implementation of Xen, and discuss a bit about the future of Xen |
@Evaldo | (1134595272 19:21) <@Evaldo> First, we need to consider the reason to virtualize, such as supporting heterogeneous environments on a single server |
@Evaldo | currently supported OSes include Linux 2.4 and 2.6, NetBSD 3.0 and Plan9 |
@Evaldo | Xen can also be used to consolidate work. Recent research showed that servers are used on 15% of their capacity on average, and Xen can help consolidate idle servers on the same physical server |
@Evaldo | Legacy systems, requiring old libraries are also a point to consider |
@Evaldo | It can also be used to gradually upgrade the offered services |
@Evaldo | By using multiple dedicated domains, one can have a good level of service isolation |
@Evaldo | Quality of Service can be guaranteed on a per-virtual-machine basis, giving more efficient control than traditional operating system priority scheduling |
@Evaldo | the ease of administration of individualized servers and the relocation and migration features also make it very flexible and time-saving |
@Evaldo | Analyzing the virtualization techniques available, we have: |
@Evaldo | Single System Image, as provided by Ensim, Virtuozzo, Solaris Zones |
@Evaldo | These systems group processes and resources in specialized containers, but since the kernel is a common point, a flaw leading to kernel compromise compromises all virtual machines at once |
@Evaldo | Emulation techniques are very flexible, and they can run the most diverse operating systems, but emulation is an inefficient task, leading to poor performance |
@Evaldo | Virtualization, as done by VMWare, is also very flexible, allowing the use of unmodified operating systems, but current virtualization techniques take up to 35% of cpu time to implement the virtualization engine |
@Evaldo | User Mode Kernels, like User Mode Linux and CoLinux have the disadvantage of running as regular processes under the host operating system and leading to poor scheduling and context switch problems |
@Evaldo | Finally, Paravirtualization offers excellent performance, with the drawback that the guest operating systems must be ported to a special architecture |
@Evaldo | so, the main advantages of the use of Xen for system administrators are Service Isolation, minimizing damages, Failure isolation, in the case of bad hardware or drivers, ease of administration and QoS enforcement |
@Evaldo | for Datacenters and Hosting Providers, offering "Virtual Private Servers" and offering Xen hosts can raise the aggregated value |
@Evaldo | and the real benefits of virtualization are costs, such as purchase and rent of equipments that occasionally will be subutilized, rack space, colocation costs, energy and conditioned air costs, and the cost of downtime, in the case of broken hardware with non-relocatable data |
@Evaldo | slide 12 has an overview of the Xen 2.0 architecture, showing that the Xen hypervisor runs in a privileged layer, before any other operating systems |
@Evaldo | and all operating systems run directly on top of Xen |
@Evaldo | On to the paravirtualization technique, it takes advantage of unused structures on the x86 architecture |
@Evaldo | the x86 has 4 modes of protection/operation, called rings |
@Evaldo | traditional operating systems run on ring 0, while applications run on ring 3 |
@Evaldo | when systems are ported to the Xen architecture, Xen takes control of the ring 0, the operating system runs on ring 1, and the applications run unmodified in ring 3 |
@Evaldo | privileged operations are requested to the Xen hypervisor in the form of hypercalls, a system that should be imagined analogous to the userland -> kernel syscalls |
@Evaldo | the Linux 2.4 port to the Xen architecture took less than 3000 lines of code to happen, while the 2.6 port did not need any core modifications |
@riel | one question - if no core modifications are needed, why does the 2.6 xen patch modify around a dozen files ? |
@Evaldo | riel: it modifies architecture-specific files, while keeping the "core" intact. 2.6 is modelled in a way that architecture-specific code are separated in special files |
@Evaldo | also, keep in mind that I am talking about Xen 2.0. Xen 3.0 is a recent release, and is my next topic. I did not inspect what 3.0 changes on the linux kernel yet. |
@riel | good point, I've only read the 3.0 code |
@Evaldo | Xen 3.0 improvements over 2.0 include AGP and ACPI support on the administrative domain, SMP-capable guests, more supported hardware architectures, Intel VT-x (Vanderpool) and AMD Pacifica extensions, improvements on the management tools and optimization of the networking code |
@Evaldo | for the virtualization extensions, there is a need for a "manager", which Xen can do well according to the developers. The hardware extensions were created to allow for virtualization managers to run unmodified operating systems, and not to replace the managers |
@Evaldo | a good example is the z/VM engine for IBM S/390 hardware. the z/VM is very analogous to Xen, while S/390 has virtualization instructions on the hardware |
@Evaldo | Hardware access in Xen systems is made pretty transparent, enabling the privileged operating systems to use their own drivers to handle hardware. |
@Evaldo | this allows for some degree of fault tolerance, since a kernel crash due to a bad driver will not compromise the Xen hypervisor. |
@Evaldo | Guest domains are allowed to access exported virtual devices from the privileged domains |
@Evaldo | these devices are handled through efficient shared-memory system, avoiding unnecessary data replication. Currently, both Linux and NetBSD can export any kind of block devices for unprivileged domains |
@Evaldo | Eventually, some PCI cards can be allocated for dedicated access on a guest domain, which uses its own PCI and device drivers to handle that device |
@Evaldo | slide 19 has an example of a domain that handles a single audio card on its PCI space |
@Evaldo | driver failures on these cards will not influence the execution of the other domains, since the hypervisor, the domain 0, and all other domains that do not depend on devices from this specific domain do not have contact with the PCI device or the bad driver |
@Evaldo | < ~Arador> that would allow to run different drivers in different virtual machines, which sounds like a interesting idea for monolithic kernels |
@Evaldo | with the exception that PCI devices were not developed with virtualization in mind, so only one domain can access a PCI device natively at a time |
@Evaldo | slide 20 has the currently supported operating systems in black, and "said-to-work" in red. |
@Evaldo | there is a working FreeBSD snapshot for Xen 2.0, but it is not integrated to FreeBSD tree, and is of limited usability and stability |
@Evaldo | Sun announced that they have OpenSolaris working, but did not release the source or binaries yet |
@Evaldo | Microsoft funded the Xen project, and they made Windows work with it, but Microsoft decided to keep the patches private, so we are not able to run it |
@Evaldo | on to the Features, Xen has dynamic domain memory management |
@Evaldo | one can use "xm balloon Domain-5 48" and domain 5 will adjust itself to 48MB ram, for example |
@Evaldo | currently missing is some form of automatic control for the memory management |
@Evaldo | there is a Pause feature, enabling one to temporarily pause a running system, while keeping it in ram, ready for execution |
@Evaldo | the Save feature, in contrast, saves the virtual machine state to disk, freeing resources, and can be resumed later, using the state file |
@Evaldo | Live Migration is indeed the most brilliant feature of Xen. It allows one to move a running system from one hardware server to another, without losing state, connections or uptime |
@Evaldo | slide 25 has a diagram of the Live Migration schema |
@Evaldo | on Pre-Migration, the destination is selected and flagged for migration, but the system is still running on the first server |
@Evaldo | on the Reservation stage, resources are reserved on the target system to allow migration |
@Evaldo | then the system enters "interactive pre-copy", which copies "dirty" (recently updated) memory pages in successive rounds until the amount to be transferred is minimal |
@Evaldo | Stage 3 finally suspends the virtual machine on the source system, restores the state of the system on the target host, sends unsolicited ARP to update switch ports and other networking equipment, and proceeds to commitment |
@Evaldo | on commitment stage, the system is already running on the second host, and resources on the first host are freed |
@Evaldo | oops, my bad. it starts to run on stage 5 |
@Evaldo | stage 5 resumes virtual machine operations and finishes migration |
@Evaldo | slides 26 and 27 have migration examples, one with a heavy-duty webserver and one with a quake3 server, to test latency |
@Evaldo | slide 26 shows the bandwidth reduction caused by the iteractive pre-copy stage, and then the service interruption clearly. the service interruption took 165ms |
@Evaldo | as for the quake3 server, two migrations were performed, and the users did not experience any form of bad game behaviour |
@Evaldo | the memory footprint of the Xen internal structures is very low, at about 20kb per running domain |
@Evaldo | CPU overhead for the paravirtualization on Xen 2.x is about 5% |
@Evaldo | slides 29-32 have performance comparisons between *L*inux native host, *X*en, *V*MWare ESX, *U*ser mode linux |
@Evaldo | slides 33 and 34 have network performance comparisons. Network performance is supposed to have improved on Xen 3.0 |
@Evaldo | on to QoS management, there is a flexible scheduler management, allowing for custom schedulers |
@Evaldo | the most used is BVT, which is default |
@Evaldo | BVT uses "virtual time" to schedule domains, and has some trick on "warping" which reduces the virtual time of a domain when comparing times for scheduling |
@Evaldo | there is a detailed reference material on the references page |
@Evaldo | on to slide 40, the required steps to implement a Xen system from scratch (useful for slackware people ;) |
@Evaldo | Xen needs shared storage if one plans to use migration, so iSCSI, nfs or enbd are good techniques. otherwise, any kind of storage space can be used |
@Evaldo | Xen install needs tuning the grub configuration files, as slide 42 shows |
@Evaldo | for systems that do not handle dependencies, they are listed in slide 43 |
@Evaldo | slide 44 shows the installation of the xend and xm userland applications, to control the Xen system |
@Evaldo | slides 45 and 46 show how to build custom domain-0 kernels, for performance tuning |
@Evaldo | slide 47 shows an example domain configuration file. the format is pretty straightforward, and since it is a python included file, it can have fancy automation |
@Evaldo | slide 48 shows a few techniques to install a guest domain: XenU installer (like NetBSD does), regular bootstrap tools, used to build "chroots", QEMU can be used to build a system image, or tarballs directly |
@Evaldo | slide 49 shows how to install a NetBSD domain, which brings the NetBSD installer on the virtual machine console |
@Evaldo | sorry, I missed the translation, "Rede" == network |
@Evaldo | slide 51-52 have an example implementing QoS using the BVT scheduler, without warping |
@Evaldo | slides 54-55 show how easy it is to delegate a PCI card to a domain, like I did with the example sound card on this presentation |
@Evaldo | slides 56-58 show how to use back-end domains, which is using virtual devices exported by a privileged domain that is not the domain 0 |
@Evaldo | it is just a matter of including the domain name on the parameter |
@Evaldo | and finally, the roadmap for Xen next releases |
@Evaldo | Ballon auto control, the automatic management of the memory footprint of a domain |
@Evaldo | Load Balancing between xen servers, using the migration advantage |
@Evaldo | node evacuation, in case of trouble |
@Evaldo | implement a storage subsystem that can be safely be used in a Xen "cluster", without depending on iSCSI, nfs or enbd, which might be sub-optimal in some cases |
@Evaldo | Fault Tolerance, and possibly tracking the execution of a system in two hosts |
@Evaldo | VM Fork, allowing for rapid on-demand performance boost |
@Evaldo | and Secure Virtualization, implementing secure methods of access control and management |
@Evaldo | some references I find interesting to the people experiencing Xen for the first times are available on slide 60 |
@Evaldo | sorry for the initial delay due to connection problems, and for the invasion of the time of my second talk |
@Evaldo | I am available to questions :) |
@MJesus_ | Evaldo there are some questions in #qc |
@Evaldo | MJesus_: I replied to Arador's ones |
@MJesus_ | ah ok.... |
@MJesus_ | more questions ? |
@Evaldo | I did not understand E0x's point though, maybe if he were more specific... but he left, it seems |
@MJesus_ | all right! |
tschwinge | Evaldo: A very interesting presentation! Thanks! |
@MJesus_ | oh, thanks riel ! |
@Evaldo | tschwinge: glad you liked it :) |
@MJesus_ | always I forgot the +m |
Daniel | (tarde pero seguro :) |
tschwinge | A question: in which way has an operating system to be modified that it is able to run under Xen? |
@MJesus_ | more questions please ? |
@MJesus_ | :) Daniel |
@Evaldo | tschwinge: it needs to run in ring 1, and request privileged operations to Xen instead of doing them itself |
@Evaldo | tschwinge: this includes hardware access |
@Evaldo | tschwinge: however, with newer CPUs with intel or AMD virtualization techniques, it is possible to run unmodified OSes on Xen |
@MJesus_ | are you too tired Evaldo, or could you talk now about cacert ? |
tschwinge | The thing I'm thinking about is the following: making GNU Mach (a fork of CMU's Mach) run on Xen. GNU Mach is the microkernel which the GNU/Hurd operating system uses. |
@riel | you still need a modified domain 0 though |
@Evaldo | MJesus_: I can do it :) |
tschwinge | I'll have a look at that one day and come back (somewhere else, probably) with more specific questions. :-) |
@Evaldo | riel: indeed, because you need support for the privileged Xen operations on the domain 0 (controlling Xen itself) |
@MJesus_ | in #redes are traslating to spanish ... with some delay |
@riel | Evaldo: more because you need device drivers to talk to the real hardware, and you cannot run those drivers from inside a VMX domain |
@Evaldo | riel: that too, hehe |
@Evaldo | MJesus_: do you prefer that I start with CAcert, or wait a bit for #redes to catch up? |
@Evaldo | MJesus_: its 20:42 local time, so for me it is not a problem :) |
@MJesus_ | hummmm a good questions |
Daniel | Evaldo, in some tests of Xen 64 we noticed that kernel anfitrion runs in the same ring that applications, the 3. How reliable it is? |
@MJesus_ | because we are in a European time 23:50 |
@riel | Daniel: x86-64 only has rings 0 and 3, rings 1 and 2 do not exist |
Daniel | host kernel, sorry |
@riel | guest kernel too |
@riel | it context switches between kernel and userspace |
Daniel | thank you riel |