Olive Blog

VMNetX 0.5.1 released

Olive client version 0.5.1 has been released. This release fixes audio playback issues.

For Linux users, the new release is available through your distribution's package manager. Windows users can download the updated installer. The source release is also available.

Changes

  • Fix audio playback
  • Fix server failure to terminate stale VMs in some circumstances
  • Give up on lost thin-client connection after server timeout expires

Spring update on the Olive project

Olive has been getting some very good publicity lately. Vint Cerf, who is a Turing Award winner and Internet pioneer, found out about Olive through the grapevine, and contacted Satya to learn more. Satya gave him an overview and demo some weeks ago. Vint was very impressed with Olive because it solves a key problem that has long concerned him. This is, of course, the problem of preserving executability of software over the long term. This has been the crucial missing piece in his vision of a "digital vellum" to preserve Internet content.

Since learning about Olive and seeing it working, Vint has become something of a champion of Olive. He has invited Satya to do a joint keynote presentation on April 27 at Stanford to the annual IIPC (International Internet Preservation Consortium) meeting. On February 13, he gave a keynote talk to the American Association for the Advancement of Science in San Jose, CA. The official AAAS video of the talk doesn't seem to be out yet, but many news organizations reported the talk. This article by a BBC reporter includes Vint's mention of Olive. A search for "Vint Cerf black hole" will show you many other news articles, some mentioning Olive.

Satya was interviewed on February 17 on radio by BBC World News. It is just a short interview about Olive, working from Vint's talk to AAAS, but very good visibility for Olive.

The BBC also has a link to the Olive website on its web page. Over the past several days, the number of hits on the Olive web page has increased dramatically, thanks to Vint's publicity.


VMNetX 0.5.0 is now required to launch VMs

Launching virtual machines from the Olive website now requires VMNetX 0.5.0 or above. If you have an older version, you will not be able to launch VMs until you upgrade.


VMNetX 0.5.0 released

The Olive client has been updated to version 0.5.0. This release includes support for 64-bit VMs and various other improvements. Note that VMNetX no longer supports Ubuntu 12.04; we recommend upgrading to Ubuntu 14.04 LTS.

For Linux users, the new release is available through your distribution's package manager. Windows users can download the updated installer. The source release is also available.

Changes

  • Support 64-bit VMs
  • Show ETA when loading memory image
  • Permit VM settings needed for Windows 7 VMs with memory images
  • Fix excessive CPU usage
  • Support launching thick-client VMs via vmnetx+http/vmnetx+https URLs
  • Register to handle VMNetX URL schemes
  • Deprecate reference (.netx) files
  • Drop support for Ubuntu 12.04
  • Convert vmnetx-generate modes to subcommands

Object Unraveling: Tracking a missing cursor in QXL guests

After a recent yum upgrade on our RHEL 6 Workstation systems, we noticed that the mouse pointer was no longer visible in certain Olive virtual machines. The pointer still worked (hovering, clicking, and dragging all functioned properly) but, with no direct visual feedback, it was difficult to use the mouse. This effect only occurred on VMs with paravirtualized display (QXL) support -- so, Scientific Linux, but not Windows 3.1.

After some tinkering, we discovered that the problem went away if we forced VMNetX to check for software updates at startup, or if we touched one of VMNetX's Python source files. We didn't have to modify the file to fix the problem -- just update its timestamp so Python would regenerate the bytecode.

So, what happened?

tl;dr: SpiceChannel overrides the GObject connect(...) method for something completely different, so we used channel.connect_object('open-fd', ..., channel). This triggered a long-standing refcounting bug in PyGObject, so when the Python GC happened to run during SPICE channel setup, our open-fd handler ended up receiving a half-destroyed PyGObject.

How we use SPICE

VMNetX uses the SPICE thin-client protocol in both its thick- and thin-client modes. (Thick-client mode just means that the VM executes on the same computer as the user interface.) A SPICE session actually consists of several TCP connections: a "main channel", plus individual channels for display data, audio playback, keyboard/mouse input, cursor updates, and so on.

VMNetX uses the spice-gtk widget as its SPICE viewer. By default, spice-gtk will initiate TCP connections for the channels within a session; the programmer just supplies the remote hostname, port number, and password. Like the venerable VNC protocol, SPICE assumes that every remote thin-client session will listen on a different host/port pair, so there is no mechanism within the protocol for selecting a particular remote session.

Thus, if we used only SPICE's built-in infrastructure, the VMNetX thin-client server would have to open a separate listening port for every QEMU process running a VM; this causes issues with port number reuse and firewall configuration. To avoid these problems, we handle connection setup ourselves: VMNetX connects to a single well-known TCP port and sends a "token" which specifies a particular VM running on the server. If the server accepts the token, we launch the normal SPICE protocol over the TCP connection, with the thin-client server forwarding TCP traffic to the appropriate backend QEMU process.

This additional handshake means that we cannot use spice-gtk's automatic connection setup. Instead, we ask spice-gtk to signal us when it wants to establish a TCP connection. We initiate the connection ourselves and pass its file descriptor back to spice-gtk. While this additional step is only necessary for thin-client mode, it is also executed in thick-client mode.

spice-gtk is written in C and uses the GObject framework to expose an object-oriented API to application code. We use it from Python via the PyGObject binding library. In order to make a GObject accessible from Python, PyGObject creates a Python proxy object which exposes the object's methods and properties as Python bound methods and properties. Thus, in Python, a SpiceChannel is actually two objects which reference each other:

Figure 1

The Python object holds a strong reference (A) to the GObject, and the GObject holds a weak reference (B) to the Python object. The weak reference allows a holder of the GObject to recover the Python object, if it still exists, but does not prevent the Python object from being destroyed.

Object destruction in GObject and Python

Both GObject and Python use reference counting to determine when they can free objects. Since refcounting by itself cannot detect reference cycles, Python also has a special-purpose garbage collector that only handles reference cycles. "Container" objects that can participate in cycles, such as lists and dicts, must be specially registered with the garbage collector from C code. To find unreferenced objects, the GC traverses outward from each container, tracking the inbound references it finds to every other container. At the end of this process, any container whose refcount is larger than the computed number of inbound references must be referenced from outside and so is not garbage. These objects are removed from the set and the process repeats. Any remaining objects are garbage, so the GC forcibly clears their references. This should cause their refcounts to drop to zero and the objects to be freed.

As the PyGObject library is designed, PyGObjects will often have shorter lifetimes than their underlying GObjects. For example, consider building a GTK+ window. Each UI widget is created, configured, and bound to another widget. At that point, the Python object for the widget will often fall out of scope, and what remains is a tree of UI widgets (GObjects) linked together from C code. The PyGObject library handles this common case via ordinary refcounting, without involving the Python GC: when a PyGObject drops out of scope, reference A is dropped and weak reference B is cleared. If the PyGObject is later needed (for example, to pass to a callback function), a new PyGObject is created for the existing GObject.

SPICE session setup

Back to our problem. During SPICE session setup, the sequence of events looked like this:

signal: channel-new <spice_client_gtk.MainChannel object at 0x28cc230 (SpiceMainChannel at 0x29b4000)>
signal: open-fd <spice_client_gtk.MainChannel object at 0x28cc230 (SpiceMainChannel at 0x29b4000)>
callback: set_fd <spice_client_gtk.MainChannel object at 0x28cc230 (SpiceMainChannel at 0x29b4000)>

signal: channel-new <spice_client_gtk.PlaybackChannel object at 0x28cc190 (SpicePlaybackChannel at 0x29bf340)>
signal: channel-new <spice_client_gtk.RecordChannel object at 0x28cc280 (SpiceRecordChannel at 0x29c0a90)>
signal: channel-new <spice_client_gtk.CursorChannel object at 0x28cc2d0 (SpiceCursorChannel at 0x29c1ed0)>
signal: channel-new <spice_client_gtk.DisplayChannel object at 0x28cc320 (SpiceDisplayChannel at 0x29c3710)>

*** GC ***

signal: open-fd <spice_client_gtk.DisplayChannel object at 0x28cc320 (SpiceDisplayChannel at 0x29c3710)>
signal: open-fd <spice_client_gtk.CursorChannel object at 0x28cc2d0 (uninitialized at 0x0)>
signal: channel-new <spice_client_gtk.InputsChannel object at 0x28cc5a0 (SpiceInputsChannel at 0x2a5a010)>
signal: open-fd <spice_client_gtk.InputsChannel object at 0x28cc5a0 (SpiceInputsChannel at 0x2a5a010)>
callback: set_fd <spice_client_gtk.DisplayChannel object at 0x28cc320 (SpiceDisplayChannel at 0x29c3710)>
callback: set_fd <spice_client_gtk.InputsChannel object at 0x28cc5a0 (SpiceInputsChannel at 0x2a5a010)>

Whenever a channel object is created, VMNetX receives a callback via the SpiceSession channel-new signal. The channel-new handler registers callbacks against multiple GObject signals, including open-fd. Except for the DisplayChannel, VMNetX doesn't retain a reference to the channel objects: we only need them in the signal handlers, and signal handlers always receive a reference to the object that emitted the signal.

Notice this line:

signal: open-fd <spice_client_gtk.CursorChannel object at 0x28cc2d0 (uninitialized at 0x0)>

Garbage collection is running while the cursor channel is being set up, and is somehow destroying reference A. That would explain why a yum upgrade started the problem: the Python GC runs after a certain number of allocations, and if one of the depended-on libraries slightly changed its allocation pattern, GC could now be running at a different time. That would also explain why update checking or bytecode compilation fixed the problem: those are both large tasks that would have significantly changed GC timing.

Digging into signal registration

Normally registration of callbacks would be done with the GObject connect() method:

channel.connect('open-fd', self._request_fd)

However, when SpiceChannel subclasses GObject, it overrides this method to perform a completely different function: initiating a TCP connection. (In C, there is no conflict because methods are just functions, and their names include the class name. In Python, of course, the subclass method overrides the superclass one.) Luckily, there's another method, connect_object(), which is not overridden. connect_object() is simply connect(), plus a mechanism for replacing the object passed to the callback function (normally channel) with a different one. So we can equivalently register the callback this way:

channel.connect_object('open-fd', self._request_fd, channel)

The registration is stored in the GObject as a closure. The closure, of course, maintains a reference (C) to the Python object that will be passed to the callback function.

Figure 2

Aha! A reference cycle!

PyGObjects are registered as containers, so they are visible to the Python GC. The GC doesn't know about the underlying GObject -- it's not a Python container -- but it does know about the references to Python objects stored inside the GObject's closures, because the PyGObject tells the GC about them during traversal. (If it didn't, reference cycles involving closures would be uncollectable.) So from the GC's perspective, the landscape looks like this:

Figure 3

Thus, the GC decides the PyGObject is unreferenced and clears its outbound references:

Figure 4

If the GObject was a Python object, its outbound references would be cleared as well. But it isn't, and the GC doesn't know about it, so the PyGObject remains accessible and is not destroyed. Thus, when the open-fd handler is called, it receives a half-destroyed channel PyGObject which is no longer linked to the GObject. When it tries to use this PyGObject, it receives a TypeError, which it then swallows as a workaround for a race condition in spice-gtk:

def _request_fd(self, chan, _with_tls):
    try:
        self.emit('viewer-get-fd', chan)
    except TypeError:
        # Channel is invalid because the session was closed while the
        # event was sitting in the queue.
        pass

The result? VMNetX silently continues running, but the cursor channel is never connected. The session mostly works, but the cursor shape and position data never get to spice-gtk.

Solutions

What can we do about this?

One option would be for VMNetX to keep a list of channel PyGObjects for connected channels. Since these PyGObjects would be reachable from outside the GObject/PyGObject pair, they would not be GC'd. We could register to receive a signal when the channel was torn down, and then remove the channel from the list.

A second option, oddly, is to add a line like this to the channel-new handler:

channel.__dict__

This creates (on demand) the __dict__ dictionary that stores Python attributes on the object. (Equivalently, we could assign a value to some channel attribute, e.g. channel.dummy = None.) This tells the PyGObject library that it is not safe to tear down and recreate the channel PyGObject, because it has important state (the Python attributes) that must be preserved. In this case, PyGObject will convert reference B to a GObject toggle ref, which is maintained as a strong reference whenever the GObject refcount is greater than 1, and a weak reference otherwise.

Figure 5

That works, but involves too much magic.

(Why doesn't PyGObject always use toggle refs? A design concern was to permit GObjects to be referenced simultaneously from Python and another high-level language. Toggle refs don't work if there are multiple language bindings with proxy objects, because the GObject refcount will never go to 1 and so the bindings' proxy objects will never be destroyed. The compromise was to use toggle refs only when it's clearly unsafe to tear down the proxy object.)

A third option (also known as the "right" option) is to stop using channel.connect_object() and call the superclass method directly:

GObject.connect(channel, 'open-fd', self._request_fd)

In this case, unlike with connect_object(), the closure does not include a reference to the PyGObject.

Figure 1

Instead, the PyGObject is accessed via weak reference B whenever it is needed by a callback function. If the PyGObject is dead, a new one is created. That's exactly what we want.

Conclusion

The VMNetX bug was to believe that channel.connect(...) and channel.connect_object(..., channel) were equivalent. Their functionality is equivalent, but their refcounting behavior is not. Of course there's a bug in the PyGObject library as well, since it shouldn't be possible for a PyGObject to get into a half-destroyed state.

It turns out that the PyGObject refcounting system has been through several iterations over the years, as various refcounting issues were discovered and fixed. This particular issue was first discussed and reported in 2008, but due to an error in the provided test case, the bug was erroneously believed to be fixed. Included in the bug report is a patch to fix the issue by allowing the GC to traverse the GObject's closures only when its refcount is 1 -- that is, when there are no references other than the one held by the PyGObject. We have updated the report to indicate that the bug still exists in the current version of PyGObject.

We Python programmers like to think the language releases us from having to think about low-level memory management -- not so much! The low-level issues are still there in the background, and can pop up to trap the unwary. In this case, we naturally assumed that the bug was either in our code or in spice-gtk, and only gradually realized that the problem was a refcounting bug deep within PyGObject. After all, if there had been such a bug, surely it would have been fixed by now! Meanwhile, the unusual connect_object() argument list was so self-evidently correct, and so obviously unrelated to the problem, that it wasn't seriously considered as a cause.

In our experience, technical problems of this subtlety are not uncommon in archival virtualization. When virtualizing an older operating system or application, the first attempt will sometimes "just work" -- but if it doesn't, tracking down the problem can require careful investigation of the archived application, supporting libraries, guest and host operating systems, emulated peripherals, virtual machine monitor, and our own infrastructure!

The fix for this bug is included in VMNetX 0.4.4.

Appendix: Debugging the problem

We tracked down the problem by:

  1. Adding a bunch of print statements to VMNetX code,
  2. Using gdb breakpoints to intercept the creation of important objects and obtain their addresses,
  3. Setting gdb watchpoints on the objects' refcounts and inspecting the stack when they changed, and
  4. Digging through the spice-gtk and PyGObject sources.

In hindsight, enabling GC debugging would have made this a lot easier:

import gc
gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_OBJECTS)

but we didn't do that until fairly late in the process.


VMNetX 0.4.4 released

The Olive client has been updated to version 0.4.4. This is a bugfix release.

For Linux users, the new release is available through your distribution's package manager. Windows users can download the updated installer. The source release is also available.

Changes

  • Fix missing guest mouse cursor in some sessions
  • Fix some minor bugs

VMNetX 0.4.3 released

The Olive client has been updated to version 0.4.3. This release significantly improves startup performance.

For Linux users, the new release is available through your distribution's package manager. Windows users can download the updated installer. The source release is also available.

Changes

  • Improve performance of memory image loading
  • Add example web frontend for vmnetx-server

VMNetX 0.4.2 released

The Olive client has been updated to version 0.4.2. This release brings improvements to Windows support and automatic notification of new VMNetX releases.

For Linux users, the new release is available through your distribution's package manager. Windows users can download the updated installer. The source release is also available.

Changes

  • Add fullscreen button
  • Periodically check for new VMNetX release
  • Linux: Improve error message in thick-client mode if CPU not supported
  • Windows: Draw loading progress bar in taskbar button
  • Windows: Install per-user by default
  • Windows: Improve platform integration
  • Windows: Sign the installer
  • Server: Fix authentication failures after 20 VM launches

VMNetX 0.4.1 released

The Olive client has been updated to version 0.4.1. This is the first release supporting Windows systems, for cloud execution only. Try it out and let us know how it works for you!

For current users, the new release is available through your distribution's package manager. If you installed from source, get it here.

Changes:

  • Support client on Windows (remote execution only)
  • Add administrative interface for terminating a vmnetx-server VM

Project Updates

The Advisory Group met on July 3, 2013 to review the installation of the first piece of content—The Great American History Machine, educational software created at Carnegie Mellon University in the 1980s. The group was satisfied that the software operated as it had when it was originally created.

Only a few members of the Advisory Group have had the time to look at Olive as it is currently accessible only in the Linux environment. Invitations will be sent out again to those who have not accepted and received credentials. Some other individuals who have expressed interest also have access to the sandbox created for the Advisory Group.

Jerry McDonough provided metadata for this piece of educational software in OAI and METS. The Advisory Group will continue to study the optimal metadata approach.

Under the Sloan Grant, the Olive project went through an RFP process to identify a suitable group to create a white paper on the optimal organization to sustain Olive after development. Ithaka won the contract and has become a partner in Olive development. Dr. Deanna Marcum and Ms. Clara Samayoa visited Carnegie Mellon on August 13 and 14 to see a demonstration and to discuss sustainability parameters. The partners plan a presentation at CNI to gather feedback for a winter summit.

Project personnel, Benjamin Gilbert, Dan Ryan, Erika Linke, and Gloriana St. Clair all participated in answering questions about Olive in a demonstration of it for Stanford University colleagues (Henry Lowood, Curator for History of Science & Technology collections; Michael Olson, Project Director of the Cabrinety project; Charlotte Thai, Project Member of the Cabrinety Project, and Peter Chan, Digital Archivst). Dr. Julie Smith, one of the developers of the Great American History Machine, attended and demonstrated how she had used the educational software in the classroom. Smith's talk and notes are now available on the Olive website.

Technical development of Olive continues. Currently, we are ingesting the Chem Collective, another piece of educational software. Work on pulling in Doom, one of the target games has begun.


1 2