We don’t need no virtualization
If you think about it, the main use case for virtualization is isolation: Given that you have several applications running at the same time, how can you make sure they don’t interfere with each other? For example, if a customer deploys an AWS Lambda function, Amazon doesn’t just spin up a Node.js instance and let it go hog wild on their servers. Instead, they provision a Firecracker VM that gives the application a completely isolated compute environment; it has its own virtual file system, memory and network interfaces. However, this has a number of downsides—for one, it’s expensive! You have to allocate virtual address space, load an entire operating system, perform bespoke network routing, and so on.
Let’s take a step back; why do we need virtualization in the first place? The main problem is that the languages and runtimes we use today are not designed for “multi-tenancy”; by default, the runtime may have unfettered access to the entire system it’s running on. In theory, this means one tenant could disrupt the operation of another tenant or the host (accidentally or on purpose), exfiltrate data, manipulate code, etc. These capabilities can be locked down in various ways, e.g. at a user level in the operating system or sometimes in the runtime (Deno is a recent example). However, this level of access control is usually course-grained and tricky to get right.
An alternative to virtualization
Is virtualization really a prerequisite for process isolation? What if resource control was built into the language itself instead? Let’s say that a program starts without any access to its environment whatsoever. If the program needs to say, access a SQL database, the host can provide a “Database” interface, via the runtime, that is completely abstract. The only thing the program can control is what queries to send to the database; TCP connections, database user management and the wire protocol are never exposed to the guest program. This essentially amounts to supporting Dependency Injection at the language / runtime level (or an effect system, if you’re so inclined). This opens up a lot of possibilities, but most pertinent is that it allows us to get rid of virtualization completely! You can have several tenants running on the same host, in the same address space, on the same file system, completely isolated from each other. Isolation is achieved by running multiple instances of the runtime, configured with different capabilities.
Implementation
There are a number of ways in which the idea of “built-in resource control” could be realized:
WebAssembly
WebAssembly (WASM) is built on the idea of decoupling byte code and VM implementation; WASM itself is just a bytecode spec, and a runtime can be assembled using different components. This basically means that WASM makes no assumptions whatsoever about the “platform” it will run on, and the host can “opt in” to what capabilities will be exposed to guest processes.
SpiderLightning is a WASM realization of the ideas presented here; it exposes common web programming components, like key-value store, HTTP client, etc, as WebAssembly components. The downside of using WebAssembly is that it’s a compile target, and not a separate language. As such, there might be a bit of a mismatch between the source language and the WebAssembly target. However, interest in WASM is growing, and there are new languages like MoonBit that are designed with WASM in mind.
“Platformless” languages
Just like WASM, you could design a high-level language from the ground up that doesn’t make any assumptions about what platform it will run on. As far as I know, Roc is the most prominent in this category, but it’s still in its infancy.
Static verification
A third way to solve this problem would be for the host to provide bespoke libraries for an existing language, and then statically verify that the code does not perform any IO outside those libraries. For example, you could let a user run an arbitrary Node.js process on your own machine, as long as you have statically verified that it won’t do anything bad. This could also work with external packages, as you could check each package once and then cache the result.
Conclusion
In summary, I think it would be better to solve application isolation at the language level rather than at the OS level. Imagine not having to build a Docker image every time you wanted to push a code update. Or, as a hosting provider, not having to run N layers of virtualization to run an application hosting platform. Wouldn’t that be great?