A bit over a year ago, I wrote some notes on a “smaller Rust” – a higher level language
that would take inspiration from some of Rust’s type system innovations, but would be simpler by
virtue of targeting a domain with less stringent requirements for user control and performance.
During my time of unemployment this year, I worked on sketching out what a language like that would
look like in a bit more detail. I wanted to write a bit about what new conclusions I’ve come to
during that time.
The purpose for our language
Re-reading my previous post, I’m struck by how vague my statement of purpose for this language is.
My entire blog post is really focused on differentiating the language from Rust, and I frame the
discussion in terms of what I would remove from Rust, and how the language would not support certain
use cases of Rust. This isn’t really surprising: I was working on Rust, and I never had taken the
time to think of this hypothetical language in itself the way I have now.
The goal of this design was to create a language that could compete as an “application programming
language.” The design goals of this language were:
- It should try not to be notably hard to learn. To the extent possible, it should be familiar to
most programmers. Since I’m comitting by the exercise to trying to apply ownership and borrowing
to the application domain, it will necessarily contain some features most programmers find
pretty novel (like Rust “lifetimes”). But in general, we will try to reduce the onboarding ramp
and simplify things.
- It should typecheck and compile quickly. It should not have bad batch compilation performance,
and it should be designed with incremental recompilation in mind, to enable a good experience for
users who integrate their compiler into their development environment (with a full IDE or even
just with a plugin for a text editor). I didn’t even mention this concern in the previous post.
As others have discussed elsewhere; Rust’s poor compile times are not the result of its advanced
type system, but of a combination of other factors. Some are essential, like the runtime
guarantees it makes (e.g. monomorphization) whereas others are accidental, like some aspects of
its module system. None of these factors would be essential for our language, so we would
carefully avoid these pitfalls.
- It should have a runtime which suits it well to the major use cases for application programming
languages today. This means mainly being well suited to the developing for the web, both
front-end and back-end. (Being well-suited to the mobile platforms is unrealistic for a language
not sponsored by those platform developers, unfortunately.) Being well-suited to CLIs would also
I want to focus the rest of this post on my thoughts for evolving Rust’s ownership and borrowing
system, but before I do that I want to briefly touch on other design decisions that fell out of this
- I would target WASM, and only WASM, for this language. WASM with reference types is suitable as
an environment for application programming (with shims for future extensions like properly
integrated garbage collection). This way the language designers can piggy back on the work being
done at many companies to establish WASM as a good shared VM platform, instead of being
responsible for things like platform compatibility or using the very slow LLVM. Targeting WASM
would also mean easier FFI integration into other languages that run on the same VM as WASM; that
- I would explore control-flow-capturing closures as a core language abstraction, similar to Kotlin.
As I wrote in an earlier blog post inspired by the design on this hypothetical
language, I think these are a great way to integrate effects well with higher order function
- I would provide syntactic sugar for
Optionas the way to handle null and errors,
similar to Swift.
- As I wrote in a previous blog post, I would provide green threads as the sole concurrency model,
with language or standard library provided channels and cells (discussed later) as the way of
sharing data between threads. How these green threads are mapped to CPUs is a matter for the
runtime you choose to run the compiled WASM in.
- I didn’t get to the point of designing a polymorphism system; I would probably start with a
strenuous comparison of Rust’s traits and Go’s interfaces, and (knowing the other features of the
language) try to figure out what from Rust’s traits is unimportant.
- I would be hope the language could avoid macros, which (in the case of pattern based macros) add a
second meta language to the language that advanced users need to understand, and in all cases
substantially complicate compilation.
But now onto the meat of this post: the ownership and borrowing model. In my previous post I made
some points that I largely agree with still, but would probably reframe. Here’s what I wrote:
Rust works because it enables users to write in an imperative programming style, which is the
mainstream style of programming that most users are familiar with, while avoiding to an impressive
degree the kinds of bugs that imperative programming is notorious for. As I said once, pure
functional programming is an ingenious trick to show you can code without mutation, but Rust is an
even cleverer trick to show you can just have mutation.
Resource acquisition is initialization: Objects should manage conceptual resources like file
descriptors and sockets, and have destructors which clean up resource state when the object goes
out of scope. It should be trivial to be confident the destructor will run when the object goes
out of scope. This necesitates most of ownership, moving, and borrowing.
Aliasable XOR mutable: The default should be that values can be mutated only if they are not
aliased, and there should be no way to introduce unsynchronized aliased mutation. However, the
language should support mutating values. The only way to get this is the rest of ownership and
borrowing, the distinction between borrows and mutable borrows and the aliasing rules between
In other words, the core, commonly identified “hard part” of Rust – ownership and borrowing – is
essentially applicable for any attempt to make checking the correctness of an imperative program
tractable. So trying to get rid of it would be missing the real insight of Rust, and not building
on the foundations Rust has laid out.
I still think this is Rust’s “secret sauce” and it does mean what I said: the language would have to
have ownership and borrowing. But what I’ve realized since is that there’s a very important
distinction between the cases in which users want these semantics and the cases where they largely
get in the way. This distinction is between types which represent resources and types which
In this mental model, resources are types which represent “a thing” – something with an identity and
a state which can change with time as the program executes. In Rust, almost everything is a
resource: a String is a resource a HashMap is a resource, most user types are resources. In
contrast, data types are just “information” – a fact, which has no meaningful identity, contains no
state that evolves over time, etc. In Rust, types like integers,
&str, and so on – which all
Copy – are data types. (However, a mutable reference to those types is a resource: more
on this later.)
In Rust, only types which can be cloned by a mempcy can implement
Copy. This is because Rust is
designed to encourage treating all heap memory as a resource, the management of which the end
user can control by selecting when the type representing that memory is dropped. This is very
valuable in the domains which Rust is intended to target. However, for higher level applications
that most programmers write, control over heap memory is not usually important. This is what users
mean when they want to “turn off the borrow checker” – they want to let a garbage collector figure
it out for them when this bit of data is freed, because to them it is “just data” and not a
This hypothetical language would lean into that distinction. Using persistent data structures (like
those from Clojure) and garbage collection, the set of types which could be treated as data types
would not be restricted in this language. The string type would be a data type, rather than
a resource; a dynamically sized array of data types would be a data type as well, as would a map
with keys and values that are data types.
Meanwhile, types representing IO objects would always be resource types. Collections containing
resource types would also be resource types. Composite types (like structs and enums) which contain
a resource type would also have to be a resource type. There would be an easy way to convert data
types to fully owned resource types as well; in the case of persistent data structures, converting
a data type to a resource type would be the point at which the “copy on write” operation occurs.
As a result users can use ownership semantics for things which impact global and external state
(like IO) and for cases where they know it will be an important performance optimization.
And the difference in how the language treats data and resources would be identical to the
difference between how Rust treats Copy and non-Copy types. Only resources would have affine
“ownership” semantics – in which moving them invalidates the previous binding. Data types would have
the standard non-linear semantics users are familiar with from most languages. This means that
writing algorithms using data types would be functionally the same as writing algorithms in other
imperative languages, easing the onboarding of users to the language and limiting their errors
related to linear types to areas where they are certain to care.
Borrowing and the two reference types
The previous discussion covers the ground of ownership, but what about borrowing? It turns out that
this distinction between resources and data is also the distinction between the two reference
types in Rust. A shared reference implements
Copy, and is properly understand as “data” (with no
meaningful identity of its own) and an exclusive/mutable reference does not implement
does not make sense to treat as “data” – it is exclusive (meaning it has an identity) and it is
mutable (meaning it has updating state).
This means that these two reference types would function as temporary views of another type as
either a data type or a resource. It doesn’t matter if the underlying type is data or a resource; a
“data view/shared reference” of any type is data, and a “resource view/mutable reference” of any
type is a resource. This allows users to temporarily switch modalities for a particular value,
depending on what they need. Of course, just like in Rust, a “data view” would not give the full
power over the type that a “resource view” has, whereas a “resource view” could always be degraded
into a “data view.” (This is the same semantics that references have in Rust.)
Note also that I’ve said “view” rather than “reference,” because the language is designed to make no
guarantees about the representation of types. Depending on what makes the most sense for the
implementation, either all types are “reference types” unless the compiler can unbox them, or all
types can be automatically boxed by the compiler if it determines it needs to. So these views should
not be imagined as “pointers to” the underlying type, they may have the same representation as that
I previously said that the language would have two primitives to communicate between concurrent
subprocesses: channels and cells. I have nothing interesting to say about channels, but I want to
discuss the Cell type, the language’s only shared state primitive.
The Cell type would be implemented as a garbage collected read/write lock. How this lock is
implemented is the business of the runtime (e.g. on a runtime which runs greenthreads in parallel,
it would use atomics, whereas on a runtime which runs greenthreads on a single thread, it would not
need to.) A Cell has data semantics, but allows constructing resource views of the underlying type
(in essence, performing a write lock.) Thus the Cell type allows treating resource types as if they
were data, even when calling resource-view methods. It essentially moves the compile time checks on
resource types to the runtime, removing also any guarantees about when the type will be destroyed.
Note that this language would have no
Sync traits, because all types would have
Sync semantics: everything can be shared across all green threads. Thus there would be no
restriction at all about what could be put into the cell type.
Ideally, the Cell type could even hand out an unguarded “resource view,” as opposed to a newtype
MutexGuard that Rust uses; it would be great if the compiler could somehow insert unlocks
when the resource view goes out of scope. This may require something analogous to monomorphization
though, and so it could impact compile times and implementation complexity; it may be a luxury we’d
have to live without.
I don’t have any intention of working on developing these ideas into a real language, so I thought
publishing the result of my design work would be the best way to give that work a bit of impact. I
hope anyone considering designing a new application language would consider these ideas as a way to
give users guaranteed correctness around resource management. I’m open to hearing from people who
are interested in these ideas; though as with most email I recieve I may regrettably fail to
Since this design work was in a spirit of direct contradiction to Rust’s goals, it’s hard to see it
having a big impact on the design of Rust. One thing I’d mention, though, is that the discussion of
Autoclone trait (types with non-affine semantics even though their clones require code
execution) is relevant to this distinction I’ve made between resources and data. I don’t think most
types should be
Autoclone (my list would include: Rc, Arc, and maybe persistent collection types
like Bodil Stokke’s im), but I do think it would benefit users to have the choice to not
treat memory like a resource when the bookkeeping necessarily to control that management is at
cross-purposes with their end goals.
Finally, I think if anyone were to pursue this, the big area that I’ve handwaved over is the
“persistent data structures” part. I discuss converting between data types and resource types, even
temporarily: it may require novel research to create collection types that could sometimes have the
performance calculus of persistent collections (cheap to copy, expensive to mutate) and sometimes
have the performance calculus of mutable collections (cheap to mutate, expensive to copy).