My thoughts on TrapC
Today, I came across this proposal for a new language/language extension to C/C++ that claims memory safety. Here’s some of my thoughts on it.
I’ve seen some mixed opinions online about this, but a lot of the discussion lacked real discussion around what this proposal misses, and also discussions about the trade-offs in the new features it proposes. After reading the proposal, I remain unconvinced that the current state of proposal will actually solve as many problems as it claims. Here are some of my concerns.
trap
s are hard to enforce safety around
TrapC
’s eponymous new feature are “trap” handlers. Effectively, an exception
handler, with the added caveat that “exceptions” (traps) don’t propagate more
than one level by default (though users can pass them up the stack). The author
claims that this is preferable to the C++ implementation which requires
constructing an alternate exception handling stack. Returning a trap will trap
to the handler in the caller (if present, and abort otherwise), and provide the
caller a chance to handle the error, or continue. Returned values are zeroed
out.
This seems at odds with the idea of memory safety. What happens in the following scenario?
int* alloc_int() {
int x;
// Supposedly TrapC will hoist local/stack memory allocation if the compiler
// detects that you're returning local addresses?
return &x;
}
int main() {
int *x = alloc_int();
// one must assume that failure to allocate memory raises a trap. The
// proposal didn't mention it. If it doesn't by default, one could implement
// an allocator that does.
trap {
// do nothing - x should be zeroed out
}
// Is this a segfault in cases where we trapped?!
return *x;
}
It’s possible for the compiler to analyze and handle this for this specific
example, the compiler could either refuse to compile this, by saying that it
couldn’t prove the validity of x
, or generate an alternate code path after the
trap that always aborts. But it’s not clear to me if that’s always possible - or
if it’s even desirable in the first place. The author also seems to be
anti-compiler warning/linting preferring to instead have the compiler always
generate the safest possible code so I guess this example would silently be a
abort? Making this memory safe in general requires adding checks on every
deference after a trap which is bound to have a performance impact.
free
no-ops and compatibility with existing C code
The proposal mentions making free
a no-op to avoid use-after-free bugs.
Instead, all free
s are deferred to the end of the function. The proposal
doesn’t mention how this works in the case of nested scopes, which seems
error-prone to me. Consider the following:
char* foo() {
char* retval;
for (int i = 0; i < 100; i++) {
char* data = some_huge_allocating_function(i);
do_something_with(data);
if some_property_of(data) {
retval = data;
} else {
free(data);
}
a:
}
b:
return retval;
}
In the above example, does the free happen at a
or b
? If it’s b
we have
the possibility of running out of memory, which could safely generate a trap,
but it’s unclear from the proposal if that is the case. If it’s a
, how do you
know which pointers to not free? Does the runtime need to track references? The
proposal is lacking clarity around this.
Also, it’s not clear whether or not pointers are fungible. The proposal mentions
striving for compatibility with c
code, but that seems difficult without
blocking casts from pointer to integer, and also disallowing functions like
setjmp
/longjmp
. It does seem like goto
is disallowed.
Thread safety
The proposal focuses on UB stemming from use-after-free and unhandled errors.
However, in my experience, a significant source of memory issues comes from
multi-threaded execution. My reading of the proposal gave me no insights on how
TrapC
alleviates those sources of error.
Conclusion
I’m not a hater - TrapC
has some interesting ideas. The trap
syntax for
error handling seems relatively ergonomic, and the general premise that
compilers and languages standards could do more by default to protect against
programmer errors isn’t a bad one. However, I think linter/compiler errors do a
better job of informing users of execution model which is important in
performance critical scenarios. If performance isn’t critical, then it might be
better to opt for a garbage collected language instead, which can deal with
issues of memory safety far more comprehensively. The proposal lacks critical
details, and leaves many important points as “implementation details”, which is
how we ended up with UB in languages like C over time. I’m interested to see how
TrapC
will evolve over time, but I think there’s already many similar projects
that have come and gone.
My key takeaway here is that there’s some demand and viability for a compiler that provides safe, albeit slower or less efficient execution of existing programs. This might be a good option for safety critical situations where legacy code needs to be run. However, even in those cases, it’s not clear that providing safe/defined alternatives will actually improve safety, because safely reasoning about the intent of the original programs after UB is (by definition) impossible. In those cases, I still think that raising warnings and errors is more likely to lead to positive outcomes.