Log, don't throw
Or, the actual title: My dumb error-handling idea.
Eternal struggle
It's ${CURRENT_YEAR}
, and error handling is STILL not a "solved problem", especially when it comes to balancing ergonomics (how good it feels to code with) and performance. It's still very much debatable what approach you use, whether it be exceptions, error codes, Result/Option types, either monadic (Rust etc.) or tuple (Go, Odin).
I have made the mistake of reading programmer "Disc Horse" (particularly on Hacker News and Lobsters) where ✨the enlightened✨ makes fun of the Blub programmers for having such a error-prone and inefficient way of handling errors, among other things. All of this hasn't added clarity, only confusion. Enough for me to add fuel to the fire, especially as someone completely unqualified to yap about any of this.
So let's first enter these rough points from said discourses into "evidence", of which I won't elaborate or argue for:
- The design of exceptions is bad and you should feel bad, prefer error codes and result types.
- Error handling should be front-and-center and not just afterthoughts, therefore they should be part of your code.
- "Careful thought and strategically-placed print statements" still reign supreme.
- The
if err != nil: return nil, err
("simple re-throwing") pattern in Go. fmt.Errorf("%w", err)
("error wrapping") in Go.- Log and rethrow can be considered an anti-pattern.
- Premature abstraction is the root of all evil.
- It is still useful to have different types of error, because not every error is equal.
Next, consider this log of an unhandled exception of a web service:
ValueError exception: Can't parse "blerp" as Date
from: std/parsedate:100
from: /home/zumi/dev/myDumbProject/src/parsetools/object:245
from: /home/zumi/dev/myDumbProject/src/fetch/object:124
from: /home/zumi/dev/myDumbProject/src/controller/getObject:24
from: /home/zumi/dev/myDumbProject/src/handleRequests:24
You can probably infer from this that there has been an error when accessing some Object, and that it has received a nonsense value for a date input. But there's two bits of context I'm missing here:
- What kind of date was it trying to parse, exactly? As in, which database column? (and it'll most likely be from a database)
- Which object ID did it choke on?
In a trivial database, you can probably go to the line and then go to the database to find the odd value. But all "serious" web services will have to talk about scale. You can imagine that plus a wild request, and suddenly it becomes a bit harder to rectify.
And you could argue that I should have caught it at the lowest level I could. A bit easier with a language that has checked exceptions, because the compiler will tell you which ones to catch.
Besides, the whole point of exceptions is so that you can get the error-handling cruft out of the way, right? So then, you'd be likely to wrap your entire function inside of a giant try
statement rather than try-catch
individually. In some languages, try
is its own block. Imagine if you want to initialize variables this way, unless blocks can also be expressions.
But what if I can have this instead:
[!!] can't parse object's creation date. e: "ValueError: Can't parse "blerp" as Date". value = "blerp". [std/parsedate:100]
[!!] can't parse object. [parsetools/object.src:245]
[!!] can't fetch object. id = 99. [fetch/object.src:124]
[!!] can't display object. id = 99. [controller/getObject.src:24]
Right out of the gate, I have the answer to the questions of context, and I can fix the issue faster:
blerp
for some reason was the value of the creation date in the database.- The issue was with ID 99.
This can be achieved anyway you like. But what if the log is the stack trace?
There's a practice I know of that simply has strings as the error value, so basically Result[T, string]
. I think the idea being that, they're just values, so as to keep the function pure and you have the option of logging it or not.
What if I, like a dumbass, don't care about "keeping it pure"? I just assume that a logger package is present, initialized, and I utterly depend on it, and I don't want the option of not logging the error?
And instead of having only one or maybe two kinds of error, results or error codes or exceptions or whatever... what if I have all of them??
My dumb idea
I came up with these error types:
- Outcome
- DifferentiatedFail
- Option[T]
- Result[T, DifferentiatedFail]
- Result[T, ContextualFail]
It's important to note that this is NOT a general error-handling strategy. This shouldn't be used to "take over the world", but instead to ONLY be used in YOUR code. The libraries you use can do whatever, exceptions, results, whatever. The strategy described here should be used in code that YOU care about. It's really a way of marking boundaries between app code and "library" code. A.k.a, "app code that faces the user" where this approach can be used vs. "app code that can be used elsewhere" where e.g. exceptions can be used.
And instead of the focus being on the function itself, the focus is in whoever calls it. I think it fits into the "code what you need" mindset instead of prematurely preparing abstractions that will not hold up.
For the pseudo-code in this article let's assume something that looks kind-of-like C, and has exceptions.
Outcome
A value of either Fail
, or Ok
. Implementing this can simply be a boolean true/false—whichever denotes the error value depends on what kind of standards you have. For example, C functions usually say "return true if there is an error", which results in "non-zero value denotes an error".
DifferentiatedFail
An enum that just says what kinds of errors are possible that you care about. They can be literally anything depending on what you need, perhaps in the form of these constants, which map neatly to HTTP errors:
InternalFail // 500
ExistsFail // 404
PermissionFail // 403
Option[T]
Your bog-standard "safe nullable reference" optionals type that you need to "unwrap" to use, forcing you to check the thing. In absence of this, you could just use the nullable reference type, but you carry the risk that comes with it.
Here it communicates that either there is only one possible way that the call can error out, or that the caller should not care what kind of error occurred. After all it just needs to know whether it would receive the value or not.
Result[T, E]
The ok-or-error monad that Rust convinced me was the One True Way to Go. And speaking of, yeah, Go fits this model, although it's like the "standard nullable reference" version of this model, since you can forget to check it. oooOOOOooOOo BILLION DOLLAR MISTAKE!!!!!! BAD!!!!!! or something. Hate this type of antagonism.
ContextualFail
The closest thing to "exception objects", but it's lighter because it just contains this:
struct ContextualFail
{
kind: DifferentiatedFail;
message: string;
};
Which one to pick
- Errorless functions obviously either return some value of type T or has nothing to return (
void
/never
, some call these functions "procedures" instead) - One type of error to care about? Either Outcome or Option[T].
- Multiple types of error? DifferentiatedFail or Result[T, DifferentiatedFail].
- If and only if your caller needs a custom message that they themselves can't generate, then you should go for Result[T, ContextualFail].
In short:
The thing about DifferentiatedFail is—again—it can be anything, even function-specific:
UsernameVibeCheckFailed
PasswordTooStrong
ConfirmationPasswordMismatch
This is fine for standalone console apps or whatever, but if you're writing a web service, you may as well fold it into a ContextualFail:
ContextualFail{
.kind: ValidationFail,
.message: "Invalid user name, must have at least one em-dash in it"
}
ContextualFail{
.kind: ValidationFail,
.message: "Password too strong, must have a maximum of 3 characters"
}
ContextualFail{
.kind: ValidationFail,
.message: "Password confirmation doesn't match password input"
}
The idea here being that ValidationFail
would map to a 400, and then the message can be a message that can be flashed alongside the re-thrown form.
Show code pls
Alright. I'm a bit more familiar with Go so that's what I'll use as reference. In Go, you might do:
func callee() (T, error) {
// ...
if err != nil {
// you can choose to bury errors here
return nil, errors.New("can't do x")
}
//
return ...
}
func caller() error {
a, err := callee()
if err != nil {
// if callee uses this pattern, you can easily
// get a mile long string...
return nil, fmt.Errorf("can't get a: %w", err)
}
return nil
}
func main() {
err := caller()
if err != nil {
// you could log only in here, but then you
// don't get WHY it happens...
log.Fatalf("%w", err)
}
}
In my dumb error handling proposal, I'd do, in pseudo-code:
Option[T] callee()
{
try
{
x = // ...
return x.some()
}
catch SomeError as e
{
log.errorf("cannot get x: %s", e.message)
return none(T)
}
}
Outcome caller()
{
#if you_prefer_pattern_matching
match (callee())
{
some(a)
{
// do stuff with a
return Ok
}
none()
{
// note how you don't return this log itself
// as a value, but it's just a log
log.errorf("cannot get a")
return Fail
}
}
#else
a = {
result = callee()
if (a.isNone())
{
log.errorf("cannot get a")
// this is an early *explicit* return,
// exits the entire function
return Fail
}
// *implicit* return assigns to `a`
// and continues
result.get()
}
// do stuff with a
return Ok
#endif
}
int main()
{
if (caller() != Ok)
{
log.errorf("cannot do thing")
return 1;
}
return 0;
}
You'd get:
cannot get x: some internal error
cannot get a
cannot do thing
Notice how you don't need to:
- Carry around strings
- Have an existential crisis over what to log, because you log everything
- Care about telling WHY it failed, because it's already been said
How might an error string be used, aside from logging, anyway? Go already has a hard time differentiating errors (need to preallocate beforehand and having to use errors.Is
), so I think that says something about returning error strings as a concept.
Exception object implementations are varied, but they usually need to have a stack trace. The common complaint is that they need heap allocations for composing the error messages and such. No difference here if you do formatting for everything, but you could define a bunch of strings in .data
to mitigate that. Again, what do you plan on using all of that for?
In some languages you might even want to create new Exception types because at a high level you don't need to care about the internals of whatever it is you're calling, e.g. a controller shouldn't need to care about a DbError. I think here even that would fit.
If you want to use this in a web service in particular, there's another point that could be in favor of this:
Users don't need to know the precise error logs. But the server admin does.
"Precise error logs" in this model, then, is an explicit opt-in, rather than an opt-out. Let's assume the standard handler-repository pattern.
Here's some rough, contrived "repository" code. In an exception-laden language, you might want to try-and-catch at the most granular level. But sometimes you don't want to handle things like DbError which can happen at every step of the way.
// as in this tiny example the kind of error we're expecting is only
// a server-side error, we can use an Option[T] here.
Option[T] getPostDate(int postId)
/** error kind: internal **/
{
db = getGlobalDatabase()
try
{
q = db.prepareStatement(makePostQuery(postId))
s = db.execute(q)
r = db.getColumn(s, "post_date")
try
{
return r.asDate().some()
}
catch ValueError as e
{
// you can tell the problem value here
log.errorf("cannot parse post date: %s. value=%s", e.message, r)
return none()
}
}
catch DbError as e
{
// since this is a catch-all, we won't get which one
// of these db calls (prepareStatement, execute, getRowCol)
// errored out. so we might wanna print its stack trace anyway.
log.errorf("%s", e.getStackTrace())
log.errorf("db error: %s", e.message)
return none()
}
}
And here's the "handler" that takes this function.
void endpointGetPost(HttpServer s, int id)
{
// assume the post's availability is handled differently
date = {
i = getPostDate(id)
if (i.isNone())
{
log.errorf("unable to get post date of id %d", id)
s.reply(status=500, body=makeErrorPage(500))
return
}
i.get()
}
// ...
}
The user only sees the 500 error, while you see:
[!] cannot parse post date: Can't parse "null" as Date. value=null [controller/post:163]
[!] unable to get post date of id 19 [controller/post:145]
Or:
std/dbtools/private/backwarddb:140
std/dbtools:1100
fetch/object:154
[!] db error: The database server didn't respond [fetch/object:174]
[!] unable to get post date of id 19 [controller/post:145]
From the handler code's PoV, you as a caller don't need to know WHY the lower layer failed, like a DBError or whatever. But you want to know WHAT the effects of it is. like, whether it's an "internal error" or a "does not exist error", so you can return a 500 or a 404.
From the administrator's PoV, you want to know WHY the lower layer failed, because WHAT the effects of it is already reported to the user of your service.
Drawbacks
But in case you're convinced this is a good idea, consider the following.
Again, you need some global log object that is always available, and accessible by every function. If you work in something that expects pure functions, tough luck, you need to pass that logger object around like hot potatoes.
For Outcome
and Option[T]
in particular, since they communicate ONE kind of error, you still need to diligently document that yourself.
// wait, does this error or not?
Option[T] derpity()
{
// ahh okay, i see.
Option[T] derpity()
/** error kind: internal **/
{
You find that you suddenly need to differentiate between an internal error and a validation error, but what you called was an Outcome
or Option[T]
. Okay, let's change that to DifferentiatedFail
or Result[T, DifferentiatedFail]
. Oops, turns out there's 10 other functions that use it, and you have to change them too. That's right, it's viral! But the solace is in the fact that it's not happening in libraries, only your own code. Besides, abstractions that work do tend to take a while.
Depending on how granular you made your functions, you can't easily shut off some errors from your callers. Your only option here is to set the log levels from the called functions themselves. Incidentally, that also means you can't just "swallow" errors like you might be tempted to do with reg'o exceptions.
And let's not forget the fact that you now have FIVE types to choose before writing anything, instead of just the one. In which case, you should refer to the flowchart of what to choose.
In conclusion
XKCD 2119.

I'll have to put this into practice for my aforementioned web app, and see how I like it. But until then it's just a theory.
A GAME THEORY. THANKS FOR FLAMING.