Ditch Pros and Cons: Use a Utility Function

If you’ve ever met me in person, you know I talk a lot about utility. My close friends are used to answering questions like “does this have net positive utility to you?” and “is that a strongly weighted component of your utility function?”. Every time I make a decision – what to do with my evening, or what to do with my life – I think about it in terms of utility.

I didn’t always do this, but I’ve adopted this way of thinking because it forces me to clarify everything that’s going on in my head and weight all my thoughts appropriately before making a decision. When I decide things this way, I genuinely feel like I’ve made the best possible choice I could, given everything I knew at the time.

What on earth is utility?

Utility is just a fancy word for “value”. If you enjoy chocolate ice cream, then eating chocolate ice cream has positive utility. If you don’t enjoy carrot cake, then eating carrot cake has negative utility.

One action can have multiple kinds of utility. You can add together all the utility types to get the action’s net utility. For example, if I assign positive utility to eating ice cream but a negative utility to gaining weight, there will be a certain optimal point where I eat as much ice cream as I can without gaining weight. Maybe, if I assign eating ice cream +5 utility, not gaining weight +5 utility, and exercising -5 utility, then it would make sense for me to hit the gym more often so that I can eat more ice cream without gaining weight.

The set of utilities I assign to all outcomes also tells me the optimal possible outcomes, with the highest net utility. In this example, that would be either modifying ice cream so that it doesn’t make me gain weight, or modifying my body’s energy processing system to get it to process ice cream without storing any of it as fat.

Having numbers that are consistent is helpful sometimes, but isn’t strictly necessary. When I need to make quick, relatively straightforward decisions, I typically just make up some utility numbers. Utility calculations in a small isolated system are basically matters of ratios: it doesn’t matter exactly how much utility I assign to something, but if outcome X has 5 more utility points than outcome Y, X is preferable.

Forcing yourself to make up numbers and compare them to each other reveals what you care about. If you initially thought you didn’t care much about something, but then realize that if you calculated net utility with a low number assigned to that thing, you’d be unsatisfied with the result, then you care more than you thought you did.

It might be somewhat unclear, with my super-simple examples so far, what you can assign utility to. So, here are some examples of things that I assign positive utility to:

  • Reading good books
  • Doing new things
  • Increasing utility according to the utility functions of people I care about
  • Building neat software
  • Drawing and painting
  • Writing stories and blog posts
  • Improving/maintaining my mental and physical health
  • Having interesting conversations
  • Improving the quality of life of all sentient beings
  • Running
  • Riding my bike
  • Taking walks with my girlfriend
  • Eating ice cream

If you enjoy doing it, if you think you should do it, if it makes you happy, if it’s a goal you have for some reason, or anything else like that, you assign it some amount of positive utility.

If you’d like to figure out how much relative utility you assign to different options, compare them: if I had to either give up on improving the quality of life for all sentient beings, or give up on ice cream, the ice cream has gotta go.

You can even assign positive utility to things you don’t end up doing. That’s because the net utility, after accounting for circumstances or mutually exclusive alternate actions. Knowing that you would do something, barring XYZ condition, is a useful thing to know in order to dissect your own thoughts, feelings, goals, and motivations. The converse is true, too: you can assign negative utility to things you end up doing anyway, because the net utility is positive.

So if that’s utility, what’s a utility function?

A utility function is a comprehensive set of everything that an agent assigns any utility to. “An agent” is anything capable of making decisions: a human, an AI, a sentient nonhuman alien, etc. Your utility function is the set of everything you care about.

The inputs to a utility function are quantities of certain outcomes, each of which are multiplied by their respectively-assigned utility value and then added together to get the total expected utility of a given course of action. In an equation, this is:

Ax + By + Cz + ...

Where A, B, C, and so on are individual facets of outcomes, and x, y, z, and so on are utilities.

Say I’m with my sister and we’re going to get food. I personally assign a strong net positive to getting burgers and a weak net negative for anything else. I also assign a positive utility to making my sister happy, regardless of where we go for food. If she has a strong net negative for getting burgers, and a weak net positive for sushi, I can evaluate that situation in my utility function and decide that my desire to make her happy overpowers the weak negative I have for anything besides burgers, so we go get sushi.

When evaluating more complex situations (such as moving to a job across the country, where positives include career advancement and increased income, and negatives include having to leave your home and make new friends), modeling your own utility function is an excellent way to parse out all the feelings that come from a choice like that. It’s better than a simple list of pros and cons because you have (numeric, if you like) weights for all the relevant actions.

How to use your utility function

I don’t keep my entire utility function in my head at one time. I’ve never even written it down. But I make sure I understand large swaths of it, compartmentalized to situations I often find myself in. However, if you decide to actually write down your utility values, proceed to make them all consistent, and actually calculate utility when you make decisions, there’s nothing stopping you.

In terms of the optimal way to think about utility calculations, I have one piece of advice. If you come out of a utility calculation thinking “gotcha, I can do this”, “alright, this seems reasonable”, or even “ugh, okay, I don’t like it but this is the best option”, then that’s good. That’s the utility function doing its job. But, if you come out of one thinking “hmmm… I guess, but what about XYZ contingency? I really don’t want to do ABC…”, or otherwise lingering on the point of decision, then you’ve forgotten something.

Go back and ask “what’s wrong with the ‘optimal’ outcome?”. It might be something you don’t want to admit to yourself, but you don’t gain anything by having an inaccurate perception of your own utility function. Remember that, in absence of a verbal reason, “I don’t wanna” is still a perfectly valid justification for assigning negative utility to an action or outcome. In order for this process to work, you need to parse out your desires/feelings/goals from your actions, without beating yourself up for it. Your utility function already is what it is, and owning up to it doesn’t make it worse.

Once you have a pretty good handle on your own utility function, you can go ahead and mentally model other peoples’. Humans are calculating utility all the time in the form of preferences and vague intuitions, so even if other people don’t know their utility functions, you can learn them by a combination of watching their actions and listening to their words.

The discrepancy between those two, by the way, is indicative of one of two things: either the person is choosing an action with suboptimal utility, or they don’t actually assign utility to the things they say they do aloud (perhaps for social reasons). You can point out this discrepancy politely, and perhaps help them to make better decisions in the future.

Once you begin to use utility functions for both yourself and others, you might be surprised at how much easier it is to make decisions. When considering possible courses of action for yourself, you’ll be able to choose the best option and know it was the best. And, in a group, having an accurate model of other peoples’ utility functions can let you account for their preferences, perhaps even better than they themselves do.

Language: A Cluster Analysis of Reality

Cluster analysis is the process of quantitatively grouping data in such a way that observations in the same group are more similar to each other than to those in other groups. This image should clear it up.

Whenever you do a cluster analysis, you do it on a specific set of variables: for example, I could cluster a set of customers against the two variables of satisfaction and brand loyalty. In that analysis, I might identify four clusters: (loyalty:high, satisfaction:low), (loyalty:low, satisfaction:low), (loyalty:high, satisfaction:high), and (loyalty:low, satisfaction:high). I might then label these four clusters to identify their characteristics for easy reference: “supporters”, “alienated”, “fans” and “roamers”, respectively.

What does that have to do with language?

Let’s take a word, “human”. If I define “human” as “featherless biped”, I’m effectively doing three things. One, I’m clustering an n-dimensional “reality-space”, which contains all the things in the universe graphed according to their properties, against the two variables ‘feathered’ and ‘bipedal’. Two, I’m pointing to the cluster of things which are (feathered:false, bipedal:true). Three, I’m labeling that cluster “human”.

This, the Aristotelian definition of “human”, isn’t very specific. It’s only clustering reality-space on two variables, so it ends up including some things that shouldn’t actually belong in the cluster, like apes and plucked chickens. Still, it’s good enough for most practical purposes, and assuming there aren’t any apes or plucked chickens around, it’ll help you to identify humans as separate from other things, like houses, vases, sandwiches, cats, colors, and mathematical theorems.

If we wanted to be more specific with our “human” definition, we could add a few more dimensions to our cluster analysis—add a few more attributes to our definition—and remove those outliers. For example, we might define “human” as “featherless bipedal mammals with red blood and 23 pairs of chromosomes, who reproduce sexually and use syntactical combinatorial language”. Now, we’re clustering reality-space against seven dimensions, instead of just two, and we get a more accurate analysis.

Despite this, we really can’t create a complete list of all the things that most real categories have in common. Our generalizations are leaky in some way, around the edges: our analyses aren’t perfect. (This is absolutely the case with every other cluster analysis, too.) There are always observations at the edges that might be in any number of clusters. Take a look at the graph above in this post. Those blue points at the top left edge, should they really be blue, or red or green instead? Are there really three clusters, or would it be more useful to say there are two, or four, or seven?

We make these decisions when we define words, too. Deciding which cluster to place an observation happens all the time with colors: is it red or orange, blue or green? Splitting one cluster into many happens when we need to split a word in order to convey more specific meaning: for example, “person” trisects into “human”, “alien”, and “AI”. Maybe you could split the “person” cluster even further than that. On the other end, you combine two categories into one when sub-cluster distinctions don’t matter for a certain purpose. The base-level category “table” substitutes more specific terms like “dining table” and “kotatsu” when the specifics don’t matter.

You can do a cluster analysis objectively wrong. There is math, and if the math says you’re wrong, you’re wrong. If your WCSS is so high that you have a cluster that you can’t label more distinctly than “everything else”, or if it’s so low you’ve segregated your clusters beyond the point of usefulness, then you’ve done it wrong.

Many people think “you can define a word any way you like”, but this doesn’t make sense. Words are cluster analyses of reality-space, and if cluster analyses can be wrong, words can also be wrong.


This post is a summary of / is based on Eliezer Yudkowsky’s essay sequence, “A Human’s Guide to Words“.