# NashCodingYet Another Artificial Intelligence Blog

23Aug/1111

## How Karma Should Be Measured

Measuring karma was a heavily-debated topic for a while on HackerNews. The goal is to provide some measurement that both accurately measures overall contribution to the community and encourages consistent engagement. Several solutions were discussed and a few were even tried. For example, pg tried to replace the overall karma score with an average score. All three combinations (total only, average only, and both) were juggled around in the top right corner, until eventually the simple total was used.

While all these attempts were good ideas, I think there is an even better metric that should be used here: the Sharpe ratio.

When you look to invest in an asset, it's often important to consider the risk of an investment as well as the overall historical returns. Assuming you have no external knowledge of the asset, looking at the volatility of its returns is a pretty good estimate of risk. Ideally, you'd like to see super high returns with no volatility– just a steady stream of money rolling in day after day.1 The Sharpe ratio is a way to combine the returns of an asset and its historical risk into a single number.

Simply defined, the Sharpe ratio is the return of the asset minus the risk free rate, all divided by the standard deviation of returns. For instance, if we have an asset which has returned an average of 10% per year and an annualized volatility of 20%, and we assume that we could park our money in a savings account for a risk-free 1% per year, then the Sharpe ratio for this asset is (10 - 1) / 20 = 0.45.

# Naive Sharpe karma

How does this translate to a social news site like HackerNews? First, let's consider each user to be an investable asset. Each time a comment or story submission is made, that user is generating a return.2 Since each comment starts with a default score of 1, let's go ahead and make that the risk free rate of return. So the karma metric I'm proposing would be:

$f(x) = \frac{\sum\limits_{i=1}^{n^{x}}{(k^{x}_{i} - 1)}}{n^{x}\sigma^{x}}$

where:
$x$ is the user
$n^{x}$ is the number of comments for user $x$
$k^{x}_{i}$ is the karma score for the ith comment of user $x$
$\sigma^{x}$ is the standard deviation of the karma scores for user $x$

In programmer-speak, the pseudo-code for this is3:

function karmaSharpe(user) {
return 0;

var numerator = 0;
numerator += (comment.karma - 1) / user.comments.length;

var denominator = stdev(x => user.comments.karma);
if(denominator < 1)
denominator = 1;

return numerator / denominator;
}


# A better metric

The Sharpe ratio is certainly an improvement over the previous two metrics. Totals give a very low-fidelity view of the user's contribution and averages make no distinction between consistent, high-quality members and more skewed users who have a couple really highly-upvoted submissions. The Sharpe ratio addresses both of these shortcomings by including an additional dimension in the calculation: volatility.

The metric is still not perfect, however. A good metric would include time in the formula, since a high-quality user would be consistently contributing every day. An improved metric is below:

$f(x)=\frac{\sum\limits_{i=1} ^{d^{x}} { (\sum\limits_{j=1} ^{g^{x}_{i}} {\frac{k^{x}_{ij}-1}{g^{x}_{i}}} -1)}} {d^{x}\sigma^{x}}$

where:
$x$ is the user
$d^{x}$ is the number of days user $x$ has been registered
$g^{x}_{i}$ is the number of comments for user $x$ made on day $i$
$k^{x}_{ij}$ is the karma score for the $j^{th}$ comment of the $i^{th}$ day for user $x$
$\sigma^{x}$ is the standard deviation of the karma scores for user $x$

And in pseudo-code:

function timeAdjustedKarmaSharpe(user) {
return 0;

var excessReturns = dailyAverageExcessReturns(user);
return karmaSharpe(excessReturns);
}

function dailyAverageExcessReturns(user) {
var groups = user.comments.groupBy(c => c.Date);
var excessReturns = groups.select(g => g.sum(c => c.karma - 1) / g.count);
return excessReturns;
}


# Conclusion

It may not look very simple, but it actually is straight-forward. We are effectively saying that each user should generate an average comment score of 1 point per day to break even. Anything you make beyond 1 point is considered an excess return for the day. We then simply take the Sharpe ratio of the average daily excess returns. The resulting metric ensures that users are incentivized to make consistent, high-quality submissions and punishes one hit wonders and those who take a spray-and-pray approach.

##### Footnotes
1. Well, assuming that there is no way you're getting into a Ponzi scheme. This is pretty much the guarantee that people expected from Madoff's fund.
2. Note that I'll use the term "comment", but story submissions work the same and are thus analogous.
3. The pseudo-code is a lazy bastardization of C#.
1. First Post!

Sorry i couldn’t help myself.

Thanks for the article.

2. Very nice work. These are interesting formulas. Are they and the pseudo-code free to use, i.e. under the MIT license, or something similar?

3. Hi Bill,

Yes, all of the code is available under the MIT license. Thanks for asking.

4. I really like the idea of using a rate-of-return model, but I’m not satisfied with the final formula. When I’m super-busy at work (or on an extended vacation), I may go weeks without “generating a return”, but once the workload eases, I’m get more active. You need something so that I’m not seriously hurt by the days when I’m just lurking. Maybe dx should be the number of days on which a user has posted, although that would actually encourage me to not post at all when I’m busy. I guess I’d really like some sort of decay, so that more recent returns count more than the distant past.

5. Hi,
Thank you. I implemented a php version for this that loops over a data table for our app user’s karma, since the data there was well suited for this. IN your 2nd pseudo-code snippet, the following line:
return karmaSharpe(excessReturns);
really uses a different karmaSharpe function than the function in the first snippet, but it is of similar form of course (looping over the number of days and summing excess returns for each day).

@samwyse: Good idea. After reading your post, I added a simple decay to the summing of excess returns by multiplying by a simple weighting for the I’th day, basically of the form: [(I’thday – dayregistred)/(totaldaysregistered].

6. Really like your article and I am having big headache to design a karma system for an iPhone app that I am working on. (check out my contest @ prizes.org/wowwao)

Wondering if it’s possible for me to integrate under MIT license?

7. Thanks! Yes, the karmaSharpe call at the end of the second snippet is not technically proper, but it kept it brief. Thanks for catching that. Do you have a link to your implementation?

@Wyn, yes all code is released under the MIT license.

8. Hi, Its not live yet, it’s in beta and part of a branch in development right now. Its an app using mysql, php & jquerymobile, where people ask questions and can thumb up or thumb down the answers/replies, in real time (approximately). So a karma point can be negative or positive. Also people can choose not to reply, and the asker can choose not to rate the reply, so I took the default score to be 0 (not sure if this is correct).
It still needs more user testing, but so far it looks like it is working. The nice thing is that with the data, we can show the best responders for a datae range (like for the month of August etc.).
Will post back if we decide to use this when we go live

9. So what is comment.karma? I don’t see it defined anywhere, is that just the total (up – down)? Is it % of up votes? The example with the market asset seems to imply the comment.karma = %up, but I want to make sure.

10. @Alex In this context, comment.karma is the total # of upvotes – total # of downvotes. If it were finance, it might be %.

11. Wes,

Using just your second equation… Doesn’t the Sharpe formula also penalize the user for attaining more votes than the mean of all of his votes. Example dataset: item_a: 2 votes, item_b: 3 votes, item_c: 2 votes. This would provide or a karma of 2.31 assuming you take off one vote as being the item owner.

If item B received one more vote, the karma score would drop to 1.44

It would seem to me that the formula returns a more favorable karma the closer items are to the having the same amount of votes. This does not seem to be right in regards to a users karma because if they have one item that receives many more votes than there rest of their items, they are substantially penalized for it. With the idea being that since all of their submissions are not ranked as high, they must somehow be seen as having poor karma.

Is that correct or am I just not getting it?