NashCoding Yet Another Artificial Intelligence Blog


Walkthrough of the UCB1 Optimality Proof

The Finite-time Analysis of the Multiarmed Bandit Problem paper is a seminal work in bandit research. It was the first paper to present and prove an algorithm for selecting a bandit with finite-time optimality guarantees. If you're not familiar with bandits, Wikipedia has a decent entry.

In this post, I'm going to assume you have made an attempt at reading the UCB paper and tried to understand the proof for UCB-1. My goal is to provide an in-depth explanation for how they arrive at their final finite-time optimality guarantee for the algorithm.


Hacker News Needs Honeypots

There has been a lot of recent debate regarding how to improve quality control on HackerNews (HN), and to his credit, Paul Graham (pg) has tried a lot of tactics. There is a very clear set of HN guidelines, which very few members these days probably read. For a while, pg tried playing around with the karma formula and, even if I disagree about the way karma should be measured, at least he gave it an effort. He also hid comment karma from everyone but the author, to help slow the demonstrable deterioration of the discussion section; apparently this has been successful in pg's observations. Nevertheless, I do believe that we are seeing a continuing trend downward in overall article quality on the front page1.

In this post, I present a honeypot approach to combating group-think and quality deterioration in article selection on social news sites.

  1. For instance, see this front-page article from 10/23/11.

How Karma Should Be Measured

Measuring karma was a heavily-debated topic for a while on HackerNews. The goal is to provide some measurement that both accurately measures overall contribution to the community and encourages consistent engagement. Several solutions were discussed and a few were even tried. For example, pg tried to replace the overall karma score with an average score. All three combinations (total only, average only, and both) were juggled around in the top right corner, until eventually the simple total was used.

While all these attempts were good ideas, I think there is an even better metric that should be used here: the Sharpe ratio.


Building a Startup – 12 Priceless Tools for Launching Your MVP

Building a startup is hard. Really hard. The last thing you want to do is make it harder. Through the trials and tribulations I've undergone in launching my startup, I've discovered several gems that are indispensable. If you're trying to launch a startup, here are the tools you need.

Filed under: Startups Continue reading

Building a Startup – Part 2 – Defending .NET

There has been a lot of trash talk recently about .NET for startups. People make ridiculous claims about how .NET killed MySpace and you should never hire .NET programmers. Well I'm building my startup in .NET and for a lot of good reasons.


Building a Startup – Part 1 – Introduction

I've joined with David Fogel and Yanon Volcani to build EffectCheck. This post is the first in a series detailing how we're building this startup from the ground up.


Using the Forms Authentication Membership Provider on AppHarbor

I love AppHarbor. I've been waiting for a year for someone to do "Heroku for .NET", and it's finally here! One of the basic use cases for an ASP.NET MVC web app is membership support. This post will be a very quick walk through on using the default (forms authentication) membership provider for an AppHarbor web app.


Tutorial – Evolving Neural Networks with SharpNEAT 2 (Part 3)

In parts 1 and 2 of this tutorial series, we evolved neural networks using the standard NEAT algorithm. In part 3, we're going to use the HyperNEAT algorithm.


A C# Blackjack Simulation Framework

Since I haven't had time to sit down and write new code lately, I decided to dig into the closet and start pulling out code that I've worked on but haven't released yet. Today's gem is a framework for simulating Blackjack (also known as 21) strategies.

Filed under: Blackjack, C# Continue reading

Evolving Nash Equilibria – A Quick Correction

As I noted in my first post, I was a little skeptical about whether a Nash Equilibrium (NE) could be evolved by taking the squared loss of each hand. My conclusions were that, given an expected value evaluation function, it was possible using a best-opponent fitness but not using a squared-loss fitness. There turns out to be a small bug in the squared loss code which caused a big change in results. Below are the results for the correct fitness function implementation.