Understanding Bayes’ Theorem

Disclaimer: I am writing this because I finally understood how Bayes’ Theorem works. Therefore, this is much more a reference note to myself than anything else, but I am publishing the information here because I think it might be useful for other people. I am no math expert (as can be deduced from the fact that I only understood the theorem now, after having seen it many times), so please let me know if you find any mistakes in my explanation.

Bayes’ Theorem states that

Bayes' Theorem

where P(B|A) is the probability of event B occurring, given event A occurred; P(A|B) is the probability of event A occurring, given event B occurred; P(B) is the probability of only event B occurring; and P(A) is the probability of only event A occurring.

I have seen, and even used, this formula a number of times in my life, but I had never really understood what it was saying. But before I explain it, I will introduce a few basic concepts.

Relative frequency

Given an experiment with two, non-mutually exclusive possible outcomes, A and B, and n repetitions of that experiment, and let n1 be the number of occurrences of event A alone, n2 the number of occurrences of event B alone and n3 the number of occurrences of events A and B simultaneosly, the relative frequency of the occurrence of event A, or probability P(A) of event A, is given by

Probability of an event A

where nA is the number of times event A occurred. Likewise, the relative frequency of the occurrence of event B, or probability P(B) of event B, is given by

Probability of an event B

where nB is the number of times event B occurred. Finally, the relative frequency of the occurrence of both events, or probability P(AB) of events A and B, is given by

Probability of two events A and B

Conditional probability

The relative frequency of event A occurring, given event B occurred, is given by

Number of occurrences of  event A, given B

Notice that in the denominator, we account for the occurrences of event B alone and of event B alongside with event A. The lower the value of nB (recall nB is the relative frequency of event B occurring alone), the higher the ratio of the equation above will be (yelding 1 when nB is zero, i.e., event B occurs only when event A occurs). If B occurs more often alongside with A than alone, then the probability of A occurring when B occurs will be higher.

Likewise,

Number of occurrences of  event B, given A

nA|B and nB|A may also be denoted P(A|B) and P(B|A), respectively. P(A|B) is read as “the probability of event A, given B”.

Given the above equations, we observe that

Number of occurrences of events A and B

or

Probability of events A and B

Now we’re ready to understand Bayes’ Theorem

Bayes’ Theorem

As mentioned in the beggining of this post, Bayes’ Theorem states that:

Bayes' Theorem

Now, here’s how you should interpret it. A better way to visualize it is to write it as

Another way to visualize Bayes' Theorem

given the last equation from the previous section. The explanation is similar to the one given for the equation for nA|B in the previous section. If P(A) is close to P(AB) (recall P(A) is the probability of event A alone or alongside with event B), that means most occurrences of event A happen when event B also occurs. From that, we can intuitively conclude that event B will very likely occur given event A occured, since the occurrence of event A is strongly related to the occurrence of both A and B.

The explanation above can be expressed in terms of very informal (but I believe reasonable) logical statements:

1. A and B may occur
2. A tends to occur only when B also occurs
3. A occurred
4. It’s likely B will also occur

Note: this explanation is strongly based on the online tutorials for Digital Image Processing, by Gonzales & Woods. I used the same notation and terms. However, my objective here was to add the clarifying (at least for me!) explanations.

Ternary operator in Python

Here’s how you translate the following C code

int max(int a, int b)
{
    return (a > b ? a : b);
}

to Python:

def max(a, b):
    return a if a > b else b

The general syntax is:

TRUEVAL if CONDEXPR else FALSEVAL

Note: I’m not 100% sure, but I think that only works starting from Python 2.5.

Fibonacci

Um
Coelho
Mais dois
E ainda outro
Seguem assim até o infinito

Using feof() and fread()

When reading single bytes from a file in C, one must pay attention to the correct usage of feof() and fread(). At first, the following piece of code seems to work correctly:

const char *filename = "hello";
unsigned char byte;
FILE *fp;
 
fp = fopen(filename, "rb");
 
if (!fp) {
    printf("could not open file\n");
    return 1;
}
 
while(!feof(fp)) {
    fread(&byte, 1, 1, fp);
    printf("%02x\n",byte);
}
 
fclose(fp);

Suppose the file “hello” has the following contents:

0000000: 68 65 6c 6c 6f 0a                                hello.

(which is the string “hello” followed by an LF)

When the code above is run, the following output is produced:

68
65
6c
6c
6f
0a
0a

Notice the last character seems to be read twice. The problem is that feof() only returns true after attempting to read past the end of the file. In order to fix this “read-twice” behavior, the return value of fread() must be checked:

if(!fread(&byte, 1, 1, fp)) {
    break;
}

Note: Using feof() as the while condition is kind of redundant here. In this situation, one could simply use while(1) and the behavior would be the same.

Update: A much better solution was given by my friend Bryan:

const char *filename = "hello";
unsigned char byte;
FILE *fp;
 
fp = fopen(filename, "rb");
 
if (!fp) {
    printf("could not open file\n");
    return 1;
}
 
fread(&byte, 1, 1, fp);
while(!feof(fp)) {
    printf("%02x\n",byte);
    fread(&byte, 1, 1, fp);
}
 
fclose(fp);

Don’t forget that virtual

I am not really a C++ programmer. I usually code in C, and I think all C++ I’ve ever written involved a couple of vectors and maybe one or two classes. So what I’m writing here is certainly old news for C++ programmers.

The other day, while I was reading this essay about the Liskov Substitution Principle, I was intrigued by a situation presented in page 4, which deals with class inheritance and method overriding.

Methods that can be overriden must be declared virtual in C++. I really didn’t know about that. I am used to Java’s behaviour, which I illustrate below:

/*
 * A.java
 */
 
public class A {
 
	public void bar() {
		System.out.println("bar");
	}
}
 
/*
 * B.java
 */
 
public class B extends A {
 
	public void bar() {
		System.out.println("Bar!");
	}
}
 
/*
 * C.java
 */
 
public class C {
 
	public void call(A o) {
		o.bar();
	}
}
 
/*
 * Main.java
 */
 
public class Main {
 
	public static void main(String [] args) {
		A a = new A();
		B b = new B();
		C c = new C();
 
		c.call(a);
		c.call(b);
	}
}

The code above will produce the following output:

bar
Bar!

If I wanted to do the same thing in C++, I would have to write it this way:

#include <iostream>
 
class A {
public:
	A() {}
	virtual ~A() {}
	virtual void bar() { std::cout << "bar" << std::endl; }
};
 
class B : public A {
public:
	B() {}
	virtual ~B() {}
	void bar() { std::cout << "Bar!" << std::endl; }
};
 
class C {
public:
	C() {};
	virtual ~C() {};
	void call(A &o) { o.bar(); }
};
 
int main(int argc, char *argv[]) {
	A a;
	B b;
	C c;
 
	c.call(a);
	c.call(b);
 
	return 0;
}

The output will be the same as the Java program, but pay attention to the virtual modifier placed before the declaration of the bar method in A. If I remove it, the output will be:

bar
bar

Even though B declares a method called bar, it is completely shadowed by the implementation inherited from A. In my opinion, Java’s behaviour is a lot more intuitive. I wonder how many people might have spend hours, maybe days, looking for a bug in a C++ program when the problem was that a method was not declared virtual and method calls weren’t occuring as expected.

I talked about this with Otávio, and he found out C# has a similar characteristic. In C#, a method must be declared as virtual for its subclasses to override it AND the subclasses must use override when overriding the method. Here’s the code he provided me:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
 
namespace ConsoleApplication1
{
   class Alice
   {
       public virtual void sayHello()
       {
           Console.WriteLine("Hello, world");
       }
   }
 
   class Bob : Alice
   {
       public override void sayHello()
       {
           Console.WriteLine("Hello, world!");
       }
   }
 
   class Echo
   {
       public void say(Alice toto)
       {
           toto.sayHello();
       }
   }
 
   class Program
   {
       static void Main(string[] args)
       {
           Alice a = new Alice();
           Bob b = new Bob();
           Echo e = new Echo();
 
           e.say(a);
           e.say(b);
       }
   }
}

Some people argue that this makes code clearer, but I’m not convinced. Why not make overriding implicit?

SWI-Prolog Easter Egg

swi-prolog-easteregg

Found it by accident :)

New toys

Yesterday I bought myself three new toys:

f-22-raptorAn F-22 Raptor model

ka-52-alligatorA KA-52 Alligator model

rc-f1-ferrariAn RC F1 Ferrari

This week’s links

I’m gonna start posting links to news, articles or blog posts which I find interesting on a weekly basis. I expect to do this every Friday or Saturday from now on. Here’s the first batch:

  • 13 things that do not make sense: 13 things that still puzzle the minds of scientists around the world.
  • It’s not magic!: Microsoft’s Eric Lippert on “magic code” (code that people don’t realize someone actually had to implement)
  • Orthogonality: a long but interesting post about programming language orthogonality

Not much for this week I guess, but I hope to collect more links in the next weeks :)

QuickSynergy 0.9.0 released

Yesterday I released version 0.9.0 of QuickSynergy for Linux. Here are some screenshots:

QuickSynergy 0.9.0 Screenshot

Share mode

QuickSynergy 0.9.0 Use mode

Use mode

QuickSynergy 0.9.0 Settings

Settings

QuickSynergy is a graphical user interface for Synergy developed by me and Otávio Cordeiro, which aims to ease the task of setting up Synergy on Linux and Mac OS X systems, for which there are no official GUIs. Synergy is an application that allows you to share a single mouse and keyboard between two or more computers across a TCP/IP network. It is much like a “software KVM switch“, but without the V in KVM.

Release notes for version 0.9.0:

  • Allow setting client name in Use mode.
  • Configurable path to synergy binaries.
  • New configuration file format (key-value).
  • New Share mode image in main window.
  • Minimum required GTK version now 2.10.
  • Bug fixes.

Fractals in Logo

Today I was reading a blog post about Logo and I decided to play a little with UCBLogo (Berkeley Logo) and try to draw some fractal (more specifically, L-Systems) with it.

Here is a Koch Curve:

Koch Curve

The code:

to M n
  ifelse notequal? :n 0 [
    M :n - 1
    lt 90
    M :n - 1
    rt 90
    M :n - 1
    rt 90
    M :n - 1
    lt 90
    M :n - 1
  ] [
    fd 10
  ]
end

cs
pu
lt 90
fd 150
rt 180
pd
M 3

And here is a Fractal Plant:

Fractal Plant

The code:

make "stackpos []
make "stackhead []

to store
  push "stackpos pos
  push "stackhead heading
end

to restore
  pu
  setpos pop "stackpos
  setheading pop "stackhead
  pd
end

to F n l
  ifelse notequal? :n 0 [
    F :n - 1 :l
    F :n - 1 :l
  ] [
    fd :l
  ]
end

to X n l
  if notequal? :n 0 [
    F :n :l
    lt 25
    store
    store
    X :n - 1 :l
    restore
    rt 25
    X :n - 1 :l
    restore
    rt 25
    F :n :l
    store
    rt 25
    F :n :l
    X :n - 1 :l
    restore
    lt 25
    X :n - 1 :l
  ]
end

setbackground 2
setpencolor 0

cs
pu
bk 50
lt 90
fd 150
rt 90
pd
rt 45
X 5 2

The code could be a lot better, but that’s what I was able to grasp in about 3 hours. The last time I had really used logo was in primary school, when I was 8 or 9-years old.

I don’t know if it’s a problem with my Fractal Plant function code or if it’s a bug in UCBLogo, but sometimes execution froze completely and I had to hit Alt+S to stop it and call the function again until it worked. I also don’t know if that’s a limitation of UCBLogo or if I just haven’t studied it enough, but I couldn’t find out a way to unset a variable or redefine a function.

Anyways, Logo is absolutely fun to play with. As the author of the previously mentioned post said, it is kind of addicting to think of a set of steps and order the turtle to repeat them a few hundreds of times to see what happens. I consider the language itself a bit confusing at first, but I guess I should read more about it before forming any strong opinions on it.