Thursday, September 30, 2010

Blasphemy Day: Bad Faith

Today is Blasphemy Day. In honor of that, here's some blasphemy: There is no god. Jesus was just a man. Mohammed was not a prophet. A cracker cannot turn into flesh. Thunder and lightning are not caused by Thor or Zeus. Heaven and hell do not exist. Sex is a good thing. The presents under your tree weren't left there by Santa Claus.

Doubtlessly, you believe at least one of these things. Which is why I support Blasphemy Day. Because something you believe is blasphemy to someone. The price you pay for being able to express it is that others can express something that you may disagree with or find personally offensive. To oppose that is to oppose the very concept of free speech.

And so here's another piece of blasphemy: Faith is a bad thing.

In this context, by faith, I mean belief without reason. The word is sometimes used to mean something more like trust. A common arguing technique by some believers is to deliberately confuse the two separate meanings. ("You have faith that your brakes will work, which is the same as my faith in god.") Let me be clear, I am talking about belief without reason. If you have a reason for believing something, it's not faith.

Why is faith a bad thing? Because it's fundamentally irrational. You can't draw a map without first looking at the territory. Something you believe by faith might be, by pure chance, correct. But given all the different mutually exclusive things you could believe by faith, that chance isn't very good. If you're actually interested in having your beliefs be correct, faith is a very bad approach to take.

Now, evidence can be misleading. People believe things that are incorrect, but with backing evidence all the time. For example, scientists at the end of the nineteenth century believed in luminiferous aether and with good reason. Light's a wave, it has to travel through something, right? But by always evaluating new evidence, you can correct your mistakes and make your beliefs closer to the truth. With faith, there is no such recourse.

Faith is a bad thing, but what's even worse than faith is the belief that faith is a good thing. Human irrationality is a bad thing, but it's an inevitable thing. We are not rational beings by nature, and a certain amount of irrationality is to be expected. But it is not to be desired. We must strive to be as rational as possible, because that is the only way we can progress. It is the only way to correct our mistakes and fix them.

Promoting faith as a paramount good is perhaps one of the most evil things religion has ever done (well, ok, crusades, pogroms, fatwas, etc. are pretty bad too).

Wednesday, September 29, 2010

An Open Letter to the Ottomans

In retrospect, I should have expected it. For the whole game, you had been aggressively militaristic. You conquered several city-states and the Aztecs and Persians. But I thought you liked me. We had always been friendly in our dealings. But no, you attacked me, invoking your need for "living space".

I don't like war. In fact, I hate it. I rarely declare war on others, and when others force me, I'm usually only trying to make them stop. That was what I was thinking when you took over Rostov. I would take it back, and give you whatever gold or resources you wanted to stop the invasion.

But you didn't stop there. No, you couldn't. You burned Rostov to the fucking ground, killing every last man, woman and child. Not only that, you conquered Brussels, my longest and closest ally. Russia protects her smaller allies, and for these atrocities, we shall not rest. There shall be no peace as long as we both exist.

You may have legions far larger than mine. But I am generations ahead of your technology. Your piddly little janissaries will scream as they are crushed under the treads of my tanks, and I am, as we speak researching atomic theory. I am not afraid of nuclear winter, and you will be able to offer no counter-attack, or mutually assured destruction.

Mark my words: I will not rest until each and every one of your cities is a ruined, radioactive crater, marring the face of this planet.

Monday, September 20, 2010

The Road to San Francinattle

A problem I've encountered in making games and random map generators is giving places names. If it's a randomly generated map, then it needs randomly generated names. Doing it by hand would defeat the purpose. Another possibility would be to compile a large list of city names and then have the program pick names randomly. But that would less interesting as it could never come up with a new name. It would be great if a program could know the rules of what would make for good city naming. But since I'm not about to invent artificial intelligence, something a bit simpler is in order.

And this weekend, I came up with the perfect solution. A Markov chain random text generator. A Markov chain is a random process in which the next state is dependent only on the current state. In the case of a Markov chain random text generator, the current state is the last n characters (or words) generated. My program works with characters, but the concept applies exactly the same to words.

A Markov chain text generator first reads some sample text. From this it builds kind of probability table of which letters follow which other letters more or less frequently. Then it generates the letters (considering only the last n letters generated), weighting them so the more common letters and letter combinations occur more frequently.

My particular generator is slightly different than most others because it knows how to begin and end. Most other Markov text generators are used to make large blocks of text based off of entire books. Mine is to be used for short names. The beginning and end matter more, and it's good to have a name that ends in a logical way such as "ton" or "ville" instead of just being cut off in the middle. So I treat the end of the name as a character the same as the rest. It gets put in the chart the same as other and it gets generated the same as others, except that when it's generated, the name is done and it stops generating more.

Here's an example of how it works, using a name it actually generated: San Francinattle. For early testing, I used a list of the 100 largest cities in the US, including San Francisco, Cincinatti and Seattle. I'll only consider those three for simplicity. More names affects the probabilities involved, but not the principles. In this example, I'm using order = 3.

For the first letter, two names have "S" and one has "C", so it has a 66.7% chance of generating "S". With just "S" it has a 50-50 shot of producing "e" or "a" next. With our limited dataset, once "Sa" is generated, the only next option is 'n'. That continues until we've generated "San Franci". The order is 3, so it only considers "nci" when generating the next letter. That letter combination was in Cincinatti, in addition to San Francisco. It doesn't care about what came before this, so it has a 50-50 chance of generating an "s" or an "n", and in this case it went with "n". Now with this combination of letters, Cincinatti is the only game in town, so it has to stick with it for the next four letters, but when the last generated letters are "att" then it matches Seattle. At which point, it finishes up the city name.

And so, a brand new city name, inspired by other city names, and following English spelling convention.

After some successful test runs, I compiled a list of over 19,000 US city names (information provided by your friendly neighborhood census bureau (and a little bit of programming to strip out unnecessary parts)).

Here are some of the names generated with order = 1:
Orrdak Lanson
Pachelug
Mior Juengsperes Beleckbumonty
Qubrt Blle
Nesoala Isvid
Splllay
War Hilahe Crgeyrtosn
Rionnengh Fr
Ibbon
Sove
Order = 2:
St. Cimandsonisond
Rippnevillsbortham
Mase
Vinge
Easton
Blue Haveroe
Peenvilley
Mintond
Forwing
Willinbold
Wod
Bly
New Citeradermon
Order = 3:
Tonkeen
Chur
Big Vall
Pleadis
Summitol
McRae
Roscomb
McBainfieldersvillenwood
Meside
Uphalittles
Glens
South Burleve Porth
Runa
Hess Hillage Placket
Belph
Warway
Order = 4:
Washton
Brook
Iona
Denver
Graham
Buncombes
Dewey
Mario
Dolanagan
Onster
Hampton
Magner
Greenview
Bee City
Saling Gretna
It's amazing how few orders are between unpronounceable gobbledygook and reproducing sample names exactly. Orders 2 and 3 seem pretty good to me though. It produces names which are easy to imagine could be real, but probably aren't.

I also got a list of about 2,000 German cities. Here are some names generated using those (order = 3):
Alzenbach
Coswintern
Völklingen
Ludwig
Kötheim
Güstralsrode
Löwentha
Pappenhüttenburg
Kalbernburg
Wolgau
Geislingen

Tuesday, September 14, 2010

Some Musings on Redundancy

I was talking about the DRY principle before. Don't Repeat Yourself. From reading programming books, it seems like redundancy is the worst possible sin, for which one should be immediately banished to the ninth ring of hell (though it seems like it's frequently committed in practice). But I have to wonder if redundancy is always a bad thing.

Well, it's pretty obvious that redundancy isn't always a bad thing. Leaving the world of computer science, and looking at engineering, redundancy is frequently a good thing. To take an extreme case, life support systems on spacecraft are multiply redundant. Which is good, because life is awfully fragile in low-Earth orbit. Even here on Earth, redundancy in engineering tends to be a good thing. It has the downside of costing a more, but has the advantage of preventing one thing going wrong from destroying everything.

But redundancy in engineering has little (if anything) to do with redundancy in programming. So, let's look at something a little more information based -- linguistics. Language is chock-full of redundancy. Here's a simple example: "I run", "He runs". What's with that extra s? We know who the sentence is talking about from the pronoun. Why bother with noun-verb agreement at all? Because the world is a noisy place. There's always some amount of background noise around. Frequently, people talk to each other in the middle of crowd, where everyone else is talking too, which is a pretty incredible feat if you think about it. And in a noisy environment, some amount of spoken information is going to be lost in the background. And a little bit of redundancy can help you make sure you actually heard what you thought you heard. This form of redundancy still has its cost: it takes longer to get a complete message across.

And after this cross-discipline trek, I'll finally step back into programming, from human languages, to programming languages. I've been learning some Groovy for work. Groovy is a language that's built on top of Java. It does everything Java does and adds in some cool features of its own. One of the things it does, and part of its core philosophy, is to remove Java's unnecessary and redundant fluff. For example, Java requires a semicolon at the end of every statement. But most of the time, a single statement is on a single line. So, Groovy lets you use a new line to end the statement. You can still use the semicolon if you want, but it's not necessary. It's redundant. Stuff like that is all over the place. It makes the code shorter, but it also makes it (at least for someone new to Groovy) more difficult to understand. Finding a method's return type, for example, is no longer, necessarily, simply looking at the method's declaration. It's still unambiguous, but it's harder to find.

And not only is it harder for a human, it makes mistakes more difficult for the compiler to catch mistakes. If you have information stored in two places, and you change one (intentionally or accidentally), the compiler can alert you to the inconsistency. If you changed it intentionally, you'll be reminded to change the other. If you changed it accidentally, you'll be reminded to fix it. If the information is stored in only one place, the compiler has no way of checking if you really meant the change. Here's an example with Python (since I haven't been using Groovy enough to encounter a good example of this yet). In Python, functions can be treated just like any other variable. I was writing a program in which I wanted to get the result of one function (which took no arguments) and then pass that to another function. Simple code like this: x = funcA; funcB(x); See the problem? It should have looked like this: x = funcA(); funcB(x); Those parentheses after funcA make a big difference. Without them, funcA itself is passed to funcB instead of the result of funcA. If you consider a theoretically ideal language which has absolutely no redundancy (brainfuck comes close), then any arbitrary string would compile and run. Which means if you make a single typo, it will still work, it just won't do what you want it to do.

I was gonna talk more about other forms of redundancy in programming. Higher level stuff in the overall design of the program rather than the nuts and bolts of the language. But this post is plenty long enough as it is, so I'll wrap up with conclusions now. Is redundancy a bad thing? Not necessarily. The advantages can outweigh the disadvantages. But there are always disadvantages. In most of my examples, the costs were pretty small compared to the benefits. But, especially in the higher level, more abstract stuff, the costs can be significant. So, I'll bring up another coding principle: Code by intention. If you're going to do something redundant, do it for a good reason. Do it intentionally.

Wednesday, September 1, 2010

Today is the First Day of Autumn

Why? Because I say so and I really like autumn. That's why.

There are those who say that solstices and equinoxes mark the beginning of each season. In which case we're still three weeks away from autumn. Others say that solstices and equinoxes mark the midpoint of each season.

Nonsense, I say! Solstices and equinoxes have to do with the alignment of Earth's tilt to the Sun. Seasons have to do with the weather and the temperature. The two are related, but not the same.

If you say solstices mark the beginning of seasons, then a week before Christmas isn't winter yet, which seems kind of ridiculous. If you say equinoxes mark the midpoint, then Valentine's Day is already spring, which is even more ridiculous.

And more importantly, there is no first day of any season. As my brother and I were discussing on Facebook, everything is continuous, and seasons are a prime example of that. It's not like there is one single day where the temperature drops from 30°C to 20°C and all the trees change color. It's a gradual transition, like red turning to orange in a rainbow. Any line of demarcation is going to have to be arbitrary.

And since it has to be arbitrary, it might as well line up with another, well established arbitrary date-point. In this case, the first of the month.

And so, I decree: The first day of autumn is September 1st. By extension, the first day of winter is December 1st, the first day of spring is March 1st and the first day of summer is June 1st. And it lines up much better with the weather that way.