Wednesday, September 29, 2004

Almost Back

The blog lives again - almost. My comment system didn't survive the transition to the new hosting service; but otherwise it's nice to have a voice on the web again. I'll try and find some time to fix the the comment system soon, but it may have to wait until the weekend.

Tuesday, September 28, 2004

This really stinks

My hosting service has died and I'm no longer connected. These are posts I've written to ease the frustration...

It's been five days since I've last been able to post to my blog. Things wouldn't be so bad, but my hosting company has restored an old copy and not given me access to it. Of course to make matters even worse my old co-worker Ned linked to my blog this week. Now everyone who reads his blog and follows the link is going to think I'm a blog zombie or worse. Then on top of this all, just this afternoon I received an email from a avid reader/friend who was lamenting my lack of postings. It's truly painful to be this isolated.

Saturday, September 25, 2004

My hosting company has been having issues

Since Thursday afternoon I have not been able to access the server hosting this site. As I write this on Saturday afternoon It's still unreachable. It's been a frustrating couple of days. The hosting company finally got back to me this morning with an explination (the server died), and a time table for being back on line (real soon), so I can be patient and wait.

The funny thing is I didn't realize how attached I had become to this medium of expression. Sure sometimes it's a pain, but when you see a story and want to talk about it but you can't because your domain is unreachable it stinks. For example, they spotted a Great White shark off of Cape Cod this week and tagged it. For some reason is still hanging around. I wanted to blog about it and couldn't Sure I'm writing about it now but its old news.

Thursday, September 23, 2004

Brain Quakes

We live in a cool age of ever expanding understanding of the physical and biological processes around us. We hear of new theories and ideas about all sorts of things nearly every week. I just read an article over at that suggests all the stuff we've learned about how oil and methane gas are produced, by the decay of ancient biological matter, are wrong. The theory also suggests that there's a giant microbial biosphere deep underground that lives off of and produces these products.

This theory is being applied to explain processes not only on Earth but on Mars as well. Recent measurements of the Martian atmosphere indicate a level of Methane present that cannot be explained by simple Martian volcanic activity. Some scientists are suggesting that microbial life deep underground on Mars is the source of this gas.

I want these theories to be true. Not just because knowing there's life on Mars would be cool, or that thinking about this new biosphere living in dark fissures of the earth feeding on methane and or oil is kind of bizarre. I like the experience of the fundamental brain quake associated with paradigm shifts. News like this is like crack for nerds.

Wednesday, September 22, 2004

Ruby on Rails

Rails is a cool web application framework for Ruby. To promote how simple it is to use they've made a demonstration movie that includes installation, configuration and the creation of a blogging application backed by a sql database, all in just 10 minutes. Obviously the person doing the work knows the platform, but this isn't a scramble of mouse clicks and speed typing. They don't even use a pre built template to speed creation of the application. They do it from scratch. It's a great demo.

Tuesday, September 21, 2004

The future of PE and other hobby projects

I believe my time of working on PE is nearing its end. When I set out to write PE I had in mind a replacement for my beloved Brief. It now serves that purpose for me admirably. The only feature from Brief that I miss and haven’t reproduced is block column selection. While that would be cool to have, I can live without it. There are plenty of other Brief features that I didn’t copy; I just don’t care about them. I will continue to fix bugs and tweak PE over time as it moves to version 1.0 but I now longer envision any significant feature work.

Going forward I would like to finish the Java scripting library I started a while back. I will probably roll the verbose regular expression parser I talked about recently into it. I think both are cool little ideas and worthy of some level of completion. I don’t envision either of those tasks being significant.

Once those are completed, that leaves me without a hobby project to hack away at. I’ve thought about trying to get into doing some open source work but my free time is very scattered. I think most active open source projects move too fast for someone with my limited availability.

I will need to ponder this and come up with something new and fun to work on while I complete the scripting library.

Monday, September 20, 2004

PE version 0.91

I've made available a new version of my old-school text editor PE. PE version 0.91 adds some basic programmer features that it's been missing: syntax coloring, macro recording and playback, as well as user preference retention for stuff like window size, file filters, etc....

Geoge Lucas on the Future of Star Wars.

I missed this interview published last week.

Lucas: Ultimately, I'm going to probably move it into television and let other people take it. I'm sort of preserving the feature film part for what has happened and never go there again, but I can go off into various offshoots and things. You know, I've got offshoot novels, I've got offshoot comics. So it's very easy to say, "Well, OK, that's that genre, and I'll find a really talented person to take it and create it." Just like the comic books and the novels are somebody else's way of doing it. I don't mind that. Some of it might turn out to be pretty good. If I get the right people involved, it could be interesting.

I hope some young Frank Miller type gets a hold of the genre and turns it back onto a darker path.

Sunday, September 19, 2004

Super Emulator

Platform emulators have gotten pretty good over the years. I don't run Windows stuff on my wife's little iBook but I hear there's good software if I wanted to try. If you believe this article from technology Review, the tiny startup Transitive Software has some breakthrough technology that allows you to emulate a platform with almost no loss of performance. It certainly sounds too good to be true.

While skimming the companies technology overview document I was struck by the similarity between one of their breakthrough optimizations and what the Sun HotSpot JVM attempts to do for Java.

The optimizing kernel reads the intermediate representation and optimizes the code. At first, simple optimizations are performed. In most applications, however, a 90/10 rules holds where 10% of the code is executed 90% of the time. The optimizing kernel looks for blocks of code that are executed often, spends increasing amounts of time improving the optimization of this code, and then stores this optimized code in memory. Each time a frequently used block of code needs to be executed, the highly optimized code stored in memory is used instead of optimizing that block of code again. Because the blocks of code that are executed change frequently, the optimizing kernel flushes old optimized blocks and generates new ones. The optimizing kernel produces superior code optimization compared to static binary translators or compilers. It optimizes code based on how an individual user is using that application and does not need to optimize code for the general case.

Bug complexity and high level languages

I was reading Bob Congdon's recent post on James Tauber's Inverse Law of Bug Complexity:

"The harder a bug is to track down, the simpler the fix tends to be."

I don't agree this is a law at all. Mr. Tauber doesn't even seem too sure. The word 'tends' leaves too much wiggle room for a law. But the observation is not without merit. However, I'd say this is a rule of perception not reality. Most bugs are easy to fix once you know what the problem is. Bugs that are super hard to find just seem easier in relation to the task of finding them.

What really got me thinking about this however was the feeling that I haven't really had to chase a super hard bug since I started working in Java. Sure I've been confused by some class loading issues and hit a few JVM problems that were a pain but these pale in comparison to the stray pointer issues in C you would encounter, especially before protected memory and NuMega's Soft-Ice came on the scene.

Of Love and Other Demons

I just finished Gabriel García Márquez's book Of Love and Other Demons. It's only the second book of his I've read. The first was One Hundred Years of Solitude which is quite amazing and one of my favorite novels. That's a hard act to follow and I have to admit some disappointment with this newer novel. Don't get me wrong, it's a good book, I just was hoping for more.

Of Love and Other Demons is the story of a girl who is accused of being possessed by demons. The story takes place in an unnamed Latin American port city in the time of slave trade and Spanish colonialism. As the story unfolds the girl is imprisoned in a convent to await her exorcism. There she encounters a young scholarly priest and an illicit and tragic love affair follows.

What really makes this book isn't the story, however, it's Gabriel García Márquez's writing. Even though it's translated from Spanish the sentences are alive and full of color. I can only imagine how good it must be in it's orginal form.

Saturday, September 18, 2004

Why's Ruby Revisited

I had almost forgotten about Why's Poignant Guide to Ruby (think of it what you will). It appears to have been expanded since everyone first talked about it. More importantly, at least to this post, the author has published a nice article on what's new in Ruby 1.8. I have to confess my Ruby is a but rusty, and I don't have any immediate plans to use the new features, regardless, it's interesting to see what's changed.

New Look

I was getting tired of the old look of this blog so I made a new logo and rearranged my style sheet. Not much has really changed, but I did finally make a proper index page. I haven't got around to updating the PE page yet but I will soon. (There's a much enhanced version of PE on the way too). I also retired the artwork page. I didn't like the technology I was using for the gallery and to be honest I was sick of looking at that old stuff. I'll probably resurrect something similar but with a focus on my current work doodles. My co-workers have been encouraging me to do that for a while and it might be funny. We will see.

Friday, September 17, 2004

Sorry IE Users

I've been peppering my posts with ' without realizing IE doesn't support it. I've switched to the more general '.

Parody out of Control

I just stumbled across the political parody site: Rowboat Veterans for Truth.. It's not especially clever. I'm blogging about it because the real Rowboat Veterans where actually from my home town of Marblehead MA.

The men who rowed Washington across the Delaware and also away from Long Island were Marblehead fishermen from then Col. John Glover's Massachusetts Regiment. Glover later became a General. Just as important but less well known Glover assembled the first ships for a deep water force for use by Gen Washington from Marblehead and surrounding towns. Although not nationaly recognized Marbleheaders still claim their home is the birthplace of the US Navy.

If you ever visit Marblehead, visit Abbot Hall in the old section of town. Hanging in the hall is the original Spirit of 76 by Archibald Willard. It's a very impressive both in historical significance and scale - it's pretty darn big.

Finally, while looking for some historical information about Glover and the Marblehead fishermen I found an online copy of The History and Traditions of Marblehead by Samuel Roads Jr. – 1880. I doubt I'll ever read it but it's an interesting piece of history.

Wednesday, September 15, 2004

Man on Fire

I usually don't take movie reviews too seriously; reviewers can have their own opinions just like everyone else. I said usually because for some reason my blood is boiling after having read the Boston Globe's review of the Denzel Washington flick "Man on Fire". On a scale of A to F the Globe reviewer gave the film a D-. A D-. I've seen plenty of horrible films in my day: Gymkata, Battle Field Earth and Speed 2 come to mind. Those films merit a D-, Man on Fire certainly does not.

Man on Fire isn't a perfect film; I can see why some reviewers complain about the second half dipping a bit too deeply into cliché, and the editing being a bit choppy, but beyond that it's very, very good.

Denzel does a great job bringing life to a flawed and troubled character. I think his Creasy character is the perfect post 9/11 anti-hero. His brutal revenge has a clarity that eschews any moral ambiguity about killing those who have wronged him. It's a complex and powerful performance.

In the final analysis I give the movie only a B+, but a D- is just crazy.

Blog maintenance

I've done a little blog maintenance. The front page now has my mug on it as does the updated About page. I've also added the correct link specification so Firefox picks up my RSS feed.

Tuesday, September 14, 2004

Living without here-documents

Since Java won't have here-documents any time soon, what's the best alternative for embedding blocks of formatted text in your code? Obviously you can use regular concatenated strings, but that places a dual burden of needing to maintain the entwined syntaxes of Java and whatever's being wrapped. It may be expedient but it's always going to be ugly.

I think a better mechanism is to use the class loader's getResourceAsStream("filename") call to load a file containing the pre formatted text. My only concern with this approach is there's no standard way of dividing multiple multi-line strings in a single text document. I'd rather not have one file per block of text.

So there's no standard solution, but there are examples in the SDK of Java solving the same problem. The policy documents under the jre/lib/security directory use a simple curly brace delimited syntax for encompassing multiple lines.

grant codeBase "file:${java.home}/lib/ext/*" {

Simply using curly braces isn't a good choice for a generic solution though because the text being wrapped may well contain curlies and you would need to start escaping the content; and that's pretty much what this exercise is trying to avoid.

For my proposed solution I turn back to the syntax of the fore mentioned here-documents as the best generic solution to this problem.

random content

The benefit of this approach include the following :

1.) The contained blocks of text would be accessible in a manner consistent with regular property files.

   String s = BlockProperties.get("NAME");

2.) The syntax is close to that of a standard property file.

3.) Content would not need escaping.

4.) Super simple to parse.

Monday, September 13, 2004


One of my favorite little features of scripting languages, like perl, ruby and groovy, is their support for here-documents. A here-document is a lot like a <pre> tag in html, but instead of declaring text for display you are declaring the contents of a string - line endings and all.

Instead of writing:

static String test =
"This is line 1\r\n"+
"This is line 2\r\n" +
"This is line 3\r\n" +
"This is line 4\r\n";

You get to write:

static String test = <<<END_OF_BLOCK
This is line 1
This is line 2
This is line 3
This is line 4

This is a great little feature for embedding little languages like HTML, XML or the alternate regular expression syntax I've been talking about within another language.

There's a JSE for the feature on Sun's Java developer site. Please vote for it if you agree it's a good idea.

Sunday, September 12, 2004

Down Time Project

I strained some lower back muscles on Thursday and since then I've been taking muscle relaxants, ibuprofen, sleeping a lot and generally taking it easy. I took advantage of the down time to start writing the alternate regular expression parser I recently blogged about. The syntax has changed a little but the spirit of the original idea are intact. The following are some samples of what it can generate.

Standard Regex:
Verbose Regex:
define restricted ('<>()[]\\.,;:@"')
anchor begin
oneOrMore(notAny(restricted)) +
zeroOrMore('.' + oneOrMore(notAny(restricted))
or group('"' + oneOrMore(any) + '"')

Standard Regex:
Verbose Regex:
define ALPHANUM (range('a','z','A','Z','0','9'))
define ALPHA (range('a','z','A','Z'))
anchor begin
group(zeroOrMore(zeroOrOne('_.-') + oneOrMore(ALPHANUM)))
+ '@' +
group(oneOrMore(ALPHANUM)) +
group(zeroOrMore(zeroOrOne(any('.-')) + oneOrMore(ALPHANUM))) +
'.' +
group(repeat(ALPHA,2, ))
end anchor

The expression (Standard and Verbose) in each box are identical. The result of running the verbose syntax parser is the standard expression.

I haven’t added the named groups feature I talked about yet. I wanted to get the basic parser done first. I also need to expand my set of unit tests and flush out the capabilities. I’ve seen certain expressions that I cannot even read so it’s quite difficult to know how to parse them. If nothing else this project is honing my regular expression skills.

Friday, September 10, 2004

Blogger hates quotes and apostrophes

Sometimes blogger correctly encodes quotes (apostrophe too) other times it doesn't. My last post used a lot of apostrophes and double quotes and of course it didn't encode correctly. Arggg. I've hand added all the &apos; and &quot;s now so it should be readable.

More Regular Expression Complaints

Bob responded to my previous comment on regular expression with a post of his own. His didn't seem to care for the idea of a new syntax for writing regular expressions and instead focused on how you can use technique and tools to make the current syntax more usable.

I agree that there are better ways to write a complex regular expression, like the mail address parsing example, using common language features; simply breaking a complex expression like this into named blocks would go a long way to improving its understandability. However, my issues with regular expression go deeper than what can be done with simple syntax substitution.

Take for example the current regular expression's use of the simple parentheses. In regular expression parentheses serve a dual purpose: they act both as scope boundries for operators like the quantifiers "?", "+", "*" and "{x}".

Quantified Expression: "^([a-f][0-9]-){3}[a-f][0-9]$"
Matches: "a8-b2-c3-f6"
Doesn't match: "a9-b3-c8-x8"

And as group delimeters to capture subsets of the matched string for back references or for extraction by the caller.

Sub Expression: "^([a-f][0-9]-([a-f][0-9])-([a-f][0-9])-([a-f][0-9])$"
Matches: "a8-b2-c3-f6"
Group 1: "a8"
Group 2: " b2"
Group 3: " c3"
Group 4: " f6"

Back Reference Expression: "^(.{4})-\1$"
Matches: "aaaa-aaaa"
Group 1: "aaaa"

If you have a complex expression with a lot of parenthesis you either have to count them very carefully to determine the correct group number or else you have to do some experimentation to determine what number matches which group. If the expression ever changes or you need to add or subtract parenthesis the back references and the code that uses the expression will break.

A first step to fixing this would be named groups. Rather than having to use group numbers you would be able to address the group by name. Here's a short example of what I mean that recreates the examples above using the basic syntax I created in my previous post.

# Equivalent to "^([a-f][0-9]-([a-f][0-9])-([a-f][0-9])-([a-f][0-9])$"





# Equivalent to "^(.{4})-\1$"




Named groups as I described them above wouldn't fix one class of problem however. But it's a problem standard regular expressions have too. You can't use quantifier blocks and groups simultaneously. Examine this expression:

Quantified Expression: "^(([a-f][0-9])-){3}([a-f][0-9])$"

The block "(([a-f][0-9])-)" must repeat three times. This block however contain the parenthetic group "([a-f][0-9])". It seems like you should you be able to access each group of the repeating block. But you can't. Given the input, the actual results are:

Grouped quantified expression: "^(([a-f][0-9])-){3}([a-f][0-9])$"
Matches: "a8-b2-c3-f6"
Group 1: "c3"
Group 2: "c3-"
Group 3: "f6"

Thursday, September 09, 2004

Java Tokenizers

I'm always needing to parse strings and streams and always bumping into weirdness with Java's default tokenizers. Here's a short list of my pet peeves with these classes. There are others but these are the biggest.

1. If you’re going to have two classes called Tokenizers (StringTokenizer, StreamTokenizer) it would be nice if they had the same semantics. Besides the fact they both break up a sequence of characters into tokens these classes have nothing in common.

2. In Java 1.1 when they introduced all the Reader IO classes why didn't they make a ReaderTokenizer instead of adding a Reader constructor to StreamReader and deprecating the InputStream constructor. They added BufferedReader, FilterReader, etc to replace BufferedInputStream and FilterInputStream, why stop there.

3. The operation of StreamTokenizer is not well documented. In order to use the class you really need to understand it's implementation model.

I've never seen the code but from playing with the API it appears the class keeps an array in the background that mirrors the character set. Each slot in that array has attributes that describe its corresponding character. The attributes determine whether the character is whitespace, a word character, string delimeter, etc... The class then provides you with a bunch of methods that let you change the attributes in the slots.

However, if you don't understand the underlying model I just described the methods like the following are hard to understand:

public void wordChar(int low, int hi)
Specifies that all characters c in the range low <= c <= high
are word constituents. A word token consists of a word constituent
followed by zero or more word constituents or number constituents.

There's nothing in the doc or the method signature that lets you know you can call this repeatedly to set different ranges of characters as word characters.

4. The Javadoc for StringTokenizer calls it a 'legacy class' and recommends people use String.split(String regex) instead. This advice is fine for certain uses of StringTokenizer but for others a more appropriate statements would be to use: StreamTokenizer() with a StringReader().

Wednesday, September 08, 2004

Regular Expressions

Regular expressions are without doubt the most cryptic set of commands a modern programmer is bound to face. We've built generations of languages that have hidden the complexities of machine code but we still use these horrible gumbles of characters for text processing. Take for example this expression for parsing email addresses I found at


Please, there's got to be a better way. I know Python supports verbose regexs, but even that's not much of an improvment. Why isn't there a higher level regular expression language? If such a thing exists I couldn't find it doing some basic Google searches.

What I would like to see is a language that's more verbose, less cryptic and supports block reuse. I've invented a little syntax below that I think is easier to read and I've attempted to translate the above expression. I think it's a lot easier to read.


RANGE(a-z), (ZERO-OR-MORE RANGE(a-z,0-9),RANGE(a-z,0-9) )

OR (
OR (

ONE-OR-MORE NOT-IN SET(<>(\)[]\\.,;:@"),
ZERO-OR-MORE('.', ONE-OR-MORE NOT-IN SET(<>(\)[]\\,;:@"))
('"', ONE-OR-MORE ANY,'"')


Sunday, September 05, 2004

C++ after Eclipse

I've been hacking on PE some lately. It's been an odd switch going back to c++ and MSDEV 6.0 with all its oddities and its bare bones IDE environment. I've been working in Java and Eclipse for a couple of years now and the environment is just so much more productive. I knew that before but going back really brings it home.

Given I'm such an Eclipse hound now and MSDEV 6.0 is getting long in the tooth, I thought I'd give the Eclipse CDT project a spin. It's certainly easy to install and setup. On Windows I just had to install the Cygwin and Gnu dev tools and unzip the project into my Eclipse 3.0 directory.

Once I had the CDT running I attempted to configure the editor to my liking. Here's where I hit my first sign of disappointments to come. The preferences panel only offers basic color coding options and has none of the automatic formatting goodness of the Java IDE.

Once I had the editor setup I wrote my first little hello world program and tried to compile it. It failed when it couldn't find a make target. This got me a little nervous because I've happily purged most of my make knowledge from memory and didn't relish diving in and refreshing those brain cells. Luckily the CDT offers two modes of working with make: managed projects and standard projects. With managed projects it does all the work and builds the make file. With the standard project you need to do the make file work, you just teach the IDE about the build targets. I had started out creating a standard project (it sounded simpler) when I should have created a managed one. I just created a new managed one and everything worked great.

Once I had some simple code working I switched over and went looking to explore the CDT refactoring capabilities. If you don’t already know, this is where the Java IDE really shines. You can rename classes, methods and attributes and have those changes propagated throughout the project. It’s really an amazing time saver. However, to my great disappointment none of this magic has been carried over to the CDT.

The Eclipse CDT is pretty cool but currently it's not that much different from my old MSDEV. Given that work on the CDT continues I hope to see some of the Java magic make its way into the c++ environment. Until then, given their similarities I probably won’t bother to port the PE project, but I may choose the CDT for any new projects I start.

Saturday, September 04, 2004

Surviving Corporate Silence

I found this paper that nicely summarizes how corporations should deal with corporate change. I've seen my share of corporate reinvention so I've seen a lot of this advice put into play, however, there's one recommendation made that I don't think I've ever seen followed:

Create new communication channels and use them. There can be a
temptation to hold back information, especially bad news, until you are
sure. The result can be decreased trust, since people usually can tell
when something is going on. Often the information void will be filled with anxiety-producing rumors.

While it's annoying not having all the facts, I usually make the best of these times by making a game out of it. I gather all the rumors I can before the transition and then compare the rumors with the final reality. There are usually some pretty strange and off base rumors. The most extreme version of this survival strategy I've heard of was from friend of mine who told me their experience in the army. Him and his coworkers would routinely entertain themselves by creating new rumors and monitoring whether and or how long they took to come back to them.

Black Triangle Sightings

This article on mysterious triangular aircraft is pretty interesting. Given it's reported slow speed I doubt it's the oft mentioned and mysterious Aurora, but who knows.

On a more humerous note, I've dug a little deeper and found folks speculating it's an "incredibly advanced gravity-defying triangular super-secret aerial platforms at least partially derived from ET technology" called the TR-3B. It's amazing how much detail people claim to know about the thing:

A circular, plasma filled accelerator ring called the "Magnetic Field Disrupter" surrounds the rotatable crew compartment and is far ahead of any imaginable technology. Sandia and Livermore laboratories developed the reverse engineered MFD technology. The government will go to any lengths to protect this technology.

The MFD generates a magnetic vortex field that disrupts or neutralizes the effects of gravity on mass within proximity by 89 percent. This is not antigravity. Anti-gravity provides a repulsive force that can be used for propulsion. The MFD creates a disruption of the Earth's gravitational field upon the mass within the circular accelerator.

The mass of the circular accelerator, and all mass within the accelerator, such as the crew capsule and the nuclear reactor, are reduced by almost 90%. This causes the effect of making a vehicle extremely light and able to outperform and outmaneuver any craft yet constructed--except, of course, those UFOs we did not build. The TR-3 is a reconnaissance platform with an indefinite loiter time. "Indefinite" because it uses a nuclear reactor for power

Very Busy

Between work and vacation days I've not had much time to blog lately. Hopefully I will have more time soon. Meanwhile, here are a couple of short items to fill the void.

1. The blogger Troutgirl was terminated from Friendster for blogging. While the posts that caused her termination do mention work and the technology they employ, they hardly seem cause for termination. I've always been fairly cautious in this vein; I guess it's the smart thing to do.

2. While reading about the horrible terrorist attack in Russia I came across the word denouement.

With hospitals overflowing and many bewildered relatives still seeking news of missing loved ones after Friday's bloody denouement, President Vladimir Putin ordered a security crackdown in the Caucasus.

It's not very often that a word in a news article stumps me, but this one did. According to Hyperdictionary denouement means:

    1. [n] the final resolution of the main complication of a literary or dramatic work

    2. [n] the outcome of a complex sequence of events

The Out Campaign: Scarlet Letter of Atheism