Saturday, September 30, 2006

JavaScript -- crazy old friend

I actually learned a lot of JavaScript around 1996, when I was doing what turned out to be a horrible redesign of the Daily Cal's website. At first I learned it to do a few browser tricks, like popup menus that instantly change the page you're on (told you it was horrible). But eventually I was using JavaScript to publish a whole newspaper issue's worth of stories. I don't remember how it worked, if I fed text files in through some kind of form, or just a big block of text, but the whole thing ran in Netscape 3 and of course I had to save the output (I think it was just the home page index) by hand. I soon gladly left JavaScript behind, deciding that Real Men script on the server side. That put me on a very strange path that included AppleScript (blech), Frontier (way less blech) and Perl (way, way less blech). I kept using Perl. As an amateur, I did not have time to invest in learning another language. Besides, Perl had tons of libraries, a very important strength for a hobby programmer without a lot of time on his hands. And you really can write clean, well documented Perl. I've used Perl to write a lot of tools for my work as a newspaper reporter. There are two Perl-based Web apps I use every day, in fact, one of which has been in use and under iterative development since 2000. I did eventually learn Ruby and set up a single, simple Rails application. Ruby is a lot like Perl, in fact is descended directly from Perl, and Rails is a fast way to write Web apps. Anyway, over the past year or so I've waded back in to JavaScript. Google Suggest knocked my socks off when I first saw it, as did Google Maps. I knew I'd eventually want to use AJAX techniques in my Web apps, so after consulting the folks in the Joel on Software Discussion Group, I picked up O'Reilly's JavaScript: the Definitive Guide, 4th edition, by David Flanagan. It's a great book, with clean writing and a straightforward, honest approach. Very thorough and has a great reference section. Its only fault is that it's four years old, but, hey, Microsoft hasn't released a new Web browser in that time. My latest app is based on the Google Maps API, so I've been working heavily with JavaScript since completing the functional spec in June and, especially, since completing the initial base of Perl code around early August. JavaScript has become an impressive language and has some fans. Some people even use it on the server side, where there are lots of other languages available. Myself, I'm quite impressed with how far it's come in the ten years since I worked seriously with it. But god if it doesn't bug me sometimes. Here are the top things that just suck about JavaScript: 1. No foreach loops Perl has foreach loops:


foreach my $item (@items){ -do stuff- }
Ruby has the array.each{-code-} construct:



items.each{|item| -do stuff-}

JavaScript has, well, for loops. So every. Single. Time. You want to iterate over an array, you have to do:


for (var i = 0; i < item.length; i++){ var item = items[i]; -do stuff- }
God that gets old. And error prone. There are the typos. There are plenty of opportunities, since you type the name of the index variable four times and the name of the array twice. Also, nested loops tend to break. Since JavaScript doesn't have true lexical variables, a nested loop has to have its own index variable (usually, "j"). Something like this won't work:


for (var i = 0; i < item.length; i++){ var item = items[i]; for (var i = 0; i < item.things.length; i++){ var thing = item.things[i]; -do stuff- } }
... because the changes to "i" in the inner loop will affect the "i" used in the outer loop. Even though each is declared with var, the second var has no effect because lexicals aren't supported. And instead of dying with an error message in this case as it should, JavaScript in this case just allows you to redeclare "i" -- redeclaration is silently ignored as a matter of policy. So you're programming will start acting buggy and it will take you a while to figure out why. Then when you go back and change "i" to "j" you have to change it four times, which means you're practically guaranteed to miss at least one. It's insane that JavaScript doesn't have foreach. I've written about 600 lines of JavaScript for my Google Maps application, and I've not once used a for loop for anything other than iterating over an array. Oh, and don't let anyone tell you that JavaScript's for/in loops are any better for arrays. They have the same issues, since you still have to deal with index varaibles:


for (var i in items){ var item = items[i]; -do stuff- }
I guess maybe that buys you slightly less typo opportunity, since you only use the index variabletwice. But that's not much of a gain. Since the for loop is bog standard -- Perl has one, C has one, I'm pretty sure Ruby has one -- I just use that instead. 2. No map In Perl or Ruby, the map operator provides an easy way to transform an array. It will apply a given code block to every single element in an array. JavaScript's lack of foreach might be forgivable if it had map, a truly awesome tool. But despite its shiny, "everything is an object" architecture, it does not provide this basic capability. Here's map in Perl, capitalizing every word in an array:


@items = map { ucfirst $_ } @items;
Or in Ruby:


items = items.map{|item| item.capitalize }
In JavaScript, you have to break out a bloody for loop again:


var newItems = []; for (var i =0; i < items.length; i++){ var item = items[i]; newItems.push(ucFirst(item)); } items = newItems;
And that's IF you already wrote a ucFirst function, since JavaScript ships with no native facility for capitalizing words. 3. No keys In Perl, you can quickly get a name of all the keys in a hash -- and thus all the properties in 99% of objects -- via the keys function. This provides a nice way to see if a hash is empty of not:


if (not keys %bad_stuff){ have_fun(); }
JavaScript has no similar fucntion for providing the properties of an object. You have to bust out a for/in loop:


var properties=[]; for (var property in badStuff){ properties.push(property); } if (properties.length == 0){ haveFun(); }
4. No lexical variables Variables can be global or scoped to function, but not scoped to block. Is it just me, or is it 2006? This wouldn't be such a huge deal if we didn't need index variables for our for loops. Since we do, we really need lexical variables so we can nest those loops without keeping track of which index variables are used up. Ack. 5. Object functions can't ever pose as object properties This is kind of a nit. But: In JavaScript, it kind of sucks that you have to know whether you are calling an object's property or its function. It would be nice if I could replace a property with a function and not have to change this:


var communists = thinkPad.maker;
... into this:


var communists = thinkPad.maker();
Perl's not really any better in this regard, it's just that any Perl coder who wants to say he is writing object oriented code has to use functions for all object property access, and so you never have to think about whether or not your accessing a function. And you never, ever have to do those stupid trailing parens. Hmmm You know, really, what's surprising about JavaScript aren't the sucky parts, but how good it is, actually. The roughly 600 lines I've done on my current project constitute a fairly well designed collection of objects and classes. There is plenty to like about JavaScript: 1. Object oriented Unlike Perl, JavaScript is object oriented down to its core. This means that strings, numbers, arrays and other native types can all be manipulated as objects, and funtions can be added to the core, native capabilities. For example, I added a method to capitalize strings, and I can call it on any string. 2. Prototype-based OO Prototypes are a concept originating, if I'm not mistaken, from SmallTalk. The idea is that you define a class by creating an object, and all instances of that class are essentially copies of that object. An important corrolary is that you can add methods to an object on the fly and, thus, define new classes at runtime. JavaScript's object system is based on prototypes, which is nifty, since prototypes are well suited to Web programming. It's sort of like the duct tape of object orientation. The big caveat here is that JavaScript does not allow you to assign a new prototype to a particular object, only to a whole class. This is very handy, and something I had gotten used to doing in Perl using Class::Prototyped (usually via the truly excellent CGI::Prototype). I tried it once in JavaScript, when I wanted to create a ProjectEvent object that had all the properties of a particular Event object but with some other, additional properties particular to the ProjectEvent. It didn't work -- the ProjectEvent class could inherit the prototype from the Event class, but one object could not do so with another object. It might have been possible to create ad-hoc, intermediate classes to accomplish this -- one for each ProjectEvent -- but it just wasn't worth it. 3. Hashes JavaScript calls them Objects, but they can be used just like hashes.


object[key] = value;
You may know them as "associative arrays." I don't know how languages ever got on without them. Oh, you can even call "delete!"


delete object[key];
4. Ubiquity I thought Perl was ubiquitous, but JavaScript is in a whole other league. It's baked into every Web browser, in one implementation or another, so many, many people know it. This means you can readily find answers to your questions on Google or Google Groups. Of course, much of the information is of poor quality. You definitely need a good book at your side, especially for basic questions, where you want to learn how to do things the right way, and understand why you do things that way. 5. Intelligent handling of references You don't have to think about references in JavaScript. Better yet, it gives you optimal performance by always passing by reference. This conserves memory and avoids unneeded copying. Yet it will cleverly pretend to be passing by value in cases where that's what you expect, like handing a string to a function or concatenating two strings together. If you come from a language other than Perl you're probably shrugging at this. Well, it's a big deal, at least to me. Really nice not to have to distinguish between @items and \@items.

Thursday, May 11, 2006

Google plug ins have arrived

Six months ago, Dave Winer suggested a plug-in architecture for search engines. He wanted to mix results from specialized search engines like Sphere and memeorandum in with all his searches.


Well, Google apparently listened. It just rolled out a plug-in interface, called "Subscribed Links." A Web publisher can point Google at a special XML feed containing a series of "ResultSpecs." Each ResultSpec (example) is a user query string ("extraordinary rendition") plus the URL, title etc. to return when that text is entered ("Outsourcing torture," http://www.newyorker.com/fact/content/?050214fa_fact6, " ... had been sent to Syria on orders from the U.S. government, under a secretive program known as 'extraordinary rendition.' This program had been devised ... ").


This looks promising. At work, I'd really love to plug in results from WSJ.com, Factiva and Lexis Nexis into my Google results. Which is why I loved Dave's plug-in idea when he first proposed it.


But there are some serious limitations that give me pause about this architecture:



  • Only one result per plug-in per query, it appears. This is silly. If WSJ.com, for example, spits up three good hits, I want all of them, not just one. If WSJ.com starts spamming me with too many hits, I'll just unsubscribe, problem solved.


  • If you want a published document to be a result for more than one query term, the interface gets a lot less simple.


  • I'm not sure about this, but it appears as though you have to specify the exact query terms for each result, instead of just telling Google the various keywords associated with a particular result. So if a subscription newspaper, for example, tokenizes a typical news story, it would have to associate that story not just with the dozens of keywords contained in the news story, but also with the exponentially larger possible combinations of keywords. For example, one for "torture," one for "citizen," one for "torture citizen" and one for "citizen torture." The publisher may be able to get around this with a clever regular expression -- Perl regexes are supported -- but that's a little funky.



By the way, in my original Ocrober post I noted as an aside: "search has become social software and we just have not noticed it yet. PageRank is social software in a crude form." As it turns out, Google's plug-in architecture was rolled out as part of a social search system called Google Co-op.

About this site

Programming is not a particularly interesting topic for my family or friends. Writing about programming also happens to be a Web cliche. And I am not a professional software writer, so it's not like my musings on software are going to be particularly useful to a broad audience.


Still, sometimes it is nice to get thoughts written down, in order to stop thinking about them, and in order to record them for future reference. I write software for personal use in Perl and Ruby.


So I am confining my technical writing to this little ghetto. By the way, "hack" can describe a person or a thing. Hmmm.


(Thanks as always to Blogger for the publishing software and OCF for the hosting.)