Gerard Mason

Tuesday, 2 September 2008

Google Chrome - what, no coffee?

Well it was here when they said it would be, and I downloaded it. It installed, and crashed at the last minute when the browser had just started up and was importing settings and bookmarks from IE and Firefox, and it looked like it was the bookmarks that were responsible.

But, just like it says on the box, only the tab that was doing the imports fell over, and I was able to carry on and start browsing the web immediately. Later, I went to the options menu and got it to load the bookmarks again, and this time it completed without incident. So score +1 for Chrome.

So what about this blazingly fast Javascript engine then? I've deliberately refrained from running any official tests, on account of being lazy, but just let's say that I'm once again impressed at just how important network and server latency are in the browser experience. That is to say, working with Gmail and Google Reader, sure it seemed a little bit faster at opening the pages, but even in these Javascript-heavy apps, it's the I/O that's the bottleneck.

I decided to look at plugin support. Flash/Flex is there as you might expect (what with having to support YouTube and all!) but, very surprisingly, Java was absent. I managed to find some pages with applets on by going to the Sun website (what a blast from the past!) and got a message saying "No plug-in available to display this content". Here's one of the pages I tried, see what you get.

At first I was a little staggered. Could the status of Java applets have fallen so low that Google weren't even going to bother supporting Java in the browser? (And see here for what I found out about current antipathy to Java in the browser.) Well yes it could, I suppose. Maybe Google have simply concluded that Java in the browser has had its day.

I expect though that the explanation is simpler: Chrome is a beta after all, and there's no compelling need for Google to support applets from day one in the same way that YouTube makes it necessary for them to support Flash. Also maybe something technical about the way the JRE integrates with the browser makes it harder to support than Flash? Maybe. But Safari 3.1.2 (as kindly downloaded for me by iTunes when I wasn't paying attention during a product update) seems to have managed it without any problem. Hmm, this smells kinda bad, kinda fishy...

Of course, Google have got a JVM of their own though, haven't they? Maybe instead of incorporating the Sun product, they'll simply port Dalvik to run inside Chrome? That would unify two of their platforms very nicely indeed, thank you. Android games running in Chrome tabs? I'll have some of that!

If I were Google though, I keep it very quiet if that was indeed my plan, since it would undoubtedly cause a massive outcry (assuming anyone still cares, which is moot, but I bet a lot of people who didn't really care would still enjoy complaining for its own sake). In fact, the best way to do it might be to release Chrome without any Java support at all, and wait for annoyed voices to demand it, and then say something like "Well licensing restrictions mean we can't support yer actual Sun Java in Chrome, but we got something 'ere that's just as good, honest guv'nor."

Update: OK, panic over, Chrome does support Java! You just have to have the absolute latest, bleeding edge development release, version 6 update 10. If you click on the toolbar menu and then on the Help submenu it'll take you to a page where you can search for Java support, and that'll take you to a page where they explain what's going on. Or just take my word for it and go to http://java.sun.com/javase/downloads/ea.jsp and download and install Java SE 6 Update 10. Phew!

Still think Android apps in Chrome would be a great idea though.

Google Chrome. OH. MY. GOD.

Apologies for the quality of this post. There's so much to say and my thought processes are just running all over the place as various connections are being made. I've just read about Google's new Chrome web browser tonight, it's all over the web.

My first reaction was, "How odd!" Why would Google want to bring out a new browser? There are new browsers appearing all the time, and it would be much more in keeping with G's modus operandi to date for them to simply help out with advice, code and a bit of cash here and there, rather than to up-end the whole apple-cart like this.

Then I read the 38-page cartoon that they sent out explaining things. And my second reaction was, "Oh. My. God."

It seems obvious now that development of current browsers was either not going in the right direction for Google, or just wasn't getting there fast enough. Things are scrappy. They're fragmented. Google have big plans for the browser, and it looks like they've decided to start bringing all the strands of their work together, so that we can begin to see the shape of what's coming.

Strands? Heck, let's change metaphor. It's like when the tide starts to come in on a nice warm beach. At first all you can see is tiny rivulets of water coming from all directions and going in all directions. It's only later you realise that THE SEA is on its way and your little spot in the sun is soon going to be under six feet of water (and yes, Microsoft, it is you on that towel).

So they've made all these little moves. And they looked a bit odd and a bit disconnected. Google Apps — a bit slow, a bit underpowered, but they would be see, 'cos they're running in a browser. GWT — what's the point of a development environment that has you writing web apps like they were desktop apps? Gmail — nice example of what you can do with Ajax, was it written using GWT? Android — what browser does it use?

But now Google are bringing out Chrome, whose intent seems to be to run applications as complicated as the most complicated ones that you run natively on your operating system, and to run them just as fast (or at least, in the same ball-park). Hmm, Google Apps, they're going to be a bit snappier now, aren't they? Hmm, I can see the point of a big-iron development environment based on a typed language now! And Android, currently sporting the browser that Chrome is based on, will likely be running Chrome or a Chrome-alike in the next release (after the one that we still haven't had yet).

That's enough hot air and pontificating. The rest of this post is specific reactions to things in the cartoon, which you may not understand unless you follow the link above and read the cartoon.

They are using the Webkit code base. Not Mozilla. By my reckoning that's now about a million billion important new browsers have been built on webkit, versus ... erm ... (I can't think of any) built on the Mozilla codebase. OK, so I'm using "important" in a very particular sense: "big", that is to say, backed by an organisation (probably a commercial company) and guaranteed a large user base. (And I know that there are lots of browsers based on Mozilla, but together they must have a user base approaching, what, 10,000 people?) [Yes, other than Mozilla itself and Firefox.]

Mozilla are #?*&ed! Now the flow of money from Google to the Mozilla foundation is not charity, it's a deal whereby Mozilla preferentially funnels its searches to Google. So that can stay in place. As long as Mozilla users search on Google, Mozilla can get money out of that deal, there's no sense in Google just killing it. So Mozilla is not #?*&ed immediately then, but stand by to see it lose market share vertiginously if Chrome is as good as Google thinks it's going to be.

Stand by also to see Microsoft scramble to match Chrome in terms of features. This comes at a particularly bad time for Microsoft, with IE 8 code very likely closed to new functionality, and the release only a few months away [GOOGLE SMACKS MICROSOFT, #1]. What do MS do now? Do they stick to the original release timeframe and release it as-is, and smart when nobody notices because Google released a better browser a few months back [and that's TOMORROW folks!] and everybody's using it? Or do they pull the release and desperately try to match Chrome, feature for feature?

Omnibox. I can see this running into trouble very quickly. This business of remembering what site-based search boxes you've used, and allowing you to reuse them by typing in a site identifier and then a tab and then your search terms? Think of the controversy caused by deep linking a few years back. This is an excellent way to cut a website's search page out of the loop. So now, instead of first going to Amazon's home page and having to skim over all the stuff they've kindly prioritised for you as your eye hunts for their search box, you'll go straight to their results page. Hmm. Site publishers are going to regard this as kidnapping their search boxes, and I would be surprised if there weren't a few legal challenges to it soon.

Interesting to see the places in the cartoon where they have obviously decided to put the wind up the competition. Some of them really made me chuckle.

On page 4 they say that each tab is a separate OS process. If memory serves, Unix/Linux processes used to be lighter weight than Windows ones. Assuming that's still the case, Chrome may be a bit sprightlier and more performant on Linux than on Windows [GOOGLE SMACKS MICROSOFT, #2] — just the thing for those Linux-powered net-tops that are springing up all over the place.

On page 5 they point out that this means that the sort of badly-behaved page that used to make your entire browser crash will now only affect the one tab. This must happen to me about once a day at least: four separate browser windows open, themed for work-related stuff (several pages of documentation from assorted sites), news (Google Reader for scanning, then I open up any interesting stories in their own tabs), mail, and one for anything else; that's twenty or thirty pages all open at once, some of them regularly updating in the background. When a bad page takes down that lot it's annoying and I thank heaven for Firefox's auto-reopen feature. When the bad page is really bad, and Firefox goes down again straight away as soon as it tries to reopen it, that's when I get annoyed.

Pages 9-11 must be putting the fear of God into Microsoft right now. Google are showing off how they can push automated Chrome testing out over their famous distributed server network, testing tens of thousands of web pages per hour [GOOGLE SMACKS MICROSOFT, #3] and making sure that they cover them in order of importance, as indicated by their very own page ranking alogrithm.

Page 13 is very interesting. They mention no names, but I immediately thought of Adobe's Tamarin VM for Javascript, now donated to Apache. Were they thinking of Tamarin? Did they look at it and reject it, or was it not open source back when they decided to write one themselves? I need to look at the timescale for that more carefully. One thing: Tamarin is built for the version of Javascript that didn't make it into the new standard, and work is apparently under way at Apache to convert it for the version that did. Good luck with that. Google probably thought it was better to start from scratch [GOOGLE SMACKS ADOBE]. And if the boys that did the new Javascript VM are more or less the same ones that did the Dalvik VM for Android, then Google probably thinks it can do a damn good job on its own, and rightly so.

Interesting also that they are seem to be JIT-compiling Javascript to machine code. That's been a perilous way to go in the past, partly because of what can happen with variables. Javascript variables are untyped, but the values that they hold do have types (number, string, object, ...). Now there's nothing to stop me coding a for-next loop where the value held in some variable used inside the loop changes type on each pass through, and in the past that's either killed efforts to compile Javascript or put serious constraints on the efficiency of the resulting code (by making it have to be too general).

In this context, it's especially interesting to look at the latest release of the Google Web Toolkit (GWT). GWT you will remember lets you write your web application in Java, a heavy-duty, strongly-typed language, which GWT then "compiles" to Javascript for actual execution in the web page. The release notes for the latest version of GWT noted that this "compilation" phase effectively throws away the valuable type information, in the transformation from typed Java to untyped Javascript, and that in previous releases this negatively impacted performance. But the current release takes advantage of the fact that any Javascript variable in a web page produced by GWT is guaranteed to have come from a typed Java variable! In other words, you can guarantee that that sort of type-bending naughtiness isn't going to happen in a respectable GWT application. So you can do type inference based on the first value of a variable that you see... And then the release notes said that that had led to sundry improvements that were beyond my understanding, because all I could think of was that Javascript was still untyped.

So what's the betting that GWT-produced web applications will run especially well in Chrome, because of the good behaviour of their variables (and, no doubt, for many other reasons way above my head)?

Michael Arrington at TechCrunch says:

Make no mistake. The cute comic book and the touchy-feely talk about user experience is little more than a coat of paint on top of a monumental hatred of Microsoft.

I hope this doesn't mean that MS have got so far under Google's skin that they are letting hatred guide their actions. That would be a colossal mistake. So far, Google have been the nimble players. They are the ones who, in every case [May not be true. I have a terrible memory!], have led the way with an unexpected paradigm-shift, leaving others scrambling to catch up. Letting Microsoft-hatred guide your actions is a mistake other companies have made in the past, and it's ruined them because it hands the initiative to MS, who are not slow to capitalise on the opportunity.

Update: Dave Methvin over at Information Week points to where Google may have got some of the technology they are using to sandbox Chrome tabs.

Saturday, 30 August 2008

Single-child families and population decline

There's worry in certain circles right now about population decline in much of Europe. Families are becoming smaller as investment in individual children (supposedly) continues to rise. And if average family size is below, what is it, 2.1, then voila, it's below replacement level.

I wonder if there's another factor that's contributing, making the effect larger and more self-reinforcing. With very few exceptions, all the only-children I know are single in their adult lives, and all those who had at least one brother or sister are now partnered, again with very few exceptions.

If this is true, then as average family size decreases, logically, the number of single-child families increases as a proportion of the whole. If most of those children are not going to reproduce, then you have a runaway decline.

Those enterprising cloud warriors

It seems that no sooner does a technology appear than it is subverted by those who want to abuse it.

A while back I started renting a virtual server to which I was (am) going to move a website that I own. The site is currently on a small hosting account, and it's starting to need a bit more than that. Anyway, I didn't do anything for a while, and just let the virtual server gather dust. Then, the other night, I got a burst of energy and started to look at what I could port over first.

As I was doing that, I was also idly browsing through the log files in /var/log and I see an enormous messages file and an enormous secure file. And their backups were big too... This is a server that, while it's on the internet, didn't even have a domain name until the last few days, it was just a raw ip address. Why were those files so big?

It seems that someone is trying an automated dictionary attack on the server. As far as I can tell, each login attempt, via ssh, is supplying a username but no password. Each name is being tried once, and then on to the next. So it's looking for unprotected accounts rather than trying to guess passwords (I think a "dictionary attack" really refers to when they are using a dictionary to try to guess passwords, but I'll stick with it since they must be storing their list of usernames in a dictionary too.) It seems fairly primitive but it's still immensely worrying, especially since I really don't want to have to become a linux security expert.

Each attack starts about the same time every evening, and lasts about eight minutes. That's it, but in that time it's doing a couple of login attempts every second. Of course, the source ip address of each attempt is logged, so I've been busily adding them to /etc/hosts.deny whenever I see a new one. Last night was quiet, first time in a couple of weeks, apparently. Time will tell whether just adding addresses to hosts.deny will work; in the short run maybe, I rather doubt it in the long run.

So what has any of this got to do with "the cloud"? Just that the attacking ip address sometimes resolves to an Amazon AWS instance, ec2-75-101-154-0.compute-1.amazonaws.com to be exact. Either some nerveless criminal is renting out EC2 instances with the intent of using them specifically to crack whatever insecure hosts they can find or, perhaps more likely, some of the startup images that EC2 instances use — and these images are huge, containing a whole operating system, as well as whatever applications they are going to run — have been compromised.

Looking at the way Amazon run EC2, they provide a number of basic instance images, but there are a lot of others mentioned on their forums, created by "helpful" users and containing just the right applications for people to find them attractive. An obvious security hole you might think, but people must be using them, or they wouldn't exist. Maybe Amazon need to virus-scan their EC2 images before starting them up, what a horrible thought.

Update: I see that at least one other person has noticed the same thing happening too: David in Sweden. I wonder if he ever got an answer out of Amazon?

Saturday, 23 August 2008

Amazon Elastic Block Store

This has been presented as a small, incremental step forward for Amazon's cloud-based computing. It's not. It's a big deal.

First, some grossly-simplified terminology. In the beginning there was Amazon Elastic Compute Cloud (EC2), which is your server(s). There was also Amazon Simple Storage Service (S3) which you can think of as being your network-attached file system. A little later they also brought out (though it's still in beta a.t.m.) Amazon SimpleDB which is your database and Amazon Simple Queue Service which is your messaging infrastructure. So what's the problem? Why do we need EBS?

Well I said "servers" above, but it's more accurate to think of EC2 instances as being like a CPU + RAM, where the RAM has enough space for a virtual disk. Your server does have some disk space attached ("instance storage" in the lingo) to which it can write files, but remember that the whole point of EC2 is that servers can get started automatically in response to demand, and that when demand falls, they can get automatically switched off. At that point, anything they've written to their local disk just disappears. I suppose it's the same as an object going out of scope and getting garbage-collected.

So if your web application, running on your EC2 instance, has done some work and updated some records, then it needs to salt them away somewhere outside the server-instance itself, and that's where S3 and SimpleDB come in. And of course, once you realise that your EC2 instance is probably going to need to save some data, it's but a step to see that it's also probably going to have to initialise itself, once it's up and running, from a data store too. So data passes between EC2 and S3 and SimpleDB in both directions.

So again, where's the problem? Well, I said that S3 was like a file system. It is, it's like some old, clunky PC-DOS 1.x floppy-disk-only file system, before you had directories and certainly before you had subdirectories. Each user account gets I think it's 100 bags (they're the floppies) and each of those bags can hold vast amounts of objects, but all at the top level, and all in the same namespace. If you want to impose any kind of structure on top of that, then you must do it by appropriately decorating the names (keys) that you use. It's as though your whole hard disk could only have a single folder and you had to keep everything organised by using tremendously long filenames.

Again, I said SimpleDB was the database, but it's not an SQL database. In fact it doesn't even have tables. It's more like a set of objects and a set of attributes, and any object can have (or not have) any of those attributes, and you can search on the attributes. So imaging that you could add some kind of metadata tags to S3 files and search on the tags, and that's SimpleDB.

I'm not criticising them, the constraints of replication, timeliness, availability etc. etc. very likely imposed all these limitations. And in an odd kind of way, the objects you get in S3 and SimpleDB remind me more of the sort of constructs you get in programming languages like Java or Javascript, so you can see that as building blocks they might be very powerful and useful.

But it's probably true to say that the software that would run well in this environment would have to be written or adapted specially for it. A lot has been written too, in the couple of years that EC2 and S3 have been available. But there's a whole class of more traditional applications that haven't, and which won't work in this environment, and they represent 99.9999% of all the software that's out there.

So now Amazon introduce EBS, and you can think of that as an infinite array of network-attached hard disks (but this time they are proper ones, with subdirectories and everything :) ). Previously, if I wanted to store my web application's data in say a PostgreSQL database, to get the benefit of SQL based access, I had to put that database on the EC2 instance's hard disk (instance storage) and had to make sure that the data in that database got pickled to S3 before the instance was shut down. Now I can just put the database on the EBS disk and forget about it.

Of course, for the very largest, most fault-tolerant and distributed of applications, EC2 + S3 + SimpleDB + SQS was probably architecturally the right way to go anyway; the sheer size of such applications might make exactly the same constraints necessary. The difference is in the area where I live: the web site that can fit comfortably on a single server, or even a single virtual server, with an SQL database on the same machine or on a different machine on the same network. That's to say, the space where the smallest to the medium sized database driven websites live.

These are well served at the moment by the numerous small hosting providers who'll give you a virtual server or two, or your own box. But if your web site does well, then the upgrade path can be difficult and expensive. And if your web site very suddenly does well then you likely won't be able to resource-up in time and it may fall over. And the pricing models are very coarse-grained in comparison to Amazon's.

For those web sites, Amazon EC2 wasn't an option because they would have had to rewrite their data layer and still end up with something that wasn't as flexible as SQL. But with EBS, those kinds of sites can be ported directly.

Monday, 11 August 2008

South Ossetia

It's anyone's guess what the Russians' ultimate goals are, but I suppose there are three possibilities, in ascending order of seriousness.

First, simply to defend the (independence of the) South Ossetians, which is their ostensible motive. If that's the case, one might reasonably ask if invading Georgia is the right way to go about it. However the ultimate consequences might not be all bad.

Second, to annexe South Ossetia. No doubt pliant S.O. nationals could be found to publically thank them for doing so. This would definitely qualify as adventurism, and its successful accomplishment would likely increase the Kremlin's appetite for more such moves - Abkhazia probably being the first.

Third, a full-scale and prolonged occupation of Georgia, leading eventually perhaps to: the incorporation of large tracts of its territory (South Ossetia, Abkhazia, and the corridor between them) into Russia; the emergence of rump Georgia as a vassal state; or even the complete annexation of Georgia. This would be extremely bad, and would likely lead to instability in the surrounding areas.

Something that would worry me very much if it happened after successful Russian action in S.E. (defined as the second or third possibilities above) would be a coup in Serbia, followed by an invitation to Russia to safeguard Serbian borders. Conflict between Russia and the EC would seem likely at that point, followed almost immediately by a Russian energy embargo and a collapse of European will.

Wednesday, 6 August 2008

Absinthe

Good to see that Absinthe is making a comeback. Interesting that the American market is where all the new action is, and that the products available there are higher quality than in Europe:

To date, there are four brands on US shelves: Lucid (Breaux's formula), Kuebler, Green Moon, and St. George Absinthe Verte. "The US is lucky in that its first absinthes are high-quality products, distilled from whole herbs," Breaux says. "In the European market, 80 to 90 percent is industrial junk."

I want my grave-photo to show me sitting at a table, having just finished an excellent repast, smoking a Cuban cigar and drinking a fine Absinthe.*

* Does one drink Absinthe after dinner or before? Perhaps I could make do with a fine champagne cognac.