Design Pattern: Wrapper with Composable Actions

Name And Classification

Wrapper with Composable Actions. An extension to the Wrapper pattern that allows operations to be composed on wrapped objects.

This pattern is well known in another context, but its usual name and presentation is not very descriptive so I’m avoiding it deliberately until the end.

Problem

An object has been placed in a wrapper but it must still be possible to use it as input to a chain of operations.

Context

As well as the common uses for wrappers (changing one interface to another), there are other situations where it’s useful to wrap objects, for example:

  • if you need to keep track of a result plus extra information about the actions you’ve carried out as you perform them.
  • if the value you are wrapping is not simple in some way – perhaps it is going to be computed asynchronously and the wrapper represents a promise to provide the result.
  • if you want to add decision points or extra actions based on data not included in the wrapped object, but available in the wrapper that should be done on every operation. (This is like the decorator pattern).
  • if you want to indicate that something fundamental has changed, you can use a wrapper object that doesn’t provide the ability to get at the original object.

Actions (operations that take a value) are usually created to operate on the nonwrapped versions of objects, which means that in the general case, wrapped objects can sometimes be difficult to compose actions on. This pattern specifies two functions that can be implemented to assist with composing actions on wrapped objects.

In general, this pattern is useful when considering a sequence of operations where each step depends to some extent on the previous value and produces a new value. Because of this it is commonly employed in functional languages. Its implementation in languages without first-class functions is generally awkward. This pattern can provide extra value in languages with a strong (stronger than java) type system.

Solution

To ensure composability, two functions need to be provided, wrap and chain (although their exact names can vary based on situation). I have shown them here on the interface Wrapper:

/* This is analogous to an operation that takes something of type T and returns something of type A, except
 * that A is provided in wrapped form.
 * In a language with first class functions, Action describes a function rather than an object interface
 *    i.e. T -> Wrapper
 */
interface Action {
	Wrapper call(T value)
}

interface Wrapper {
	// This is likely to be provided as a static method or constructor.
	Wrapper wrap(T value);

	/* This is normally more complex than simply returning the result of the action, as there
	 * is likely to be some behaviour or state embedded in *this* Wrapper that needs to be appropriately combined
	 * with the Wrapper that is got by executing the action.
	 */
	 Wrapper chain(Action action);
}

Example 1: Promises

Promises are wrappers for value objects that may not yet be available for calculation. Promises are typically instances of this pattern. Here is a (much simplified) example in javascript:

function Promise() {
	this.onResolve = [];
};
Promise.prototype.chain = function(thenFunc) {
	// return a new promise that will resolve at the same time
	// and with the same value as the result of the passed function
	var result = new Promise();
	this.onResolve.push(function(val) {
		var resultingPromise = thenFunc(val);
		resultingPromise.chain(function(val) {
			result.resolve(val);
		});
	});
	return result;
};
Promise.prototype.resolve = function(value) {
	// tell all functions waiting for a value that we have one
	this.onResolve.forEach(function(then) {
		then(value);
	});
	// clear down (to avoid memory leaks)
	this.onResolve = null;
	// from now on, chain gets called immediately with the value.
	this.chain = function(then) {
		return then(value);
	}
};
Promise.wrap = function(value) {
	var result = new Promise();
	result.resolve(value);
	return result;
};

This can then be used like so:

function parseJSON(json) {
	return Promise.wrap(JSON.parse(json));
};

function getURL(url) {
	var result = new Promise();
	var xhr = new XMLHttpRequest();
	xhr.onreadystatechange = function() {
		if (xhr.readyState == 4 && xhr.status == 200) {
			result.resolve(xhr.responseText);
		}
	}
	xhr.open("GET", url, true);
	xhr.send();
	return result;
}

function postStatistics(obj) {
	var xhr = new XMLHttpRequest();
	xhr.open("POST", "http://statserver", true);
	xhr.send(obj.stats);
	return Promise.wrap();
};

Promise.wrap("http://myserver/data.json")
	.chain(getURL)
	.chain(parseJSON)
	.chain(postStatistics);

The big win here is that actions that would otherwise have had to be nested can be composed into a data processing chain, and synchronous and asynchronous actions can be composed together transparently.

Example 2: History

Supposing you want to store the steps taken in a computation as you go along. You can use this pattern to achieve this:

function History(val) {
	this.val = val;
	this.commands = [];
}
History.prototype.chain = function(action) {
	var result = action(this.val);
	result.commands = this.commands.concat(result.commands);
	return result;
};
History.wrap = function(val) {
	return new History(val);
};
History.prototype.toString = function() {
	return this.commands.join("\n")+"\n"+this.val;
};

With some actions defined:

function addTen(val) {
	var result = History.wrap(val + 10);
	result.commands.push("added ten");
	return result;
}

function divideTwo(val) {
	var result = History.wrap(val / 2);
	result.commands.push("divided by two");
	return result;
}

You can now do a calculation and then print out the result and the steps taken to achieve it like so:

History.wrap(8).chain(addTen).chain(addTen).chain(divideTwo).chain(addTen).toString()

This time, extra information was threaded through the whole process, but the individual steps in the chain remained composable.

Extensions

Often there are already a body of functions that operate on unwrapped values, returning unwrapped values. An ‘apply’ function can be defined in terms of the two functions already added to allow them to be easily used:

Wrapper.prototype.apply = function(func) {
	return this.chain(function(val) {
		return Wrapper.wrap(func(val));
	});
};

Big Reveal

As you may have realised already, this pattern is also known as ‘Monad’ in functional programming, where the wrap function is called ‘unit’ (or ‘return’) and the ‘chain’ function is called ‘bind’ (or >>=).

Despite the fact that Monads are commonly associated with Haskell and functional programming, they’re used and useful in many places and I think describing them as a Design Pattern is much more accessible than describing them by their similarity to category theory constructs.

I came to this conclusion after realising that I needed promises to make a reasonable version of the javascript FileSystem API (which is available on github) and then suddenly realising that Promises are actually Monads, something that nobody had mentioned to me before.

Links:

Dancing-links: Understanding how the algorithm works

I’ve been writing a little recently about a program I wrote to solve sudoku and other problems using the dancing links algorithm.

This was how I explained it when someone asked for information about how it worked:


Firstly you have to understand Exact Cover. An exact cover problem is a problem where you’re given a bunch of choices, and a set of constraints and your challenge is to select a bunch of the choices that will fill every constraint exactly once.

For example, consider the case of someone creating their ice dance routine. They have a number of tricks that they need to show the judges, and don’t want to perform any trick more than once. They have a number of sequences which are groups of tricks that can be put together and they want to choose the ideal selection of sequences to cover all the tricks once. In this example, the constraints are that they must perform every trick. The choices are the possible sequences they could incorporate into their routine.

A nice way to represent problems of this sort is to draw out a table where the constraints are columns and the choices are rows, and you have a big X in cells where a particular choice fulfills that constraint.

As it turns out, given the right constraints and choices, sudoku can be described as an Exact Cover problem. It does involve 729 rows and 324 columns, which is perhaps why people tend not to use this technique when solving sudoku by hand.


Ok, assuming you’ve got that, now you need to understand Algorithm X. Knuth said of it “Algorithm X is simply a statement of the obvious trial-and-error approach. (Indeed, I can’t think of any other reasonable way to do the job, in general.)”. You can actually work through this by hand if you have pencil and eraser and the table drawn out in pen as I described in the previous section. Here’s my description of algorithm X:

  1. If your table has no columns, stop – you’ve solved it. If you’ve got a partial solution stored, then it’s actually a real solution, return it.
  2. Select a column (representing a constraint).
  3. Find a row with a cross in that column (representing a choice that fulfills that constraint). Add it to some kind of structure where you’re storing potential solutions. If you can’t find a row, give up – there are no solutions.
  4. Assume that the row you found in 3 is in the solution, so remove all columns that it have an X in that row. While removing all those columns, also remove all rows that have an X in the columns you’re removing (because you’ve already satisfied the constraint, so you’re barred from choosing something that would satisfy it again).
  5. Now recursively try to solve the reduced table. If you can’t, remove the row you tried from the potential solution structure, restore all the rows and columns you removed in steps 3 and 4 and try a different row. If you run out of rows, then give up – there are no solutions.

Now that you understand that, you can understand dancing links. Dancing Links is a way of implementing that algorithm efficiently. The key point of dancing links is that in a linked list, when you remove a node (which can be done efficently by modifying the pointers of its neighbours), the node that you’ve removed has all the information you need to add it back to the linked list (in the case that it turns out you were wrong when you guessed it was part of the solution). That plus the fact that if you make all your linked lists circular then suddenly you lose a lot of special cases is pretty much all dancing-links is.

Dancing-links: The JS ecosystem

I recently created a sudoku and polyonimo solver using javascript, node.js, jasmine, jsdoc, git & github, cloud9 and heroku.

These programs and services are not perfect, but the fact that you can slot them together to provide a complete, production-ready, web-based development infrastructure for free is amazing.

Javascript

When you’re working on javascript code, there are a number of environments you might want to be able to deploy to. For Dancing-links, I wanted the code I wrote to work in the browser, in node.js and, to allow me to click ‘debug’ in eclipse, I wanted it to work in Rhino too.

As far as I’m concerned Javascript is the only real write-once-run-anywhere language.

Having said that there are also a number of ways in which it is deficient and perhaps the worst is in terms of loading code you depend on. Each of those three environments I mentioned requires a different solution. I approached this by creating a piece of library code that faked up the way that node.js does it on the browser and in Rhino, a sort of fake ‘require’ function. In node.js the order doesn’t matter, because everything is cached and if the required module isn’t already there, it goes away and gets it.

Unfortunately, the way most people write javascript, the order does matter in the browser, which means you can’t just concatenate files together without understanding what they mean. To reduce this complexity, and make it easier to use the result in more places, I’ve made a concerted effort to remove ordering requirements. This means never actually constructing anything during ‘definition time’. This means removing patterns that look like statics and singletons. Where you can’t help it, using lazy instantiation can get round some problems.

The other problem that crops up is inheritance. I allow myself the luxury of having a single ordering requirement: Utils.js will be included before anything else. In Utils.js I fake up require where it doesn’t exist and also provide an inheritance mechanism that sets up the inheritance prototype chain immediately or sets a trigger to set up the inheritance mechanism when the super class is included. This is a first cut at this, so I’m not completely happy with it yet, but I think something like this approach is the right one.

node.js

Node.js makes writing servers extremely easy. It has a large library of modules publicly available and easily accessed through the node package management tool.

It can be a bit of a shock for people used to the Java ecosystem though. Firstly in terms of cross platform, as things stand at the moment windows is definitely a second class citizen. Many important APIs don’t work correctly on windows – for example requesting the inode of a file will always return 0, which breaks code that uses that to check to make sure it doesn’t process files more than once. Jasmine used to fail on windows because of this, but they have graciously accepted my patch to use a library that doesn’t fall for this problem. Another nasty issue I hit was that the API for watching a filesystem for changes that is best specified and should be most compatible (because it uses a fallback behaviour where it is not available) is completely unsupported on windows – it just throws an exception telling you to use a much less specified API:

Providing filename argument in the callback is not supported on every platform (currently it’s only supported on Linux and Windows). Even on supported platforms filename is not always guaranteed to be provided. Therefore, don’t assume that filename argument is always provided in the callback, and have some fallback logic if it is null.

Like what? If you aren’t passed a filename how are you supposed to deal with a rename? The documentation is silent on this matter.

The fs.watch API is not 100% consistent across platforms, and is unavailable in some situations….. If the underlying functionality is not available for some reason, then fs.watch will not be able to function. You can still use fs.watchFile, which uses stat polling, but it is slower and less reliable.

…except of course you can’t, because on windows it just throws an error. Maybe if you were writing the code yourself this wouldn’t be quite so big an issue, but some of that large library of thirdparty modules I mentioned require this functionality to work.

My solution was to monkey patch it:

var os = require('os');

if (os.platform() == 'win32') {
	// change fs.watchFile so that it doesn't throw an error on windows.
	
	fs.watchFile = function(filepath, callbackfunc) {
		var old = fs.statSync(filepath);
		fs.watch(filepath, function() {
			fs.stat(filepath, function(err, newStat) {
				callbackfunc(old, newStat);
			});
		});
	};
}

which is of course far from ideal, but works well enough for my usecase. Another issue – the document talks about filenames and paths, but the code actually expects stat objects.

All this highlights another difference between the node community and other communities you may be familiar with. You will need to look at the source code of some of the modules you use. That’s just the way it is. Fortunately most of them are pretty reasonably coded. A common complaint of mine is that modules that expect you to use them from the command line tend not to provide a reasonable programmatic way of invoking them, and many tool-like modules don’t expect to be run more than once in the same VM and don’t provide the necessary API to clean up.

Node.js is amazing for writing servers very easily (and for integrating with some of the other tools I’m going to mention later), but I used it in dancing-links for two main reasons – the speed (much faster than rhino in eclipse) and as a build language.

When you are writing javascript you typically want to see it in a browser. You want the manual part of the build process to be at most control-s in the js file you’re editing and F5 in the browser you’re viewing your page in. However, you generally want a bunch of other stuff done in between the two, potentially like concatenation of separate js files, building the jsdoc, running the tests, obfuscation, perhaps building the .css files from some other format like less/sass/stylus. Then you need to serve all this to the browser. Setting up a node server to do all this is relatively easy (less than 70 lines) using mature libraries like express and connect-assetmanager.

For less webby builds, you might prefer to use travis.ci.

Node provides the node package manager, npm which makes it easy to declare and grab your dependencies as well as publish your own code to the public repository.

Jasmine

I’m not a great believer in making test code read like English. It seems that everyone is trying to do that these days with various contortions of the host language. I particularly hate fluent interfaces where the normal semantics of what functions and objects mean are broken. Code is code and should read like code. What I do like unequivocally though is test code that generates an English description of the code under test.

A CircularList,
  when newly created without any data,
    is empty.
    and a data item is pushed after it,
      toArray returns an array with only the new item.
      is not empty.
  when newly created with some data,
    has a next and a previous of itself.
    toArray returns an array with only itself.
    is not empty.
    and a data item is pushed after it,
      toArray returns an array with itself and the new item.
      returning a node that contains the data item
  when created with an (empty) header node and 5 data items,
    forEach calls its callback for all items (and their nodes) until false is returned.
    and the third node is hidden,
      toArray returns only the non hidden items.
      calls the onNodeHidden function once on the listener
      then restored,
        toArray returns all the data items.
        calls the onNodeRestored function on the listener
      and hidden is called a second time, it makes no difference;
        toArray returns only the non hidden items.
        calls the onNodeHidden function once on the listener
        then restored,
          toArray returns all the data items.
          calls the onNodeRestored function on the listener
    and a two item chain is spliced into it after the third node,
      inserts the new chain into the right place.

One place that jasmine is a bit of a pain is that really you would like everything at higher nesting levels to be recreated for every test (as Junit does for instance variables), however the lower nesting levels share things defined higher up. This means that to be safe, at most you declare the variables in the outer scope and all actual instantiations need to be done in a beforeEach().

What I’d like to do is generate HTML from the tests and serve the generated ‘spec’ with the documentation.

I did look at a number of other test frameworks, like Buster.js. My ideal requirements are

  • Can be run entirely in node.
  • Can be run entirely in the browser (but not capture mode) which means it can’t rely on node.js require behaviour.
  • Can be run by my webserver every time a file changes.
  • Creates a nice, readable spec document.
  • Works on windows as well as other platforms.

jsdoc

It’s lovely to have jsdoc automatically built from your code and served from the build web server. My experience is that if you aren’t doing something like this it’s very difficult for developers to take jsdoc seriously. Jsdoc implementations for node tend to be a bit fragmented, and none of them seem to have a nice programmatic API. I ended up copying a lot of code from the cli runner of one implementation and writing my own.

If you’ve already got a js project on github, you may already have a jsdoc site without realising it. Have a look at jsdoc.info.

git & github

I’ve mainly been using git for personal projects rather than for large distributed projects, so I haven’t had to get into it properly. I’m disappointed with the very flaky integration into eclipse and the apparent lack of decent visual merging tools. Hopefully these things will improve given time.

Github on the other hand is an amazing service, I’ve heard it called ‘facebook for nerds’. Providing public repositories for free, you can use it as a central point to share your code, easily clone and submit patches to other projects, serve the project website from it, use the wiki and issue tracker, view diffs, make smaller changes directly through the site…. The list goes on. Partly because it’s so useful, ecosystems have grown around it. The django guys said it best: Github is Rome.

I should mention too that bitbucket is also a good service, which might be worth checking out, particularly if you have a small group of friends working on a private project.

cloud9

Cloud9 is a web based IDE. What is most impressive is its excellent integration with node and git. I’d go so far as to say that for developing a project with node and git, you might well be better off developing it in your web browser on cloud9 than in eclipse on a windows machine. You can do all your normal git commands in it, and also run node servers and npm commands. You can edit the code then run the unit tests through the built in console. It also gives you a single click deployment (assuming you’ve got all the files set up correctly) to cloud production level servers like heroku.

heroku

If you have your files set up right, heroku will take your node server and run it on a single instance. You can scale up by configuring heroku to give you more processes, but that costs money. Still, you get a single node.js process for free, and that is plenty for even a moderate load. I don’t expect it to survive getting posted to reddit (although you could also easily set up cloudflare for free….), but for almost anything else it’s going to be fine.

the ecosystem

What amazes me is that you can develop the whole thing in cloud9, using github as your source repository, running your tests and dev server on cloud9 too, then deploy with a single click to a production grade server all for free and without installing a single piece of software on the machine you develop from (except possibly Chrome). You could run major projects from a netbook, or an internet cafe from anywhere in the world. The cost now for setting up and running experimental services is probably as near to free as it can be made, and I look forward to seeing what happens as people discover how empowering that is.

Playing on the CouchDB

Let’s say you’re playing with CouchDB and javascript. It’s great fun, but you’re in development mode, you want to be able to change a file and then reload in your browser without having to mess around uploading and deploying stuff.

The ‘proper’ way of doing this is probably to set up a reverse proxy so that your couchdb is visible through your development webserver, and all is good, but that sounds like a lot of configuration for a playing around kind of setup.

What you’d like to do is run a normal couchdb install on your machine, and use whatever you normally do for webserving (for just playing around, I tend to use a jetty jar I’ve created as it’s very light and easy, but you could equally use tomcat or whatever), and edit the javascript and html files in place. If you have this set up, you can’t usually load the javascript couchdb api from the couchdb server, since it uses an XHR to communicate with the couchdb server, and these are restricted to using exactly the host that the page is loaded from. The fact that the couchdb server is on a different port to your webserver stops it from working.

You can get around this by saving a simple html document as an attachment in your couchdb (any database). This html document is then loaded in an iframe, sets the document.domain to the hostname (removing the port), and then it passes the api object up to the parent window.

https://gist.github.com/942047.js?file=couch-x.html

If you’ve attached that file to a document with id ‘document’ in a database called ‘webaccess’, you would then use it as below:

https://gist.github.com/942063.js?file=example.html

Presence in Virtual Worlds

There have been a lot of experiments conducted to establish how much people feel “present” in virtual words. Loosely, this is usually defined as how much a person reacts to the virtual objects as if they were real (this paper (pdf) gives a good sense of it in its first paragraph). One of my friends in a 3d virtual environment recently wanted to sit on a chair. He’d been so taken in by the quality of the illusion that for a moment he treated the virtual chair as if it actually could support his weight. This is presence, and it’s the zenith of what we aspire to.

Actually, I don’t think it is. I don’t have a nebulous feeling of ‘presence’ in VR, but then I don’t when I’m in a shop, or at work, and sometimes I specifically feel unpresent in the real world. If I’ve been reading a good book, I feel that I’m more in its world than in this one. I think that almost all of these experiments suffer from a way of thinking that almost negates what the experimenters themselves are hoping for. They all start from the assumption that the virtual world is not real, and there’s a difference between it and the real world.

In fact, when I’m in a new environment, real or virtual, I treat it the same. I explore, I try to work out the rules of this new space. I experiment. If I find that the chair in this environment is insubstantial and I can pass my hand right through it, I don’t think that the chair somehow isn’t real, I simply realise that it’s a new kind of real chair, a chair I can’t sit on. I’d be the same in reality if someone gave me an insubstantial chair.

Nobody reacts in a 3d environment as if they are simply being shown patterns of projected colours, just like nobody treats the TV they’re watching Schindlers List on as if it’s just a mesh of colours, or for that matter – nobody treats a book as a bunch of squiggles on sheets. Art is a virtual reality that we explore, experience and decode. In neither simulators nor the walking around world do we experience unfiltered reality. What we experience are the concepts that our environment suggests to us.

There’s no question in my mind that Crayola land, where some of my other friends taunted crudely drawn ducks, or the library, where I was briefly menaced by a ghost child while books flew around are real places. They are just spaces with different rules to spaces I’m used to, and spaces where the assumptions I’m used to don’t always hold. We spend a lot of our lives, particularly when we’re young, working out what is real and what isn’t, and it’s exhilarating to be in that place of possibilities. Reality is a puzzle, and we love to try to solve it.

One of the presence experiments that has been carried out many times is the Pit experiment (pdf). People are asked to conduct a simple task in the virtual environment, and then relatively suddenly are confronted with a huge hole in the floor. Physiological measures show that their heart rate goes up, and their skin conductance changes. Is it Presence? Have people responded to the ‘virtual pit’ in the same way as they would to a ‘real pit’?

The Pit Experiment

I don’t think so, even though when I was there, I nearly lost my balance stepping out over it. (Did me talking about it as if it was really ‘there’ bother you?) I responded to it exactly as I would respond to a real pit – that I knew I couldn’t fall into, but that’s because it was a real pit that I knew I couldn’t fall into. There’s a difference between a ‘real pit’, which contains the notion of something you can fall into and a real pit that you can’t fall into. I’ve jumped out of an airplane, strapped into a parachute, knowing I was completely safe, but that doesn’t mean my heart rate didn’t go up. I responded to it not as I would respond to a real fall, but as I would respond to a real fall where I was safe – a virtual fall.

Of course, it’s difficult to run the experiments to prove this – you’d have to have an insubstantial chair to give me (and how would you carry it?), or run an experiment with a pit that people couldn’t fall into (perhaps using wires?) and compare it with a pit that they really could fall into. It’s difficult to get ethics approval for truly potentially life threatening experiments, even to try to establish a baseline.

And here we find some of the more interesting brand of presence experiments in virtual worlds. Doing experiments virtually that are too unethical to do in the real world. There’s the famous Milgram experiment for example, difficult to get approval for these days, but the virtual version is fine (I think it should be a compulsory part of education to watch the video of some of the 35% of people refusing to give the shocks in the original experiment. It’s inspiring). I saw an experiment where a virtual woman chatted up men, some of whom found her attractive and felt guilty with who knows what effects on the long term health of their relationships. The Millgram experiment is not ethically challenging because of its effect on the actor who is pretending to be shocked, it’s ethically challenging because of its effect on the participants. If people do respond in the same way to the virtual one as they do to the real one, then it’s just as unethical. There is nothing more or less real in the Virtual Milgram experiment than in the “real” one.

In some ways it’s more disturbing. Just like the man in a relationship who wondered if he was a freak because he found himself fancying a virtual woman, it is challenging to our sense of what it means to be human to discover our emotions and cherished empathy responses can be tricked and played so easily. And perhaps this is where the most interesting work is, searching the psyche for what makes us tick, because after all, finding out how we respond to what is inside our own and other peoples heads is even more important than finding out how we respond to various forms of the outside world.

That is Easy, but It Will Still Take You a Full Week To Do.

I used to program BASIC V on the Archimedes, and you could at any time drop into assembly and out again, within the BASIC program. I want to be able to do the same thing in Eclipse with java.

I should be able to specify @lang for each method/class/package. Ruby, Groovy, Nice, Scheme, Visual Basic, BASIC, Lisp, Prolog, Smalltalk, Ada, Javascript, BeanShell, Python, Assembly are all available in ways that compile to the JVM, so why can’t I say “I want to implement this method in Scheme, and this other method in Nice, and this other method in assembly” within the same class?

This should be transparent, with all code highlighting, refactoring tools, javadoc, etc, available.

When native code has to be called, this is something else that should be straightforward. I should just be able to specifiy language C or C++ in the same way as all these others, and it should do everything it needs to automatically. Why should I have to fiddle with build scripts or header generators? Mindless repetitive tasks are things that computers are supposed to be good at.

During debug, I want to see an object graph of the whole system that I can drill down through and edit the state of (the whole time it’s running, not just at breakpoints), and have a console window, with all scripting languages available to manipulate the system.

Eclipse seems to spend half of the time I use it recompiling projects. Sometimes when it doesn’t need to. It loves to randomly pause in the middle of something and leave me stuck for ages. It’s just way too slow for large projects.

And why, when I create servlets, do I have to edit interminable xml, and wait for hours while simple changes are deployed here and there. The amount of configuration you have to do to create a HelloWorld servlet that talks to a database is utterly crazy. The amount of time I’ve spent waiting for eclipse to redeploy a servlet on every tiny change is bizarre.

It seems that java, with its myriad libraries, and IDEs, makes everything easy, but nothing trivial. No wonder people are jumping ship to more dynamic languages.

Contrast this with Ruby on Rails, or even .Net. Just put [webmethod] in front of your normal c# method declaration and drop the file into the web server, and suddenly you’ve got a SOAP service and a form to test it. It’s so easy.

As I read in a Perl book once, simple things should be simple.

Interesting Lives Are A Waste Of Space

Well, you know, some people lead pretty interesting lives, and maybe it would be nice to experience their experiences or to record everything], not just node]the odd thing. How much space would it take to fit an entire human life on a hard drive?

In 1986, T. K. Landauer did a study that estimated that people only take in and remember about 1 byte per second. At that rate, assuming an 80 year life span, you’d only need 80 x 365.25 x 24 x 60 x 60 bytes or a 2.4 gigabyte harddrive to store your entire life experiences.

That seems a bit low to me, so I thought, what about if you wanted to store everything the nervous system receives, not just what it remembers, or just the stuff that gets to the brain, but everything, so that if you rigged someone up to a replay device they would have exactly the same experiences for their entire life.

By current estimates, there are approximately 380 million receptors in the human body. These are just the neurons that take something from the outside world and turn it into a signal that can go into the nervous system.

Usefully enough, these receptors, although they fire many peaks when excited are only on or off, so the analog to digital conversion is as simple as you can get – only 1 bit is needed per receptor.

The next task was to decide on a sample rate for the life dump recording. The sample rate should be high enough so that two stimuli in quick succession that the body can tell apart would appear as two stimuli in the life dump. The receptors can only fire at a maximum of 1200 Hz. Initially I thought that I would use this figure as the sample rate but then I realised, although this is the maximum firing rate, the receptors can fire at different times – they aren’t clocked synchronously at 1200 Hz. The figure I really need is the smallest time interval that two stimuli can happen together for the biological system to perceive them as two separate events. It’s kind of a resolution in time. Bear in mind the auditory system can distinguish between sounds even near the 10kHz mark.

The figure that I decided to use in the end was the transduction time in visual receptors, taking this as an average across the whole body (seems reasonable, approximately 70% of all receptors are visual ones anyway). The transduction time is the time it takes for the receptor to turn an external signal (heat/light/pressure) into a physiological one, so I figure this is plenty quick enough. It’s probably a lot faster than I need, but it’s the best figure I’ve got so I’m going to use it. The visual transduction time is about 6 femtoseconds or 6×10-15 seconds, which gives us a sample rate of 1/6 femto seconds or 1.7×1011kHz.

Now multiplying those values together gives us the number of bits per second we need.

380×106 (receptors) x 1 (bit per receptor) x 1.71011kHz (sample rate)

which is 7.4 zettabytes per second. That’s about 1470 million times as big as google cache. Too big you say? Well according to one estimate, there will be 1 zettabyte of information on the world wide web by 2010.

Well lets work out the figure for an 80 year life. 7.4 zettabytes s-1 x 80 x 365.25 x 24 x 60 x 60 = 18 700 000 000 zettabytes for a lifedump.

Bearing in mind the kind of data, I think I can safely say we would expect to get some pretty fantastic compression rates on it, but even without compression I reckon we’ll have the kind of storage necessary to make a life dump within 150 years.

Please contact me if you can improve this calculation in any way – I’d particularly appreciate a better figure for temporal resolution in receptors.

First published on everything2.com (hence the linking) under the name delfick.

Halting dog problem

Here is the story of how I discovered that if you believe in psychic dogs, you must be barking.

Ok, I’ve got a bunch of dogs that bark. Now these dogs aren’t just any dogs, oh no, they’re pretty clever. So, I’ve got a dog (rex) that barks at cats, and a dog (gnasher) that barks at fish, and a dog (fido) that barks at black things, a dog (pig) that barks at white things and some others. But am I happy? No. I want more!

One day, I start playing with a mirror. Most of the dogs look in the mirror and aren’t even remotely interested, but my dog that barks at black things (fido) happens to be black, so when he looks in a mirror, he goes off on one. Some of the other dogs bark at themselves in the mirror as well.

Unfortunately, my mirror is one of those full length ones on the front of a wardrobe, so whenever I want to play with the mirror and the dogs I have to carry the whole wardrobe downstairs (the dogs aren’t allowed upstairs).

So, I think for sometime, and I come up with a plan. I spend months training an amazing psychic dog (meg), which barks if it sees a dog that will bark at itself. This makes it a bit easier, as I don’t need to cart the wardrobe around anymore, meg barks at fido, but not at gnasher, so I can take meg around with me instead of going to all that furniture movement effort.

One day, just for fun, I retrain Meg to bark at the opposite of what she used to bark at. This is fairly easy – she’s pretty intelligent. Now, she barks at dogs that won’t bark at themselves in a mirror. She barks at gnasher now but not at fido anymore.

So everything was fine, until one day, in contravention of the rules, Meg went upstairs and looked in the mirror…


First published on everything2 (hence the linking style) under the name delfick, with the title “Halting Dog Problem”