THIMBL Keyboard

January 2, 2013

I recently developed a prototype music keyboard for the iPad, in order to play around with the idea. It’s called the THIMBL Keyboard and it looks like this.

Each octave-row maps the twelve semitones to six positions on each hand: Thumb, Half (between Thumb and Index), Index, Middle, Big (ring finger), and Little, thus the THIMBL acronym. This keyboard is very interesting because it has no diatonic bias like a standard piano keyboard, but it does have a bias toward certain keys, i.e. the Left Index position is always a C note. The player moves up and down octaves by moving the hands vertically, so chord inversions are very easy to find. However, this layout means that a C Major scale is not especially simple to play without knowing the right sequence of steps, or memorizing finger positions.

I’ve been practicing some basic technique with this prototype, and have discovered a few things about how it behaves. It seems unorthodox at first, but after learning the intervals between each pair of finger positions, playing music by ear becomes much easier. There are some expected problems with touchscreen controls, as the fingers can’t rest on the key surfaces, and the keys don’t overlap in tiers, but in general this prototype is more durable and easier to maintain than the last version. I’d still like to build a production-quality model but this works surprisingly well in the meantime. Check it out if you’re interested!

I’ve also put together some vertically-oriented notation paper which helps with transcribing and playing music. Time is measured in rows and finger positions correspond to columns of cells. You’ll have to find some way to indicate the octave of each note in this grid, I’d recommend color-coding notes to match the octave colors on the keyboard.

Non-Player Characters

December 24, 2012

Here’s an interesting idea. This article mentions “non-player characters” in the context of a role-playing game, and proposes something rather unsettling:

Many of us approach the other people in our lives as NPCs.

I’ve been thinking along similar lines. People often imagine strangers as incidental scenery, part of the social environment. This is understandable given the limits of human knowledge – there simply isn’t enough time or mental capacity to understand very much about very many people. However, we often forget that this perspective is only a necessary convenience that allows us to function as individuals. For example, if you’ve ever been in a rush to get somewhere on public transportation, you’ve probably felt that bit of guilty disappointment while waiting to accommodate a wheelchair-bound passenger. Here comes some person into my environment to take another minute of my time, right? If you use a wheelchair yourself, this delay happens every time you catch a ride, and that frustration simply does not exist. If anything, I would imagine that disabled passengers feel self-conscious every time they are in a situation where their disability affects other peoples’ lives, even in an insignificant way.

Has this always been true? Probably to some degree, but the modern media environment seems to especially promote it. Good fiction writing communicates the thoughts and motivations of relevant characters, unless they are complete unknowns. This means that any meaningfully observable character has some kind of hypothesized history and experience informing their participation in the story. Film is different, in that a script can describe an “evening crowd” in two words, but the realization of that idea can involve hundreds of extras, living entire lives and working day jobs that barely relate to their final appearance on the screen. We can assume that their real lives intersected with the production of that scene on that day, but it’s really the only significance that their identities have in context.

With interactive media, the idea of a “non-player character” has appeared in many forms, and academics study how they can design the best (read: most believable) fictional characters for interactive environments. Here the limited reality of these characters is even more pronounced. In video games, non-player characters have lower polygon counts, fewer animations, and generally use less code and data. This is a consequence of the limited resources available for building a virtual environment, but the effect is readily apparent and forced.

Does this mean video games shouldn’t include background characters? Not really. What I’m suggesting is that we should be careful to see this phenomenon for what it is: an information bias in favor of the protagonist, which necessarily happens while producing media. It shouldn’t ever be mistaken for a relevant characteristic of the real world. This holiday season, when you’re waiting an extra minute or two for a disabled stranger, or expecting better service from a tired professional, remember that he or she probably has lived a life as rich and complicated as your own, and try not to react as if he or she is just some kind of annoying scenery. Whoever it is might return the favor, even if you never realize it.

MixBall

December 12, 2012

Music is a big part of my life. I have a voracious appetite for recorded music, and I’m working on my own humble contribution to the universe of sound. Like many aspiring composers, I’ve dreamed of creating songs that touch many lives. It hasn’t been easy – profound communication through music is an especially difficult task. In today’s world where every musical idea is measured, recorded, licensed, and purchased, that task is harder than it has ever been. I attended school with several people who are now working musicians, struggling for excellence in a craft which has been commoditized to the point of disposability.

Maybe digital distribution and piracy didn’t cause this, but many of us have still forgotten to respect the artistic process. If I were to release an original music demo, it would almost certainly be lost in a sea of other free legal or illegal content, and the prospect of eventually making usable money in this way without staging live events is not great. It feels like a step backwards, if not an unexpected development.

Knowing this, I decided to make a demo which subverts the trend. My first release of original music is now available, but only in an interactive format. MixBall is a special game that mixes the music while you play. I’m not charging money yet, but you’ll have to spend time and energy “beating” each mix, so it might make you think a bit about value. If it achieves that, I will consider it a success. The catchy tunes are a bonus, hopefully you’ll enjoy them too.

You might be wondering why this countercultural experiment is hosted on Apple’s App Store and requires an iOS device. The answer has to do with hardware limitations. I was interested in building for Android as well, but low-latency sound requires considerate effort, and Apple has the whole portable music pedigree to boot. Hopefully that doesn’t offend anyone.

Get MixBall today in the App Store!

Degrees and Freedom

August 6, 2012

Here’s my idea of a good math lesson. I want to explain Euler’s formula, the cornerstone of multidimensional mathematics, and one of the truly beautiful ideas from history. In school this formula appears as a useful trick, and is not commonly understood. I think that is because students are denied enough time to wonder what the formula actually means (it doesn’t describe how to pass an exam). Here is Euler’s formula:

e^(ix) = cos(x) + i*sin(x)

This idea was introduced to me after a review of imaginary and complex numbers. Once the history and definition were out of the way, we completely freaked out at the idea of putting ‘i’ in the exponent, then practiced how to use it in calculations. I might have had a brief moment of clarity in that first class, but by the AP exam Euler’s formula was nothing more than a black box for converting rectangular coordinates to polar coordinates.

Many years later, I came across the introduction to complex numbers from Feynman’s Lectures on Physics, and suddenly the whole concept clicked in a way that it never had in school. Explained here, I don’t think it is really that difficult to understand, but then I’ve already managed to understand it, so I’ll try to communicate my understanding and then you can tell me whether it makes sense.

We need to start by generalizing the concept of a numeric parameter. The number line from grade school is an obvious way to represent a system with one numeric parameter. If we label the integers along this line, each mark corresponds to a grouping of whole, countable things, and the value of our integer parameter must refer to one of these marks. If we imagine a similar system where our parameter can “slide” continuously from one integer to the next, the values that we can represent are now uncountable (start counting the numbers between 0.001 and 0.002 if you don’t believe me) but opening up this unlimited number of in-between values allows us to model continuous systems that are much harder to represent with chunks.

Each system has a single numeric parameter, even though the continuous floating-point parameter can represent numbers that the integer parameter cannot. In physics, the continuous parameter can represent what is called a “degree of freedom,” basically a quantity that changes independently of every other quantity describing the system. Sometimes a “degree of freedom” is just like one of the three dimensions that you can see right… now, but this is not always the case. Wavefunctions in particle physics can have infinite degrees of freedom, even though the objects described by these esoteric equations follow different laws when we limit our models to the four parameters of spacetime.

Anyway, the imaginary unit or ‘i’ is just some different unit that identifies a second numeric parameter. If we multiply an integer by ‘i’, we’re basically moving a second parameter along its own number line that same distance. Apply the “sliding” logic from before and we can use the fractional parts between each imaginary interval. If this sounds new and confusing, just remember that any “real” number is itself multiplied by the real unit, 1. Personally, I don’t think that the word “imaginary” should be used to describe any kind of number, because all numbers are obviously imaginary. However, this convention exists regardless of how I feel about it, and nobody would know what to put in Google if I used a different word.

Why do teachers use this system where one implicit unit is supplemented by a second explicit unit? Simple – it was added long before anyone fully understood what was going on. The imaginary unit was the invented answer to a question, that question being:

Which number yields -1 when multiplied by itself?

The first people to ask this question didn’t get much further than “A number called ‘i’ which is nowhere on the number line, and therefore imaginary.” If those scholars had described their problem and its solution in a different way, they might have realized some important things. First, this question starts with the multiplicative identity (1) and really asks “which number can we multiply 1 by twice, leaving -1?” Thinking about it like this, it soon becomes clear that the range of values we can leave behind after multiplying 1 by another value on the same number line, twice, cannot include -1! We can make 1 bigger, twice, by multiplying it by a larger integer, or smaller, by multiplying it by a value between 0 and 1. We can also negate 1 twice while scaling it up or down, but none of these options allow for a negative result!

A clever student might point out that this is a stupid answer and that we might as well say there is none, but we still learn about it because amazing things happen if we assume that some kind of ‘i’ exists. We can imagine a horizontal number line, and then a second number line going straight up at 90° (τ/4 radians, a quarter turn) from the first. Moving a point along one line won’t affect its value on the other line, so we can say that the value of our ‘i’ parameter is represented on the vertical line and the value of our first (“real”) parameter is represented on the horizontal line. That is, a complex number (a*1+b*i) imagined as a single point on a 2-dimensional plane. In this space, purely “real” or purely “imaginary” numbers behave just like complex numbers with zero for the value of one parameter.

Now think about the answer to that question again. If our candidate is ‘i’ or some value up “above” the real number line, it’s easy to imagine a vector transformation (which we assume still works like multiplication) that can change 1 to ‘i’ and then ‘i’ to -1 in this 2D number space. Just rotate the point around the origin by 90°. When our parameters are independent like this, exponentiation by some number of ‘i’ units is exactly like rotating the imagined “point” a quarter turn around zero some number of times. I don’t really know why it works, but it works perfectly!

We’ve seen that imaginary units simply measure a second parameter, and how this intuitively meshes with plane geometry. Now let’s review what is actually going on. Numbers multiplied by ‘i’ behave almost exactly like numbers multiplied by 1, but the important thing about all ‘i’ numbers is that they are different from all non-‘i’ numbers and therefore can’t be meaningfully added into them. The ‘i’ parameter is a free parameter in the two-parameter system that is every complex number. It can get bigger or smaller without affecting the other parameter.

Bringing this all together, let’s try to understand what Euler was thinking when he wrote down his formula, and why it was such a smashing success. He noticed that the Taylor series definition of the exponential function:

Exponential function and its Taylor series

Becomes this:
Complex exponential and its Taylor series

When ‘i*x’ is the exponent, because the integer powers of ‘i’ go round our complex circle from 1 to i to -1 to -i and back. Grouping the real terms and the ‘i’ terms together suddenly and unexpectedly reveals perfect Taylor series expansions of the cosine and sine:
Euler's complex exponential series

As each expansion is multiplied by a different free parameter, the two expansions don’t add together, naturally separating the right side of our equation into circular functions! We can just conclude that those functions really are the cosine and sine of our variable, remembering that the sine is an ‘i’ parameter, and it works! Because these expressions are equivalent, having a variable in the exponent allows us to multiply our real base by ‘i’ any fractional number of times (review your exponentials), and thus rotate to any point in the imagined complex plane. There are other ways to prove this formula, but I still do not understand exactly why any of the proofs happen the way they do. It’s not really a problem, because Euler probably didn’t understand it either, but I’d still like to come across a good answer someday. What I know right now is that any complex number can be encoded as a real number rotated around zero by an imaginary exponent:

e^(ix) = cos(x) + i*sin(x)

Here is proof that certain systems of two variables can be represented by other systems of one complex variable in a different form, and the math still works! Euler’s formula is a monumental, paradigm-shattering shortcut, and it made the modern world possible. I’m not overstating that point at all, everything from your TV to the Mars rover takes advantage of this trick.

MixBall Preview

July 3, 2012

It’s about time to take the wraps off my latest project:

Introducing MixBall, the first dedicated interactive music platform! Tilt your iDevice to control how the music unfolds, and don’t hit any hazards if you want to survive all the way to the end! It gets rather difficult once all 3 tracks are in play…

Check out the sample video, and visit mixball.com to get on the mailing list. I’ll be sending out an update as soon as MixBall is available in the App Store!

World Wide Ouija

May 5, 2012

(Skip to the demo!)

I have a new open-source software project to share today. It is an idea that has been bouncing around my head for a while: a working, real-time, multiplayer Ouija board game that runs in a web browser. I decided to go ahead and make it last week because there are some amazing new JavaScript tools maturing out of the NodeJS madness, and this kind of project has suddenly become much easier to complete than it would have been just a few years ago.

Specifically, I’m using MeteorJS, a brand-new full-stack NodeJS/MongoDB framework which abstracts away the intimidating problem of integrating a client program with its server resources. To oversimplify, a fast web app can’t wait for every bit of program logic to download from the server, and a centralized server can’t keep every bit of data organized on every client without its own reference copies, so a lot of work gets duplicated. MeteorJS attempts to minimize this issue by setting up a framework where application data can be accessed from either the client or the server at the same logical address or “collection,” using the same JavaScript semantics on either end. When it is possible to synchronize each client’s copy of its application data with the server’s version, MeteorJS automatically passes updates to the server and fetches fresh data, algorithmically eliminating conflicts that the other clients might have introduced.

It’s all wonderfully intricate stuff, but the point is that somebody else is building it and we don’t have to. So let’s cut the techno-babble and make a multiplayer Ouija game! Start by installing MeteorJS. Then, to create a new project and populate it with basic HTML, CSS, and JS files, type this at a command line:

meteor create worldwideouija

To turn off “easy mode” and better control the data that gets shared, disable autopublish:

meteor remove autopublish

Our game needs to store three kinds of “things,” so we’ll start by initializing three collections. Open the “worldwideouija.js” file in the “worldwideouija” folder and add this at the top:

Rooms = new Meteor.Collection("rooms");
Forces = new Meteor.Collection("forces");
Messages = new Meteor.Collection("messages");

The Rooms collection will store metadata about each game room, the Forces collection will store temporary impulse values that push the Ouija marker around the board, and the Messages collection will store chat messages.

We can access these collections from either client or server, but the distinction is still important for the purposes of our application, as we need our Ouija server to manage the data for each room, and update each player’s client as necessary. MeteorJS allows us to conditionally execute JavaScript depending on whether the current environment is the client or the server, which allows all our application logic to be contained in one script. Let’s declare what the client should do now:

if (Meteor.is_client) {
Meteor.startup(function() {
Meteor.subscribe("allrooms");
Meteor.subscribe("allmessages");
//...
});
//...
}

Meteor.startup() takes a callback function that should contain all the code we want to run when the application is ready to start execution. This callback will contain our game loop.

Inside, the Meteor.subscribe() methods tell the client that it should ask the server for access to “allrooms” and “allmessages” that the server hopefully has available. While the Rooms variable always refers to the collection, this “allrooms” subscription tells the client which records to actually synchronize with the server. Because autopublish is disabled, we will have to make these subscriptions available to the client. At the end of the file, add a block that tells the server to publish all Rooms to “allrooms” subscribers and all Messages to “allmessages” subscribers. Security is not a concern when consulting with spirits:

if (Meteor.is_server) {
Meteor.startup(function () {
Meteor.publish("allrooms", function() {
return Rooms.find();
});
Meteor.publish("allmessages", function() {
return Messages.find();
});
//...
});
}

I’ve neglected to publish the Forces collection, but this is on purpose. At first, I imagined that each client could just update the server’s marker position for the current room at any time and the server would sort everything out, but whenever the delay between page updates got worse, the movement became jerky and weird. The Forces collection is how I solved this. It acts as a buffer where each client can write control data. When the server is asked to update the marker position, it reads from this collection, updates the position according to all the available Forces, and then deletes them, before sending the new position back. This means that no client ever has to read directly from this collection. Each client can update its own Forces collection and send new records to the server without subscribing to any published objects. It’s probably better not to waste bandwidth synchronizing this data if the clients don’t need it.

So let’s write some code! Define a game loop that executes every 500 milliseconds by adding a Meteor.setInterval() method to the client block, inside the startup callback function:

if (Meteor.is_client) {
Meteor.startup(function() {
Meteor.subscribe("allrooms");
Meteor.subscribe("allmessages");
Meteor.setInterval(function() {
if (Session.get("room")) {
if (isNaN(Session.get("dx"))) Session.set("dx", 0);
if (isNaN(Session.get("dy"))) Session.set("dy", 0);
Session.set("dx", Session.get("dx") * 0.9);
Session.set("dy", Session.get("dy") * 0.9);
Forces.insert({
room: Session.get("room"),
x: Session.get("dx"),
y: Session.get("dy")
});
Meteor.call("updateMarker", Session.get("room"), function(e,r) {
if (r.x && r.y) {
Session.set("posX", r.x);
Session.set("posY", r.y);
}
});
}
}, 500);
});
//...
}

This loop uses Session.set() and Session.get() to manage temporary session variables, used to save the client’s current impulse value, along with the client’s copy of the marker position. Meteor.call() tells the server to execute its “updateMarker” method that we will have to define, and the callback updates the client’s position using whatever response data comes back.

The rest of the client application uses HTML templates integrated with JavaScript, like this function which returns the current Room ID:

Template.main.currentRoom = function() {
return Session.get("room") || false;
};

This return object can be accessed from the “main” template by including {{currentRoom}} anywhere in context. We won’t go into detail about the rest of the templates but they are all included with the source code and should make enough sense if you can follow this logic. Look in the .html file for all the template HTML.

Templates can also handle jQuery events, and we can use this functionality to read the mouse position and update the client’s impulse values:

Template.room.events = {
"mousemove .gameBoard": function(e) {
var theRoom = Rooms.findOne(Session.get("room"));
var trueX = e.pageX - parseInt($('.gameBoard').css('margin-left'));
Session.set("dx", (trueX - Session.get("posX"))/25);
Session.set("dy", ((e.pageY - 50) - Session.get("posY"))/25);
}
//...
};

This is the events object for the “room” template, with keys that are strings containing a jQuery event type and often a CSS selector that declares which DOM element(s) to bind. Inside, we use a little hackery to compute an impulse vector from the mouse and marker positions, and then we save the session variables that the game loop will read from.

Finally, we have to implement an “updateMarker” method on the server:

if (Meteor.is_server) {
Meteor.startup(function () {
Meteor.publish("rooms", function() {
return Rooms.find({});
});
Meteor.publish("messages", function() {
return Messages.find({});
});
Meteor.methods({
updateMarker: function(id) {
var theRoom = Rooms.findOne(id);
var position = {};
if (theRoom) {
if (isNaN(theRoom.x)) theRoom.x = 480;
if (isNaN(theRoom.y)) theRoom.y = 320;
var dx = 0;
var dy = 0;
var numForces = 0;
var theForces = Forces.find({room: id});
theForces.forEach(function(force) {
dx += parseInt(force.x);
dy += parseInt(force.y);
numForces++;
});
Forces.remove({room: id});
Rooms.update(id, {$set: {players: numForces}});
if (numForces > 0) {
var newX = theRoom.x + dx/numForces;
var newY = theRoom.y + dy/numForces;
if (newX < 100) newX = 100;
if (newX > 860) newX = 860;
if (newY < 100) newY = 100;
if (newY > 540) newY = 540;
Rooms.update(id, {$set: {x: newX}});
Rooms.update(id, {$set: {y: newY}});
position.x = newX;
position.y = newY;
}
}
return position;
}
});
});
}

And that’s (almost) all!

Visit http://github.com/guscost/worldwideouija for the complete source code, and try it out at worldwideouija.meteor.com or worldwideouija.com! I’m still tinkering around so uptime and code quality are not 100% guaranteed.

And before I forget, many thanks to AVGP for the missing chatroom example, which helped a lot.

Last time, we imagined that a cognitive “confabulation” process (and therefore all intelligence) happens in the brain as an interference phenomenon, or a sort of nonlinear convolution, among complicated modes of oscillation on a neural network.

But this idea is immature and unfunded, and experiments are not prepared at the moment to rigorously test some kind of hard prediction.

So instead, let us wave our hands, consider the typical living person as an empirical phenomenon, and attempt to describe a basic theory of idea genesis by thinking about it/him/her. A spoken sentence is commonly defined in English class as a “complete thought,” and we hypothesize that this definition can be closely correlated with some specific thing that might be called an “understood idea” as it enters or exits a conscious person, given the following conditions:

1) The person is arriving spontaneously at each output word, i.e. composing sentences as they are being spoken. This is different from a “memorized idea” which could instead be modeled as a sort of large word in a person’s vocabulary. It is also different from a “perceived idea” like this sentence that you are reading, because in this case a large percentage of the processing devoted to “finding” each word is cut out and replaced with less-intensive processing devoted to parsing each word and, in a typical case, “sounding it out” internally as your eye scans the page. Incidentally, that is why it takes much longer to write a book than it takes to read it.

2) The person really understands each input word, a philosophical dead end which can only be assumed from a given reply.

So where do these understood ideas come from? We tend to agree what is a coherent sentence, and far chaining mellow peninsula no binder what. But how do we arrive at the correct series of words for each idea? It is not really possible to identify the physical source of any particular word that I might say myself, because to do so would require me to say new and different words (at least internally), and so on. But it is still possible to theorize a mechanism by which this can happen in a general sense, that is consistent with the principles of analog confabulation.

A good place to start is with the acknowledgment that words are not guaranteed to mean exactly the same things to different people, and it is only by assuming a considerable amount of shared experience that we can rely on these labels to signify approximately what we intend to communicate. It would also be wise to acknowledge the fact that most things that can be understood by intelligent beings aren’t easily translated into words, as the arrival of creatures with “large” vocabularies was not very long ago, and therefore we have a rather naive understanding of what a “large” vocabulary actually is.

With that in mind, let’s get right to the core of the matter: what makes a certain word or pattern part of a person’s vocabulary? What is its function in relation to other words, and the people who use them? I consider it logical and correct to describe each word as a reminder of some shared experience. Why does the word “apple” mean what it does? Because it has been associated with the experience of an apple since before any one of us was alive. I know what the word means because I have experienced it so many times in the presence of apples. I can communicate this to other people, because when dealing with apples, I am strongly inclined to spontaneously arrive at that word, and externalize it.

The paradox, then, is this: if every word in a given vocabulary has to refer to some common feature of experience, how do people communicate new things? Well, there are several other factors to consider. First, it is possible to arrange familiar words in a way that reveals some previously unfamiliar aspect of the relationship between them. When these arrangements are particularly witty or profound, they are often called “jokes.”

Second, it is sometimes possible and even necessary to create new, completely unfamiliar words when they are required by a new idea. In these cases, if the new words are particularly appropriate or useful, they must refer to some common feature of experience that has not been named, and so they are assimilated into the shared vocabulary of those who understand the new idea. That is how language evolves.

Third, human communication has always been imprecise at best and useless at worst, so there is hardly any guarantee that listeners will ever understand anything I say in the same way that I do. This imprecision is usually ignored by humans, yet it causes the evolution of communicated ideas in unpredictable and not necessarily unhelpful directions. On the other hand, when we are inclined to read and write precise, executable computer code, it is often found that simply reading the code like one would read a book does not provide any useful insight. To rigorously understand a computer program or a mathematical proof, one must essentially construct a perfect imitation of some discrete state of mind achieved by its original creator, and it is not a coincidence that our relatively primitive machines can be readily configured to make use of these same ideas. We should also not be surprised that drilling children in the efficient execution of algorithms does little to produce creative adults.

Luckily, none of these factors lead to contradiction when imagining a neural network as an analog phenomenon, and in fact the reality seems much more consistent with this framework than with typical digital and discrete-time neural networks. The idea requires a rather uncompromising philosophy once it is extrapolated far enough, but that’s a common problem with any broad scientific theory. The most difficult point to accept will be that in this view, there is no further control system or homunculus that sits “behind” the interference phenomenon in any sense, as the phenomenon itself is the only control mechanism present. This challenging idea might lead some to conclude that insanity is only one unfortunate circumstance away, or even that free will itself does not exist. I would caution those who go that far to be aware of exactly what it is they are defining – if “free will” means the capacity for human beings to make decisions that contradict every rational force or expectation in the known universe, then explaining in scientific terms how this condition arises only serves to reinforce its reality.

It is trivial to cover edge cases (read: far from the cortex) with this model, because for example, medical science already knows that the force conveyed through a muscle is proportional to the frequency of the nerve pulses, not amperage or anything like that. Considering this, “reflex actions” and “muscle memories” can be explained as progressively longer signal paths that penetrate farther toward the cortex proper, but are quickly reflected back at the muscles that will perform the response. The difficulty comes with explaining more sophisticated animal behaviors, and finally with accounting for the nature of introspective consciousness. The signal paths for these actions are certainly orders of magnitude more complex than any of those which we can directly observe at present, but it is not impossible or even implausible that the underlying physical mechanism should essentially be the same.

The central hypothesis linking analog confabulation with intelligence suggests that in reality, conscious thought is only ever quantized or digitized in the sense that a given signal either “resonates with” or “does not resonate with” the rest of the brain. It would not be elementary to add or multiply these signals in a linear fashion, as the space of human ideas is not affine. Thus, a set of words grouped together in a specific order can encode much more information than the set of information gathered from each word when considered on its own. Furthermore, ideas beyond a certain elementary complexity level are never 100% guaranteed to persist. A common annoyance called a “brain fart” happens typically when one word or phrase from an idea that “should” be resonating with the others fails to enter the feedback condition, due to unexpected interference from any number of sources. This condition is not usually permanent, but people can and do permanently forget ideas that don’t resonate with anything for the rest of their lives.

Is it really possible to understand intelligence if this much ambiguity is required? Analog systems have characteristics that make them very useful for certain tasks related to intelligence, so it is in our best interest to try. After it has stabilized, a neural network arrives at a sort of “temporary solution” where the weightings of its connections are each optimally configured that no (or few) weightings change on the next recurrence of network activity. It would seem that an analog system could be stabilized in this manner to much more significant precision, and possibly in much less time, especially if any “aliasing” effect of the digitized neurons causes disruptive oscillatory behavior to persist longer than it would otherwise. The improvement over coarse digital algorithms would likely be significant, as evidenced by the fact that bees can reliably discover the best routes to maximize food collection using very little available hardware. A digital simulation of physically precise or effectively “continuous” neural networks is possible and has been attempted, but the complexity and price of such a system is still prohibitive, to say the least. The alternatives would appear to be either an enormously complicated analog computer, or the convenient discovery of some mathematical trick that makes efficient modeling with Turing machines possible.

Therefore, at present this perspective on high-level behavior and intelligence might be developed further in a qualitative field like psychology. One intriguing theory of mind, originally published by Julian Jaynes in 1976, suggests that humans went through a phase of “bicameral” mentality in which one part of the brain that generated new ideas was perceived by the other part as a member of the external universe. Jaynes suggests that this “bicameralism” was similar in principle to what we call “schizophrenia” today, and can account for all sorts of historical oddities that we call religions, myths and legends. The theory is based on the core epiphany that logical and learned behaviors predate consciousness and indeed provide some of the necessary conditions for its existence. This is used to push the idea that the human “phenomenon” emerged from interactions between sophisticated, organized animals and the external environment after a special phase of “bicameral society” in which most humans were not even self-aware.

Jaynes’s historical analysis touches on many interesting ideas, and provides enough evidence to demand a serious consideration, but its most obvious shortcoming is the manner in which it skips from an initial, abstract consideration of the separation between behavior and consciousness, to a discussion of Gilgamesh and the Iliad. We pick up the story of mankind there, and nothing is said of the millions of years of evolution leading to that point. Any complete theory of intelligence has to account for canine and primate societies as well as early human ones, and Jaynes’s bold assertions leave the reader wondering if there are any self-aware apes leading their mindless troops through the jungle.

In the framework of analog confabulation, we can ignore some of these hairier philosophical challenges for the moment, as the bicameral mind simply bears striking similarities to one intuitive model of a general pre-conscious condition. When a stimulus enters the kitty cat, it responds immediately and predictably. This is the behavior of a system that is not considerably affected by feedback. It can be characterized as involving a sort of linear path from the senses into the cortex, with one or two “bounces” against the cortex and then back out through the muscles as a reaction. It’s really quick and works wonderfully for climbing and hunting, but it means that the cat will never sit down and invent a mousetrap.

Self-aware creatures, on the other hand, behave as if there is significant feedback, at least while introspecting, and their brains might be characterized as having a great number of “loops” in the neural pathways. It means that the resonances theorized by analog confabulation can be extremely sophisticated, but naturally sophisticated resonating structures would have to develop before any of that could happen. The critical threshold must obviously involve enough complexity to process a vocabulary of a certain size, but it could include communication of any kind, using any of the senses.

The question of when or whether bicameral human societies existed is unaffected by any of this, but at the same time that possibility cannot be ruled out. It might even be valid to say that, for example, dogs have “bicameral minds” like Jaynes claims ancient humans did, only that their vocabulary is limited and not fully understood by us. Much of it could be roughly translated into simple, impulsive ideas like “I’m hungry!” or “come play!” or “squirrel!” like the dogs in Up, but a dog could never say “I’m thinking so therefore I exist!” in the same manner. Most dogs have not discovered that their brains are the source of their own ideas, and even if they did they would not have any good word for “think.”

So what solid logic supports this theory in the end?

– Wernicke’s area and Broca’s area are two topologically complex parts of the brain that are active in understanding and forming words, respectively. A high-bandwidth neural loop connects them.

– A large body of circumstantial evidence, some of which will be included here:

– Uniquely “human” behaviors like laughter, dancing, singing, and aesthetic choices all can be said to have a certain “rhythmic” component that describes the behavior intuitively and at a low level. Each behavior would then involve periodic signals, by definition.

– More specifically, if laughter really does betray some “inner” resonance that happens involuntarily when a human encounters the right kind of new idea, that phenomenon suddenly makes a whole lot more sense in an evolutionary context.

– Meditation reveals how new ideas arrive as unexpected, sudden, and sharp feedback loops, which often take some time to deconstruct and translate into the appropriate words, but are nevertheless very difficult to erase or forget. That is of course, unless an idea arrives in the middle of the night, in which case the noise of REM sleep can overwrite anything that is not written down.

– The fact that words have to “happen” to a person several times before they are useful means that each has a periodicity, even if it is irregular. And some words like “day” and “night” occur with regular periodicity.

– Music has profound effects on the mind. Duh.

– Light also affects mood, and too much of the wrong spectrum can make you SAD.

I’ll try to keep this list updated as I remember more circumstantial evidence that should be written down in a notebook already, but it seems like there would be a lot. In any case, if you *ahem* have thoughts about this theory, please do share them. Nobody really knows the answer to any of these questions yet so all ideas are appreciated.

Here’s an updated version of a paper I wrote for Amit Ray’s class last quarter.

ABSTRACT

We assume that intelligence can be described as the result of two physical processes: diffraction and resonance, which occur within a complex topology of densely and recurrently connected neurons. Then a general analog operation representing the convolution of these processes might be sufficient to perform each of the functions of the thinking brain. In this view of cognition, thoughts are represented as a set of “tuned” oscillating circuits within the neural network, only emerging as discrete symbolic units in reality. This would pose several challenges to anyone interested in more-efficient simulation of brain functions.

INTRODUCTION

In the early years of computer science, intelligence was theorized to be an emergent property of a sufficiently powerful computer program. The philosophy of Russell and Wittgenstein suggested that conscious thought and mathematical truths were both reducible to a common logical language, an idea that spread and influenced work in fields from linguistics to computational math. Early programmers were familiar with this idea, and applied it to the problem of artificial intelligence. Their programs quickly reproduced and surpassed the arithmetical abilities of humans, but the ordinary human ability to parse natural language remained far beyond even the most sophisticated computer systems. Nevertheless, the belief that intelligence can be achieved by advanced computer programs persists in the science (and science fiction) community.

Later in the twentieth century, others began to apply designs from biology to computer systems, building mathematical simulations of interacting neural nodes in order to mimic the physical behavior of a brain instead. These perceptrons were only able to learn a limited set of behaviors and perform simple tasks (Rosenblatt 1958). More powerful iterations with improved neural algorithms have been designed for a much wider range of applications (like winning at Jeopardy!), but a model of the human brain at the cellular level is still far from being financially, politically or scientifically viable. In response to this challenge, computationalists have continued the search for a more-efficient way to represent brain functions as high-level symbol operations.

Confabulation Theory is a much newer development: it proposes a universal computational process that can reproduce the functions of a thinking brain by manipulating symbols (Hecht-Nielsen 2007). The process, called confabulation, is essentially a winner-take-all battle that selects symbols from each cortical substructure, called a cortical module. Each possible symbol in a given module is population-encoded as a small set of active neurons, representing one possible “winner” of the confabulation operation. Each cortical module addresses one attribute that objects in the mental world can possess. The mental-world objects are not separate structures, but are rather encoded as the collection of attributes that consistently describe them. Ideas are encoded in structures called knowledge links, which are formed between symbols that consistently fire in sequence. It is proposed that this process can explain most or all cognitive functions.

THEORY

The confabulation operation happens as each cortical module receives a thought command encoded as a set of active symbols, and stimulates each of its knowledge links in turn, activating the target symbols in each target module that exhibit the strongest response. This operation then repeats over and over as the conscious entity “moves” through each word in a sentence, for example. Confabulation Theory seems to affirm the Chomskian notion of an emergent universal grammar, but the specific biological mechanism that enables the process is not fully understood. However, it must be efficient enough to achieve intelligence with the resources available to the brain.

Research indicates that population encoding by itself cannot account for the bandwidth of cognitive functions when considering the available hardware, and some have proposed the idea that information must also be encoded within the relative timing of neural events (Jacobs et al. 2007). Recent experimental data suggests that some information is encoded in the “phase-of-firing” when an input signal interferes with ongoing brain rhythms (Masquelier 2009). These rhythms are generated by resonant circuits of neurons that fire in a synchronized fashion, and some circuits can be effectively “tuned” to resonate at a range of frequencies between 40 and 200 Hz (Maex 2003).

We now consider the possibility that these “tuned” neural circuits are an essential condition for language processing and other intelligent behavior: their dynamics implement an analog operation similar to the hypothesized confabulation. Sensory information arrives as a wave of excitation that propagates through the topology of the nervous system, and is sorted into harmonic components as it is diffracted by the structure of the brain. Each configuration of neurons exhibits resonance with specific frequencies or longer patterns of activation when exposed to this input, and can therefore process a continuous signal from any or all of the senses at once. The time-domain representation of the frequency-domain representation of the data is, in a rough sense, transposed into the specific resonant characteristics of various neural populations, which can then be trained to produce the same cascade of resonances that was caused by the original signal and thus recall the information from memory. In the case of a spoken word, the resonance cascade encoded in the brain structure is activated, and the waves of excitation move down the nerves to the muscles in a way that generates the sound. Speaking a word and listening to the same word would necessarily activate some common circuitry, as suggested by the existence of mirror neurons (Rizzolatti 2004).

The confabulation operation, and all cognition, could then be understood as an emergent property of suitably resonant neural populations, activated by and interfering with appropriate sensory signals. It has been hypothesized for some time that frequency-tuned circuits are essential to the process of sensory data binding and memory formation (Buzsaki 1995). If thoughts are indeed generated by an analog of the confabulation process, these “tuned” configurations would probably correspond to the hypothesized symbols more closely than simple populations of neurons. This would allude to a few looming challenges. First, the exact resonant properties are different for each of a neuron’s 1,000-plus connections, and vary with both time and interference in an analog fashion. Second, these resonances would need to be “sharp” enough to accurately identify and mimic the minute variations in sensory data generated by slightly different objects in the real world. Cochlear hair cells do exhibit behavior that suggests a finely-tuned response, each one activating in a constant, narrow band of input frequencies (Levitin 2006).

If confabulation is to emerge from a resonating neural network, this network must be able to process arbitrary periodic activation patterns, along with simpler harmonic oscillations, and arrive at meaningfully consistent results each time it is exposed to the sensory data generated by a real-world object. Considering the mathematical properties of analog signals, this does not seem like an impossible task. As Joseph Fourier demonstrated in his seminal work on the propagation of heat, any periodic signal can be represented as the sum of an infinite series of harmonic oscillations. This result suggests that it is at least possible to “break down” periodic or recurring sensory signals into a set of harmonic oscillations at specific frequencies, and within this framework, those frequencies would determine exactly where the sensory data is encoded on the topology of the brain. We can imagine that recurring harmonic components of the signals generated by the appearance, sound, smell, taste or texture of an apple would all contribute to the mental-world category of “apple-ness,” but that hypothesis doesn’t immediately suggest a method to determine exactly which frequencies persist, where they originate or where they are encoded in the cortex (aside from the “red” or “green” frequencies, I suppose).

Within this purely qualitative framework, thought is simply produced by the unique configuration of neural circuits that exhibit the strongest resonance as signals from the senses interfere with concurrent network activity. Dominant circuits can even “shut off” other circuits by altering the firing threshold and delay of their component neurons, thus destroying the resonant behavior. This phenomenon would seem to prevent an overwhelming number of ideas from destructively interfering with each other, making normal linear thought generation possible.

Memory is reliable because the recurring harmonic components of experience are gradually “tuned” into the brain structure as sensory signals mold the emerging resonance into the cortex, effectively integrating the new idea with a person’s existing knowledge base. The overwhelming majority of information is eventually discarded in this process, as it does not recur in the decomposed signal. Only those harmonic components that persist are remembered.

IMPLICATIONS

A quantitative description of this framework is beyond the scope of this paper, but it would probably include a way to discretely represent as many of the individual resonances as possible. A generalized model could be built in a higher-dimensional “circuit space,” but it is unclear whether this approach would prove significantly faster or entirely different from the latest data compressing and signal processing algorithms. Programming truly intelligent behavior into this kind of machine would probably require considerable effort, as humans generally learn things like basic arithmetic over a long period of time, eventually processing simple equations by associating categories of numbers with their sums and products.

An investigation of human and animal behavior within this framework might yield better results for now. The obvious place to start is with music cognition, as Daniel Levitin has done in his book. Further research on the connections between music theory and induced brain rhythms is advisable.

The framework is also interesting because it would require very little demarcation between the mental and physical realms, information entering and exiting the body seamlessly with sensory perception and behavior, respectively. If we imagine that the signal generated by an individual’s name might exhibit a characteristic interference with the signal generated by the personal pronouns “I” and “me,” then self-awareness might only emerge in a community that passes around the right messages. Philosophically, all conscious thought would then be intimately dependent on reality, as trained intelligent brains would only be able to reproduce the various harmonic patterns that recur in reality.

As broad pattern-matching devices, humans perform precise computations rather inefficiently, and Turing machines will probably remain the most appropriate tool for that specific job. However, imprecise computations like those required for effective facial recognition might be greatly optimized by the subtle oscillatory characteristics of neural circuitry. Those attempting to achieve artificial intelligence will benefit from a careful evaluation of the data that their models can represent and preserve.

SOURCES

Buzsaki, Gyorgy and Chrobak, James. “Temporal structure in spatially organized neuronal ensembles: a role for interneuronal networks.” Current Opinion in Neurobiology, 5(4):504-510, 1995.

Fourier, Jean Baptiste Joseph. The Analytical Theory of Heat. New York: Dover Publications, 1955.

Gibson, William. Neuromancer. New York: Ace, 1984.

Hecht-Nielsen, Robert. “Confabulation theory.” Scholarpedia, 2(3):1763, 2007.

Jacobs, Joshua, Michael Kahana, Arne Ekstrom and Itzhak Fried. “Brain Oscillations Control Timing of Single-Neuron Activity in Humans.” The Journal of Neuroscience, 27(14): 3839-3844, 2007.

Levitin, Daniel. This is Your Brain on Music. Plume, New York, 2006.

Maex, R. and De Schutter, Erik. “Resonant synchronization in heterogeneous networks of inhibitory neurons.” The Journal of Neuroscience, 23(33):10503-14, 2003.

Masquelier, Timothee, Etienne Hugues, Gustavo Deco, and Simon J. Thorpe. “Oscillations, Phase-of-Firing Coding, and Spike Timing-Dependent Plasticity: An Efficient Learning Scheme.” The Journal of Neuroscience, 29(43): 13484-13493, 2009.

Rizzolatti, Giacomo and Craigher, Laila. “The Mirror-Neuron System.” Annual Review of Neuroscience. 2004;27:169-92.

Rosenblatt, Frank. “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain.” Psychological Review, 65(6), 1958.