AI In The Future

AI In The Future
The plausible, the implausible and the plain eccentric

No journey through the AI landscape would be complete without pausing to peer into the distance – to see what’s up ahead and wondering what lies over the horizon.

So in this module we will cover topics such as:

The technology of quantum computing: Here we will look at how quantum computers and a quantum-based Internet could change everything – if the technology can be made to work in a practical way, and at scale.

Towards a new paradigm for computing: We will also ask what lies beyond the von Neumann model of computing by trying to re-imagine what we mean by programming and asking whether it is possible to remove the human-defined intellectual constructs (e.g. APIs, programming languages) that presently limit how computational structures interact with each other.

One of the many insights gleaned from Module 2, which is focused on neuroscience, is that the brain seems to be processing information in a complex web of ‘overlapping, dynamic loops’ which may not be directly replicable within our present paradigm of binary computing.

For example, the only way we know how to represent a given unit of information (e.g. a defined idea, like a fish or the speed of a car) is by using some pattern of 1s and 0s. The pattern might be very simple, like 01000101, or it might be a complex data structure containing many interconnected tables.

But the brain seems to be able to represent a given unit of information in a way that includes a multitude of dynamic, overlapping relationships to other units of information the are in different categories (e.g. sound, vision, memories, spontaneous ideas etc.)

This suggests that there may be another paradigm of computing that lies beyond what we currently understand where information is represented and processed using a much looser, but ultimately far richer and more powerful approach than that used in computing today.

We’ll be taking a look at this question in some detail in an effort to answer whether the computational model used in the brain  is even addressable with our current concept of computing – at all.

Things we don’t know: Here we will probe the gaps in our theoretical understanding of AI by discussing the ‘known unknowns’ and the possible existence of important ‘unknown unknowns.’

The material in this section will be bold and controversial because we will be road testing and, when required, challenging some of the prevailing scientific consensus as it relates to AI.

One of the several key problems in AI is that we do not yet have a scientific definition for the very idea of “intelligence”, which might come as a surprise considering how the entrenched the term “artificial intelligence” has become in science and popular culture.

The elephant in the boardroom here is that “intelligence” and therefore “artificial intelligence” are not actually scientific ideas. Instead, they are vague concepts that currently have no scientific basis.

We do have a long list of actual scientific ideas – like mass, angular momentum, electric field strength, gravitational attraction and many, many more – which are precisely defined quantities that can be accurately and reliably measured. All of these ideas are rigidly connected to a sound axiomatic foundation by well-proven and robust mathematical structures. This is what real, hard science looks like.

In comparison, “intelligence” is just not in the same class – which is why I said that it not a scientific idea. We look at this in much greater detail in Module 3.

This fact has somewhat disturbing implications for what certain parts of the AI community are doing and where they are headed. Many people who are working in AI might not want to hear this message but it is a  fact that the field within which they work has no axiomatic basis.

One of the practical problems that arises from this is that we do not yet know how to encode abstract ideas, like “imagine”, “think” or “understand” in a way that correlates with how those ideas are manifested in a biological brain.

It is easy to create an arbitrary model for what “think” means in a given AI system, but we currently have no way of knowing whether that is how “think” is implemented in the human brain – the reason being that we do not yet have a sufficiently advanced understanding of how the brain works.

At a practical level, our inability to codify abstract ideas means that we do not know how to define the idea of, say, a “gorilla” – so that we can efficiently analyse a picture and reliably identify a gorilla based on the semantic ideas that define a gorilla. We simply don’t know how to do this – and the hard reality is that nobody currently even knows where to start.

Because we don’t yet know how to answer this hard question all we can currently do is to use a brute force approach, which in this case means building a hugely complex computational structure that has millions of adjustable knobs and enough functional complexity so that it can analyze every single pixel in the picture in order to reliably identify a gorilla. But what if the picture is of a gorilla hanging upside down? Or a artist’s stylized rendition of a gorilla? Or a man half dressed in a gorilla suit?

A 5-year old child would not be fooled by these examples, by a complex neural network might well be.

Even though the resulting deep neural network is very powerful and useful, it is based on a totally crap technical solution.  There must be a better, smarter way to do this!

In years to come I believe we will look back on today’s CNNs as we do today when we look back on IBM Deep Blue – both will be viewed as different generations of ‘brute force’ approach, although realised at different levels of abstraction (chess being an easier problem than image recognition, but you can’t solve image recognition without first knowing how to solve Chess).

I really do think that we now need to accept that the ideas contributed decades ago by people like Alan Turing, John von Neumann and Marvin Minsky and more recently by people like Geoffery Hinton have taken AI about as far as it can go.

There is of course still plenty of room for steady incremental improvement for years to come – maybe decades even – but AI as it is currently defined is stuck at a certain intellectual level.

The good news is that based on our current, limited understanding of neuroscience, it is obvious that there must exist levels of intellectual abstraction that lie far above the one we’re using to build AI systems today.

I firmly believe that there are ways to be far, far smarter about building AI systems. The technical challenge is to discover those ways.

If AI is to progress to the next level then we are going to have to confront hard questions like those identified above head-on.

And we will need to be prepared to deal with the consequences – even if that means revisiting  some fundamental ideas that from part of so-called “settled science”. This, by itself, will prove to be an insurmountable barrier for many credentialed, respected scientists.

AI awaits the arrival of a fresh mind, maybe an outsider, who has the right balance of chutzpah, intellectual horsepower and utter genius, to clearly see what others in the field cannot: I am talking about someone in the class of Michael Faraday, lsacc Netwon, Marie Curie, Kurt Gödel or Albert Einstein.

Making sense of media ‘hot button’ topics: We’ll also look at Artificial General Intelligence (AGI), brain uploads, machine sentience, superintelligence, the singularity, ‘sim theory’, SkyNet scenarios and whether we should give robots human rights because they might have feelings.

I’ll just say at this point that some of these ideas are utterly silly but you’ll need to get into the content to find out which is which.

Here’s a bit more detail on some of the topics that are covered in this Module:

Quantum computing

Although still embryonic, quantum computing currently sits somewhere between science and technology.

The way it is currently positioned is as a way to solve a certain class of computational problem that lies beyond the reach of even the most powerful conventional computers. There is controversy about what proportion of problems that can be addressed by conventional computers can also be addressed by quantum computers.

More tantalizing – and still highly controversial – is the extent to which quantum computing will eventually replace conventional computing (putting to one side the current need for cryogenic cooling which would somewhat complicate mobile applications…).

But the really big idea, and one that is even more controversial, is whether it is possible to instantaneously send information using quantum entanglement.

Mainstream science says that this is impossible, but clever engineering implementations have suggested that some form of information transfer over long distances – without the need for any intervening transmission infrastructure and without sending any matter – might be possible after all.

This is a very controversial idea, however, which may or may not be something we can do.

But if this turns out to be possible then it would open the door to a new form of computational structure where the ‘neurons’ in each neural network were instantaneously connected to each other – as well as to all of the neurons in any number of additional neural networks. This seems to be what might be happening in a biological brain.

Beyond hyperparamer selections: New directions for AI theory

We will also look at how the shift to an AI-centric world could precipitate a sea change in our conception of computing.

For instance, the von Neumann architecture forms the basis of every single computer in operation today – but this brilliant idea is now 65-years old and is not well suited to efficiently executing AI algorithms, as we will see.

Another important factor is trying to identify gaps and fill gaps in our theoretical understanding of AI.

One example of this is that important types of AI system have no short term memory ability, at least not in the same was as we understand it: when a robot falls over it has to start from scratch and train itself all over again.

Perhaps surprisingly, there is presently no theory that would allow the robot to say “This position is similar to one I got myself into three weeks ago” and then re-use that learning in the new, slightly different situation.

This is a bit like the situation where we train a computer to ‘learn’ to recgonise a rhesus monkey by showing it lots of pictures of rhesus monkeys. A child would only need a few examples and answers to a few questions. The child somehow understands the ‘idea’ of a rhesus monkey but the computer does not.

Another example is that as the building blocks used to build AI systems become more numerous and more powerful leading AI engineers and scientists are focusing more on experimentation – which means  devising clever ways to bolt together AI systems – rather like how you might try to create a range of different models from a kit of Lego parts – and deciding on the optimum set of “hyperparameters” for a given neural network:

There are essentially two sets of knobs that need to be adjusted to convert a raw neural network that doesn’t actually do anything into something that does something useful: on the left is a control panel of maybe 100 switches and knobs which all need to be set into specific positions before the training process starts. The settings on this panel do not change during training, but if you get the selection wrong the training process might fail. These knobs and switches are called hyperparameters.

On the right is a second, much larger control panel that might contain millions of knobs (two for each ‘neuron’, or unit in the neural network). The designer  uses a software programme to make iterative adjustments to these knobs in order to try to converge on a set of values that allows the network to reliably recoginse a given pattern (e.g. gorilla) that forms part of a given data structure (e.g. image file)

Here are a few examples of the hyperparameters :

  • Number of hidden layers: how many layers of neurons (units) between the input and the output?
  • Number of units per layer: how many neurons (units) on each layer?
  • Activation function (e.g sigmoid, tanh, ReLU, leaky ReLU etc.)
  • Learning rate: what value of ‘alpha’ should you choose?
  • Training strategy: Do you randomize the training examples, or segment them into logically distinct groups, or mini batches?
  • Minibatch size: How may training examples per mini batch?
  • How many iterations shoudl you try to achieve convergence for a given training set.
  • What regularization technique are you going to use to encourage convergence: L2 regularization, dropout regularization, early stopping or data augmentation?

All of these decisions have a bearing on the performance of the final neural network and an incorrect set of decisions can easily result in the network failing to converge during the training phase.

But how do we know that a given neural architecture or set of hyperparameters is sufficient to solve a given AI problem?

The answer is that we don’t: when struggling to solve a hard AI problem , we have no way of knowing whether we just need to be more creative – essentially just try harder – or whether the problem will remain fundamentally unsolvable without one or more new functional blocks.

This applies when designing a neural network, as well as at a higher level where we get into the question of whether the idea of a neural network is the right one for a given problem.

As we will see in Module 2 (Neuroscience) the neural architecture of the cerebellum and cerebral cortex are radially different and it is likely that this is for a reason.

But so far, we have just one class of neural networks and have no theory for being able to calculate the optimum neural architecture for a particular problem.

Hot buttons

We will lastly spend some time on the ‘hot button’ topics that are so favored by the media.

Thinking on AI has been self-organizing into two rival factions, each of which has very a different view on what AI is and where it will take us.

Given that some of the doomsday scenarios feared by people like Elon Musk can be linked to the dystopian worlds portrayed in sci-fi movies like Terminator and The Matrix, then it is easy to see why the media has been busy whipping things up.

Misleading headlines like: “Google AI creates its own ‘child’ that’s more advanced than systems built by humans” do a great job of scaring people or making them angry which is another way of saying that they drive traffic and ad revenue.

But they also polarise opinion – a situation that is made worse when certain commentators make outlandish statements that have no basis in fact and cannot even be supported by serious logical arguments.

In this module we will use our understanding of AI to get a proper grip on all of these topics.