Friday, 11 March 2011

Preadaptation Challenged By Functionality Discreteness

Evgeny Selensky

In this paper we consider some interesting and controversial biology issues which, in our opinion, present considerable challenges to the contemporary neo-Darwinian paradigm. We do not pretend to provide a deep insight into the problems discussed. The presented experimental results are due to others. We only make an attempt to analyse those issues in light of the available empirical evidence from two competing standpoints: the widely accepted theory of macro-evolution and an alternative theory of Intelligent Design. These problems are in the centre of a heated discussion because they have a bearing on the origin of the world and of life and consequently touch upon religion and philosophy.

Preliminary Considerations

Intelligent Design (ID) is a novel mathematical (probabilistic) theory which claims that for a complex object it is possible to establish with an acceptable level of assurance if this object has been influenced (configured or made) by an intelligent agent. In other words, in the space of possible configurations of complex systems there may exist zones bearing a signature of intelligent agency which can be reliably identified. As regards our discussion below, ID claims that living organisms are an example of intelligently configured systems.
The theory of macroevolution (ME) rests upon the concept of preadaptation that plays a central role in arguments against Intelligent Design.

It is important to realise that ID does not question the existence of preadaptation or microevolution since both of them are observable. However, one should ask two important questions: to what extent can organisms adapt and, likewise, what are the limits of microevolution? What is presented below shows that neither of these questions is trivial.

In addition, a certain asymmetry of ME has to be pointed out in relation to ID. While ID in principle does not contradict preadaptation or self-organisation requiring only the correct identification of their limits, in ME any intelligent cause or interference in the origin or development of life is completely ruled out. A detailed discussion of whether or not the assumptions of ME or ID can be regarded as purely scientific or what the notion of scientific should really mean is beyond the scope of this paper. Here we shall only say the following:
  • ID is scientific to the extent that it operates with observable phenomena, formulates and tests its own predictions (see e.g. [UncommonDescent, ID Predictions]) and is falsifiable in the sense of Karl Popper [Wikipedia, Falsifiability]. For a detailed account, see [UncommonDescent, Frequently Asked Questions]. To analyse an outcome of some process it is not necessary to be able to replicate the process itself. There exists the scientifically legitimate option to analyse the information content of the available outcome [Wikipedia, Kolmogorov complexity]. 
  • The principles of ID can successfully be used in forensics, sociology, psychology and other disciplines.
  • Evidence such as fossil records that have traditionally been seen as supporting ME is inconclusive and can also be interpreted in terms of ID. E.g. the sudden emergence of a wide diversity of body plans in Cambrian strata (the so-called Cambrian explosion) does not fit into the classical Darwinian scheme of drastic complexity changes as a result of infinitesimal steps over vast periods of time.
  • Of a number of theories available we should choose the one that best fits experimental data. In our estimation, Occam's Razor is much better satisfied by ID than by ME [Wikipedia, Occam's Razor].

Preadaptation (also referred to as exaptation) is a shift in the function of a trait at different stages of its development. It becomes possible as a result of an ancestor randomly acquiring a characteristic that after some generations becomes useful for progeny. Usually it is interpreted in light of evolution [Wikipedia, Exaptation]. It is commonly believed that preadaptation allows one to explain the increase of structural and functional complexity of organisms in the context of common descent. Attempts have been made at presenting certain processes in biological systems as those caused by preadaptation. We do not question the possibility of any kind of preadaptation, be it biochemical, anatomic or behavioural. The purpose of this paper is to informally show that the probability of preadaptation and, consequently, that of ME are tightly bounded by biochemical complexity.

A popular example used in support of ME puts in one sequence in ascending complexity the vision organs of contemporary organisms: from light sensing cells of worms to eye cavities to human eyes. An illustration of this example was published in [Ichas 1994]. To facilitate discussion, we present a variant of it here in Fig.1.

Fig.1. Different organs of vision shown in ascending complexity interpreted as stages of a single evolutionary process (taken from [Wikipedia, Eye]). 
Click on the picture to magnify.

It is generally believed that the organs in Fig.1 are different stages of the same evolutionary process that consists in gradual acquisition of biological complexity in descendant species. Another example of this kind is a supposed functional switch of plumage from thermoregulation to flight. It is commonly believed that examples like the above are sufficient to prove that there are no complexity gaps not only between different species but also between inorganic matter and living things.

As an aside remark, the weakness of this argumentation can readily be seen from a very slim survival probability of transitional forms which have some critical organs under-developed as was first pointed out by George Mivart [Wikipedia, Exaptation]). On the contrary, every organ of vision in Fig.1 quite satisfactorily serves its purpose: the light-sensitive cells of worms detect slightest changes in brightness while the human eye helps analyse distances to objects, their size and colour in very much detail. Moreover, we think it much more probable for quite the opposite to happen in practice. It is the reduction of unused functionality due to significant changes in living conditions over time. Such is the almost complete loss of eye-sight in moles, or degeneration of limbs of some inhabitants of the ocean deep or other species which do not need much locomotion. For another example, see [Gauger et al 2010] which shows that often functionality reduction is favoured by organisms over the more costly adaptations acquiring new functions even if paths leading to new functionality are very short.

It is also known that microevolutionary adaptations (often caused by just a few mutations) usually work at the protein level. For instance, the ability of hemoglobin of mountain species such as llama to bind oxygen is stronger than that of species settled in the plain; the ability is even stronger in llamas before they are born because it is during the prenatal development that the species need it most. This adaptation is a direct result of several mutations slightly altering the protein structure. Having said this, we have little grounds to claim that protein ME takes place: the folds of protein domains are largely the same for all organisms; proteins (or at least protein domains) of higher organisms have virtually the same complexity level as bacterial ones [Finkelstein, Ptitsyn 2002]. This suggests that macroevolution requires substantial changes at the higher levels of biological organisation: cellular, tissue or organ. But this is not so much it.

Complexity of Life: irreducible or redundant?

The principal argument of ID is irreducible complexity of organisms. Life is something much more complex than inorganic matter and its complexity cannot be taken away without destruction of life itself. The removal of a single component from the whole lot of interacting parts in a living organism more often than not results in mutilation or death. The notion of irreducible complexity was first proposed by Lehigh University biochemistry professor Michael Behe in his "Darwin's Blackbox" [Behe 1994] and then developed by [Dembski & Wells 2007], see also [Behe 2000].

There are many known natural phenomena that include a number of co-acting factors whose joint operation results in something quite different from the operation of each factor in isolation. Resonance is one such example whereby the amplitude of oscillations of a mechanical system quickly increases as soon as the frequency of the external force coincides with an eigenfrequency of the system. Another example is a catalytic reaction occurring at much higher rates in the presence of special chemical agents called catalysts than without them. Sometimes the presence of catalysts is necessary even to start a reaction.

According to ID, life is characterised by irreducible complexity. This manifests itself in the existence of cyclic causal relations in the functioning of an organism: in their simplest form, failure-free functioning of component А depends upon the presence of component B and vice versa. The theory of evolution attributes this to gradual switching from one function to another via multiple preadaptations. Furthemore, there are many examples of redundantly complex biosystems which might have been aquired as a result of a series of undirected modifications. Then what Behe calls irreducibly complex systems might have come about as a result of simplifications that followed, which I think is a plausible scenario as long as we are within a microevolutionary framework. In other words, once a formal function as per [Abel 2011] has been present in the biosystem, its modification either towards complexity or simplification is in principle possible via microevolutionary pathways.  However, even though preadaptational scenarios are in principle possible, Darwinian gradualism faces multiple particular challenges. As [Violovan 2004] points out, in this case evolution of an irreducibly complex system by preadaptational rearragement of existing functional blocks is extremely unlikely because it requires concurrence of many random processes (simultaneous mutations of receptors and their ligands, enzymes and their substrates/modulators, etc.). At the same time, natural selection all the way through must make sure the selction is directed towards the full function.

I agree with David Belinski who in his review of Michael Behe's book in the press (see quote on the book cover) likened this problem to quantum physics where energy transfer which on the macrolevel appears a continuous process is in fact discrete on the quantum level. Behe in essence claims that something similar happens on the biochemical level of life where Darwinian type gradualism does not appear to be satisfactory because the probability of multiple gradualistic preadaptations in a number of biochemichal systems is prohibitively small.

Let us try to develop a better understanding of the argument, is there a place for the concept of irreducible complexity in biology? The main idea of this paper is taken from [Behe 1994, Behe 2007]. In this paper we discuss the concept of complexity of life from the computer science perspective.

Let us return to the illustration of the evolution of eye-sight in Fig.1. This picture is meant to demonstrate the smoothness of transition from each preceding form to its successor. Unfortunately, this illustration, as many other examples of this kind used by proponents of ME, does not provide enough concrete detail whereas it is in considering such details that irreducible complexity is revealed. That details are omitted makes an impression of simplicity. Consequently it appears unnecessary to work out how each particular component of an organ takes part in the functioning of the whole of it at each stage of the presented "transition" from the simple to the complex. As Michael Behe rightly points out, this and other similar examples can only be valid if they are able to demonstrate convincingly the ease (i.e. acceptable mathematical probability) of appearance of every new component of the system more complex than its predecessor. It is not that the human eye cannot function without the lens, it can indeed. Rather, the eye is a system composed of irreducibly complex biochemical systems responsible for sensitivity to light, focusing the image, etc. Just ensuring the right curvature of the retina cell layout requires a complex biochemical mechanism.

Interestingly, Darwin in "On the Origin of Species" [Darwin 1859] did not want to discuss the question of the origin of life, however vaguely mentioning Creation. The existence of such incredibly complex and perfectly functioning biological systems as the human eye the scientist considered hard to explain with the help of his famous theory. As Michael Behe argues, that was to be expected because biology of Darwin's time could not address those issues. In other words, the state of the art in biology of the 19th century would interpret Fig.1 as a gradual evolutionary development and such an interpretation would then be regarded as plausible. Contemporary biology however requires us to carefully review explanations of this kind because they are now much less credible than before. The complexity of the initial point in this series, i.e. of a light sensing cell is already very high. But even if we disregard this, we still cannot see that each next stage of the supposed process is simple enough in the Darwinian sense. Consequently such a process is highly improbable as we shall see through some figures below. Before we extend on this subject, let us make a few preliminary remarks.

Firstly, since ME claims there is no need to employ any kind of decision making agents at any stage in the development of life, including its origins, it relies solely on unintelligent search in multi-parameter spaces that is automatically directed by natural selection. Natural selection is activated by changes in the environment of a biological system. Search parameters are height, weight of an organism, the ratio of the area of the wing to body mass for birds, birth rate, structure of various vital organs, the complexity of the nervous system, behavioural features, etc. It is clear that the number of parameters necessary for the search quickly grows with the increase of structural complexity of organisms, from the simplest to the primates.

Secondly, a step in the search is a very small change in the magnitude of one or more biological parameters caused by mutations of one or more genes. The most probable scenario is a successful change of one parameter at each step. The probability of successfully changing more than one parameter at a time quickly reduces as the number of parameters increases. A change in parameter values is considered successful if, as a result, an organism/population is better adapted to its current environment.

To summarise, evolution is nothing but optimisation of structure and of the biosystems' functioning in their habitats. The question however is how wide the limits of evolution are. While micro-changes (roughly, taking place within species) are easily observable, macroevolutionary changes such as the emergence of new species could not be observed in the simplest living forms with a short enough lifespan of one generation. However, experiments on aphids recorded by a Soviet biologist G. Shaposhnikov in the 1950-60's do report speciation. Shaposhnikov achieved reproductive isolation of aphid populations over as short a period of time as a few months. Note that reproductive isolation constitutes an important characteristic of a species. [Rasnitsyn 2002] argues that the incredibly high speciation rate could have been due to the selection pressure stemming from a single criterion, the type of a host plant. There are also reasons to believe that epigenesis might have contributed to the result in this case [Chaikovsky 2006]. Furthermore, it is conjectured that in his subsequent experiments Shaposhnikov might have even crossed the boundaries of a genus but this has never been published, unfortunately, and is less certain [ibid]. Nonetheless, even though speciation is possible, in practice the formation of higher taxa is still extremely unlikely as we argue herein.

Consequently, ME is much less obvious. As a matter of fact, in order for such complex mechanisms as eye-sight, blood-clotting, metabolism, photosynthesis, DNA transcription/translation and the like to form as a result of Darwinian evolution it is necessary to assume that search must have taken place over a large number of parameters simultaneously, which severely limits the probability of finding an optimum without the help of intelligent decision making agents.

An Illustration from Computer Science

Let us illustrate this with an example from computer science. Assume we are required to find an optimal configuration S of a known system under certain given conditions. Assume also that S depends on two parameters: X and Y. Mathematically, it is expressed as S = S(X,Y). The optimisation function Q(S(X,Y)) which is often called a fitness/objective function, reflects the quality of a given configuration. In this particular case of two arguments, the fitness function is an ordinary 3D surface (Fig.2). If more parameters are involved, we will have a hyper-surface while all our reasoning will remain unchanged.

Fig.2. An objective function of two arguments.

For biological systems, the objective function expresses the degree of their adaptation to the environment. Without loss of generality, assume that small values of the function correspond to better adaptation and consequently to preferred parameter value combinations (known as tuples). In this context, preferable configurations will be represented by pits while configurations that should be avoided, by peaks on the fitness surface. In other words, we are interested in the minimisation of Q (for maximization all we need to do is negate Q). Our algorithm that mimics Darwinian evolution will be roaming on the landscape looking for pits, ideally, for the deepest pit, which will correspond to the best configuration S*(X,Y). The quality of adaptation of a given biological system is expressed by a point on the fitness landscape. A neighbourhood of N(S) is a collection of other configurations reachable from S by one move of the search algorithm. Quite clearly, the biological system itself does not have any information about the preferred tuples because searching is done automatically.

Such computational schemes are called local search algorithms and are well studied in combinatorial optimisation. It is well known that local search is prone to getting trapped in local optima, i.e. near solutions that are best in a given neighbourhood but not necessarily globally [Michalewicz & Fogel 2002]. If our search algorithm always selects only improving moves - which is what microevolution is doing - then once it finds a local attraction basin, it will not be able to move out of there to continue looking for better solutions. Clearly, around the bottom of the basin any point is higher and consequently the respective parameter tuples are worse than in the the bottom itself. It is not possible therefore to get out of this basin without controlling the search. This is known as the problem of local minima.

Local search behaviour can be likened to the motion of a heavy ball (Fig.3) from an initial uphill position (shown dashed) on an uneven terrain acted upon by gravity and friction only. The ball, once in the pit on the left (labelled as the local minimum), will not have enough momentum to get over the hill to the right of it. Therefore it will not be able to get into the globally best pit (global minimum) beyond that hill. Instead, it will only oscillate around the bottom of the left pit and eventually stop there due to friction.

Fig.3. A heavy ball moving under gravity and friction from its initial state (dashed line) is trapped in a local minimum (the pit on the left). The global minimum (on the right) is unreachable due to insufficient momentum.

Let us now assume the algorithm somehow can deal with the problem of local optima. But even that is not enough. Michael Behe states the following. In order for biochemical search to find another local minimum, it is necessary to successfully vary X and Y simultaneously, which has a much smaller probability than varying a single parameter as long as we do so by Darwinian blind local search. This is because of all possible (X,Y) parameter tuples far not every one is biologically feasible. For example, human eye-sight is ensured by incredibly delicate tuning of a whole number of retina cell parameters (physical and chemical properties of its proteins, 11-cis-retinals, transducin, rodopsin and others, as well as the magnitude of the difference between the concentrations of sodium ions on either side of the cell membrane, which is necessary for an electric signal to be generated and sent to the brain). A small deviation from the fine-tuned values can cause the relevant complex biochemical reactions to fail, which will result in eye-sight loss. Here Behe's mousetrap effect manifests itself: the particulars glossed over by ME lead to insurmountable peaks on the fitness landscape, which hinders Darwinian local search from finding progressively better parameter value combinations for biological systems.

The mousetrap analogy was presented in "Darwin's Blackbox" and is briefly as follows. In order to construct a mechanism such as a usual mousetrap we are required to set values of many parameters simultaneously: the strength and orientation of the spring, the angle of action, the mechanical properties of the clamps which attach the spring to the base, the relative position of the bait, etc. such that the complexity of the whole structure is increased in comparison with the complexity of its individual components (the spring, clamps and base) in order to make sure the mechanism works as desired. So we have two options: we either assume the purposeful setting of the many parameters by the person who designed and built the mousetrap (an intelligent agent) or deal with incredibly small probabilities of multiple coincidents. This argument, in our opinion, speaks clearly in favour of ID in relation to biological systems.

Let us return now to local search. Usually the search is driven out of local optima using special algorithmic controls known as meta-heuristics. A meta-heuristic kicks in as soon as it detects that the search is cycling having come back to a previously visited state. There are many meta-heuristics but they all serve one purpose. They "persuade" the search to temporarily agree to accept worse states in the hope that after a certain number of moves the algorithm will leave the local attraction basin and continue searching for better solutions. Some meta-heuristics (notably, the tabu search [Glover & Laguna 1994]) in addition hold information about N previous moves in order to avoid making past mistakes. What is important to note is meta-heuristics require much more control than blind local search. They would consequently be even stronger evidence in favour of ID if anyone could show that search in biological systems had any meta-heuristic component. On the contrary, classical Darwinian search completely rules out any temporary compromises that would lead to accepting worse configurations in an attempt to get out of local minima. So it cannot cope with local optima itself. But maybe there are scenarios where it can be driven out of local optima automatically by some other means? Let us consider how probable those scenarios are.

Is There A Way For Darwinian Search To Escape From Local Minima?

We pointed out earlier that preadaptation may take place at genetic level. One of possible examples is gene duplication whereby one of the two identical genes is thought to be able to accumulate any mutations without restraint because it is driven out of the pressure of natural selection [Kimura 1983]. Then it can fortuitously acquire a new function, in which case it is said that a functional switch has occurred. The new function can subsequently be used in natural selection. Theoretically this could serve the search to overcome local minima acting as a meta-heuristic. However, the probability of multiple gene switches is extremely low as we shall see later on. Even if we accept a very rare functional switch as possible, which it is indeed, Darwinian search may require too many of them in order to get through to another local attraction basin. In this context, it is natural to ask, why did not Darwinian search evolve towards the definitely beneficial algorithmic adjustments such as those that enable the past search history to be recorded in order not to repeat previous mistakes similarly to the tabu search? In ME such memory would have allowed organisms to save time so critical for survival. On the other hand though, this could have been a consequence of the fact that there is no free lunch: on average over all problems no single search algorithm is better than any other. We will discuss this in a moment.

Another theoretic possibility for Darwinian search to be able to escape from local optima is to change the fitness landscape itself. This is possible as a result of big enough changes in the environment: configurations that are optimal in given environmental conditions may not be optimal in other conditions. But again, the probability of multiple coincidental parameter adaptations is extremely low. The amount of time necessary for an organism to respond to changes in the environment is tightly bounded from above by the requirement of survival. But, as we shall see later, the published lower bounds concerning just protein preadaptation are too high to allow the possibility of parameter variations big enough to cause substantial (phylogenetic) changes in biological systems.

A new group of local search algorithms have been proposed recently that are called very large neighbourhood search [Wikipedia, Very Large-Scale Neighborhood Search]. These methods very little, if at all, depend on meta-heuristics in finding solutions. This is because at any single move a very large number of parameter values are varied. The problem of local minima is removed at the expense of a combinatorial explosion in the neighbourhood size, which requires sophisticated algorithms to efficiently search among promising neighbour states. We believe that ME type blind search could be very similar to these algorithms, if it were not for a prohibitively low probability of finding favourable states in its huge neighbourhoods, which we shall discuss in the next section.

We should finally say a few words about the possibility of modelling macroevolutionary processes with non-point search algorithms based on co-operation of multiple local agents. It is especially important due to the fact that these algorithms can provide a substantial gain in terms of execution times [Huberman et al, 2010], which may appear to endanger our previous conclusions. Before we do so, it is worth pointing out that for life as a whole any algorithm averaged over all possible problems is as good or as bad as any other, for an explanation see the so-called No Free Lunch (NFL) theorems [Wikipedia, No Free Lunch theorems].

While such algorithms as particle swarm optimisation or game-theoretic schemes can nicely model co-operation of intelligent biosystems (the behaviour of insects, birds or fish). These algorithms as well as the so called genenic algorithms follow a priori rules and need to be tuned. Their function is to drive the search towards areas in solution space with relatively high solution density. In practice this is ensured by intelligently tuning their parameters [Ewer et all. 2012]. Usually this important point is overlooked. Apart from this, the existence of rules which do not depend on and therefore cannot be reduced to physico-chemical constraints is already a very good empirical pointer to intelligence [Abel 2011].

Assume that we want to do an intellectual experiment aimed at demonstrating that a group of agents are able to move towards a joint win through ordering their mutual behaviour or increasing the complexity of their co-operation. In order to reliably show that these algorithms can model Darwinian unintelligent evolution, it is necessary to correctly set the initial conditions. This means we should require that the agents be initially unintelligent and that there be no information held in the system about how the structural/behavioural advances should be achieved. In other words, any form of intelligence (in particular, animal instincts in the case of the swarm algorithms or the agents' knowledge of the rules of the game) must be ruled out prior to the start of search. [Dembski & Marks, 2009] express the same requirement by stipulating that the agents should not be allowed to use complex specified information a priory (see below).

As far as efficient genetic algorithms are concerned, the fact that they exist does not, in our opinion, speak for the plausibility of phylogenesis because we belive that they can only model microevolutionary processes and therefore cannot lead to a dramatic increase in the functionally specified complexity of the system. It is known that DNA stores ontogenetic information about the future development of an organism. This information is prescriptive and formal by nature. An example of such formal prescriptions could be a written recipe or any other algorithm i.e. a sequence of actions to achieve the desired result. So to suppose that new sufficiently different taxa (e.g. classes or phyla) could emerge by means of random mutations coupled with selection would be equivalent to supposing that prescriptive information can be generated spontaneously. However, there is no empirical evidence of this happenning whatsoever [Abel, 2011]. More fundamentally, the notion of evolutionary algorithm itself is oxymoronic since evolution has neither a formal language nor a goal. On the contrary, processes modelled by genetic algorithms require the a priori existence of some formal representation of future ontogenetic choices or a recordation of wise choices already made [ibid]. Consequently, there separately arises the extremely important question about the origins of prescriptive information in biosystems.

Known Bounds on Protein Functional Switches

Having said all that, it would be very interesting to come up with a lower bound on the amount of time necessary for one step of phylogenesis. To achieve this, we could do the following. First, calculate the expected number of preadaptational functional switches for a single protein per population of the simplest (species A). Second, knowing the life expectancy of A, we calculate the amount of time necessary to build a protein which the immediate descendants of A have but A themselves do not. We conjecture that the most optimistic bounds will be too high to consider phylogenesis a probable phenomenon even taking into account the contemporary accepted bounds on the age of the Earth (4 billion years according to [Wikipedia, Age of the Earth]).

We now present some research findings which lend grounds to believe that the probability of phylogenesis is prohibitively low.

[Behe 2007] reports the number of protein binding sites produced by random mutation. The results presented as functions of species and population size demonstrate that the number of binding sites over a substantial number of generations (equivalent to supposed millions of years of human evolution from apes) is 3 orders of magnitude smaller than what is used in a single cell. The article [Axe 2010a] is also very interesting in this respect (see also a blog entry [Deyes 2011] discussing [Axe 2010a], as the latter is, unfortunately, available on the web only as an abstract). In [Axe 2010a, Deyes 2011] a bound is computed on the size of the search space in relation to protein structures that enable life-critical chemical reactions in bacteria. The search space contains 10390 states for a medium size protein of 300 amino acid residues.

It has to be mentioned that since biological systems do not solve the problem of searching for all solutions, they clearly do not need to search over the entire state space. The state space size helps determine the likelihood of finding an improved state by Darwinian means alone. To achieve this, apart from the state space size, we also need to know the statistical characteristics of solution distribution over the search space such as solution density. [Reidhaar-Olson & Sauer 1990] report the extreme rarity of functionally non-degenerate amino acid residue substitutions. On average, only 1 in every 1063 substitutions prove to be useful as regards certain limited regions of proteins. A very close estimate of 1 in every 1064 was experimentally obtained in [Axe 2004] where the objective was to assess how amino acid  sequence changes influence the ability of certain types of protein to maintain a given function.  In that context, functionality was understood as certain levels of  hydropathy of protein molecules in order to maintain antibiotic resistance of the cell.  In that paper, an estimate of the probability of obtaining any kind of protein was assessed as well, given a primary stricture of 150 amino acid residues (the maximum length of a protein domain).   This latter estimate is 1 useful sequence out of every 1074. It is important to note that in his response to criticisms about his work Douglas Axe points out that rarity and isolation of functional sequences in the sequence space are highly correlated [Axe 2011]. Indeed, the more complex a function is, the more rare and isolated the corresponding base sequence becomes in the space of all possible sequences. This can be illustrated using some meaningful non-redundant string A of symbols: the longer it gets the more rare and isolated the sequences preserving the meaning of A are in the massive space of all possible strings of the same length. At the same time, an overwhelming majority of these strings will be jibberish.

It is clear therefore that the probability of finding improved states by Darwinian search is prohibitively small. In fact, the amount of time necessary to find a favourable protein fold according to [Axe 2010a, Deyes 2011] exceeds the commonly accepted bound on the age of the Universe, i.e. 13 billion years according to [Wikipedia, Age of the Universe]. To compare, the number of atoms in the observable part of the universe is just 1080 [Wikipedia, Observable Universe].

As regards protein preadaptation, [Axe 2010a] shows that the probability of functional switches in proteins as a result of unintelligent search is prohibitively low. Since this is of importance to us, let us briefly reproduce here the reasoning of the authors of [Axe 2010a]. A population of 1010 bacteria on average gives birth to 104 generations a year, which leads to an upper bound of 5×1023 new genotypes over 5 billion years (an approximation of the age of the Earth). It is an upper bound because far not every permutation of amino acids will lead to new proteins. Since there are only 20 possible amino acids, we have from 2070 to 20150 possible permutations for an average protein domain (a functional unit of protein molecules) containing from 70 to 150 amino acid residues (in [Deyes 2011] the bound is actually overestimated because they use the whole protein molecule instead of just one domain). It appears that the population size is assumed there limited. However even when this assumption does not hold and the number of generated bacteria is 1012 or more per litre (an average density of bacteria in a stationary phase is about 109 per millilitre), it is still much less than 2070.

The number of ways one can obtain functionally different domains is therefore astronomically large. However, it is known  that proteins are extremely selective in relation to amino acid substitutions. It is demonstrated experimentally that mutagenesis of beta-lactamases and bacterial ribonucleases leads to functional failure as a result of substitutions in only 10% of amino acid residues from conservative regions of the protein molecule. The authors conclude that finding new functional switches systematically using only Darwinian blind search is virtually improbable. It is also interesting to note that according to [Axe 2010a], functional switches usually presented as proof of protein preadaptation, were found in vitro, i.e. in a laboratory setting, and some of them have since been shown unstable in the range of physiological temperatures.

Very recent experimental findings demonstrate that the number of base changes in paralogous genes (i.e. genes related to an ancestral gene via duplication) must be severely limited in order to allow substantial genetic innovation. In other words, experiments show that mutational jumps leading to genetic innovation must not have been very large. For bacteria these limits are as follows. The number of simultaneous possible base changes for functionally neutral genes must not exceed 6, while in the maladaptive case it must not exceed 2 [Axe 2010b]. This seriously limits the neighbourhood size (see our earlier discussion about Very Large Neighbourhood Search) and consequently limits the abilities of blind Darwinian search.

A Discussion of Philosophical Ideas Behind Biological Research

In this paper we have made an attempt to present the available biological research findings as a case against the theory of macroevolution which maintains that all biodiversity is a result of random mutations followed by natural selection. We argue that the probability of phylogenesis is tightly bounded by the probability of multiple preadaptations at the biochemical level. The low probabilities are a critical issue for dynamic systems, which biological systems are, because they have a limited time to react. It appears that the prohibitively low probabilities for anything more complex than micro-evolution essentially preclude or at least substantially limit phylogenesis such that a single connected tree of life does not seem feasible.

Historically, there have been different schools of philosophical thought. Darwinism is a descendant of the Epicurean stream of thinking, whereby the existence of everything is attributed to chance. An alternative fatalist view advocates for necessity. In contemporary science it is represented by such scientists as Stuart Kauffman. It was not our intention to discuss his ideas in great detail above just because we focused our attention on standard Darwinian views. Here though we shall briefly touch upon Kauffman's ideas.

Kauffman's work contributes to the development of the theory of self-organisation in highly non-equilibrium states. A remarkable feature of non-equilibrium processes consists in the spontaneous emergence of strong correlations between previously uncorrellated states of system components. This is reflected in a fast growth of state fluctuation magnitudes and a strong dependence of fluctuations on the type of non-linearity of the system's behaviour [Prigogine & Stengers, 1984]. Self-organisation (i.e. the emergence of order out of chaos) is observable in inorganic matter (see e.g. [Wikipedia, The chemical clock]) or in swarm intelligence in living system populations, which we discussed earlier. According to Kauffman, life and phylogenesis (and, consequently, the associated information) are a result of spontaneous self-organisation of matter on the verge of chaos. He  represents optimisation of biological systems as a dynamic process with all parties co-evolving. In [Kauffman 1991] he argues that organisms can be modelled by Boolean networks, whereby species act as attraction basins separated by chaos. Mutagenesis can give rise, Kauffman states, to phylogenetic trajectories from one basin to another through chaos. Biological research suggests, however, that mutagenic chaos corresponds rather to malformation or death than to anything useful. As the research cited above demonstrates, mutations can but extremely rarely lead to something new and useful. It is only in the latter case, that we can legitimately speak about preadaptation. Also, we have seen above that there are tight limitations on the number of possible adaptational paths between species or, in Kauffman's language, of phase trajectories between attraction basins.

Yet another major school of thought is Stoicism that attributes everything to intelligent cause. Today it is represented by Intelligent Design. Of course, it is extremely hard to scientifically prove that something is not possible. We are only speaking about probabilities. The cited above experimental evidence demonstrates that biological systems seem to be more likely to reduce functionality than to choose anything more complex and more advantageous even if the path to this new functionality is very short. So in a sense, nature is more economical than we may be accustomed to think. In addition to this, we must not forget that there has been no experiment showing that a new species can emerge even for the simplest living organisms such as bacteria taking into consideration very fuzzy borders between bacteria species. The number of generations observed for bacteria since the beginning of biological experiments is huge and is equivalent to millions of years for species as complex as primates [Behe 2007]. 

Quite often proponents of the origin of life as spontaneous self-organisation of matter say the following. This way or another, something like life as we know it was bound to arise sooner or later because, they argue, this was "preordained" by the fine tuning of our universe. E.g. the properties of elementary particles, electrons, nuclei, etc. are such that chemical bonding and other phenomena, notably non-equilibrium processes, become feasible. And our universe is just one of multiple possible realisations of a multiverse with various ensembles of characteristics. Therefore,  evolutionists believe, it is incorrect to present estimates of life emergence probability, which is in fact incredibly low, as a means to infer intelligent agency. To argue against this view, we can say this.

True, the theory of evolution in the wide sense (including abiogenesis) tells us how life could have originated and evolved. However, to our knowledge, until today it has not been satisfactorily demonstrated how abiogenesis can generate rich information content characteristic of living systems (specified complexity).
  • Complexity is understood in the sense of [Wikipedia, Kolmogorov complexity] as a minimum description length over all possible descriptions of some input data.  
  • "The basic point of a specification is that it stipulates a relatively small target zone in so large a configuration space that the reasonably available search resources — on the assumption of a chance-based information-generating process — will have extremely low odds of hitting the target"[UncommonDescent, FAQ]. In some contexts, specificity refers to a given functional requirement or requirements: specifically complex systems must be functional. "Living organisms are distinguished by their specified complexity. Crystals fail to qualify as living because they lack complexity; mixtures of random polymers fail to qualify because they lack specificity" [Orgel 1973].
Interestingly, in recent publications (see e.g. [Abel 2009a]) it has been emphasised that self-organisation should be distinguished from mere self-ordering:

  1. Self-organisation assumes hierarchical  relationships between components of a system while self-ordering does not; 
  2. Self-organisation is characterised by formal processes as opposed to physico-dynamic processes that allow for self-ordering;
  3. While spontaneous self-ordering without the help from intelligent agency, examples of which we saw earlier, is observable in systems "on the edge of chaos", self-organisation is not. At least, it has not been observed anywhere to date.

Secondly, staying within the boundaries of pure science, we cannot seriously consider the possibility of the existence of multiple worlds as both untestable and unfalsifiable.

Thirdly, as we mentioned in the beginning, even if we cannot replicate a process it is scientifically legitimate to analyse the complexity of information content of the outcome of that process. As far as we know, one of the earliest scientists who proposed such ideas was Andrei Kolmogorov. E.g. [Kolmogorov 1965], in addition to statistical and combinatoric methods to quantify information, introduces an algorithmic method using recursive functions.  ID develops similar ideas and maintains that living organisms are characterised by specific information content.

Now, the basic idea behind design inference is very simple [Dembski 2007]. Assume there is a complex enough system A consisting of a large number of interacting components. Assume also that a fixed configuration of those components corresponds to some function. In this case we say that А bears a certain amount of complex specified information (CSI). A large enough amount of CSI in a system allows one to assume with an acceptable probability that A is a result of design. ID claims that generation of a large quantities of CSI cannot be explained as a result of only chance and/or necessity with a practically acceptable probability in the spirit of [Abel 2009a], which presents universal plausability bounds for a given hypothesis or theory (see also [Borel 1962]). 

An example of complex specified systems is the text of this paper which  of course is not a  random sequence of symbols but is a result of  a number of intelligent actions on the part of the author. The existing bounds of probability and  time to randomly generate a  long enough meaningful text [Wikipedia, infinite monkey theorem] do not allow us to seriously consider scenarios of spontaneous generation of CSI in nature.

Another example is somebody walking in the forest [Behe et al 2010] and seeing tree branches randomly scattered on the ground. No non-trivial inference is possible. Then some fifty yards away they notice that similar branches are laid out in a pattern, say, to cover a snare.

Yet another example is you finding that the door which you had previously locked using a 10-digit electronic device has been unlocked. It does not take you a long time to realise that to unlock that door a sufficient amount of intelligence was required so it must have been done by a human. Of course, it is possible to say that the pattern of branches was self-organised or the door got spontaneously unlocked by itself but an explanation involving design has a better chance to be correct.

Using the NFL theorems [Dembski 2007] demonstrates the impossibility of generating CSI  as a result of co-evolution in the sense of Kauffman.  

  • It has to be mentioned that the example of computing CSI in [Dembski 2007] for the flaggellum  of E. coli assumes an incredibly low probability of spontaneous simultaneous  generation of all consituents of the flaggellum, which does not take into consideration preadaptation. However,  [Behe 1994] and [Dembski 2007] claim that the flaggellum is irreducibly complex  and that any possibility for preadaptation of subsets of the flaggellum components is purely theoretical.
The disputes around ID are not so much about probability estimates, as they are about the methodological legitimacy of choice contingency as a scientifically valid cause of the observed phenomena. Opponents of ID maintain that the inclusion of it in the scope of science will invalidate the scientific method, which they understand exclusively in the framework of materialistic naturalism. For some criticisms of ID, see [Wikipedia, Specified Complexity]. Proponents of ID reasonably point to the de facto used in science, technology  and medicine methods of intelligence inference. Examples of such methods are semantic analysis of messages in information theory and coma diagnostics (see [Wikipedia, Glasgow Coma Scale]). 

In everyday life we do similar things so commonly without realising that we in fact may be using the ID reasoning. For instance, we think it quite natural that we can easily tell an answering machine from a human telephone operator. So the question is, why being able to see the soundness of ID on a regular basis can we not inductively apply this reasoning to the origin of the world and  assume that it came into being as a result of intelligent agence and not by chance and/or necessity?  

The question, from the standpoint of ID, is therefore not if information can or cannot be  generated spontaneously, which is possible within certain limits as a result of only chance and/or necessity. For example, in a message some information increase is possible due to an accidental correction of an error. However, in all practical cases a long enough initial message is a result of intelligent agency. 

A given system after it has been placed in a specific isolated zone in its phase space can rely on built-in adaptational mechanisms that allow it to move around the zone  towards an attraction basin. This microevolutionary behaviour is routinely observed in artificial and biological systems However ID claims that in vast configuration spaces of complex systems, to reach those islands of functional meaning isolated by oceans of non-function is practically impossible without careful and purposeful fine-tuning by an intelligent agent.

We are very far from disregarding the importance of Darwin's contribution to biology. However the logic of the development of scientific knowledge requires systematic reconsideration of former views as soon as empirical data that does not fit in the existing models becomes available. For this reason, it is quite natural that sometimes our understanding is challenged as a result of empirical investigations. In the 20-th century theoretical physics experienced two critical periods: the first was due to А.Poincare's and А.Einstein's work in the theory of relativity, the second followed the formulation of quantum mechanics based on the works of E.Schrodinger, M.Born and others. As a result, former models were rethought as special cases of the more generic laws.

We believe that biology is in a similar situation today. From the mid-20-th century until now has been a time of major discoveries that resulted in the emergence of modern genetics and biochemistry. These two disciplines put biology, which was only a descriptive science prior to it, in an experimental framework. It has to be pointed out that on microevolutionary level Darwin's theory nicely matches experiments while the advances of genetics over the 20-th century allow us to explain the mechanism of intra-species variability. Nonetheless, extremely complex processes taking place at the biochemical level of living matter - the ground level of life - already do not fit in the explanations relying exclusively on random unintelligent search, the basis of Darwinism. We think that the emerging theory of Intelligent Design is capable of adequately addressing the many challenges of today's biology.


I am grateful to Dr Oleg Kovalevskiy (University of York, Great Britain),   Andrei Talianskiy (Moscow, Russia), my former line manager and an Artificial Intelligence expert Dr Patrick Prosser (Glasgow University, Scotland, Great Britain), Eric Stewart (USA) and, last but not least, members of the readers' forum at for their valuable comments and detailed discussions.


  1. D. Abel (2011), The First Gene, LongView Press Academic: Biolog. Res. Div.: New York, NY.
  2. D. Abel (2009a), The Capabilitties of Chaos and Complexity, Int. J. Mol. Sci. 2009, 10, 247-291; doi:10.3390/ijms10010247.
  3. D. Abel (2009b),  The Universal Plausibility Metric (UPM) & Principle (UPP). Theoretical Biology and Medical Modelling, 6:27. 
  4. D. Axe (2004), Estimating the Prevalence of Protein Sequences Adopting Functional Enzyme Folds, Journal of Molecular Biology,Volume 341, Issue 5, 27 August 2004, Pages 1295-1315.
  5. D. Axe (2010a), The Case Against a Darwinian Origin of Protein Folds, Biocomplexity Journal.
  6. D. Axe (2010b),  The Limits of Complex Adaptation: An Analysis Based on a Simple Model of Structured Bacterial Populations. Biocomplexity Journal.
  7. D. Axe (2011), Correcting four misconceptions about my 2004 article in Journal of Molecular Biology, Blog post, Biological Institute website.
  8. M. Behe (1994), Darwin's Blackbox: The Biochemical Challenge to Evolution.
  9. Michael J. Behe, Self-Organization and Irreducibly Complex Systems: A Reply to Shanks and Joplin, Philosophy of Science 67 (March 2000), University of Chicago Press, August 31, 2000.
  10. M. Behe (2007), The Edge of Evolution: The Search for the Limits of Darwinism.
  11. M. Behe, W. Dembski, S. Meyer (2010), Science and Evidence for Design in the Universe. Proc. of the Wethersfield Inst., vol.9., Ignatius Press, 2010. 
  12. E. Borel (1962), Probabilities and Life, Dover. 
  13. Yu. Chaikovsky (2006), Nauka o rasvitii zhizni. Opyt Teorii Evolutsii, (In Russian), Мoscow, КМК.
  14. C. Darwin (1859), On the Origin of Species.
  15. W. Dembski (2007), No Free Lunch: Why Specified Complexity Cannot be Purchased without Intelligence, Rowman and Littlefield Publishers, 2007.
  16. W. Dembski, J. Wells (2007), The Design of Life.
  17. W. Dembski, R. Marks (2009), Conservation of Information in Search: Measuring the Cost of Success, Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transations, Sept. 2009, 39(5), pp. 1051 - 1061.
  18. R. Deyes (2011), Proteins Fold As Darwin Crumbles., 2011. 
  19. W.Ewert, W. Dembski, R. Marks, Climbing the Steiner Tree: Sources of Active Information in a Genetic Algorithm for Solving the Euclidean Steiner Tree Problem, Biocomplexity Journal, 2012(1).
  20. A. Gauger, S. Ebnet, P. Fahey, R. Seelke  (2010), Reductive evolution can prevent populations from taking simple adaptive paths to high fitness. Biocomplexity Journal, 2010(2):1-9.
  21. A. Finkelstein, O. Ptitsyn (2002), The Physics of Protein: Lectures, Moscow,  Knizhni Dom Universitet (in Russian).
  22. F. Glover and M. Laguna (1997), Tabu Search. Kluwer, Norwell, MA.
  23. B. Huberman, J. Mihm, C. Loch, and D. Wilkinson (2010), Hierarchical Structure and Search in Complex Organizations, Management Science, Vol. 56, 831-848.
  24. М. Ichas  (1994), On the Nature of Life: Mechanisms and Meaning", Moscow, Mir, translated into Russian.
  25. S. Kauffman (1991), Antichaos and Adaptation, Scientific American. 
  26. M. Kimura (1983), The neutral theory of molecular evolution. Cambridge. 
  27. А. Kolmogorov (1965), Three Approaches to the Definition of Information Quantity, Probl. Peredachi Informatsii, 1:1, 3-11 (In Russian).
  28. Z. Michalewicz, D.B. Fogel (2004), How to Solve It: Modern Heuristics, Springer.
  29. L.E. Orgel (1973), The Origins of Life. New York: John Wiley, p. 189.
  30. I. Prigogine, I. Stengers (1984), Order out of Chaos: Man's New Dialogue with Nature, Heinemann, London.
  31. А. Rasnitsyn (2002), Protsess еvolutsii i metodologia systematiki, Trans. Russkoe Entomologicheskoe Obschestvi. S. Petersburgh. Vol. 2002(73), pp. 4—21  (In Russian).
  32. J.F. Reidhaar-Olson, R.T. Sauer (1990), Functionally acceptable substitutions in two alpha-helical regions of lambda repressor, Protein Science, 1990;7(4):306-16. 
  33. K. Violovan (2004), Anti-Perakh: Blind Chance or ... Blind Chance? In Russian.