« Lambda counting » : différence entre les versions

De Wiki du LAMA (UMR 5127)
Aller à la navigation Aller à la recherche
Ligne 1 : Ligne 1 :
==Introduction==
==Introduction==


what kind of properties : structural (for functional programs), behaviour (SN, weakly normalizable, ...


This paper addresses the following question. Having a
references to known results on : turing machines, cellular automata
(theoretical) programming language and a property, what is the
probability that a random program satisfies the given property ?
In particular, is it the case that almost every random program
satisfies the desired property i.e. the probability is 1 ? This
kind of question has some known results for Turing machines and
cellular automata. Some of them will be given in section ??.


We consider, in this introduction that this notion of
we concentrate on combinatory logic, lambda calculus)
probability is, at least intuitively, sufficiently clear but, of
course, this will have to be made precise.


We will concentrate in this paper on functional programming
languages and, more specifically, on the lambda calculus, the
simplest such language. Various properties can be studied. Some
concern the structure of a term, some concern its behaviour.


The first question for which it would be desirable to have an
answer is the following: give a "simple" equivalent for the number
of terms of size n. This question is, at present and as far as we
know, unsolved and the usual technics of generating functions does
not work. See section ??? for more details. We give here upper and
lower bounds for this number. This estimation will be enough for
our purpose but the gap between the lower and the upper bound is
to big to hope an equivalent.


For other questions, some experiments have already been done (see
This paper adresses the following question. Having a (theoritical) programming language and a property of programs in that language. What is the probability that a random program satisfies the given property ? In particular, is it the case that almost every random program satisfies the desired property i.e. the probability is 1 ?
for example ...) which clearly indicates the desired result. For
This question has some known results for Turing machines and cellular automata. Some of them will be given in section ??.
example, they "show" that almost every closed lambda term begins
with a <math>\lambda</math> but, as far as we know, there was no
"proved" result of this form.


This paper proves some non trivial results on the structural form
In this introduction we consider that this notion of probabilty is, at least intuitively, sufficiently clear but, of course, this will have to be made precise.
of a lambda term. In particular we show that almost every closed
lambda term begins with "many" <math>\lambda</math> (the precise
meaning of this is given in theorem ??), that they bound "many"
occurrences of the corresponding variables (theorem ??) and that,
given any fixed closed term, almost ''no''
<math>\lambda</math>-term has this term as a sub-term (theorem
??).


Our original motivation was to consider the property of being
We will consider this question for functional programming languages. For this kind of languages we can consider various properties. Some concern the structural of the program, some concern its behaviour.
terminating. Is a random term strongly normalizable (SN for short)
i.e. every sequence of reduction terminates ? This question is, at
present, unsolved and the experiments we have done do not even
give an idea of what the result should be. It is known that being
SN is an undecidable question and it is thus not easy to count the
number of SN terms of big size. It is clear that having
<math>(\delta \ \delta)</math> as a sub-term is a sufficient (but
not necessary) condition for being non SN but as the experiments
have shown (and as we have proved) almost no terms contains
<math>(\delta \ \delta)</math> and this is thus useless to have a
result for non SN. On the opposite direction, it is known that, if
a term t is not SN, then a pattern "looking like" <math>(\delta \
\delta)</math> (the precise meaning will be given in section ??)
must appear in t but the experiments seem to show that almost
every term possesses this pattern (actually, we have not been able
to count for big enough terms to have a clear idea of the
probability of possessing the desired pattern).


Another question, which is also undecidable, and for which finding
The simplest such language is the lambda calculus.
the probability seems to be even more difficult, is to get the
Some experiments have been done for it (see for example ...) which clearly indicates that, for example, almost every closed lambda term begins with a <math>\lambda</math> but, as far as we know, there is no proved result of this form.
probability of being weakly normalizable i.e. there is a sequence
of reduction that terminates.




Combinatory logic is another programming language related to the
lambda calculus. It is a way of coding lambda terms without using
bound variables. The coding is fair for the questions we are
concerned with in the sense that there are translations in both
directions which, for example, preserve the property of being SN.
We have also studied this language and, surprisingly, the results
are very different from those for the lambda calculus. For example
we show that, for every fixed term t_0, almost every term possess
t_0 as a sub-term and this of course implies that almost every
term is non SN. Note that the increase of size in the translation
from the lambda calculus to combinatory logic is not known (we
only have a lower bound) and thus results for combinatory logic
cannot be used to get results for the lambda calculus.


The organization of the paper is as follows. Section ?? briefly
This paper proves some highly non trivial results of this form. In particular we show that almost every closed lambda term begins with "many" <math>\lambda</math> (the precise meaning of this is given in theorem ??), that these <math>\lambda</math> bound "many" occurences of the corresponding variables theorem ??) and that, given any fixed closed term, almost ''no'' <math>\lambda</math>-term has this term as a sub-term.
recalls known results for Turing machines and cellular automata.

Section ?? shows that, in combinatory logic, every fixed term
Our original motivation was to consider the property of being terminating. Is a random term strongly normalizable (SN for short) i.e. every sequence of reduction terminates ? This question is, at present, unsolved and the experiments we have done do not even give an idea of what the result should be. It is known that being SN is an undecidable question and it is thus not easy to count the number of SN terms of big size. It is clear that having <math>(\delta \ \delta)</math> as a sub-term is a sufficient (but not necessary) condition for being non SN but as the experiments have shown (and as we have proved) almost no terms contains <math>(\delta \ \delta)</math> and this is thus useless to have a rsult for non SN. On the opposite direction, it is known that, if a term t is not SN, then a pattern "looking like" <math>(\delta \ \delta)</math> (the precise meaning will be given in section ??) must appear in t but the experiments seem to show that almost every term possesses this pattern (actually, we have not been able to count for big enough terms to have a clear idea of the probabilty of possessing the desired pattern).
appears in almost every term. In section ?? we recall the basic

definitions of the lambda calculus and we discuss the various
Another question, which is also undecidable, and for which finding the probability seems to be even more difficult is to get the probability of beeing weakly normalizable i.e. there is a sequence of reduction that terminates.
possibilities for counting the size of a term and the probability

measure we put on sets of terms. Section ?? gives the
Combinatory logic is another programming language that is very realted to the lambda calculus
combinatorial results we will need in our proofs and, in
particular, the lower and upper bounds for the number of terms of
size n. It also introduces Catalan and Motzkin numbers and the
so-called Lambert function. Our main results, i.e. theorems ??, ??
and ?? appear in section ?? Section ?? gives experimental results
for questions for which we have no proof. Finally, section ???
gives open questions and future work. The detailed proofs are
given in an appendix.


== Known results for Turing machines and cellular automata ==
== Known results for Turing machines and cellular automata ==

Version du 21 octobre 2008 à 09:22

Introduction

This paper addresses the following question. Having a (theoretical) programming language and a property, what is the probability that a random program satisfies the given property ? In particular, is it the case that almost every random program satisfies the desired property i.e. the probability is 1 ? This kind of question has some known results for Turing machines and cellular automata. Some of them will be given in section ??.

We consider, in this introduction that this notion of probability is, at least intuitively, sufficiently clear but, of course, this will have to be made precise.

We will concentrate in this paper on functional programming languages and, more specifically, on the lambda calculus, the simplest such language. Various properties can be studied. Some concern the structure of a term, some concern its behaviour.

The first question for which it would be desirable to have an answer is the following: give a "simple" equivalent for the number of terms of size n. This question is, at present and as far as we know, unsolved and the usual technics of generating functions does not work. See section ??? for more details. We give here upper and lower bounds for this number. This estimation will be enough for our purpose but the gap between the lower and the upper bound is to big to hope an equivalent.

For other questions, some experiments have already been done (see for example ...) which clearly indicates the desired result. For example, they "show" that almost every closed lambda term begins with a but, as far as we know, there was no "proved" result of this form.

This paper proves some non trivial results on the structural form of a lambda term. In particular we show that almost every closed lambda term begins with "many" (the precise meaning of this is given in theorem ??), that they bound "many" occurrences of the corresponding variables (theorem ??) and that, given any fixed closed term, almost no -term has this term as a sub-term (theorem ??).

Our original motivation was to consider the property of being terminating. Is a random term strongly normalizable (SN for short) i.e. every sequence of reduction terminates ? This question is, at present, unsolved and the experiments we have done do not even give an idea of what the result should be. It is known that being SN is an undecidable question and it is thus not easy to count the number of SN terms of big size. It is clear that having as a sub-term is a sufficient (but not necessary) condition for being non SN but as the experiments have shown (and as we have proved) almost no terms contains and this is thus useless to have a result for non SN. On the opposite direction, it is known that, if a term t is not SN, then a pattern "looking like" Échec de l’analyse (erreur de syntaxe): {\displaystyle (\delta \ \delta)} (the precise meaning will be given in section ??) must appear in t but the experiments seem to show that almost every term possesses this pattern (actually, we have not been able to count for big enough terms to have a clear idea of the probability of possessing the desired pattern).

Another question, which is also undecidable, and for which finding the probability seems to be even more difficult, is to get the probability of being weakly normalizable i.e. there is a sequence of reduction that terminates.


Combinatory logic is another programming language related to the lambda calculus. It is a way of coding lambda terms without using bound variables. The coding is fair for the questions we are concerned with in the sense that there are translations in both directions which, for example, preserve the property of being SN. We have also studied this language and, surprisingly, the results are very different from those for the lambda calculus. For example we show that, for every fixed term t_0, almost every term possess t_0 as a sub-term and this of course implies that almost every term is non SN. Note that the increase of size in the translation from the lambda calculus to combinatory logic is not known (we only have a lower bound) and thus results for combinatory logic cannot be used to get results for the lambda calculus.

The organization of the paper is as follows. Section ?? briefly recalls known results for Turing machines and cellular automata. Section ?? shows that, in combinatory logic, every fixed term appears in almost every term. In section ?? we recall the basic definitions of the lambda calculus and we discuss the various possibilities for counting the size of a term and the probability measure we put on sets of terms. Section ?? gives the combinatorial results we will need in our proofs and, in particular, the lower and upper bounds for the number of terms of size n. It also introduces Catalan and Motzkin numbers and the so-called Lambert function. Our main results, i.e. theorems ??, ?? and ?? appear in section ?? Section ?? gives experimental results for questions for which we have no proof. Finally, section ??? gives open questions and future work. The detailed proofs are given in an appendix.

Known results for Turing machines and cellular automata

Lambert function, Catalan and Motzkin numbers

Catalan numbers

  •  : Catalan numbers

Usual equivalent: which is obtained using Strirling formula. However, using stirling series: , we get that for we have

Thus, using this and , we have:

for all but also for .

Motzkin numbers

Let us define the number of unary-binary trees with inner nodes and leafs. We get

Then, by summing we define the number of unary-binary trees with inner nodes and give an equivalent:

Lambert W function

The Lambert function is defined by the equation which has a unique solution in .

For , we have which implies that near . To prove this, it is enough to remark that

This is not precise enough for our purpose. Using one step of the Newton method from , we can find a better upper bound for because is increasing and convex. This gives:

Indeed, if we define , we have and therefore, newton's method from gives a point at position:

Finally, we show that for , we have:

Indeed, for , we have , which implies and therefore .

combinatory logic

Basically the paper already written by Marek

+ the following


As we will see in section ??, theorem ?? does not holds for the Lambda calculus. This may be surprising since there are translations between these systems which respect many properties (for exemple the one of being terminating). However these translations do not preserve the size.

The translation T from combinatory logic to lambda calculus is linear, i.e. there is a constant k such that, for all terms, but the translation T' in the other direction is not linear. As far as we know, there is no known bound on the size of T'(t) but it is not difficult to find exemples where size(T'(t)) is of order .

The point is that T' has to code the binding in some way and this takes place. It will be interesting to compare the size of T'(t) with the one of t using other notion of size than the usual one. See section ?? for some complement.

Generality on lambda calculus

definition

The set of lambda terms (or, simply, terms) is defined by the following grammar


To be able to define the notion of a random term we have to define a distribution law on . There are many possibilities for that. We choose here the simplest one. Note that this is the one for which, at least at present, we are able to prove some results. It is based on densities. For that we first have to define the size of a term.

The usual definition is the following.

definition

The size (denoted as ) of a term is defined by the following rules.

- if is a variable.

-

-


In the rest of the paper we will use another definition (denoted as ) which is similar but gives simpler computations. We believe (but we have not yet checked the details) that, with we would have similar results. The computation, with , of the upper and lower bounds of the number of terms of size will be done in section ??

definition

The size (denoted as or, more simply ) of a term is defined by the following rules.

- if is a variable.

-

-


These definitions of the size are, for the implementation point of view, not realistic because, in case a term has a lot of distinct variables, it is not realistic to use a single bit to code them. The usual way to implement this coding is to replace the names of variables by their so called de Bruijn indices: a variable is replaced by the number of that occur, on the path from the variable to the root, between the variable and the that binds it. Note that, in this case, different occurrences of the same variable may be represented by different indices.

Choosing the way we code these de Bruijn indices gives different other ways of defining the size of a term. This can be done in the following ways

- Use unary notation, i.e. the size of the index simply is itself

- Use optimal binary notation, i.e. the size of the index is i.e. the logarithm of in base 2.

- Use uniform binary notation, i.e. the size of an index is the logarithm, in base 2, of the number of leaves of the term.

Remark

See section ?? for a discusion about these different size.


definition

Let be an integer. We denote by the set of terms of size n.


definition

Let A be a set of terms.

1) We denote by the cardinality of A.

2) We denote by the limit, for n going to , of .

Remark

Note that d is not exactly a measure since is undefined if the previous limit does not exist

definition

Let P be a property of terms. We will say that almost every term satisfies P (this will be also stated as P holds a.e.) if

generating functions

this does not work (by now) because radius of convergence 0

no known results for the number of terms of size n (denoted )

our results

(the proof of result of section k needs the result of section (k-1))

Upper and lower bounds for

For the lower bound, we will first count the number of lambda-terms of size starting with lambdas and having no other lambda below. This means that the lower part of the term is a binary tree of size with possibility for each leaf. Therefore we have:

And therefore, for , using our lower bound for and , we get:

with

Now, for fixed, we define (so ) and look for the maximum of this function. We have . Thus, is equivalent to . The Lambert function begin increasing this means that is equivalent to . Therefore, reaches a maximum for .

This means that reaches its maximum for fixed when is near to which is likely not to be an integer. However, there are at least integer between and . Indeed, using our inequalities on Lambert W function, we have:

Thus, we get the following lowerbound for :

To simplify, using the fact that and taking large enough, we have the following lowerbound:

We now compute an upper bound for the number of lambda-terms of size with exactly lambdas (that is with leaves using the Motzkin numbers and allowing any lambda to bind any variable (regardless of the real scope):

If we sum this for all possible and get an upper bound of using Lambert function as for the lower bound, we get the following upper bound for :

The ration between our upper bound and lower bound is equivalent to (NEEDS FURTHER CHECKING):

upper and lower bounds for number of lambdas in a term of size n

Jakub's trik : at least 1 lambda in head position

at least lambdas in head position and number of lambdas in one path

Remark: (may be 4) can be done directly without 3))

each of the head lambdas really bind "many" occurrences of the variable

every fixed closed term (including the identity !) does not appear in a random term (in fact we have much more than that)

comment : so different situation in combinatory logic and lambda calculus ; the coding uses a big size so need to count variables in a different way

Experiments

results of the experiments we have done

some experiments that have to be done : e.g. density of terms having or big Omega pattern ...

to be done

Upper and lower bounds for with other size for variables especially one, binary with fixed size

Open questions and Future work

.....