« Lambda counting » : différence entre les versions
| Ligne 82 : | Ligne 82 : | ||
| ⚫ | |||
| ⚫ | |||
simply, terms) is defined by the following grammar |
simply, terms) is defined by the following grammar |
||
<math>t, u := Var \ \mid \ \lambda x.t \ \mid \ (t \ u)< |
<math>t, u := Var \ \mid \ \lambda x.t \ \mid \ (t \ u)</math> |
||
definition |
definition |
||
| Ligne 93 : | Ligne 92 : | ||
To be able to define the notion of a {\it random} term we |
To be able to define the notion of a {\it random} term we |
||
have to define a distribution law on <math>\Lambda< |
have to define a distribution law on <math>\Lambda</math>. There are many |
||
possibilities for that. We choose here the simplest one. Note that this is the one for which, at least at present, we are |
possibilities for that. We choose here the simplest one. Note that this is the one for which, at least at present, we are |
||
able to prove some results. It is based on densities. For that we |
able to prove some results. It is based on densities. For that we |
||
| Ligne 102 : | Ligne 101 : | ||
definition |
definition |
||
The size (denoted as <math>size_1(t)< |
The size (denoted as <math>size_1(t)</math>) of a term <math>t</math> is defined by the following |
||
rules. |
rules. |
||
- <math>size_1(t)=1< |
- <math>size_1(t)=1</math> if <math>t</math> is a variable. |
||
- <math>size_1(\lambda x.t)=size_1(t)+1< |
- <math>size_1(\lambda x.t)=size_1(t)+1</math> |
||
- <math>size_1((t \ u))=size_1(t)+size_1(u)+1< |
- <math>size_1((t \ u))=size_1(t)+size_1(u)+1</math> |
||
In the rest of the paper we will use another definition (denoted |
In the rest of the paper we will use another definition (denoted |
||
as <math>size_0(t)< |
as <math>size_0(t)</math>) |
||
which is similar but gives simpler computations. |
which is similar but gives simpler computations. |
||
We believe (but we have not yet checked the details) that, with |
We believe (but we have not yet checked the details) that, with |
||
<math>size_1< |
<math>size_1</math> we would have similar results. The |
||
computation, with <math>size_1< |
computation, with <math>size_1</math>, of the upper and lower |
||
bounds of the number of terms of size <math>n< |
bounds of the number of terms of size <math>n</math> will be done |
||
in section ?? |
in section ?? |
||
definition |
definition |
||
The size (denoted as <math>size_0(t)< |
The size (denoted as <math>size_0(t)</math>) of a term <math>t</math> is defined by the following |
||
rules. |
rules. |
||
- <math>size_0(t)=1< |
- <math>size_0(t)=1</math> if <math>t</math> is a variable. |
||
- <math>size_0(\lambda x.t)=size_0(t)+1< |
- <math>size_0(\lambda x.t)=size_0(t)+1</math> |
||
- <math>size_0((t \ u))=size_0(t)+size_0(u)+1< |
- <math>size_0((t \ u))=size_0(t)+size_0(u)+1</math> |
||
| Ligne 137 : | Ligne 136 : | ||
The usual way to implement this coding is to replace the names of |
The usual way to implement this coding is to replace the names of |
||
variables by their so called de Bruijn indices: a variable is |
variables by their so called de Bruijn indices: a variable is |
||
replaced by the number of <math>\lambda< |
replaced by the number of <math>\lambda</math> that occur, on the |
||
path from the variable to the root, between the variable and the |
path from the variable to the root, between the variable and the |
||
<math>lambda< |
<math>lambda</math> that binds it. Note that, in this case, |
||
different occurrences of the same variable may be represented by |
different occurrences of the same variable may be represented by |
||
different indices. |
different indices. |
||
| Ligne 147 : | Ligne 146 : | ||
following ways |
following ways |
||
- Use unary notation, i.e. the size of the index <math>n< |
- Use unary notation, i.e. the size of the index <math>n</math> |
||
simply is <math>n< |
simply is <math>n</math> itself |
||
- Use optimal binary notation, i.e. the size of the index |
- Use optimal binary notation, i.e. the size of the index |
||
<math>n< |
<math>n</math> is <math>log_2(n)</math> i.e. the logarithm of |
||
<math>n< |
<math>n</math> in base 2. |
||
- Use uniform binary notation, i.e. the size of an index is the |
- Use uniform binary notation, i.e. the size of an index is the |
||
logarithm, in base 2, of the number of leaves of the term. |
logarithm, in base 2, of the number of leaves of the term. |
||
Version du 20 octobre 2008 à 13:11
Introduction
The question is: among programs, what is the probability of having a fixed property.
what kind of program : turing machines, cellular automata, combinatory logic, lambda calculus
what kind of properties : structural (for functional programs), behaviour (SN, weakly normalizable, ...
references to known results on : turing machines, cellular automata
we concentrate on combinatory logic, lambda calculus
Lambert function, Catalan and Motzkin numbers
Catalan numbers
- : Catalan numbers
Usual equivalent: which is obtained using Strirling formula. However, using stirling series: , we get that for we have
Thus, using this and , we have:
for all Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle n\geq1} but also for Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle n=0} .
Motzkin numbers
Let us define Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle M(n,k)} the number of unary-binary trees with inner nodes and Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle k} leafs. We get
Then, by summing we define Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle M(n)} the number of unary-binary trees with Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle n} inner nodes and give an equivalent:
Lambert W function
The Lambert function Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle W(x)} is defined by the equation which has a unique solution in .
For , we have which implies that near . To prove this, it is enough to remark that
This is not precise enough for our purpose. Using one step of the Newton method from , we can find a better upper bound for because is increasing and convex. This gives:
Indeed, if we define , we have and therefore, newton's method from gives a point at position:
Finally, we show that for , we have:
Indeed, for , we have , which implies and therefore .
combinatory logic
results on combinatory logic
Generality on lambda calculus
definition The set of lambda terms (or, simply, terms) is defined by the following grammar
definition
To be able to define the notion of a {\it random} term we
have to define a distribution law on . There are many
possibilities for that. We choose here the simplest one. Note that this is the one for which, at least at present, we are
able to prove some results. It is based on densities. For that we
first have to define the {\it size} of a term.
The usual definition is the following.
definition
The size (denoted as ) of a term is defined by the following rules.
- if is a variable.
-
-
In the rest of the paper we will use another definition (denoted
as )
which is similar but gives simpler computations.
We believe (but we have not yet checked the details) that, with we would have similar results. The computation, with , of the upper and lower bounds of the number of terms of size will be done in section ??
definition The size (denoted as ) of a term is defined by the following rules.
- if is a variable.
-
-
These definitions of the size are, for the implementation point of
view, not realistic because, in case a term has a lot of distinct
variables, it is not realistic to use a single bit to code them.
The usual way to implement this coding is to replace the names of
variables by their so called de Bruijn indices: a variable is
replaced by the number of that occur, on the
path from the variable to the root, between the variable and the
that binds it. Note that, in this case,
different occurrences of the same variable may be represented by
different indices.
Choosing the way we code these de Bruijn indices gives different other ways of defining the size of a term. This can be done in the following ways
- Use unary notation, i.e. the size of the index simply is itself
- Use optimal binary notation, i.e. the size of the index is i.e. the logarithm of in base 2.
- Use uniform binary notation, i.e. the size of an index is the logarithm, in base 2, of the number of leaves of the term.
generating functions
this does not work (by now) because radius of convergence 0
no known results for the number of terms of size n (denoted )
our results
(the proof of result of section k needs the result of section (k-1))
Upper and lower bounds for
For the lower bound, we will first count the number of lambda-terms of size starting with lambdas and having no other lambda below. This means that the lower part of the term is a binary tree of size with possibility for each leaf. Therefore we have:
And therefore, for , using our lower bound for and , we get:
Now, for fixed, we define (so ) and look for the maximum of this function. We have . Thus, is equivalent to . The Lambert function begin increasing this means that is equivalent to . Therefore, reaches a maximum for .
This means that reaches its maximum for fixed when Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle k} is near to Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle \frac{n+1}{W(4e(n+1))}} which is likely not to be an integer. However, there are at least Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle \left\lfloor \frac{n (\ln(\ln(4en)) - 1)}{\ln^2(4en)}\right\rfloor} integer between Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle \frac{n+1}{W(4e(n+1))}} and Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle \frac{n+1}{\ln(4e(n+1))}} . Indeed, using our inequalities on Lambert W function, we have:
Thus, we get the following lowerbound for Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle L_n} :
To simplify, using the fact that Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle \lim_{n\to +\infty}\left(\frac{\ln(n)}{\ln(4en)}\right)^n = 0} and taking Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle n} large enough, we have the following lowerbound:
We now compute an upper bound Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle UB(n,k)} for the number of lambda-terms of size Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle n} with exactly Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle k} lambdas (that is with Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle n - k + 1} leaves using the Motzkin numbers and allowing any lambda to bind any variable (regardless of the real scope):
If we sum this for all possible Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle k} and get an upper bound of Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle k^{n-k+1}} using Lambert function as for the lower bound, we get the following upper bound for Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle L_n} :
The ration between our upper bound and lower bound is equivalent to (NEEDS FURTHER CHECKING):
upper and lower bounds for number of lambdas in a term of size n
Jakub's trik : at least 1 lambda in head position
at least Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle o(\sqrt{n/\ln(n)})} lambdas in head position and number of lambdas in one path
Remark: (may be 4) can be done directly without 3))
each of the Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle o(\sqrt{n/\ln(n)})} head lambdas really bind "many" occurrences of the variable
every fixed closed term (including the identity !) does not appear in a random term (in fact we have much more than that)
comment : so different situation in combinatory logic and lambda calculus ; the coding uses a big size so need to count variables in a different way
Experiments
results of the experiments we have done
some experiments that have to be done : e.g. density of terms having Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle \lambda x.y} or big Omega pattern ...
to be done
Upper and lower bounds for Échec de l’analyse (SVG (MathML peut être activé via une extension du navigateur) : réponse non valide(« Math extension cannot connect to Restbase. ») du serveur « https://wikimedia.org/api/rest_v1/ » :): {\displaystyle L_n} with other size for variables especially one, binary with fixed size
Open questions and Future work
.....