« ANR PML » : différence entre les versions

Version du 18 novembre 2008 à 23:58

Liste des participants

Cette liste sera supprimée du document final puisque les mêmes information s'y trouverons. Pour l'instant ceci est la liste des personnes contactées qui ont répondu au mail de contact et qui n'ont pas encore dit non (ou qui ont dit oui):

Christophe Raffalli (coordinateur) : 10 mois par ans (voire plus) (50% sauf année 1 si délégation ?)
Pierre hyvernat : 3 mois par an (50%)
Lionel Vaux : 3 mois par an (50%)
Frederic Blanqui : 3 mois par an (100%)
Pierre Weis : 2 mois par an (100%)
Pascal Manoury : 2 mois par an (50%)
Marianne Simonot : 2 mois par an (50%)
Olivier Pons : 2 mois par an (50%)
Maria-Virginia Aponte : 2 mois par an (50%)
Julien Forest : 2 mois par an (50%)
Colin Riba : 3 mois par an (50%)
Alexandre Miquel : 2 mois par an (50%)

Total sans tenir compte des pourcentage : 36 par an

Total : 20,5 mois/an (en tenant compte des enseignant chercheurs) Total si 3 ans : 61,5 mois Total si 4 ans : 82 mois

Budget humain possible

sur 4 ans : 1 thèse = 36 mois, 2 post-doc de 2 ans = 48 mois => total = 84 mois (OK)
sur 4 ans : 2 thèses = 72 mois, 1 post-doc de 1 ans = 12 mois => total = 84 mois (OK)
sur 3 ans : 1 thèse = 36 mois, 1 post-doc de 2 ans = 24 mois => total = 60 mois (OK)
Où

Context and positionning of the proposal

Ever since FORTRAN appeared fifty years ago, Programming languages have been evolving rapidly. These languages now include more and more sophisticated concepts like "object", "type inference", "modules"... This richness makes it more and more complex to train programmers and makes it difficult for them to follow the inovations and changes in programming languages.

Another phenomenon is the emergence of formal methods used in conjonction with various programming languages. This generaly introduces another layer to languages in order to write specifications, and sometimes yet another one for proofs. Learning a programming language together with the associated specification/proof languages can take an important effort.

Projects such as ACL2 (the successor of nqthm, the Boyer-Moore theorem prover) use a rather simple language (namely LISP) both as a programming language and specification language, allowing to keep a unity in the system. Unfortunatly, LISP is limited both as a programming language (no good treatment of sum type, no module system) and specification language (very limited quantification). Moreover, LISP has no compile-time type-check and these could be very useful to detect bugs and automaticaly assert properties.

The aim of this project is to build a very powerful language (with no loss of expressive power compared to state of the art languages) based on a very small number of simple features. We think this will be possible thanks to recent progress both in the semantics of programming language and the apparition of new algorithms for type inference based on constraints-solving. In fact we propose in [RAF1] an innovative concept derived from the later to enable this: contraints-checking.

Moreover, our language will be used not only as a programming language and as a specification language (like in ACL2), but also as a proof language. This is natural for an ML-like language because, for instance, pattern-matching is a natural and powerfull way to make a case analysis in a proof.

The idea of a new language arised from the discovery of a new typing algorithm [RAF1] whose implementation gave birth to a first experimental implementation of PML (Proved ML) by Christophe Raffalli. This implementation is already available from the web page of the language. The basic idea (described below) for type-checking are also implemented. However, turning PML into a real tool requires a lot of research and implementation work and this is why we request the help of the ANR. Here are some of them (more details are given in the next sections).

PML is different from the other modern programming languages because its design focuses on a few powerful concepts. One problem this raises is that it is more difficult for a compiler to produce efficient code. In particular, since PML unifies several notions of products (modules, tuples and records), there is no simple way to choose the internal representation of a product, expecially with implicit suptyping. Writing a good compiler for PML will require more complex optimisations schemes (probably driven by typing) than languages like OCaml or Haskell.

Here is a simple example, accepted by the current version of PML, demonstrating product type, sum type and subtyping. We define ternary tree as an extension an binary tree with an implicit suptyping relation (all function accepting binary_tree will accept ternary tree, by ignoring the middle_son):

type rec binary_tree (A) =
  [ Nil[] | Node[A with val left_son : binary_tree(A) val right_son : binary_tree(A)] ]
type rec ternary_tree(A) =
  binary_tree({ A with val middle_son : ternary_tree(A)})

PML requires a termination criterion because a proof by induction will just be a normal recursive program. Such a program has to be well-founded in order to correspond to a valid proofs. A lot of languages (Haskell, SML) can now use Aprove to establish termination of simple programs. However, we want the test to be fully integrated with the language, and be compatible with very modular programs. At the moment, these goals seem difficult to achieve with Aprove or other external termination checkers.

PML's approach to proofs is very innovative because it can be seen as an intermediate level between the fully automated approach of ACL2 or PVS and the aproach based on a specific formalism for proof like COQ, ALF, Isabelle, ... The former proof assistants accept to use any automated theorem prover, with one main drawback: the correction of such a system are postulated a priori. The later proof assistants define proofs in a precise framework such as natural deduction, calculus of construction, ... and provide tactics to build complex proofs. The proof are usually tedious to write and impossible to read or write without a specific interface. Even with a declarative language like Mizar (or the Mizar mode for Isabelle or coq), it is not very easy to build proofs, let alone concise proofs!

PML logic is in fact the equationnal theory of its programming language and we plan to use variants of Knuth-Bendix completion as a proof-checker. The first experiments with the current implementation are promising. However, Knuth-Bendix procedure in the context of ML is a complex and new research problem and it needs a lot of work, for instance to handle proof in arithmetic which are often needed. Here is an example of a proof in arithmetic, checked in the current version, which is not completely satisfactory (because too hard to write), but shows the use of a language for both proofs and programs and the use of recursive function for inductive proof:

 val rec mul_associative x y z |- (x * y) * z == x * (y * z)
   proof match x with
     Z[] -> 8<
   | S[x'] -> 
     let hr = mul_associative x' y z in
     let _ = mul_right_distributive y (x' * y) z in
     8<

Scientific and technical description

Background, state of the art

Programming language

The ML programming language, created by Robin Milner et al in the 80th has two major distinctive features [FIXME]:

Algebraic data-types and pattern matching: data types are basically all constructed using inductive definition, cartesian products (product type) and disjoint union (sum type).
Statical type inference: the type of every piece of code is automatically infered using Hindley-Milner algorithm (HM) and by construction, once compiled, an ML program can not crash (no segmentation fault). More precisely, when we do not use unsafe features of the language (like interface with unsafe libraries written in C), if an ML program crashes, then there is a bug in the type-checker or the compiler.

Recent progress in type inference algorithm uses constraint solving. This means that the type system can be described in first-order predicate logic in such a way that a type inference problem is a formula written in a decidable fragment of first-order predicate logic (often the purely existential fragment). Hence, any constraint solver can be turned into a type-check for ML. These approach is know as HM(X) [FIXME].

There are two problems with this approach:

The complexity of constraint solving can be too high for practical use, especially using general purpose constraint solver. Currently (to our knowledge), there are no mainstream implementation of ML based on HM(X).
HM(X) does not completely solve all problems of subtyping. The language to express the type constructed by the algorithm is the same as the language used by programmers to write types. Hence, if two constraints $\alpha \subseteq \beta$ and $\alpha \subseteq \gamma$ for three types $\alpha ,\beta$ and $\gamma$ . Then, the most natural solution would say that $\alpha =\beta \cap \gamma$ . This imples that intersection of types needs to be part of the language for type. This means that constraints like $\beta \cap \gamma \subseteq \delta$ may also appear and they are problematic to deal with. Moreover, the same reasonning apply for union and then, constraints of the form $\beta \cap \gamma \subseteq \beta '\cup \gamma '$ may appear and at least increase a lot the complexity of the constraint solving.

PML's approach is to replace type-inference by just constraint checking: we just check that a solution exists for the constraints in some model. The syntax to write the solution is not available to write types in the language and is even not defined. Type-checking in the current implementation of PML can be seen as a black box ensuring that nothing can go wrong during execution. Furthermore, the type used by the programmer are in fact only syntactic sugar for the projection operator onto the intended type (giving for free nice feature like higher-oder parametric type, that is type with parameters which may be themselves type with parameters).

With this approach, we loose type-inference and the ability to display types in error message. Nevertheless, in case of error, we display three positions in the code and error message like this error at position 1, label "id" projected at position 2 do not appear in the value constructed at position 3. This kind of error messages are in fact of bounded length and often more useful that OCaml or SML messages. We can understand this as showing three points in the slice of the error, which is introduced by Joe Wells in [WELLS1]

Proof assistant

Proof assistyants have evolved rapidly since Automath in the 70th. Two main threads exist: automated proof assistant such as ACL2, PVS and safe ones such as Coq, Isabelle, PhoX, Lego, HOL, ... The former incorporate very sophisticated automated deduction strategy, with no certificate for the validity of the proof, while the later require all proofs to be done in a specific framework (like natural deduction or type theory) allowing for a simple check of the proof. The gap between the two approaches tend to be reduced, by the emergence of complex tactics (for Coq or Isabelle mainly) that build proofs for you. For instance Zenon is an almost state of the art first-order theorem prover that output a Coq proof tree.

The common defect of all this proof assistants is that proof can not be written nor understtod without running the proof assistant. Some proof assistants do not follow exactly the above scheme: for instance Mizar and Alf. Mizar has a declarative style for proof which is humanly readable and checked by a limited checker (this proof style has been adapted to Coq and Isabelle). Unfortunatly, there is no formal description of Mizar logic available. Alf on the other hand is based on proof theory and requires the user to basicaly write the complete proof tree just leaving out a few details. The logic is very well formalized and simple, but writing proof is tedious and not similar to usual practice in informal mathematics.

This pictures of the world of proof assistants shows that some fundamental work should be done here. In the current version of PML, the logic is just an equationnal theory of the underlying programming language which should be easy to describe formally when it is stabilized. And the proof engine, is like in Mizar, a limited automated checker inspired by the Knuth-Bendix completion algorithm (KB). But KB has to adapted to the higher-order construct of ML like languages and also to be limited to always terminate and possibly be fast. The current limitation is much too strong (it is limited to closed terms and can not use universal theorems automatically as in the second example of section 1 where one has to make precise how we use distributivity).

Nevertheless, examples in the current version shows that our approack is worth trying, because our first proof are concise and readable (when you know the language). A very encouraging point is that all examples where written without interface. This really means that proof are readable without the help of a computer.

Rationale highlighting the originality and novelty of the proposal

The final objective of our project could be a full PML compiler, bootstrapped and completely proved with itself. This does not exist for any language but is probably too ambitious in four year (who knowns ...). We plan to produce in four years (only) a compiler for PML, written in PML, but not proved in PML yet. On the other hand, we want to develop proof-checking to allow very elegant proofs, supporting this claim by various examples, some of them being near to industrial application, some others being algorithms coming from implementation of programming languages (to check that full bootstrapping is possible).

We really want to focus on the quality of the language both for proofs and programs. Here quality should be easy to understand and write (and therefore, easy to learn). We think that using a programming language for proof could make formal method more attractive to the industry without the defect of system like PVS and ACL2 where the automated tactic replace the need for a proof language, but are sometimes hard to control and use. For instance, finding the right lemmas to make a proof possible in ACL2 is quite difficult.

Comparison with the GALLIUM project at INRIA: This project has very similar goals. We have probably a more fundamental approach of the problem and are trying more advanced (and risked solutions). Some choices of each project are incompatible: the GALLIUM team is exploring the use of sytem F to enhance the power of ML typing. However, if the logic in fully mixed with the programming language, then we loose coherence (in fact we can encode the paradox of system U). So we can highlight PML and GALLIUM difference by one project keeping the logic seperated from the programming language: GALLIUM team uses Coq either to extract program either to prove proof obligation, while our project tries to merge the logic and the programming language.

Scientific and technical programme, project management

Specific aims of the proposal

As said in the previous section, the final objectif would be to have a fully bootstrapped PML language, which would mean that PML is entirely written ans proved in PML is too ambitious as first. We choosed to focus here on the design of the language plus a proof of concept, that is compilation and proof of various examples, searching to do our best on the following points:

Natural way of writing programs (Task 1,Task 5)
Efficiency of the code generated by the compiler, despite the heavely use of subtyping (Task 3)
Readable and short proof (Task 1, Task 4, Task 5)
Efficiency of type-checking and compilation (Task 2)
Efficiency of proof-checking (Task 4)

Project management

We plan to have 3 meetings per year, 3 days each, with almost all members of this projects. We think tihs meeting are fundamental to keep the project running, inform everybody of the project progress and problems.

We want also to organise visits of one or two members of the project to the site of antother member. We plan a total of 40 weeks (almost 10 month) for the four years of the project. This visit could be fundamental for finalising or get started either papers of implementation. A lot of tasks involve many people coming from different horizon. We think this could be very fruitful, but requires good communication.

The structure with only one partner with a coordinator devoting most of its time to the project (84%), should simplify a lot other management issues.

Detailed description of the work organised by tasks

Task 1 - theoretical work, design of the language

Coordinateur: Christophe Raffalli

Participants: Alexandre Miquel, Pierre Hyvernat, Pierre Weis

1.a Correctness of the constraint checking algorithm to check typing (delivered 09/2009): This is work in progress [RAF1] and is likely to be terminated before the beginning of the project. Nevertheless, we include this goal here for clarity.

1.b Consistency of proof-checking (delivered before 09/2010 for the core of the language): This is essential for define clearly the logic of PML and prove it is concistant. This should not be too hard for the core of the language. However, this proof has to be extended to take in account all future extension of the language and could be seen as a permanent task.

1.c Adaptation of the uniqueness typing to the context of constraint checking (delivered 09/2011): The current version of PML is a pure functional programming language, with no imperative feature. This is problematic, because input/output are necessary for real programs and affectation are required for efficiency especially when using large arrays. We plan to adapt the approach of the Clean language [FIXME]. In clean, all program can be analized as purely functional programs, but the type system can check that some data are not use any more and reuse the place in memory for other data (allowing affectation). For instance, in such a context wrting in a file f can be written as let f' = write f str in ..., but the compiler much check that f will not be used anymore implying the equivalence between the standard imperative semantics of writing to file and the purely functional semantics used by proofs.

1.d Private and abstract data types. (delivered 09/2010 for private type, 09/2011 for abstract type)

Abstract data types hides the definition of a data type and allow the user of a library to be sure that his code will continue to work event if the internal representation of data are changed. In the context of constraint-checking in PML, added abstract data types seems a to be a chalenging task. Moreover, abstract data-types is a form of existential quantification over types in types and could raise some consistency issues. We hope to find a way to incorporate abstract/existential types in PML without loosing coherence.

Private data types are a very simple yet very powerful mechanism for easily ensuring invariants on all the values of a data type. This mechanism is as follows. The compiler simply checks that the constructors of a data type are not used for constructing values. Values are constructed by using construction functions like with abstract data types. However, unlike with abstract data types, constructors can still be used as patterns for defining functions by pattern-matching. Hence, a library on a private data type is not closed but can be extended easily. Private data types are therefore an important and very useful feature for defining data structures with complex invariants and proving their correctness more easily. They have been implemented in OCaml by Pierre Weis and are described in Frédéric Blanqui, Thérese Hardin and Pierre Weis ESOP'07 paper.

Task 2 - termination criterion

Coordinator: Pierre Hyvernat

Participants: Frédéric Blanqui, Colin Riba, Christophe Raffalli, Olivier Pons, Marianne Simonot , Julien Forest

Essential task, running during the 4 years: In this field we will always find embarassing example that do not work, but could work ... both for the core and uxilliary criterion (see below for the distinction)

Even if it might be counter-intuitive at first, it is often necessary to write programs whose execution can be infinite. For example, any kind of "server", or almost any interactive program might have infinite executions. Even in purely mathematical programming, it can be interesting to have intermediary non-terminating functions. Consider a function outputting the stream of prime numbers : even if this function is non-terminating, we might use it in a terminating manner in a proof by requesting the n first prime numbers.

Since PML uses full recursion (keyword rec), writing such programs is easy. On the other hand, the notion of "terminating", or "well-founded" recursive function isn't part of the core of PML: such programs are just special cases of recursive programs. The user will have to specify that such and such function are in fact terminating and might have to prove it himself.

Proofs of specifications are just PML programs, but those cannot be allowed to run infinitely. In fact, proof will not run at all: they are only required to be well-founded; the consistency of PML relies on this. In order to relieve the user from proving that all proofs are in fact terminating, PML should offer a way to check automatically that (some) functions are terminating. Because the halting problem is undecidable, it is hopeless to do that in all generality, but this generality is seldom necessary. PML should only work for most of the functions, most of the time. (Rather than work for all the functions, all the time...)

Technically speaking, PML can infer an error called Loop when it encounters a program which, it thinks, may not terminate. Such an error can of course be raised, but cannot be captured: this is an error rather than an exception. The property we need to guarantee is that if PML doesn't raise the error Loop, then the program in question does indeed terminate. Whatever happens, if the error Loop is raised for a terminating function, the user can still provide PML with a proof that this error is never raised. (PML current syntax for that is [p proof ... ] where p is a term and ... is a proof that p doesn't raise any exception nor error: this is the desired property for proofs.)

2.a Core termination criterion (delivered 09/2010)

This first test will be part of PML's proof-checker hence the consistency of PML will depend upon the correctness of its implemntation. We could just accept primitive recursive functions as terminating, but this is just not strong enough for any practical use. (Users of the COQ proof system know that using structural ordering on a single argument isn't practical...). Currently, there has been little work on this aspect of PML. A simple yet effective termination checker has been implemented, but this test isn't plugged to the rest of PML. The termination criterion used is based on "size change criterion" of Lee, Jones and Ben-Amram. This test successfully checks termination for primitive recursion, lexicographic ordering and permutation of arguments. This covers most simple practical cases.

Once this is totally integrated in PML, several things need to be done:

2.b Test and improvement of the core termination criterion (delivered 09/2011)

We shoud test and improve the first criterion: in particular, the size change criterion depends on a notion of size of terms. Two natural choices are the size (approximately the total number of constructors) and the depth. Those two notions are not equivalent as far as the criterion is concerned: the size of an argument may decrease while its depth increases, and vice and versa. It is not obvious which one is best, and it might be possible for the user to provide his own notion of size. Meanwhile, there is a practical workaround which consists in adding a "dummy" integer argument to our functions which will act as the overall size of the other arguments.

Here, we plan to adapt to PML a termination criterion based on the technique of type-based termination, which allow for recursive call through size preserving function such as List.map. There are several possibilities, ranging from very simple systems such as the one developed by Abel (RAIRO'04), Barthe et al (MSCS'04) or Blanqui (RTA'04, CSL'05), to the very rich system of Blanqui and Riba (LPAR'06). In the latter, given for each function some formula in Presburger arithmetic describing how the size of the output is related to the size of the inputs (this can be checked automatically), the termination follows from the fact that recursive calls are done on strictly decreasing arguments, using for instance lexicographic or multiset comparisons together with linear combinations of the arguments. Intermediate systems, such as the one of Barthe, Grégoire and Riba (CSL'08) which is powerful but with a lower complexity than Presburger arithmetic, have also to be considered.

However, the above test might be considered too error prone for their integration in the core criterion. We hope to be able to stabilized here the core criterion in september 2011 using a compromise between power and simplicity (which implies security), but all rejected criterion could then be reused in the next task:

2.c develop external termination criterion (delivered 09/2013)

Here, we want to be able to use any possible kind of algorithm, but we must find a way to output from the success of the criterion, a modified program (for instance with extra arguments, or adding an explicit termination proof or a combination of both) that can be proved terminating using only the core criterion. To do this, we propose to extend the CoLoR platform which basically do a similar task for Coq. This adaptation might be difficult mainly because the proof language of PML and Coq are very different. However, the fact we can use the core criteria, rewrite programs, produce proofs ... offer many possibility that are not present in Coq. Moreover, the equational nature of PML's proof-checking may be well adapted for this task.

Among the possible tests we are planning we can cite:

The higher-order recursive path ordering, developed by Jouannaud, Rubio and Blanqui (LPAR'07) for automatically proving the termination of higher-order rewrite systems.

The techniques developed by Giesl et al for automatically proving the termination of Haskell programs by using the powerful state-of-the-art termination techniques developed for rewrite systems (RTA'06) and implemented in Giesl et al automated termination prover Aprove (winner of last year competition on termination).

Task 3 - compilation

Coordinateur: Christophe Raffalli

Participants: Maria-Virginnia Aponte, Pascal Manoury, Pierre Weis

3.a A first compiler using LLVM (delivered 09/2010) LLVM is very attractive, because it is rather simple to use (it even has an OCaml interface) and can compile for a bytecode interpreter, can be used as a JIT compiler of a standard compiler. Moreover, it support a lot of platforms. It also provide some optimisations, which is important. We think that writing a compiler, with no optimisation, for PML using LLVM should not be too hard (this is important that this task be easy, because this is not really reasearch ...)

Then, we would like to improve our compiler in various direction. We mention here the ones that are innovative in this domain (we should also consider more standard optimisation, but we do not mention them specifically).

3.b Representation of cartesian product and disjoint sum (delivered 03/2011 for product and 09/2011 for sum types) PML allows only one kind of cartesien product which in general (because of multiple inheritance and implicit subtyping) must be represented as a table (hashtable or maps based on binary search trees). These can impact performance. We plan to generate extra constraints for each occurrence of a constructor of a cartesian product in your code. Then, solving this constraint in a way that maximise speed (or size) should allow a representation of cartesian product as efficient (or even more efficient for complex object oriented program) than using, for instance, OCaml. The same kind of optimisation (with different optimisation criterion) should be done for sum type and the implementation of pattern matching.

3.c Unboxing (depend upon 3.b delivered 09/2012) In general, 32 bits integer and floating point number are boxed (that is represented by a pointer). This allows a more ellegant language. This can lead to major impact on performance especially when arrays are involved. We think that constraint-checking is a good framework to propagate type information and allow efficient unboxing. Nevertheless, doing enough unboxing to try to match the performance of low level language like C is very hard. We hope that we can reuse some of the work of task 3.b, because unboxing can be seen also as the optimisation of the représentation of a cartesian product with only one field.

3.d compilation of affectation (reference and arrays) and IO (depend upon 1.c delivered 09/2013) After adapting uniqueness typing to PML (task 1.c), we will be able to compile affectation and IO imperatively as in any imperative programming language.

3.e How to unify the mathematical hierarchy of numbers with the current practice in computer science (depend upon 3.c, delivered 09/2013) This task can be seen as a generalisation of unboxing (task 3.c). One solution would be to propose only one type for integer, and let the compiler choose the representation, possibly using specifications bounding values. This means that we should be able to propagate typing information about the size of numbers to use efficient representations as much as possible. This seems to be a very challenging task and will probably not work well enough because the propagated bound will be based on the worst case. Nevertheless, we think it is worth trying to see how far we can go.

Task 4 - equational reasonning

Coordinator: Christophe Raffalli

Participants: Frédéric Blanqui, Pierre Weis

The kernel of the proof engine will be based on completion techniques. Knuth-Bendix completion tries to transform a set of unoriented equations into a set of (inter-reduced and) convergent, that is, terminating and confluent, set of rewrite rules. It can therefore be used for proving that some equality is the equational consequence of some equational theory. Indeed, when an equational theory can be completed into a convergent rewrite system, two terms are equivalent in this equational theory if their normal in the convergent rewrite systems are equal.

4.a Adaptation of the Knuth-Bendix completion algorithm to PML (delivered 09/2013) As explained just before, Knuth-Bendix completion is based on rewriting. However, in PML or in Moca, programs are not rewrite systems and program evaluation is not rewriting. We therefore need to adapt Knuth-Bendix completion. We need both generalisation, to use the notion of constructtor present in ML and take care of the higher-order nature of ML (even if we can not hope much here). As said above, something is already implented, but it is trivial because completely restricted to closed terms. A first version should be able at least to rewrite a closed term modulo some simple equational (and universal) theory. An important point here is to ensure termination of this algorithm and even a low complexity. The price to pay, will be incompleteness. A lot of care should also be devoted to the corectness of the implementation, because like for task 3.a, the consitency of PML rely on it.

This task in one of the maor task for PML and as already said, a first version already exist (far too weak), and we will have to improve this during the four year either to treat new cases, to improve performance, to make the code more safe regarding its correctness. This is why we consider this to be a priori a permanent task of this project.

Moreover, we include other task not only related to PML, but that concern equationnal reasonning and that could be directly or inderectly related to task 4.a.

4.b Development of a stand-alone tool for completion modulo associative and commutative symbols (delivered 09/2011), like it is the case with addition and multiplication on integers. We want to develop a stand-alone tool so that it can be easily reused by other people in other projects or tools. Apart from PML, it will be in particular interesting in Europa, a proof assistant based on the lambda-pi-calculus modulo rewriting developed by Gilles Dowek and Mathieu Boespflug; etc. Note that, for implementing this tool, we could start from the CiME library.

4.c Improvement of Moca and adaptation to PML (depend upon task 1.d and 4.b, delivered 09/2013) Completion is at the heart of Moca, it is used for automatically generating construction functions for private data types with algebraic properties as described in Frédéric Blanqui, Thérese Hardin and Pierre Weis ESOP'07 paper. As an example, consider a type with a constructor taking as argument a list. If this constructor is declared "commutative", then Moca will generate construction functions for this type that will guarantee that the list argument is always sorted (using a user-defined comparison function).

Moca also suffer from some limitation due, because it really works for commutative and associative symbols and a few other specific equation (neutral element, inverse, ...). The task 4.b should allow to generalize the applocation of tool.

Moca currently generates OCaml code without guarantee on its correctness. Using PML instead, Moca could also generate a proof of the correctness of the generated construction functions. Then, later, when trying to prove the correctness of a function defined on this private date type, one could use the invariants satisfied by the values of the private data type as assumptions, since these invariants are satisfied by construction.

Task 5 - validation (transversal task)

Coordinator: Maria-Virginia Aponte

Participants: all members.

The validation requires to write numerous examples to check that we fullfill our main goal, which is that all programs (with or without proof) are written in the best possible way. This work being fundamental research, we think that it is really important that any piece of code written in PML that does not look right should be carefully examined to check that this is not due to a defect in language design.

Marianne Simonot, Maria-Virginia Aponte, Olivier Pons, julien Forest which are all member of the team SUR whose thematic is to make it possible to integrate proof assistant in the industrial process of software conception. They will try to check if this is possible and even better with PML than with other tool. We think that the main goal of our project is to obtain an important progress in this respect and all failure here shoud be carefully analysed and possibly fixed.

This task should deliver between 10.000 and 100.000 lines of PML code to have a sufficient corpus to say in which respect we fulfilled our goals.

Planning of tasks, deliverables and milestones

Présenter sous forme graphique un échéancier des différentes tâches et leurs dépendances (par exemple, utiliser un diagramme de Gantt). Présenter un tableau synthétique de l'ensemble des livrables du projet (numéro de tâche, date, intitulé, responsable).

PAS SUR LE WIKI ...

Data management, data sharing, intellectual property and results exploitation

Results in each of the tasks will be published in journal (APAL, TCS) and international conferences as usual (LICS, TLCA, CSL, CIE).

PML language is already distributed as open source software under the Cecill-B license. We think that for such a research platform, this is the only possible way to ensure that people will try it. As soon as a first compiler is available, we plan to produce easy-to-install package for most linux distribution (Debian, Ubuntu, ...).

Consortium organisation and description

Relevance of the partner for the project

This project involve only one partner, the LAMA (UMR 5127) where the coordinator of the project already developped (mostly alone) the proof assistant PhoX. The main characteristic of PhoX is to ve rather simple to use due to a set of tactics which is limited (less than 20 for the principal ones), but powerful. Moreover, tactic are extensible by incorporating theorems inside the tactics.

Clearly, this means that efficiency was the main goal of PhoX. Unfortunately, like all tactical theorem prover, PhoX proof are not readable. After a few try [FIXME: citation demonat] with a mizar like mode for PhoX, Christophe Raffalli decided to move to a new theorem prover, starting from scratch.

Moreover, many new collegues arriver recently in the laboratory: Pierre Hyvernat, Lionel Vaux, Laurent Vuillon, Jacque-Olivier Lachaud, Guillaume Thessier (only the two first are members of this ANR, but we have frequent discussion with other about PML, in particular with Guillaume Thessier). This created a strong dynamic very suitable to host such a project.

Qualification of the project coordinator and members

The coordinator and various members of the project comes from various horizon (see section 7), but they have a common background around the use and development of programming language and/or formal methods. We think that this variety, the number of members, should allow for good communication and should be very fruitful.

We think, that compared with the development of PML by Christophe Raffalli alone, such a team should speed up the development of PML at least by a factor 5, making it possible to deliver a very innovative and useful tool at the end of the project. The lack of support for such a team would certainly limit the tool to an experimental toy with many development only partially (or even not at all) integrated in the project.

Scientific justification of requested budget

meetings (total 86800€, considered as a coordination cost for task 0 in document A) :

4 meetings per year for 10 people (duration 3 days). Each meeting costs 1 train ticket in France, 3 days and 2 nights: 10*(300 + 3*20 + 2*80) = 5200€. Then for all meetings in 4 years, this gives 83200€.

Moreover, Frederic Blanqui being in China, his travel costs 900€ instead of 300€. If we consider that due to the distance, he comes only to 6 meetings, this is an over cost of 3600€.

visits (total 52800€)

Visits in France : We plan for 8 visits of one week for each task (2 visits per year and per task, 40 visits for the whole project). The cost per visit is: 300 + 8*20 + 7*80 = 1020. The total cost is 40800€ (8160€ per task).

Visits in China: We plan 4 one-month visits for Frederic Blanqui (Pierre Weis, Christophe Raffalli and twice Cody roux). The cost of such a visit is 900€ for the travel, 1500€ for the hotel and 600€ for the meals. This means 3000€ per visit and a total of 12000€ (6000€ for task 2 and 4 involving Frederic Blanqui).

conferences (total 18000€)

The member of the ANR will submit papers to international conferences and journals (for an average of two publications per man.year, this gives 20 publications). We estimate around 12 of these publications to be in international conferences with an average cost of 1500€. This gives a total of 18000€ (3600€ per task)

computers (total 10000€)

We consider that the ANR has to participate the renewal of the computers of its participants. The lifetime of a computer being 3 years and the total number of month for permanent members of the project being 120, we think that we should pay for 4 computers with an average value of 2500€ each (we need powerful computers and laptops, because automated reasoning requires a lot of computations and memory).

human resources (1 phd student and 2 post-docs)

This project involves many tasks and a lot of members. However, all members appart from the coordinator can only devote 2-3 month per year to this project (they are invoved in other ongoing reasearch). This is far from neglectable. But we think that we will need more human power and we think that 1 phd and 2 post-doc are reasonnable. This would give 124 month (40 for the coordinator, 36 for the phd and 48 for the 2 post-doc) by reasearcher spending most of their time on the project and [FIXME ] month by people with 2-3 month per year. We consider that this is a good balance.

Annexes

CV, Resume of all project members

Christophe Raffalli (project coordinator)

Age 41, MCF at the LAMA (UMR 5127) since september 1997.

After his phd in Paris VII (defended in februry 1994), Christophe Raffalli spent 1 and a half year at the LFCS in Edinburgh, 2 years at Chalmers university in Göteborg (post-doc of the TYPES european project) and 1 year as ATER in Creteil university.

Research administration: For ten years, the LAMA was subsite of the Paris VII site inside the TYPES project which was renewed several times and Christophe Raffalli was the coordinator for this subsite. Currently the project is not financed bu E.U. Nevertheless, Christophe Raffalli is in charge of the organisation of the next TYPES meeting in Aussois in mai 2009.

Is research interest includes

Theory and implementation of proof assistant.
Proof theory.
Implementation of programming language (especially type-systems)

Maria-Virginia Aponte (Cedric, CNAM, PARIS)

Frederic Blanqui (INRIA, Rocquencourt et LIAMA Beijin en 2009-2010)

Age 36, permanent full-time INRIA researcher.

Since September 2008, Frederic Blanqui is at LIAMA, the Sino French Laboratory in Computer Science, Automation and Applied Mathematics, and Tsinghua University, Beijing, China.

Between 2003 and 2008, Frederic Blanqui was at LORIA, Nancy, in the Protheo research team led by Pr Claude Kirchner, focusing on the use of rewriting techniques for programming, as well as specifying and proving program properties.

Frederic Blanqui is expert in the following areas: type theory, rewriting theory, termination, functional programming, proof assistants, formal certification of program properties.

Frederic Blanqui supervised 2 master thesis and 3 PhD students on the extension of type theory with decision procedures, the termination of typed higher-order rule-based programs, and the certification of termination proofs.

Since 2004, Frederic Blanqui is leading the CoLoR project which aims at providing tools for automatically certifying the termination of programs. In 2007, CoLoR won the first international competition on certified termination provers.

In 2007 and 2008, Frederic Blanqui led the INRIA Quotient project which aims at extending the OCaml programming language with types whose values automatically satisfy equational invariants using the Moca tool.

Frederic Blanqui did his PhD with Pr Jean-Pierre Jouannaud at University Paris Sud between October 1998 and September 2001 on the combination of type theory and rewriting theory.

Between October 2001 and August 2002, he worked on the certification of cryptographic protocols with Pr Larry Paulson at Cambridge, UK.

Between September 2002 and September 2003, he worked at Ecole Polytechnique in the Coq development team on the extension of the proof assistant Coq with rewriting.

More details on his activities and publications can be found on his web page and in his CV.

International journals with reading committee: 3

International conferences with reading committee: 12

Other publications (thesis, workshops, invited papers, reports, etc.): 18

Prizes: 2001 SPECIF Award for his PhD thesis by the French national society of teachers and researchers in computer science; 2001 Kleene Award for the best young researcher paper at the IEEE Symposium on Logic in Computer Science (LICS).

5 most significant publications in the last 5 years:

F. Blanqui. Definitions by rewriting in the Calculus of Constructions. Mathematical Structures in Computer Science, 15(1):37–92, 2005.

F. Blanqui. Inductive types in the Calculus of Algebraic Constructions. Fundamenta Informaticae, 65(1-2):61–86, 2005.

F. Blanqui and C. Riba. Combining typing and size constraints for checking the termination of higher-order conditional rewrite systems. In Proceedings of the 13th International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Lecture Notes in Computer Science 4246, 2006.

F. Blanqui, J.-P. Jouannaud, and P.-Y. Strub. Building decision procedures in the calculus of inductive constructions. In Proceedings of the 21th International Conference on Computer Science Logic, Lecture Notes in Computer Science 4646, 2007.

F. Blanqui, J.-P. Jouannaud, and A. Rubio. HORPO with computability closure : A reconstruction. In Proceedings of the 14th International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Lecture Notes in Computer Science 4790, 2007.

Julien Forest (Cedric, ENSIIE, Evry)

Age 33, "maître de conférences" at ENSIIE since september 2007, member of the CEDRIC.

He obtained his PhD thesis in September 2003, under the supervision of Dela (Universite Paris 7, Paris, France).

His research interests include

Pattern matching in lambda calculi
Proof theory (he is the co-developper of the functionnality Function of Coq with Y. Bertot)
Automatic certification of proofs generated by automatic provers into Proof assistants

J. Forest is also a member of the A3Pat ANR project.

Interesting publications:

"Réécriture d'ordre supérieur avec motifs". Thèse de doctorat, Université Paris-Sud, Orsay, France, September 2003.
with E. Contejean, P. Courtieu, O. Pons, and X. Urbain. "Certification of automated termination proofs.", Proceedings of FROCOS'07, volume 4720 of LNAI, pages 148-162
with P. Courtieu and X. Urbain. "Certifying a Termination Criterion Based on Graphs, without Graphs" . TPHOLs 2008, 2008
with G. Barthe, D. Pichardie, and V. Rusu. "Defining and reasoning about recursive functions: a practical tool for the coq proof assistant.", Proceedings of FLOPS'06, volume 3945 of LNCS, pages 114-129. 2006.

Pierre Hyvernat (LAMA, Chambéry)

Age 30, "maître de conférences" at the Université de Savoie (in Chambéry) since september 2006, member of the LAMA.

He obtained his PhD thesis in December 2005, under the supervision of Thierry Coquand (Chalmers, Göteborg, Sweden) and Thomas Ehrhard (at the time, IML, Luminy, Marseille, France)

His research interests include

denotational semantics
type theory and constructive mathematics

Pierre Hyvernat is also a member of the ANR "CHOCO : Curry Howard for Concurrency" (coordinator: Thomas Ehrhard, Paris 7).

Interesting publications:

with P. Hancock: "Programming Interfaces and Basic Topology". "Annals of Pure and Applied Logic", volume 137, January 2006.
"Synchronous Games, Simulations and lambda-calculus", proceedings of the "GaLoP" workshop, ETAPS 2005. (journal version submitted to Annals of Pure and Applied Logic.)
"Predicate Transformers and Linear Logic: yet another Denotational Model", Lecture Notes in Computer Science, vol. 3210, September 2004.

Pascal Manoury (PPS, Paris 7)

Alexandre Miquel (PPS/LIP, Paris 7)

Age 37, "maître de conférence" at the Université Paris-Diderot (Paris 7) since september 2003, member of PPS. Currently CNRS research associate ("délégation") in LIP (Plume) since september 2008.

He obtained his PhD thesis in December 2001, under the supervision of Hugo Herbelin (INRIA/LIX) in the Coq team (now TypiCal). From oct. 2001 to aug. 2002 he was postdoc in the Chalmers Institute of Technology (Göetborg, Sweden) under the supervision of Thierry Coquand, and from sept. 2002 to aug. 2003 he was "ATER" in LRI (Orsay).

He is a specialist of the models of type theory (especially the calculus of constructions) and of the correspondence between set theory and type theory. His research interests also include:

logic, proof-theory, rewriting
denotational semantics (set- and domain-theoretic)
realizability in classical logic

His most significant publications are:

A Model for Impredicative Type Systems with Universes, Intersection Types and Subtyping (LICS'00)
A Strongly Normalising Curry-Howard Correspondence for IZF Set Theory (CSL'03)
Lambda-Z: Zermelo's Set Theory as a PTS with 4 Sorts (TYPES'04)
with A. Arbiser and A. Ríos. A lambda-calculus with constructors (RTA'06)
Classical program extraction in the calculus of constructions (CSL'07)

Olivier Pons (Cedric, CNAM, Paris)

39 ans "maître de conférences" at the [Conservatoire National des Arts et Métiers (CNAM) ] since september 2001. He is member of the [CPR team] of the [CEDRIC laboratory].

Before that he held a post-doctoral position at [Universidade do Minho ] (Portugal), in the [Logic and formal methodes team] (01/09/2000 – 31/08/2001) and he has been ATER at Ensiie (01/09/1999 – 31/08/2000) and at CNAM (01/09/1998 – 31/08/1999).

He obtained in July 1999 his PhD done under the supervision of Yves Bertot (Inria Sophia-Antipolis) and Véronique Donzeau-Gouge (CNAM).

His research interests include:

Automated proof, and its mechanised certiﬁcation.

Rewriting techniques, Termination

Type theory

Type isomorphism, proof reuse

User Interfaces of theorem provers.

Graphical interfaces for interactive theorem proving.

Natural language processing

Question answering and text retrivial

He is involved in the following project:

[A3PAT], ANR-05-BLAN-0146 (Cedric ; LRI-PCRI ; LaBRI ;INRIA-Sophia Antipolis) Mechanical certiﬁcation of proofs from automated provers, automation for proof assistants. (12/2005 – 06/2009)
[Focal] (Cedric ; CNRS ; Lip6 ; INRIA-Rocquencourt), certiﬁed development, compiler and framework.

His main significant publications include 1 International Journal (JFP ’03 ) 5 International Conferences ( UITP(2), Type 00, Fossacs’O1, Frocoss ’07 ) and some National Conferences (JFLA, CORIA)

His last publications relevant for PML are:

Evelyne Contejean, Pierre Courtieu, Julien Forest, Olivier Pons and Xavier Urbain. Certiﬁcation of automated termination proofs. In B. Konev, F. Wolter ed., Proc of FroCos 07, vol 4720 of LNAI, pp 148–162, Liverpool, UK.(Sept. 2007)
Gilles Barthe, Olivier Pons and Venanzio Capretta. Setoids in type theory. Journal of Functional. Programming, Vol. 13(2) : p. 261-293. November 2003

He is Co-author of the [Cime 3 ], rewriting toolboxes.

Cime 3, uses parts of the termination engine of Cime 2, and produce certiﬁcates for COQ.

Colin Riba (ENS Lyon)

G. Barthe, B. Grégoire, and C. Riba. Type-Based Termination with Sized Products. In Proceedings of the 22nd International Conference on Computer Science Logic, Lecture Notes in Computer Science 5213, 2008.