ANR PML
Liste des participants
Cette liste sera supprimée du document final puisque les mêmes information s'y trouverons. Pour l'instant ceci est la liste des personnes contactées qui ont répondu au mail de contact et qui n'ont pas encore dit non (ou qui ont dit oui):
- Christophe Raffalli (coordinateur) : 10 mois par ans (voire plus)
- Pierre hyvernat : 3 mois par an
- Lionel Vaux : 3 mois par an
- Frederic Blanqui : 3 mois par an
- Pierre Weis : 2 mois par an
- Pascal Manoury : 2 mois par an
- Emmanuel Chailloux (réponse en attente)
- Marianne Simonot : 2 mois par an
- Olivier Pons : 2 mois par ans
- Maria-Virginia Aponte : 2 mois par an
- Julien Forest : 2 mois par ans
Total : 31 mois/an Total si 3 ans : 93 mois Total si 4 ans : 124 mois
Budget humain possible
- sur 3 ans : 1 thèse = 36 mois, 2 post-doc de 2 ans = 48 mois => total = 84 mois (OK)
- sur 3 ans : 2 thèses = 72 mois, 1 post-doc de 1 ans = 12 mois => total = 84 mois (OK)
- sur 4 ans : 2 thèses = 72 mois, 1 post-doc de 2 ans = 24 mois => total = 96 mois (OK)
- Autre ?
- Où
Context and positionning of the proposal
Ever since FORTRAN appeared fifty years ago, Programming languages have been evolving rapidly since FORTRAN. These languages now include more and more sophisticated concepts like "object", "type inference", "modules"... This richness makes it more and more complex to train programmers and makes it difficult for them to follow the inovations and changes in programming languages.
Another phenomenon is the emergence of formal methods used in conjonction with the programming language. This generaly introduces another layer to the languages in order to write specifications, and sometimes proofs. Learning a programming language together with the associated specification/proof language can take an important effort.
Project such as ACL2 (the successor of nqthm, the Boyer-Moore theorem prover) use a rather simple language (namely LISP) both as a programming language and specification language, allowing to keep a unity in the system. Unfortunatly, LISP is limited both as a programming language (no good treatment of sum type, no module system) and specification language (very limited quantification). Moreover, LISP has no compile-time type-check and these could be very useful to detect bugs and automaticaly assert properties.
The aim of this project is to build a very powerful language (with no loss of expressive power compared to state of the art languages) based on a very small number of simple features. We think this will be possible thanks to recent progress both in the semantics of programming language and the apparition of new algorithms for type inference bases on constraints solving or contraints checking.
Moreover, our language will be used not only as a programming language and as a specification language (like in ACL2), but also as a proof language. This is natural for an ML-like language because, for instance, pattern-matching is a natural and powerfull way to make a case analysis in a proof.
The idea of a new language arised from the discovery of a new typing algorithm (CITATION de l'article en cours) whose implementation gave birth to a first experimental implementation of PML (Proved ML) by Christophe Raffalli. This implementation is already available from the web page of the language. The basic idea (described below) for type-checking are also implemented. However, turning PML into a real tool requires a lot of research and implementation work and this is why we request the help of the ANR. Here are some of them (more details are given in the next sections).
PML is different from the other modern programming languages because its design focuses on a few powerful concepts. One problem this raises is that it is more difficult for a compiler to produce efficient code. In particular, since PML unifies several notions of products (modules, tuples and records), there is no simple way to choose the internal representation of a product. Writing a good compiler for PML will require more complex optimisations schemes (probably driven by typing) than languages like OCaml or Haskell.
PML requires a termination criterion because a proof by induction will just be a normal recursive program. Such a program has to be well-founded in order to correspond to a valid proofs. A lot of languages (Haskell, SML) can now use Aprove to establish termination of simple programs. However, we want the test to be fully integrated with the language, and be compatible with very modular programs. At the moment, these goals seem difficult to achieve with Aprove or other external termination checkers.
PML's approach to proofs is very innovative because it can be seen as an intermediate level between the fully automated approach of ACL2 or PVS and the aproach based on a specific formalism for proof like COQ, ALF, Isabelle, ... The former proof assistants accept to use any automated theorem prover, with one main drawback: the correction of such a system are postulated a priori. The later proof assistants define proofs in a precise framework such as natural deduction, calculus of construction, ... and provide tactics to build complex proofs. The proof are usually tedious to write and impossible to read or write without a specific interface. Even with a declarative language like Mizar (or the Mizar mode for Isabelle or coq), it is not very easy to build proofs, let alone concise proofs!
PML logic is in fact the equationnal theory of its programming language and we plan to use variants of Knuth-Bendix completion as a proof-checker. The first experiments with the current implementation are promising. However, Knuth-Bendix procedure in the context of ML is a complex and new research problem and it needs a lot of work, for instance to handle proof in arithmetic which are often needed.
Scientific and technical description
Background, state of the art
The ML programming language, created by Robin Milner et al in the 80th has two major distinctive features:
- Algebraic data-types and pattern matching: data types are basically all constructed using inductive definition, cartesian products and disjoint union.
- Statical type inference: the type of every piece of code is automatically infered using Hindley-Milner algorithm (HM) and by construction, once compiled, an ML program can not crash (no segmentation fault). More precisely, when we do not use unsafe features of the language (like interface with unsafe libraries written in C, if an ML program crashes, then there is a bug in the type-checker or the compiler.
Recent progress in type inference algorithm uses constraint solving. This means that the type system can be described in first-order predicate logic in such a way that a type inference problem is a formula written in a decidable fragment of first-order predicate logic (often the purely existential fragment). Hence, any constraint solver can bu turned into a type-check for ML. These approach is know as HM(X).
There are to problem with this approach:
- The complexity of constraint solving can be too high for practical use, especially using general purpose constraint solver. Currently (to our knowledge), there are no mainstream implementation of ML based on HM(X).
- HM(X) does not completely solve all problems of subtyping. The language to express the type constructed by the algorithm is the same as the language used programmers to write type. Hence, if two constraints and for three types and
A WORD ABOUT OTHER PROJECT (OCAML, GALLIUM, HASKELL, ...)
Rationale highlighting the originality and novelty of the proposal
Décrire les objectifs scientifiques/techniques du projet. Présenter les avancées scientifiques attendues. Préciser l’originalité et le caractère ambitieux du projet. Détailler les verrous scientifiques et techniques à lever par la réalisation du projet. Décrire éventuellement le ou les produits finaux développés à l’issue du projet montrant le caractère innovant du projet. Présenter les résultats escomptés en proposant si possible des critères de réussite et d’évaluation adaptés au type de projet, permettant d’évaluer les résultats en fin de projet. Le cas échéant, démontrer l'articulation entre les disciplines scientifiques et le caractère interdisciplinaire du projet.
Scientific and technical programme, project management
Scientific programme, specific aims of the proposal
Présentez le programme scientifique, la méthodologie et la structuration du projet. Justifiez la décomposition en tâches du programme de travail en cohérence avec les objectifs poursuivis. Les tâches représentent les grandes phases du projet. Elles sont en nombre limité. Présenter les liens entre les différentes tâches (si possible, utilisez un diagramme ou un organigramme technique).
Project management
Préciser les aspects organisationnels du projet et les modalités de coordination (si possible individualisation d’une tâche coordination : cf. tâche 0 du document de soumission A).
Detailed description of the work organised by tasks
Pour chacune d’entre elle, décrire : son responsable et les partenaires impliqués (si possible, sous forme graphique), ses objectifs, le programme détaillé des travaux1, la description des méthodes, des choix techniques et des solutions envisagés, les risques et les solutions de repli envisagées, les indicateurs de succès associés aux objectifs et les livrables, les contributions des partenaires (le « qui fait quoi »).
Task 1 - theoretical work
- Correctness of the constraint checking algorithm
- Consistency of proof-checking
Task 2 - termination criterion
Coordinator: Pierre Hyvernat
Participants: Frédéric Blanqui, (Colin Riba, to be confirmed), Christophe Raffalli, Olivier Pons (to be confirmed), Marianne Simonot (to be confirmed), ...
Even if it might be counter-intuitive at first, it is often necessary to write programs whose execution can be infinite. For example, any kind of "server", or almost any interactive program might have infinite executions. Even in purely mathematical programming, it can be interesting to have intermediary non-terminating functions. Consider a function outputting the stream of prime numbers : even if this function is non-terminating, we might use it in a terminating manner in a proof by requesting the n first prime numbers.
Since PML uses full recursion (keyword rec), writing such programs is easy. On the other hand, the notion of "terminating", or "well-founded" recursive function isn't part of the core of PML: such programs are just special cases of recursive programs. The user will have to specify that such and such function are in fact terminating and might have to prove it himself.
Proofs of specifications are just PML programs, but those cannot be allowed to run infinitely. In fact, proof will not run at all: they are only required to be well-founded; the consistency of PML relies on this. In order to relieve the user from proving that all proofs are in fact terminating, PML should offer a way to check automatically that (some) functions are terminating. Because the halting problem is undecidable, it is hopeless to do that in all generality, but this generality is seldom necessary. PML should only work for most of the functions, most of the time. (Rather than work for all the functions, all the time...)
Technically speaking, PML can infer an error called Loop when it encounters a program which, it thinks, may not terminate. Such an error can of course be raised, but cannot be captured: this is an error rather than an exception. The property we need to guarantee is that if PML doesn't raise the error Loop, then the program in question does indeed terminate. Whatever happens, if the error Loop is raised for a terminating function, the user can still provide PML with a proof that this error is never raised. (PML current syntax for that is [p proof ... ] where p is a term and ... is a proof that p doesn't raise any exception nor error: this is the desired property for proofs.)
The least PML needs to do automatically is check for primitive recursive functions, but this test is just not strong enough for any practical use. (Users of the COQ proof system know that using structural ordering on a single argument isn't practical...) Currently, there has been little work on this aspect of PML. A simple yet effective termination checker has been implemented, but this test isn't plugged to the rest of PML. The termination criterion used is based on "size change criterion" of Lee, Jones and Ben-Amram. This test successfully checks termination for primitive recursion, lexicographic ordering and permutation of arguments. This covers most simple practical cases.
Once this is totally integrated in PML, several things need to be done:
- test and improve this criterion: in particular, the size change criterion depends on a notion of size of terms. Two natural choices are the size (approximately the total number of constructors) and the depth. Those two notions are not equivalent as far as the criterion is concerned: the size of an argument may decrease while its depth increases, and vice and versa. It is not obvious which one is best, and it might be possible for the user to provide his own notion of size. Meanwhile, there is a practical workaround which consists in adding a "dummy" integer argument to our functions which will act as the overall size of the other arguments.
- Adapt to PML and implement the termination criterion developed by Blanqui and Riba in a series of papers (RTA'04, CSL'05 and LPAR'06). Given for each function some formula in Presburger arithmetic describing how the size of the output is related to the size of the inputs (this can be checked automatically), the termination follows from the fact that recursive calls are done on strictly decreasing arguments, using for instance lexicographic or multiset comparisons together with linear combinations of the arguments. This would allow for recursive call through size preserving function such as List.map
- Adapt to PML the techniques developed by Giesl et al for automatically proving the termination of Haskell programs by using the powerful state-of-the-art termination techniques developed for rewrite systems (RTA'06) and implemented in Giesl et al automated termination prover Aprove (winner of last year competition on termination).
- To make sure that these termination proofs are correct, we propose to extend the CoLoR platform in order to certify these termination proofs in the highly secure proof checker and proof development system Coq.
Task 3 - compilation
- Representation of cartesian product and disjoint sum
- Unboxing
Task 4 - programming with effect
- Adaptation of the uniqueness typing to the context of constraint checking
- Compilation of this scheme for affectation (reference and arrays) and IO
Task 5 - equational reasonning
Coordinator:
Participants: Blanqui
The kernel of the proof engine will be based on completion techniques.
- Development of a stand-alone tool for completion modulo associative and commutative symbols.
We think that it is indeed interesting for the community to have such a tool well maintained so that it can be used and included in other projects. It is in particular useful for automatically generating correct construction functions for relational data types (ESOP'07) like in Moca.
Note that, for implementing such a tool, we could start from the CiME library.
- Adaptation of the Knuth-Bendix completion algorithm
- Terminaison and complexity of our completion algorithm
Task 6 - numerical data
- How to unify the mathematical hierarchy of numbers with the current
practice in computer science
- What number to use for index in arrays
Planning of tasks, deliverables and milestones
Présenter sous forme graphique un échéancier des différentes tâches et leurs dépendances (par exemple, utiliser un diagramme de Gantt). Présenter un tableau synthétique de l'ensemble des livrables du projet (numéro de tâche, date, intitulé, responsable).
Data management, data sharing, intellectual property and results exploitation
Présenter les stratégies de valorisation des résultats : la communication scientifique; la communication auprès du grand public, le cas échéant; la valorisation des résultats attendus; les retombées scientifiques et techniques, éventuellement les retombées industrielles, économiques … la place du projet dans la stratégie industrielle des entreprises partenaires du projet autres retombées (normalisation, information des pouvoirs publics, ...) pour les bases de données, indiquer les modes de stockage et de maintenance ainsi que les communautés bénéficiaires.
Présenter les grandes lignes des modes de protection et d’exploitation des résultats. Pour les projets partenariaux organismes de recherche/entreprises, les partenaires devront conclure, sous l’égide du coordinateur du projet, un accord de consortium dans un délai de un an si le projet est retenu pour financement. Pour les projets académiques, l’accord de consortium n’est pas obligatoire mais fortement conseillé.
Consortium organisation and description
Relevance and complementarity of the partners within the consortium
Décrire brièvement chaque partenaire et fournir ici les éléments permettant d’apprécier la qualification des partenaires dans le projet (le « pourquoi qui fait quoi »). Il peut s’agir de réalisations passées, d’indicateurs (publications, brevets), de l’intérêt du partenaire pour le projet… (il ne s’agit pas de fournir ici le C.V. du responsable scientifique de chaque partenaire). Fournir en annexe 7.2 une présentation plus détaillée des partenaires, de leur savoir- faire et de leurs apports et attentes dans le projet. Montrer la complémentarité et la valeur ajoutée des coopérations entre les différents partenaires. L’interdisciplinarité et l’ouverture à diverses collaborations seront à justifier en accord avec les orientations du projet. (une page maximum)
Qualification of the project coordinator
Christophe Raffalli : 30 mois sur 36 ou 40 mois sur 48.
Fournir les éléments permettant de juger la capacité du coordinateur à coordonner le projet.
Contribution and qualification od each project participant
Pour chacune des personnes dont l’implication dans le projet est supérieure à 25% de son temps sur la totalité du projet, une biographie d’une page maximum sera placée en annexe 7.2 du présent document qui comportera : Nom, prénom, âge, cursus, situation actuelle Autres expériences professionnelles Liste des cinq publications (ou brevets) les plus significatives des cinq dernières années, nombre de publications dans les revues internationales ou actes de congrès à comité de lecture. Prix, distinctions Si besoin, pour chacune des personnes, leur implication dans d'autres projets (Contrats publics et privés effectués ou en cours sur les trois dernières années) sera présentée selon le modèle fourni en annexe 7.3. On précisera l'implication dans des projets européens ou dans d’autres types de projets nationaux ou internationaux. Expliciter l’articulation entre les travaux proposés et les travaux antérieurs ou déjà en cours.
Scientific justification of requested budget
On présentera ici pour chaque partenaire, la justification scientifique et technique des moyens demandés dans le document de soumission A. Ces moyens sont synthétisés à l’échelle du projet dans la fiche «Tableaux récapitulatifs » dans ce document de soumission A. Chaque partenaire justifiera les moyens qu’il demande en distinguant les différents postes de dépenses selon le canevas suivant :.
(Une sous-section par partenaire)
- Une/deux bourse de thèse (où ? avec qui ?)
- Un/deux post-doc (où ? avec qui ?)
- des missions pour financer 6 réunions + des échanges
- organisation d'une école d'été ?
- achat de matériel (joindre vos devis ?)
Blanqui: 2 sejours d'une semaine par an en France pour rencontrer les autres membres du projet et assister a des conferences du domaine (JFLA)
Weis: 1 sejour de deux semaines par an a Pekin pour travailler avec Blanqui sur le projet et donner des cours sur la programmation fonctionnelle en general et PML en particulier a l'Universite de Tsinghua.
Annexes
References
CV, Resume
- Frederic Blanqui
Age 36, permanent full-time INRIA researcher.
Since September 2008, Frederic Blanqui is at LIAMA, the Sino French Laboratory in Computer Science, Automation and Applied Mathematics, in Tsinghua University, Beijing, China.
Between 2003 and 2008, Frederic Blanqui was at LORIA, Nancy, in the Protheo research team led by Pr Claude Kirchner, focusing on the use of rewriting techniques for programming, as well as specifying and proving program properties.
Frederic Blanqui is expert in the following areas: type theory, rewriting theory, termination, functional programming, proof assistants, formal certification of program properties.
Frederic Blanqui supervised 2 master thesis and 3 PhD students on the extension of type theory with decision procedures, the termination of typed higher-order rule-based programs, and the certification of termination proofs.
Since 2004, Frederic Blanqui is leading the CoLoR project which aims at providing tools for automatically certifying the termination of programs. In 2007, CoLoR won the first international competition on certified termination provers.
In 2007 and 2008, Frederic Blanqui is leading the Quotient project which aims at extending the OCaml programming language with types whose values automatically satisfy equational invariants using the Moca tool.
Frederic Blanqui did his PhD with Pr Jean-Pierre Jouannaud at University Paris Sud between October 1998 and September 2001 on the combination of type theory and rewriting theory. His PhD got in 2001 the award of SPECIF, the French national society of teachers and researchers in computer science. He also got the Kleene award the same year for his paper at the IEEE Symposium on Logic in Computer Science.
Between October 2001 and August 2002, he worked on the certification of cryptographic protocols with Pr Larry Paulson at Cambridge, UK.
Between September 2002 and September 2003, he worked at Ecole Polytechnique in the Coq development team on the extension of the proof assistant Coq with rewriting.
More details on his activities and publications can be found on his web page and in his CV.
International journals with reading committee: 3
International conferences with reading committee: 12
Other publications (thesis, workshops, submitted papers, reports, etc.): 18
5 most significant publications in the last 5 years:
F. Blanqui. Definitions by rewriting in the Calculus of Constructions. Mathematical Structures in Computer Science, 15(1):37–92, 2005.
F. Blanqui. Inductive types in the Calculus of Algebraic Constructions. Fundamenta Informaticae, 65(1-2):61–86, 2005.
F. Blanqui and C. Riba. Combining typing and size constraints for checking the termination of higher-order conditional rewrite systems. In Proceedings of the 13th International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Lecture Notes in Computer Science 4246, 2006.
F. Blanqui, J.-P. Jouannaud, and P.-Y. Strub. Building decision procedures in the calculus of inductive constructions. In Proceedings of the 21th International Conference on Computer Science Logic, Lecture Notes in Computer Science 4646, 2007.
F. Blanqui, J.-P. Jouannaud, and A. Rubio. HORPO with computability closure : A reconstruction. In Proceedings of the 14th International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Lecture Notes in Computer Science 4790, 2007.