Center

Page

Web

Wiki

Ehc

# Exploiting Type Annotations

## Comment of the reviewers on 20061013 ESOP2007 variant (techrep extended variant)

Dear author(s),

this is to inform you that your paper submitted for presentation at ESOP 2007 could not be accepted.

Competition this year was particularly though. We had something like 120 papers and could accept only 34. Because of this many good papers had to be rejected.

Herewith I enclose the reviews of your papers, I sincerely hope that they will be of some help for your future research.

Many many thanks for submitting; I hope you will come to ETAPS 2007 anyway.

Best Regards, Rocco De Nicola

---------------------------------------------

Paper: 10
Title: Exploiting Type Annotations

-------------------- review 1 --------------------

OVERALL RATING: 4 (Negative, will argue for rejection)
REVIEWER'S CONFIDENCE: 4 (expert)
----------------------- REVIEW --------------------

---------------------------------------------------------------
- Summary

The authors discuss a type inference algorithm for Hindley/Milner
extended with higher-rank types. The algorithm is formulated in
two stages. In the first (global quantifier propagation) stage,
the types of polymorphically used identifiers are inferred from
user-provided type annotations. The novelty here is that the
algorithm collects type information from surrounding nodes in the
abstract syntax tree (AST). The propagation of type information from
the  outside to the inside (top to bottom in the AST) has already
been studied by Odersky/Laeufer, POPL'96 and Pierce/Turner.
The second (local) stage,  is fairly standard and uses the types found
to type check the program.

- Overall Impression

points in favor:

Interesting, important problem.
"global" extension to Odersky/Laeufer's and Pierce/Turner's
"local" type inference algorithm
A prototype has been implemented (which I haven't checked though)

negative points:

Inference approach is sketched by example only, impossible to assess/verify
the algorithm. No formal results. No comparison with Vytiniotis, Weirich,
Peyton Jones, ICFP'05 or Laeufer/Odersky. Which examples require fewer type
annotations?

In summary:
I appreciate the authors effort to provide numerous examples to motivate
the general idea behind their inference algorithm. But the technical
development is simply disappointing. The standard approach towards type
inference is to generate constraints out of the program text which
are then solved. None of these concepts are sufficiently described.
The reader is referred to a technical report which I haven't checked.
In a scientific paper the authors will  need to provide at least some of
the technical details. For example, it's fine to focus on the constraint
solver and discuss the constraint generator in the appendix/technical report.
To conclude, I recommend rejection but I'd like to encourage the authors
to continue their work. The specific comments given below may be helpful.

On page 6, the workings of global quantifier propagation are sketched.
The authors use intersection types to collect the different type instances
of a (potentially) polymorphic identifier. To make this paper more
convincing, the authors will need to show how to solve
such type constraints.

On page 8, the authors introduce union types which are necessary in case
polymorphic types appear in contra-variant position. In my opinion,
neither intersection nor union types are necessary, if only the authors
would adopt a constraint-based approach where we generate instantiation
constraints out of the program text. Example 5 gives then rise to

t_h <= (forall a. a -> a) -> Int
t_h <= (Int -> Int) -> Int
t_h <= (forall a. a -> a) -> c

where t_h refers to the type of identifier h

for some c. (notice how we can make partial type information more precise
by generating fresh type variables!)

The above implies t_h = t_h' -> t_h'' for some t_h' and t_h''
which then yields (by applying co-/contra-variants of function arrow "->")

(forall a. a -> a) <= t_h'
(Int -> Int)       <= t_h'

t_h'' <= Int
t_h'' <= c

All this is fairly standard and directly follows from work by
Pottier/Remy and Stuckey/Sulzmann/Wazny. It's clearly a matter
of taste whether we choose to use either intersection/union types
or instantiation constraints. In either case there must be
a description of the underlying constraint solver.

On page 9, mid page, the authors mention that their system can deal
with type constraints of the form

(forall a. (Int,a) -> (Int,a)) /\ (forall b. (b,Int) -> (b,Int))

which I'd prefer to write

t_h <= forall a. (Int,a) -> (Int,a)

t_h <= forall b. (b,Int) -> (b,Int)

I believe it's fairly obvious that we need here some form of
anti-unification to find
the most general type but the authors refer again to the accompanying TR
for the technical details. In my opinion, this is not acceptable. I'm
not asking for the *full* technical details. I know this is impossible
in a 15 page conference paper. But I demand to see some of the technical
issues explained in detail. Looking at my sketch how I would tackle the
problem, I believe this is possible.

Page 9, global quantifier propagation overview. I believe that we cannot
strictly separate the two stages. Consider Example 5 again,
the expression h id gives rise to

t_h <= (forall a. a -> a) -> c

or ((forall a. a -> a) -> ...) /\ ... in your notation
BUT only if we first infer the type of id!

I assume we first divide the program into strongly connected components
and analyze each component separately. Well, this is the standard
approach but I believe that there can be subtle interactions between
the two stages which need to be explained better.

There are several references to Haskell and Haskell98, but I fail to
see a connection cause the authors exclude the treatment of type classes.

What about impredicativity? It appears that the system is predicative.
Or are you saying that you can mimic impredicativity by applying
the Odersky/Laeufer trick and putting a polymorphic type inside
a data type?

Minor:

p3, mid page, "Our approach uses a two variants of algorithm", typo
-------------------------------------------------------------

-------------------- review 2 --------------------

OVERALL RATING: 3 (Strongly negative, will argue for rejection)
REVIEWER'S CONFIDENCE: 3 (high)
----------------------- REVIEW --------------------

The topic of the paper is partial type inference for predicative System F.
The authors' claim is that the problem can be solved in practice by an
algorithmic specification of type inference. The authors informally describe
a two phase approach for type inference. The first phase gathers existing
type annotations and propagates them to other program points. This phase
uses intersection and union types, but treats them in a very ad hoc way with
_intendedly_ incomplete reduction rules. The second phase is supposed to be
a more or less standard type inference DM approach, but I did not really
understand it.

The proposal is quite informal. While the authors claim that their approach
is algorithmically defined they do not actually define it but only
_describe_ it via a few examples that all look the same.

Moreover,

- The comparisons with existing work is really poor---and thus misleading.

- The main claim that an algorithmic approach allows for fewer and still
intuitively located type annotations is not demonstrated. There is no
experimental evaluation of the proposal on a significant series of varied
examples or a representative set of program schemes.

- The severe limitations of predicative polymorphism are not discussed.

In summary, the technical quality of the paper is far below the standard of
the ESOP conference.

Note:

The authors mention a companion technical report, which I did not read.

-------------------- review 3 --------------------

OVERALL RATING: 4 (Negative, will argue for rejection)
REVIEWER'S CONFIDENCE: 3 (high)
----------------------- REVIEW --------------------

--------------------------------------------------------------
- Summary

The authors present, by example, an algorithm for type inference in
the presence of higher-ranked types and partial type annotations.  The
algorithm proceeds in two phases: In a first phase, type annotations
are propagated through the expressions to construct a preliminary
shape of the to-be-inferred type with possibly several alternatives
for some of its parts.  These alternatives are then reconciled (if
possible) by choosing the most general one.  The alternative-free
shape is then fed into an ordinary HM-type inference phase.

- Overall Impression

and even has become a "hot topic" very recently in the functional
programming community due to the (re)discovery of Generalized Abstract
Data Types (GADTs).  The issue is "transcending Hindley-Milner type
inference". The approach of the authors does look promising, at least
it is not subsumed by Remy's recently published calculus of shape
inference which also propagates type annotations (ICFP 2005).

However, the paper is, although quite nicely written, technically very
weak.  The algorithm is only presented in prose and pictures, and not
in its full generality.  There is no soundness proof of the algorithm
(meaning that it only infers correct types), and nothing is show about
the shape inference phase.  In contrast, in related work (Pottier,
POPL'06; Remy, ICFP'05) does show soundness of type inference
algorithms is shown and shape inference systems are formally compared.
In my opinion, the approach of the present paper is not rigorous
enough, it is not even made precise what the theory is whose types
should be partially inferred: Is it System F, System F-eta, or
predicative System F?  Another issue that could be addressed
rigorously but has not is: Is shape inference (phase 1) guaranteed to
not destroy possible typings?

The authors refer to a technical report, which does contain a formal
specification of the algorithms, but also no soundness proof.

Concludingly, I might say that I did not find enough evidence for the
soundness and relative completeness (wrt. to other approaches) for the
method of type annotation propagation presented in this paper.  I
therefore cannot recommend acceptance.

The paper is well written and I did not find mistakes, but it feels
more like a tutorial than a hard scientific contribution.  I encourage
you to go back to work and deliver the theorems and proofs the paper
currently lacks.

One sentence caught my attention: "We stick to algorithm W because it
has proven itself over the years."  Algorithm W has been designed for a
specific problem (ML-inference), and if the problem changes, there is
no reason to stick to an old algorithm.  Of course, you might find
that a patched W performs very well on that task, but then you have to
prove it.  You might find that it supports very well giving good error
messages, but so far, there is no evidence for this.  It seems to me
that Pottier and Remy's constraint-based approach is more appropriate.
For instance, it is not order dependent, a problem of W you point out
yourself.

Reference 1.  Isn't the name of this author "Steffen van Bakel"?

--------------------------------------------------------------


## Comment of the reviewers on 20060602 Haskell workshop 2006 variant

===========================================================================
Authors:   Atze Dijkstra and S. Doaitse Swierstra
Title:     Exploiting Type Annotations

Score:     A
Expertise: Y

Legend:

A:  Good paper. I will champion it at the PC meeting.
B:  OK paper, but I will not champion it.
C:  Weak paper, though I will not fight strongly against it.
D:  Serious problems. I will argue to reject this paper.

X:  I am an expert in the subject area of this paper.
Y:  I am knowledgeable in the area, though not an expert.
Z:  I am not an expert. My evaluation is that of an informed outsider.

---------------------------------------------------------------------------
Summary of the paper:

The paper presents an approach to integrating type annotations into type
inference for rank-n types. This is done by separating the inference process
into two phases - one for collecting constraints arising from explicitly
specified signatures and one for the usual Hindley-Milner inference extended
with these constraints.

Judgement:

The described approach is certainly interesting and studying it seems to be
very worthwhile. Overall, the paper is fairly easy to understand, although
some parts might benefit from more detailed explanations. A major drawback is
the lack of a correctness proof which is an important direction for future
research.

General suggestions:

While the proposed approach is described in a lot of detail, the introductory
sections don't give enough of the "big picture" for the subsequent
presentation to be easily understandable when reading the paper for the first
time. In particular, you say that the inference algorithm is split into two
phases; but what information is produced by the first phase and consumed by
the second one is not really described until deep in the formal presentation.
A short overview at the beginnig would be very helpful here.

Notation is introduced in too much detail sometimes (especially in Section 3).
It would perhaps be better to put more emphasis on the explaining the rules
here.

Detailed suggestions:

The rules for /\ in Section 2 look somewhat suspect. For instance, if a
function f is used at the types (forall a. (forall b. b -> b) -> a) and
((Int -> Int) -> Int), the rules would chose the first type as the more
general; but shouldn't it be (forall a. (Int -> Int) -> a) in this case? Also,
while /\ and \/ appear in the introductory sections, they don't seem to be
used in the actual formalisation. This is somewhat confusing.

The option-based approach introduced in Section 4 is somewhat hard to grasp
initially. Perhaps a short example would help here.
===========================================================================

===========================================================================
Authors:   Atze Dijkstra and S. Doaitse Swierstra
Title:     Exploiting Type Annotations

Score:     C
Expertise: Z

Legend:

A:  Good paper. I will champion it at the PC meeting.
B:  OK paper, but I will not champion it.
C:  Weak paper, though I will not fight strongly against it.
D:  Serious problems. I will argue to reject this paper.

X:  I am an expert in the subject area of this paper.
Y:  I am knowledgeable in the area, though not an expert.
Z:  I am not an expert. My evaluation is that of an informed outsider.

---------------------------------------------------------------------------
Summary of the paper:

The paper describes a mechanism for inferring System F types from
partially type-annotated source programs.  The basic idea is to use a
two-phase approach: the first phase propagates annotations to produce
a more heavily annotated program making use of intersection and union
types; the second applies a standard Hindley-Milner algorithm to infer
all remaining types.  Higher-ranked polymorphism in introduced only as
necessitated by the initial annotations.  The paper gives details of
the propagation and inference mechanisms and some examples of their
behavior.  Although presented as an "algorithm," there is no
independent specification of what the mechanism is supposed to
achieve, i.e., there is no presentation of a type system for which the
mechanism might be be proven to do sound and complete inference.  Thus
it is probably better thought of as a "heuristic" for obtaining System F
programs from partially annotated erasures in some cases.

Judgement:

The goals of this paper are certainly worthwhile (although the paper
itself lacks any motivating examples); as the authors note, there has
been a good deal of similar research published over the past few
years.  My major problem with the present paper is that the authors
make no real attempt to show how their work is distinguished from
their competition -- either in terms of expressivity/ease of use for
the programmer or formal methodology/ease of implementation for the
compiler writer.  The authors say that they have implemented the
mechanism as an "experiment," but it is not clear what question the
experiment is intended to test, much less what the outcome is.  In
particular, the paper gives neither an independent definition of the
type system for which the method does inference, nor realistic
examples showing the power and the limitations of the method.  This
gave me little motivation to fight through the formal presentation of
the mechanism, especially as this uses rather heavy and non-standard
notation.  In summary, although a lot of effort evidently went into
the research reported here, the paper doesn't tell us why we should care.

General suggestions:

- Please give more detailed comparison with existing work,
particularly from the "stratified type inference" school.

- Please provide more realistic examples, including examples where
inference fails; these can be very helpful for developing an intuition
for the mechanism's behavior.

- Even if you don't provide proofs, you should at least sketch some of
the desirable properties you would *like* to prove if you could.  The
"boxy types" has a good selection of these...

- With any type system, particularly one lacking a specification, an
important question is whether it will be *predictable* for
programmers; that is, will they be able to develop a usable mental
model for how inference works.  Can you comment on this?

Detailed suggestions:

p. 1, col. 1.  The gmapT example is completely mysterious unless the
reader is very familiar with the SYB paper.  You should at least
explain what it is for!

p. 1, col. 2. Example 1.  The definition of g is missing. (So are the
semicolons you use in all the other examples.)

p. 1, col 2. last bullet. Your emphasis on the order-dependence of HM
seems a bit bogus.  If HM is cast as a constraint generation followed
by constraint solving, there is no order-dependence.  Admittedly,
Algorithm W uses an explicit order, but then why would you use for an
order-sensitive problem?

p. 4, col 1, below Example 6.  "both of which are not" -->
"neither of which is"

p. 5, section 4 first sentence: "types get known to expressions" -->
"types for expressions become known"

p. 6, col 1. The discussion in this column is a bit repetitive of
section 2.
===========================================================================

===========================================================================
Authors:   Atze Dijkstra and S. Doaitse Swierstra
Title:     Exploiting Type Annotations

Score:     B
Expertise: Y

Legend:

A:  Good paper. I will champion it at the PC meeting.
B:  OK paper, but I will not champion it.
C:  Weak paper, though I will not fight strongly against it.
D:  Serious problems. I will argue to reject this paper.

X:  I am an expert in the subject area of this paper.
Y:  I am knowledgeable in the area, though not an expert.
Z:  I am not an expert. My evaluation is that of an informed outsider.

---------------------------------------------------------------------------
Summary of the paper:

Judgement:

A good read, but I have two problems with this paper:

1) The need for this extension is poorly motivated.  In general I
understand the need for more expressive and explicit types in
Haskell, but the opening example is sufficiently baroque that the
reader will likely ask, "Why would I ever want to write that?"

2) The paper, like many about type system extensions, has no
correctness proof of any kind.  So even the soundness of the
inference rules -- which are arguably complex -- is open to
question.  HM itself is notoriously fragile, so any extension will
be suspect.

Both of these problems could be ameliorated somewhat in practice if
the extension were being used by lots of people with lots of success,
but I don't see the evidence for that either.

General suggestions:

Detailed suggestions:

===========================================================================
===========================================================================
Authors:   Atze Dijkstra and S. Doaitse Swierstra
Title:     Exploiting Type Annotations

Score:     A
Expertise: X

Legend:

A:  Good paper. I will champion it at the PC meeting.
B:  OK paper, but I will not champion it.
C:  Weak paper, though I will not fight strongly against it.
D:  Serious problems. I will argue to reject this paper.

X:  I am an expert in the subject area of this paper.
Y:  I am knowledgeable in the area, though not an expert.
Z:  I am not an expert. My evaluation is that of an informed outsider.

---------------------------------------------------------------------------
Summary of the paper:

This paper is an exploration of the art of the possible in post
Hindley-Milner type inference. The idea is to propagate type
information supplied by the programmer in order to infer polytypes
which would otherwise require explicit signatures. The presentation
is inevitably algorithmic. The approach is effectively to recover
programs currently not accepted for lack of a signature by computing
annotations which would render them acceptable with the current
approach to higher-rank types.

Judgement:

I think this paper makes a useful contribution to our understanding of
the capabilities and limitations of type inference for programs involving
higher-rank polymorphism. I would therefore be pleased to accept it.

Higher-rank polymorphism means the end of the Hindley-Milner
division of labour---people write programs; machines write
types. Now people write type signatures to indicate that a function
has polymorphism beyond basic generalisation at 'let', and the
machine infers the details of each use. This paper lifts the 'an
inferred type is a monotype' restriction by allowing the inference
of polytypes for arguments used in a context which requires them to
be polymorphic. Whether or not this delivers a pragmatically significant
benefit, it is certainly a useful contribution to show that it can be
done.

The initial example is rather contrived, but it does show clearly what the
problem is. A lambda-bound variable h is applied to an Int and a Char:
clearly this is in error if h has a monotype, but a third usage of h
requires it explicitly to have a polytype, and that polytype is flexible
enough to legitimise the otherwise incompatible special cases.

I don't think the lack of formal proofs is a serious problem. For
one thing, this isn't a presentation of a new type system which must
be shown type-sound. Rather it's a new type inference algorithm
which targets annotated terms in a type system already known to be
type-sound. It's nice to know that the rules given in the paper are
guaranteed by Ruler to be what is implemented, but the obligation is
now to show that these rules respect System F typability (or some such):
this should ideally be obvious.

The biggest thing wrong with this paper is ephemeral: the rules presented
are really hard to make sense of. I have some suggestions in this respect
below.

General suggestions:

In the initial example, it may be worth remarking that forall a. a
-> a is not the only possible type for h. If h were typed forall a,
b. a -> b, then f's definition would also typecheck, but the
application f id would fail. It should be clear that the degree of
polymorphism is suggested by the application g h, not the
application f id. The remark that the basic strategy is to gather type
alternatives and choose the most or least general according to variance
deserved greater prominence.

It is not quite right to talk of the HM type inference
algorithm. The Damas-Milner algorithm is not the only implementation
of the HM type system, and HM is not inherently order-biased. Bruce
McAdam has given an unbiased version of HM, based on unifying the
substitutions generated from notionally concurrent type inference
tasks. However, you're right to point out the negative consequences
of choosing an ordered implementation of constraint solving when the
constraints become more complex. Retaining unsolved constraints in the
hope that later revelations will prove helpful is vital in many post-HM
type systems.

I fear I must say that the presentation of the rule systems is really
quite intimidating. I strongly encourage you to think a bit more about
going on at once. It would be kind if you could find a way to separate
the formal presentation of this two stage process into separate rule
sets. Figure 18 reads like the output of UUAG: I'd much rather see the
layers separated.

If you could see a way to make the threading of constraints less
noisy, you could focus more clearly on what actually changes. Some
of your parametrisation really doesn't help matters: figure 14 is
complicated enough already. Perhaps if you separated the different
kinds of alternative spatially rather than by label and treated one
direction at a time, the result would be more readable, albeit
longer. Also, the various judgment forms look far too similar when
they're just built from punctuation marks. Aspiration: you should be
able to distinguish judgment forms whilst holding the paper at arm's
length. Some thoughtful recrafting of the way rules are written
could have a major positive impact on the comprehensibility of this
paper.  I think that's really the most important thing to spend time
on improving.

Detailed suggestions:

Sec 1.

arbitrary complex types --> arbitrarily complex types
wave the above restriction --> waive the above restriction
all possible types for type variables
doesn't meta-typecheck; do you mean 'all necessary instances for
type variables'? I guess you don't mean 'all possible kinds for
type variables'.

Sec 2.

rewrite system for /\
It's not quite obvious what's going on here: is it really the intention
to pick one of the polytypes and hope it generalises the others? This
may be reasonable, if the intuition is that polytypes are never
invented by the machine, only propagated.

Sec 3.

By convention type rules have the form -->
By convention typing judgments have the form
we assumee --> we assume

Fig 4 Type matching
Should you be distinguishing rigid variables (skolem constants) from
flex variables here? Should there be an occur check?

Sec 4.

types get known to expressions --> types get ascribed to expressions  ?
actual/expected/known types
a little confusing, perhaps; how about inferred/required types?
known types of these expression --> known types of these expressions
If the application 'g h' would be removed -->
If the application 'g h' were removed

Sec 4.2

The H and N markers are a bit confusing. Might it make sense to make
the distinction spatially, writing something like

hards {softs} <= v <= hards {softs}

?

Sec 4.3

for different purposed --> for different purposes
The remark about the result type from matching is really confusing.
The remark about 'skolemized type variables' seems slightly wrong to me.
If I recall correctly, Skolemization is the transformation
from  all x. ex y. R x y  to ex f. all x. R x (f x)  and we say that
y has been Skolemized. I think these local rigid variables are
sometimes called Skolem constants. I'd avoid the term entirely, and
just use the flex/rigid distinction standard from unification
literature.
===========================================================================

===========================================================================
Authors:   Atze Dijkstra and S. Doaitse Swierstra
Title:     Exploiting Type Annotations

Score:     C
Expertise: X

Legend:

A:  Good paper. I will champion it at the PC meeting.
B:  OK paper, but I will not champion it.
C:  Weak paper, though I will not fight strongly against it.
D:  Serious problems. I will argue to reject this paper.

X:  I am an expert in the subject area of this paper.
Y:  I am knowledgeable in the area, though not an expert.
Z:  I am not an expert. My evaluation is that of an informed outsider.

---------------------------------------------------------------------------
Summary of the paper:

This paper extends Algorithm W with higher-rank (and perhaps
impredicative) polymorphism using user annotations that are propagated
"globally" in the first phase and used locally in the second.

Judgement:

This paper is timely, as several extended type systems based on user
annotations have been recently proposed. Furthermore, I think that the
general idea of some sort of global propagation of type annotations is
good.

However, I can't recommend this paper for publication. I can't
understand the proposed algorithm well enough to determine the
consequences of this idea nor see whether it does indeed improve on
previous work.

A part of the problem is that the type system presented by this paper
is specified only as a type inference algorithm. That both constrains
the implementation of the type system (to this particular algorithm)
it also requires more from the user who must understand the algorithm
in order to determine whether their program will type check. All other
work in this area provides some sort of declarative specification.

Furthermore, this algorithm is very complicated. (It is specified by
13 figures!)

This would not be so bad, except the description of the algorithm is
hard to follow.  Sometimes, the specification of the algorithm is hard
to understand because of over-generalization. For example,
subsumption, matching, meet and join are specified through a common
relation with many special cases---specifying them individually would
help me to compare them to existing work. This is particularly
important for the subsumption relation.

In other places, the explanation is just confusing. The "Detailed
suggestions" section below tracks my reading of the paper and
indicates some of the places where I had trouble.

algorithm. Is it a conservative extension of Algorithm W? Can it
express all System F programs? In the end, these basic results don't
tell me that much, but they do give a baseline assurance that
everything is well.

Because of these above reasons, after spending some time with the
paper, I don't think I can understand how it compares to existing
work. So it is difficult for me to see the contributions of this
paper.

General suggestions:

I can't figure out whether this system includes impredicative
polymorphism or not. The motivation only talks about higher-rank
polymorphism, no examples are given, but there are rules defined that
support it, but in the end the final figure doesn't seem to use
them. There needs to be a more consistent story here.

I think the paper would be much improved if the technical details of
"strong local type inference" were explained separately from
"quantifier propagation", instead of both being specified together in
Figure 18.

Detailed suggestions:

Introduction.

I don't understand the goal: "A programmer can specify all types which
are allowed by the type system". What do you mean? Why isn't this
trivial? In this setting, do you perhaps mean that programmers can use
type annotations to embed all System F programs in this type system?

Is there a way to determine if a type system "exploits type
annotations as well as possible"? How do you know if the algorithm
proposed by this paper succeeds in this goal?

Section 2.

Although this paper eventually distinguishes between the Hindley
Milner type system and Algorithm W (in Section 3), that distinction is
not made early enough. As a result, the paper includes false claims
such as "HM is order biased", when what is meant is "Algorithm W is
order biased". The introduction of the paper should be revised to
refer to "Algorithm W" almost every place that "HM type inference" is
mentioned.

What about examples that demonstrate impredicative polymorphism?

Section 3.

"HM type inference, the classical inference algorithm" Here you cite
Milner's 1978 paper, which does include Algorithm W. However Damas and
Milner, POPL 82 showed the completeness of the algorithm and should
also be cited.

"HM local type system + inference"---you cite Pierce/Turner and
Odersky et al. here, yet I'm not sure of the connection. Neither of
these works extend the HM type system. (And also, Pierce & Turner
does not encode checking/inference mode in the types.)

"HM strong local type inference". I'm confused. Is that what is
proposed in this paper? If not, what is it, and how does it relate? Is
HM strong local type inference an algorithm or a specification?  Note
that "encoding checking/inference mode in types" really only makes
sense for a specification.

"We have omitted rules related to Char..." Why include Char in the
paper in the first place?

"A type variable is fresh when it doesn't appear in contextual
information c". Which contexts?  I see that "fresh" appears in rules
where there are several judgements, and several contexts. Is the
variable fresh for all of them? (That doesn't seem to make sense for
E.LET_HM.)

Section 4.

"the generalized matching relation simeq, in Figure 7." Where is the
generalized matching relation defined? It is not in Figure 7.

It took me a long time to figure out what you mean by the boolean
options in the generalized matching relation. The rule in Figure 7
treats them like a list---i.e. that o is just a set that could include
any combination of flags such as <\mapsto^+>, <\mapsto^->, etc.
However, I see now, that o is a record with four boolean components,
and the notation <\leq^+>,o means, take that record and update the leq
component to be true. Please explain this better in the text!

Now, are all combinations of flags meaningful? What if o contained
<fit+, bind+, meet+, join+>?  What relation does generalized matching
compute then? I guess Figures 9 and 10 are specifying the four
combinations that make sense. But again, it took me a lot thought to
figure that out. I'm guessing that fit+, meet+ and join+ would be
mutually exclusive, but that is never specified anywhere. Perhaps
there is a better way to specify these relations? Is the structure you
need for o is something like:

data Rel = Fit | Meet | Join
data O = O { matching :: Bool, rel :: Rel }

Figure 7 box: Is tau the same as sigma? Your language in Figure 6
doesn't include tau.

Figure 11. M.ALT_I2:
What does the variable name in the alternative type for? For example,
I find it odd that v1 is forgotten when two alternative types are combined.

M.ALT_L1_I2
This rule overlaps with M.ALT_I2. Should there be a precondition that
sigma_2 not be an alternative type?

Are Figures 8 and 11 the only rules that specify fit?

Meet and join: Have you proven that the operations
o |- sigma1 triangle-up/down sigma2 : sigma -> C
actually compute the meet and join?

"In case of absence of a quantifier in sigma_1 <= sigma_2, both types
must match." What rule enables this behavior? Are all of the rules for
generalized matching included in the paper?

Type alternative elimination: What is a type with a quantifier? Can
the quantifier be anywhere, or just at the top level?

Example 1 walk through: can you revise the example to use simpler
variable names than v_23_0 and v_38_0?  How about v1?
i.e. forall a.a                  -> a
as   forall a.(a -> a).

Meet/join computation. Ok, I give up. You've lost me here. I'm

4.5: I'm having a hard time understanding how a two-pass algorithm can
be defined by Figure 18.

Figure 18. What are the components of the expression type rules
judgment? What is the sigma^k to the left of the turnstile?

Where is impredicative inference used? I don't see o_im anywhere in
Figure 18?

E.APP_I2: In general, are the output constraints a superset of the
input constraints of a judgment. If so, why does this rule output both
CC_a and CC_A? If not, why does it not also include CC_f and CC_F?
(Where CC are the constraints from the first stage.)

E.LAM_I2: What is fi^-_{l,r} ?

E.LET_I2: Is the CC_E input on the third premise a typo?
===========================================================================



## Comment of the reviewers on 20060407 ICFP2006 variant

On the positive side, this is more ambitious than many of the current
attempts at propagating polymorphism information from annotations in
type inference in System F.  Also, it seems to be introducing some
novel new mechanisms for propagating polymorphism information.

Unfortunately, the paper seems to have lots of gibberish in it.  There
are too many mistakes; see details below.  Maybe all of these mistakes
are easily fixable, but the rate at which I encountered them caused me
to lose confidence in the paper before reaching the end of section 2.

Overall, I recommend rejection.  I hope the authors can clean the
paper up and resubmit it (but please don't rush it), as they seem to
be trying some promising ideas.

The details:

In figure 3, because of the way your system threads the substitution
through the rules, and because you missed some spots, I can derive
that (lambda z. lambda y. lambda x.y(x int)(z x)) has type ((v_2 \to
v_5) \to (v_3 \to v_5 \to v_6) \to (Int \to v_3) \to v_6).  In this
type, v_2 should really be (Int \to v_3), but the typing rules don't
work correctly.  One place to fix this is to change inst(\sigma) in
E.VAR_{HM} to inst(C^k\sigma).  However, I think it would be better to
change the ways the rules work to be less delicate and easier to
understand.

In figure 3, your E.LET_{HM} rule is completely missing the condition
that checks that any recursive uses of the let-bound variable are at
the same type inferred for it.  So we can type non-HM typable stuff
like (let i = i j in e); in this example, the variable i gets type
(forall v.v) in e.

In figure 3, your inference rules have broken freshness conditions.
Consider for example E.LAM_{HM}.  Remember that in these kind of
inference rules, the horizontal line is just another notation for
implication.  That is, your E.LAM_{HM} rule is of the form ((v fresh
and judgement_1) => judgement_2).  This rule is a condition on the set
of judgements that hold, and by convention we assume the valid
judgements are the smallest set that satisfies all the rules.  Now in
this situation, what does "v fresh" mean, logically?  Does it mean "v
does not occur in judgement_1", or maybe "v does not occur in
judgement_1 and judgement_2"?  Clearly the latter is useless, as it is
pointless to exclude v from the rest of the rule.  So we can guess it
means "v does not occur in judgement_1".  However, in this case, we
can not use these rules to derive that (\lambda i \to i) has result
type "v \to v", which is sheer nonsense.  Of course, you actually
probably mean "v does not occur in C^k or \Gamma", but you didn't
write that.

The rules in figure 4 can't handle the case of "\vdash \tau_1 \to
\tau_2 \cong v \leadsto C", although there is a rule for "\vdash v
\cong \tau_1 \to \tau_2 \leadsto C".

When you say, "the rules in figure 3 from now on yield constraints",
what do you mean?  Are you saying the reader should pretend the rules
have been redefined differently?  If so how?  Can you write the
changes?

The use of an overline to indicate a sequence is completely undefined.
Furthermore, you misuse this notation to indicate a _set_ in some
places.  In any case, I strongly recommend not using it when the
overbar is over a compound expression (e.g., over "a \mapsto \sigma"),
as this is difficult to define in a way that is clear and precise to
the reader.  In particular, there is no clear way for the reader to
tell which different overlines near each other must be for sequences
of the same length.  Also, if an overline is over a compound
expression mentioning \sigma, and nearby there is an overline just
over \sigma by itself, there is no clear way for the reader to know if
these \sigma's are related.

Figures 2 and 3: The typing rules in figure 3 only make sense if every
\tau is also a \sigma.  Right now, a \sigma is of the form "\forall,
sequence of variables, full stop, \tau".  No \tau is of this of this
shape.  You seem to be taking for granted that "\forall, empty
sequence, full stop, \tau" is somehow equal to \tau.  This is bad.
For example, if you do this in a theorem prover, then usually you will
find that all of mathematics collapses with the result that 1 = 0,
true is false, etc.  I suggest adding "| \tau" to the grammar for
\sigma.

"ftv" is undefined and the name is unexplained.  The extension of ftv
to environments is also undefined, and not obvious to a reader who
does not already know the definition.

You do not define "\equiv" and use it in places where "=" would make
more sense.

"C_1 C_2" (composition of substitutions) is undefined.

You use \in with \Gamma, although \Gamma is not defined as a set.

It is strange that you use the word constraint for what is more
commonly called a substitution.

Your use of the words "subsume", "subsumption", etc., is unclear.

You remark that your types are somewhat like intersection types.
Actually, it is more like both intersection and union types, although
very restricted versions.  The most similar previous system is the
system of constrained types of Eifrig, Smith, and Trifonov.  The
system of Aiken, Wimmers, and Lakshman is also similar.  I suggest you
compare with them.

I like that your system will avoid any arbitrary left-to-right bias
while performing type inference.

In figure 5, it is confusing that you appear to be redefining \cong.
Earlier it seemed like a fixed symbol.

Figure 6 is very confusing.  Just what is the actual definition of
\sigma?  You seem to have several alternative definitions and it is
not clear which ones are used where.

You've repeated part of the definitions in figure 6 in the text in
section 3.2.

For the implementability of higher-rank intersection types it is
better to cite [Kfoury & Wells, TCS 2004] and [Carlier & Wells, PPDP
2004].

"inst" is undefined.

"loose" should be "lose".

-------------------- review 24 --------------------

The topic of the paper is partial type inference for System F.  While
the authors seem to have very clear ideas of what there are doing,
they completely fail to transmit them. After more than two hours spent
on that paper, I could not yet understand the main ideas (if any) and
the least technical details of the authors' approach.  There is
definitely something wrong in the presentation and most probably in
the approach itself. Therefore, the paper cannot be accepted.

The few things I understood from the informal discussion is that the
authors tend to use user-provided type annotations and propagate them
to other parts of the program. For example, explicitly annotated
function definitions can transmit their domain type to the arguments
to which they are applied.  All types annotations that could thus be
transmitted to a same program point are collected in a set called
"type alternatives" and resolved later by a procedure called "type
alternative elimination", using meets and joins to produce a best
annotation (so they claim). However, I don't understand how this
really works. I did not find any discussion of the propagation
strategy, which seems to be ad hoc and very fragile: there is no
example showing what happens if the argument is annotated and the
function is not.  Worse, I have serious difficulties understanding how
type alternatives can be resolved in a principal way.

Actually, I don't really understand the authors' definition of type
instance (subsumption), or even where exactly type-instance is being
defined. Type instance is a binary relation between types, but rules
of Figure 8 define a complex relation of arity 5. In fact, judgments
of Figure 8 do not match the Figure's title prototype. The same
judgment also appears in Figure 7, but with a different prototype and
reappears later in Figure 11. If I collect all rules involving <=
together, I am getting lost with the amount of symbols and lack of
explanations on how to read them.  The authors mention
contra-variance, but none of the above rules mention arrow types,
hence I do not see any contra-variance involved. Is then,
contra-variance hard-wired in the type-inference process itself? This
would be very ad-hoc and fragile, as well-typedness would not very
robust with respect to small program transformations.

Explanations in the rest of the paper are similarly loose.  The final Figure
18, which is supposed to describe what programs are well-typed, is also
unreadable. Worse, rule E.Let does not seem to include type annotations,
which should be a key to the approach.

Another important issue that remains completely unclear to me is how type
annotations with free type variables would be handled. No hint. Are these
forbidden? Can then, all of System F programs still be reached?

I don't either understand how can phases be split into propagation and then
inference a la ML without any more interaction between then. In parcticular,
what about rebinding of user-annotated programs such as in:

let f :: s1 = e1 in
let f' = f in
fun y -> g y

and its small variant

let f :: s1 = e1 in
let f' = f v in
fun y -> g y

where v is a value and f a multi-parameter function?

Actually, I don't either understand the final phase of typing a la ML in the
presence of type annotations. For instance, assume that we switch off
propagation of type annotations. What programs are then accepted? what would
be the derived (simplified) typing rules?

Some related works are listed.  However, no comparison with them is
actually done. So even if I could made sense of the authors approach I would
have no hint of why I should follow this approach rather than
previous approaches such as [2, 15, 16, 18, 25]. A closer comparison with
existing work is mandatory.

Not to mention type soundness.... what ensures us that accepted programs are
correct with respect to evaluation? or well-typed in System F?

At the end of the reading, I am still completely unable to say which simple
programs would be typable or not, and why it would be so.

-------------------- review 211 --------------------

This paper offers an approach to the difficult problem of type inference
for higher-ranked types. It requires that the programmer supply some
annotations, and then tries to exploit them as much as possible to infer
correct typings for program fragments involving higher-ranked types. The
approach proceeds in two phases, each based on a variant of the standard
Hindley-Milner type inference algorithm. A working prototype of the
proposed system is publicly available.

I found this paper rather impenetrable. Aside from copious grammatical
errors, which compromise comprehension of the technical content of the
paper in some places, I found it difficult to discern from the
presentation exactly what the "algorithm" being proposed is. Perhaps
including a flow chart or some other visual aid would help.

There is also a plethora of notation in this paper! And a zillion
inference rules, each seemingly more complex than the last! (The first
rule in Figure 14 is particularly daunting!) Perhaps the proposed
algorithm "works" and perhaps it doesn't. But, for one thing, I can't
even begin to imagine determining exactly when it terminates, and is
thus a true "algorithm", let alone formulating the properties that one
would want the algorithm to have in order to be able to prove that it is
correct in any sense.  Even the authors don't attempt any of this,
instead saying, in essence, "Here are some complicated manoeuverings
that we claim make type inference possible for higher-ranked types. It
'works'." It would be helpful to at least formulate the pertinent
properties of the algorithm presented, even if they aren't proved here.
At least the reader could then try to construct plausibility arguments
for himself.

Perhaps a reader could benefit somewhat from playing with the prototype,
although that probably wouldn't help very much with understanding the
internals of the algorithm on which it is based, and shouldn't be
required in order to understand the paper...

I wish I could say something more positive, since I'm sure a lot of work
has gone into developing this technique. But it is hard to see how this
paper, as currently presented, obviously crammed into far too little
space, is very valuable. Perhaps these kinds of detailed "algorithms"
just don't lend themselves very well to conference papers. There may
just be too much detailed manoeuvering required in such technical
developments to allow for proper exposition, and thus for proper
understanding. In my opinion, this is certainly the case for the present
paper.

page 1, column 1, line 21: "Literature is abound" -> "The literature
abounds".

page 1, column 2, line -2: "Specified type signatures are" seems like it
should be "A specified type signature is". Actually, this whole sentence
seems strange, since a type signature is not itself a type... Please
reword.

page 1, column 2, line -2: Missing comma after "required".

page 2, column 1: It would be helpful to say, in the first bullet
point's text, explicitly what the two stages of your algorithm are, just
to give the reader something concrete to hang on to.

page 2, column 1, line 23: "one phase type inference in" -> "one-phase
type inference algorithm in".

page 2, column 1, line -18: One doesn't implement a paper... Please
reword.

page 2, colum 1, line -14: I don't understand this comment "thereby
providing... you get".

page 2: Do you really mean to have the definition of type expressions
in Figure 1 AND the definition in Figure 2?

page 3: Can you say that the function inst in Figure 3 is, at least
briefly?

page 3, column 1, line -3: "sofar" -> "so far".

page 3, column 2: I can't parse the first sentence in the first bullet
point's text.

page 3, column 2: There is a missing period in the text of the second
bullet point.

page 3, column 2: Is there a missing bullet point that should deal with
abstractions?

page 3, column 2, line 17: "loose" -> "lose".

page 4, column 2, line 3: "passed to by" -> "passed to".

page 4, column 2, line 8: "passed by" -> "passed to"

page 4, column 2, line -16: "from it" -> "from them".

page 4, column 2, line -5: I don't see how v1 is bound to v1 in the
example on the next page.

page 4, column 2, line -5: "a type" -> "type".

page 5, column 1, line -4: Shouldn't some of these designations be H_h?

page 5: What is (I2) in the figure captions throughout the paper?

page 6, column 1, line -21: Indicate that rule M.VAR.L3 is in Figure
11.

page 7, column 1, line 5: Missing comma after "fit".

page 7, column 1, line -17: how does something actually GET marked as
hard?

page 7: What is "toggle" in Figure 14?

page 7, column 2, line -13: If h is a function, then it cannot be a
polymorphic type. There are numerous errors of this kind riddling this
paper. Please be careful to say what you mean. Two lines later: how can
h be the most general type? It is a function, is it not?

page 8: Introduce notation with meet and join triangles in it,
appearing in Figure 16.

page 11, column 1, line -17: I don't see how you could really be 100%
"confident" that the approach presented here doesn't break normal type
inference. I don't see how one could really completely sure that this
process has any particular properties without proving them formally.
The situation just seems too complex to have infallible intuitions and
be willing to rely completely on them.

page 11, column 1, line -3: "thesis" -> "paper".

page 11, column 1, line 2: "message" -> "messages".

page 12: Update reference [25].



-- AtzeDijkstra - 09 Apr 2008

Topic attachments
I Attachment Action Size Date Who Comment
pdf 20060407-2233-icfp06-impred.pdf manage 172.9 K 09 Apr 2008 - 12:38 AtzeDijkstra
pdf 20060602-2040-hw06-impred.pdf manage 183.8 K 09 Apr 2008 - 12:44 AtzeDijkstra
pdf 20061013-1745-esop07-impred-tr.pdf manage 209.0 K 09 Apr 2008 - 12:45 AtzeDijkstra
pdf 20061013-1745-esop07-impred.pdf manage 134.6 K 09 Apr 2008 - 12:45 AtzeDijkstra