You have described an effect system. It’s true that there are other effect systems than monads, but in practice monads give you a lot of expressive power that you would need to reinvent in any practical effect system you might devise.
For example:
- I start by tagging my I/O procedures with an
io
effect.
- My pure functions can still throw exceptions, but throwing exceptions isn’t
io
, so I add an exn
effect. Now I need to be able to combine effects.
- I want to have non-I/O functions that use mutation internally on a local heap
h
, and I want the compiler to make sure that my mutable references don’t escape their scope, so I add an st<h>
effect. Now I need parameterised effects.
- For higher-order functions, I need to write multiple versions for every combination of effects I want to support—for example,
map
with a pure function versus map
with an impure function. Now I need effect polymorphism.
- I notice that my user-defined type could be used to model effects (benign failure, nondeterminism) that the language doesn’t have natively. Now I need user-defined effects.
Monads support all of these use cases:
- Functions that may return errors can use the
Either
monad.
- Effects can be combined with monad transformers—if I need exceptions and I/O, I can use the
EitherT IO
monad.
- Monads can be parameterised like any other type; the
ST h
monad lets you do local mutation on STRef
s in the scope of a heap h
.
- Because of the
Monad
typeclass, I can overload an operation to work on any kind of effect. For example, mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
works in IO
, but it also works in Either
or ST
or any other monad.
- All monads are implemented as user-defined types, except for things like
IO
and ST
which must be implemented in the Haskell runtime.
There are of course some significant warts in Haskell’s use of monads for side effects.
For one thing, monad transformers are generally not commutative, so StateT Maybe
is different from MaybeT State
, and you have to be careful to choose the combination that you actually mean.
The bigger problem is that a function can be polymorphic in which monad is being used, but it cannot be polymorphic in whether a monad is being used. So we get loads of duplicated code: map
vs. mapM
; zipWith
vs. zipWithM
, &c.
Now, in answer to your actual question…
Koka from Microsoft Research is a language with an effect system that does not suffer from these problems. For example, the type of Koka’s map
function includes a type parameter e
for effects:
map : (xs : list<a>, f : (a) -> e b) -> e list<b>
And this parameter can be instantiated to the null effect, making map
work for pure and impure functions alike. In addition, the order in which effects are specified is immaterial: <exn,io>
is the same as <io,exn>
.
It’s also worth mentioning that Java’s checked exception mechanism is an example of an effect system that people have widely accepted, but I don’t feel it adequately answers your question because it’s not general.
Your problem is that you link method calls before the program is fully type-checked. In a complicated language with subtype polymorphism and parametric polymorphism, you cannot do any kind of linking before you know the type (or in the case of subtyping, the type bounds) of an expression. The solution is to introduce type variables for expressions of (initially) unknown type, and properly unify type variables with concrete types. If you cannot resolve a type variable, that's an error and not an invitation to deduce the most general type.
A proper type inferer would look at the code
var list2 = list.mapped { s => s.toUpperCase }
in the static environment
list : List[String]
List[E]#mapped[U] : (E => U) => List[U]
String#toUpperCase : () => String
like this:
- infer
var list2 = list.mapped { s => s.toUpperCase }
- infer
list.mapped { s => s.toUpperCase }
- infer
list
- known to be
List[String]
- save
E = String
to the static environment for this expression.
- resolves to
List[E]#mapped[U]
- introduce variable
$U
for U
.
- known to be
(E => $U) => List[$U]
- unify arguments:
(E => $U) >: typeof({ s => s.toUpperCase })
- infer
{ s => s.toUpperCase }
- introduce variables
$a, $b
- known to be
$a => $b
- unify
(E => $U) >: ($a => $b)
: E <: $a
and $b <: $U
. Technically, these are just type bounds, but let's consider them equal.
- save
E = $a
, $b = $U
to the static environment for this expression
- save
s : $a
to the static environment for this expression
- infer
s.toUpperCase
- infer
s
- infer
.toUpperCase
.
- matches
String#toUpperCase
- known to be
() => String
- is
String
- unify
String <: $b
, but lets consider them equal
- save
$b = String
to the static environment for this expression
- is
($a => $b) = (String => String)
- already unified
(E => $U) >: ($a => $b)
.
- is
List[$U] = List[String]
- infer
var list2
- introduce variable
$c
- save
list2 : $c
into the static environment for this scope
- is
$c
- unify
$c >: List[String]
, but let's consider them equal
- save
$c = List[String]
into the static environment
After this round of type inference, asking the static environment for the type of list2
gives us List[String]
. The above steps roughly follow along the type inference steps for a Hindley-Milner type system, but note that you are probably dealing with subtyping, which makes type inference far more complicated: unifying a type only provides us bounds on the type and very rarely a concrete type. In the above example, I ignored these details and always unified the types as equals.
The second major problem is infering the type of the object on which to dispatch for a method call. In this example, I always tried to infer the type of the object before resolving the method call. However, this is not necessary. A method List[E]#mapper[U] : (E => U) => List[U]
could also be viewed as a free function .mapper[E, U]: (List[E], E => U) => List[U]
. In other words, the implicit this
argument is transformed to an explicit argument. This allows us use type variables for the invocant variable, and makes it easier to deal with a set of overloads. Consider the following example:
static env:
.add[E] : (List[E], E) => void
.add[E] : (List[E], List[E]) => void
.add[K, V] : (Map[K, V], K, V) => void
.add : (Integer, Integer) => Integer
a : $a
b : $b
expression:
a.add(b)
when type-infering that expression, we initially only know that this is some kind of .add
call, but we do not know which one – we cannot resolve it at this point. We have to unify each of the possible types with the given types. If one unification fails, we remove that possibility from the set of overloads.
- infer
a.add(b)
- candidate
(List[E], E) => void
- variable
$c
for E
- unify arguments
- is
void
- candidate
(List[E], List[E]) => void
- variable
$c
for E
- unify arguments
$a = List[$c]
$b = List[$c]
- is void
- candidate
(Map[K, V], K, V) => void
- variable
$c
for K
- variable
$d
for V
- unify arguments
$a = Map[$c, $d]
$b = $c
- ERROR: no argument for
V
-parameter
- discard candidate
- candidate
(Integer, Integer) => Integer
- unifiy arguments
$a = Integer
$b = Integer
- is
Integer
So after doing a round of type inference, we are still left with a number of choices. Some languages employ some ranking mechanism, e.g. methods on subtypes are preferred over methods on supertypes. In our example, there is no sensible ranking. If you want your type-checker to be moderately well-performing, you should issue a compiler exception here, e.g.
foo.sourcefile:42:3 error: could not resolve call to method `add`
a.add(b)
^
candidates are:
List[E]#add(E elem)
List[E]#add(List[E] elems)
Integer#add(Integer rhs)
Likewise, you should display an error when the set of possible methods is empty after attempting unification.
Note that for a different static environment, the the set of possible solutions would only contain one method which you could bind to. E.g.:
a : List[$c]
b : $c
Since the call can be resolved unambiguously in that case, we can continue doing type inference for the rest of the compilation unit in hopes of resolving the $c
type variable.
Depending on how you implement generics, you could link the method call directly after successfully infering the type for the call. If you need to know the value of the type variable $c
, you need to wait until type inference for the compilation unit has completed, and can do linking in a second pass. This is now easy to do since all expressions are already annotated with their inferred type.
Best Answer
First, the GHC error,
GHC is attempting to unify a few constraints with
x
, first, we use it as a function soNext we use it as a value to that function
And finally we unify it with the original argument expression so
Now
x x
becomes an attempt to unifyt2 -> t1 -> t0
, however, We can't unify this since it would require unifyingt2
, the first argument ofx
, withx
. Hence our error message.Next, why not general recursive types. Well the first point worth noting is the difference between equi and iso recursive types,
mu X . Type
is exactly equivalent to expanding or folding it arbitrarily.fold
andunfold
which fold and unfold the recursive definitions of types.Now equi-recursive types sound ideal, but are absurdly hard to get right in complex types systems. It can actually make type checking undecidable. I'm not familiar with every detail of OCaml's type system but fully equirecursive types in Haskell can cause the typechecker to loop arbitrarily trying to unify types, by default, Haskell makes sure that type checking terminates. Further more, in Haskell, type synonyms are dumb, the most useful recursive types would be defined like
type T = T -> ()
, however are inlined almost immediately in Haskell, but you can't inline a recursive type, it's infinite! Therefore, recursive types in Haskell would demand a huge overhaul to how synonyms are handled, probably not worth the effort to put even as a language extension.Iso-recursive types are a bit of a pain to use, you more or less have to explicitly tell the type checker how to fold and unfold your types, making your programs more complex to read and write.
However, this is very similar to what your doing with your
Mu
type.Roll
is fold, andunroll
is unfold. So actually, we do have iso-recursive types baked in. However, equi-recursive types are just too complex so systems like OCaml and Haskell force you to pass recurrences through type level fixpoints.Now if this interests you, I'd recommend Types and Programming Languages. My copy is sitting open in my lap as I'm writing this to make sure I've got the right terminology :)