Haskell – When is -XAllowAmbiguousTypes appropriate

ambiguous-typesghchaskell

I've recently posted a question about syntactic-2.0 regarding the definition of share. I've had this working in GHC 7.6:

{-# LANGUAGE GADTs, TypeOperators, FlexibleContexts #-}

import Data.Syntactic
import Data.Syntactic.Sugar.BindingT

data Let a where
    Let :: Let (a :-> (a -> b) :-> Full b)

share :: (Let :<: sup,
          sup ~ Domain b, sup ~ Domain a,
          Syntactic a, Syntactic b,
          Syntactic (a -> b),
          SyntacticN (a -> (a -> b) -> b) 
                     fi)
           => a -> (a -> b) -> b
share = sugarSym Let

However, GHC 7.8 wants -XAllowAmbiguousTypes to compile with that signature. Alternatively, I can replace the fi with

(ASTF sup (Internal a) -> AST sup ((Internal a) :-> Full (Internal b)) -> ASTF sup (Internal b))

which is the type implied by the fundep on SyntacticN. This allows me to avoid the extension. Of course this is

a very long type to add to an already-large signature
tiresome to manually derive
unnecessary due to the fundep

My questions are:

Is this an acceptable use of -XAllowAmbiguousTypes?
In general, when should this extension be used? An answer here suggests "it is almost never a good idea".
Though I've read the docs, I'm still having trouble deciding if a constraint is ambiguous or not. Specifically, consider this function from Data.Syntactic.Sugar:
```
sugarSym :: (sub :<: AST sup, ApplySym sig fi sup, SyntacticN f fi) 
         => sub sig -> f
sugarSym = sugarN . appSym
```
It appears to me that fi (and possibly sup) should be ambiguous here, but it compiles without the extension. Why is sugarSym unambiguous while share is? Since share is an application of sugarSym, the share constraints all come straight from sugarSym.

Best Answer

I don't see any published version of syntactic whose signature for sugarSym uses those exact type names, so I'll be using the development branch at commit 8cfd02^, the last version which still used those names.

So, why does GHC complain about the fi in your type signature but not the one for sugarSym? The documentation you have linked to explains that a type is ambiguous if it doesn't appear to the right of the constraint, unless the constraint is using functional dependencies to infer the otherwise-ambiguous type from other non-ambiguous types. So let's compare the contexts of the two functions and look for functional dependencies.

class ApplySym sig f sym | sig sym -> f, f -> sig sym
class SyntacticN f internal | f -> internal

sugarSym :: ( sub :<: AST sup
            , ApplySym sig fi sup
            , SyntacticN f fi
            ) 
         => sub sig -> f

share :: ( Let :<: sup
         , sup ~ Domain b
         , sup ~ Domain a
         , Syntactic a
         , Syntactic b
         , Syntactic (a -> b)
         , SyntacticN (a -> (a -> b) -> b) fi
         )
      => a -> (a -> b) -> b

So for sugarSym, the non-ambiguous types are sub, sig and f, and from those we should be able to follow functional dependencies in order to disambiguate all the other types used in the context, namely sup and fi. And indeed, the f -> internal functional dependency in SyntacticN uses our f to disambiguate our fi, and thereafter the f -> sig sym functional dependency in ApplySym uses our newly-disambiguated fi to disambiguate sup (and sig, which was already non-ambiguous). So that explains why sugarSym doesn't require the AllowAmbiguousTypes extension.

Let's now look at sugar. The first thing I notice is that the compiler is not complaining about an ambiguous type, but rather, about overlapping instances:

Overlapping instances for SyntacticN b fi
  arising from the ambiguity check for ‘share’
Matching givens (or their superclasses):
  (SyntacticN (a -> (a -> b) -> b) fi1)
Matching instances:
  instance [overlap ok] (Syntactic f, Domain f ~ sym,
                         fi ~ AST sym (Full (Internal f))) =>
                        SyntacticN f fi
    -- Defined in ‘Data.Syntactic.Sugar’
  instance [overlap ok] (Syntactic a, Domain a ~ sym,
                         ia ~ Internal a, SyntacticN f fi) =>
                        SyntacticN (a -> f) (AST sym (Full ia) -> fi)
    -- Defined in ‘Data.Syntactic.Sugar’
(The choice depends on the instantiation of ‘b, fi’)
To defer the ambiguity check to use sites, enable AllowAmbiguousTypes

So if I'm reading this right, it's not that GHC thinks that your types are ambiguous, but rather, that while checking whether your types are ambiguous, GHC encountered a different, separate problem. It's then telling you that if you told GHC not to perform the ambiguity check, it would not have encountered that separate problem. This explains why enabling AllowAmbiguousTypes allows your code to compile.

However, the problem with the overlapping instances remain. The two instances listed by GHC (SyntacticN f fi and SyntacticN (a -> f) ...) do overlap with each other. Strangely enough, it seems like the first of these should overlap with any other instance, which is suspicious. And what does [overlap ok] mean?

I suspect that Syntactic is compiled with OverlappingInstances. And looking at the code, indeed it does.

Experimenting a bit, it seems that GHC is okay with overlapping instances when it is clear that one is strictly more general than the other:

{-# LANGUAGE FlexibleInstances, OverlappingInstances #-}

class Foo a where
  whichOne :: a -> String

instance Foo a where
  whichOne _ = "a"

instance Foo [a] where
  whichOne _ = "[a]"

-- |
-- >>> main
-- [a]
main :: IO ()
main = putStrLn $ whichOne (undefined :: [Int])

But GHC is not okay with overlapping instances when neither is clearly a better fit than the other:

{-# LANGUAGE FlexibleInstances, OverlappingInstances #-}

class Foo a where
  whichOne :: a -> String

instance Foo (f Int) where  -- this is the line which changed
  whichOne _ = "f Int"

instance Foo [a] where
  whichOne _ = "[a]"

-- |
-- >>> main
-- Error: Overlapping instances for Foo [Int]
main :: IO ()
main = putStrLn $ whichOne (undefined :: [Int])

Your type signature uses SyntacticN (a -> (a -> b) -> b) fi, and neither SyntacticN f fi nor SyntacticN (a -> f) (AST sym (Full ia) -> fi) is a better fit than the other. If I change that part of your type signature to SyntacticN a fi or SyntacticN (a -> (a -> b) -> b) (AST sym (Full ia) -> fi), GHC no longer complains about the overlap.

If I were you, I would look at the definition of those two possible instances and determine whether one of those two implementations is the one you want.

Related Solutions

Haskell – When is memoization automatic in GHC Haskell

GHC does not memoize functions.

It does, however, compute any given expression in the code at most once per time that its surrounding lambda-expression is entered, or at most once ever if it is at top level. Determining where the lambda-expressions are can be a little tricky when you use syntactic sugar like in your example, so let's convert these to equivalent desugared syntax:

m1' = (!!) (filter odd [1..])              -- NB: See below!
m2' = \n -> (!!) (filter odd [1..]) n

(Note: The Haskell 98 report actually describes a left operator section like (a %) as equivalent to \b -> (%) a b, but GHC desugars it to (%) a. These are technically different because they can be distinguished by seq. I think I might have submitted a GHC Trac ticket about this.)

Given this, you can see that in m1', the expression filter odd [1..] is not contained in any lambda-expression, so it will only be computed once per run of your program, while in m2', filter odd [1..] will be computed each time the lambda-expression is entered, i.e., on each call of m2'. That explains the difference in timing you are seeing.

Actually, some versions of GHC, with certain optimization options, will share more values than the above description indicates. This can be problematic in some situations. For example, consider the function

f = \x -> let y = [1..30000000] in foldl' (+) 0 (y ++ [x])

GHC might notice that y does not depend on x and rewrite the function to

f = let y = [1..30000000] in \x -> foldl' (+) 0 (y ++ [x])

In this case, the new version is much less efficient because it will have to read about 1 GB from memory where y is stored, while the original version would run in constant space and fit in the processor's cache. In fact, under GHC 6.12.1, the function f is almost twice as fast when compiled without optimizations than it is compiled with -O2.

Haskell tuple constructor (GHC) and the separation between a language and its implementation

Alas, there is no magic in the tuples. Here's the implementation GHC uses, and to give you some idea of what's going on here's the source for the last definition:

data (,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,) a b c d e f g h i j k l m n o p q r s t u v w x y z a_ b_ c_ d_ e_ f_ g_ h_ i_ j_ k_ l_ m_ n_ o_ p_ q_ r_ s_ t_ u_ v_ w_ x_ y_ z_ a__ b__ c__ d__ e__ f__ g__ h__ i__ j__
  = (,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,) a b c d e f g h i j k l m n o p q r s t u v w x y z a_ b_ c_ d_ e_ f_ g_ h_ i_ j_ k_ l_ m_ n_ o_ p_ q_ r_ s_ t_ u_ v_ w_ x_ y_ z_ a__ b__ c__ d__ e__ f__ g__ h__ i__ j__

...yeah.

So, would it be possible to define this family of operators/functions from within the haskell implementation itself, using nothing but the type system and existing language features (declarations, type signatures, function definitions etc.)? And if so, how? Or is it impossible and you have to instead look into the compiler to find the supporting framework for this collection of functions?

No, there's no way to define the tuples like that in a generic way. The common pattern is purely syntactic, nothing that can be done recursively in the type system or otherwise. You could generate such definitions using Template Haskell, certainly, but you'd still be generating each individually with string manipulation to create the name, not using any sort of shared structure.

There's also the matter that tuple syntax is built-in and not something that can be imitated, but that's a separate issue. You might imagine types like:

data Tuple2 a b = Tuple2 a b
data Tuple3 a b c = Tuple3 a b c

...etc., which don't use special syntax but still can't be defined generically for the reasons above.

This leads to an even more general question: How much of Haskell is supported by Haskell itself, through type and function definitions, declarations etc; and how much is supported by the compiler/implementation? (I am aware that GHC was written in Haskell, that doesn't answer the question)

Almost all of it is defined in Haskell. Certain things have special syntax you can't imitate, but in most cases that only extends as far as the compiler giving special attention to certain definitions. Otherwise, there's no difference between this:

data [] a = [] | a : [a]

...and any equivalent type you define yourself.

That is, if you were to abandon the standard libraries (including the prelude) and do everything from the ground up in raw Haskell; would it be possible to build a complete implementation that has all the features of GHC, using only that minimal set of features? What are the mimimum set of language features that you need in order to build a haskell implementation using Haskell? Would I be able to abandon the prelude and then completely rebuild it manually from within GHC? If you abandon the prelude and never import anything, what is left over for you to work with?

You may find it enlightening to read about GHC's NoImplicitPrelude and RebindableSyntax extensions, which let you, among other things, change the definitions used to interpret do notation, how numeric literals are handled, what the if then else syntax does, etc.

Suffice it to say that very, very little can't be reimplemented. Most things that can't are only special due to syntax, and could be replaced with equivalent stuff (like lists and tuples, above).

In the end there's a limited set of things that have very special behavior--the IO type being an obvious example--that you can't replace at all, because they're hooked directly into something in the runtime system that you can't replace.

Best Answer

Related Solutions

Haskell – When is memoization automatic in GHC Haskell

Haskell tuple constructor (GHC) and the separation between a language and its implementation

Related Topic