Functional Programming – What is Referential Transparency?

functional programmingimperative-programming

I have seen that in imperative paradigms

f(x)+f(x)

might not be the same as:

2*f(x)

But in a functional paradigm it should be the same. I have tried to implement both cases in Python and Scheme, but for me they look pretty straightforward the same.

What would be an example that could point out the difference with the given function?

Best Answer

Referential transparency, referred to a function, indicates that you can determine the result of applying that function only by looking at the values of its arguments. You can write referentially transparent functions in any programming language, e.g. Python, Scheme, Pascal, C.

On the other hand, in most languages you can also write non referentially transparent functions. For example, this Python function:

counter = 0

def foo(x):
  global counter

  counter += 1
  return x + counter

is not referentially transparent, in fact calling

foo(x) + foo(x)

and

2 * foo(x)

will produce different values, for any argument x. The reason for this is that the function uses and modifies a global variable, therefore the result of each invocation depends on this changing state, and not only on the function's argument.

Haskell, a purely functional language, strictly separates expression evaluation in which pure functions are applied and which is always referentially transparent, from action execution (processing of special values), which is not referentially transparent, i.e. executing the same action can have each time a different result.

So, for any Haskell function

f :: Int -> Int

and any integer x, it is always true that

2 * (f x) == (f x) + (f x)

An example of an action is the result of the library function getLine:

getLine :: IO String

As a result of expression evaluation, this function (actually a constant) first of all produces a pure value of type IO String. Values of this type are values like any other: you can pass them around, put them in data structures, compose them using special functions, and so on. For example you can make a list of actions like so:

[getLine, getLine] :: [IO String]

Actions are special in that you can tell the Haskell runtime to execute them by writing:

main = <some action>

In this case, when your Haskell program is started, the runtime walks through the action bound to main and executes it, possibly producing side-effects. Therefore, action execution is not referentially transparent because executing the same action two times can produce different results depending on what the runtime gets as input.

Thanks to Haskell's type system, an action can never be used in a context where another type is expected, and vice versa. So, if you want to find the length of a string you can use the length function:

length "Hello"

will return 5. But if you want to find the length of a string read from the terminal, you cannot write

length (getLine)

because you get a type error: length expects an input of type list (and a String is, indeed, a list) but getLine is a value of type IO String (an action). In this way, the type system ensures that an action value like getLine (whose execution is carried out outside the core language and which may be non-referentially transparent) cannot be hidden inside a non-action value of type Int.

EDIT

To answer exizt question, here is a small Haskell program that reads a line from the console and prints its length.

main :: IO () -- The main program is an action of type IO ()
main = do
          line <- getLine
          putStrLn (show (length line))

The main action consists of two subactions that are executed sequentially:

  1. getline of type IO String,
  2. the second is constructed by evaluating the function putStrLn of type String -> IO () on its argument.

More precisely, the second action is built by

  1. binding line to the value read by the first action,
  2. evaluating the pure functions length (compute length as an integer) and then show (turn the integer into a string),
  3. building the action by applying function putStrLn to the result of show.

At this point, the second action can be executed. If you have typed "Hello", it will print "5".

Note that if you get a value out of an action using the <- notation, you can only use that value inside another action, e.g. you cannot write:

main = do
          line <- getLine
          show (length line) -- Error:
                             -- Expected type: IO ()
                             --   Actual type: String

because show (length line) has type String whereas the do notation requires that an action (getLine of type IO String) be followed by another action (e.g. putStrLn (show (length line)) of type IO ()).

EDIT 2

Jörg W Mittag's definition of referential transparency is more general than mine (I have upvoted his answer). I have used a restricted definition because the example in the question focuses on the return value of functions and I wanted to illustrate this aspect. However, RT in general refers to the meaning of the whole program, including changes to global state and interactions with the environment (IO) caused by evaluating an expression. So, for a correct, general definition, you should refer to that answer.

Related Topic