For the purposes of this answer I define "purely functional language" to mean a functional language in which functions are referentially transparent, i.e. calling the same function multiple times with the same arguments will always produce the same results. This is, I believe, the usual definition of a purely functional language.
Pure functional programming languages do not allow side effects (and are therefore of little use in practice because any useful program does have side effects, e.g. when it interacts with the external world).
The easiest way to achieve referential transparency would indeed be to disallow side effects and there are indeed languages in which that is the case (mostly domain specific ones). However it is certainly not the only way and most general purpose purely functional languages (Haskell, Clean, ...) do allow side effect.
Also saying that a programming language without side effects is little use in practice isn't really fair I think - certainly not for domain specific languages, but even for general purpose languages, I'd imagine a language can be quite useful without providing side effects. Maybe not for console applications, but I think GUI applications can be nicely implemented without side-effects in, say, the functional reactive paradigm.
Regarding point 1, you can interact with the environment in purely functional languages but you have to explicitly mark the code (functions) that introduces them (e.g. in Haskell by means of monadic types).
That's a bit over simplifying it. Just having a system where side-effecting functions need to be marked as such (similar to const-correctness in C++, but with general side-effects) is not enough to ensure referential transparency. You need to ensure that a program can never call a function multiple times with the same arguments and get different results. You could either do that by making things like readLine
be something that's not a function (that's what Haskell does with the IO monad) or you could make it impossible to call side-effecting functions multiple times with the same argument (that's what Clean does). In the latter case the compiler would ensure that every time you call a side-effecting function, you do so with a fresh argument, and it would reject any program where you pass the same argument to a side-effecting function twice.
Pure functional programming languages do not allow to write a program that maintains state (which makes programming very awkward because in many application you do need state).
Again, a purely functional language might very well disallow mutable state, but it's certainly possible to be pure and still have mutable state, if you implement it in the same way as I described with side-effects above. Really mutable state is just another form of side-effects.
That said, functional programming languages definitely do discourage mutable state - pure ones especially so. And I don't think that that makes programming awkward - quite the opposite. Sometimes (but not all that often) mutable state can't be avoided without losing performance or clarity (which is why languages like Haskell do have facilities for mutable state), but most often it can.
If they are misconceptions, how did they come about?
I think many people simply read "a function must produce the same result when called with the same arguments" and conclude from that that it's not possible to implement something like readLine
or code that maintains mutable state. So they're simply not aware of the "cheats" that purely functional languages can use to introduce these things without breaking referential transparency.
Also mutable state is heavily discourages in functional languages, so it isn't all that much of a leap to assume it's not allowed at all in purely functional ones.
Could you write a (possibly small) code snippet illustrating the Haskell idiomatic way to (1) implement side effects and (2) implement a computation with state?
Here's an application in Pseudo-Haskell that asks the user for a name and greets him. Pseudo-Haskell is a language that I just invented, which has Haskell's IO system, but uses more conventional syntax, more descriptive function names and has no do
-notation (as that would just distract from how exactly the IO monad works):
greet(name) = print("Hello, " ++ name ++ "!")
main = composeMonad(readLine, greet)
The clue here is that readLine
is a value of type IO<String>
and composeMonad
is a function that takes an argument of type IO<T>
(for some type T
) and another argument that is a function which takes an argument of type T
and returns a value of type IO<U>
(for some type U
). print
is a function that takes a string and returns a value of type IO<void>
.
A value of type IO<A>
is a value that "encodes" a given action that produces a value of type A
. composeMonad(m, f)
produces a new IO
value that encodes the action of m
followed by the action of f(x)
, where x
is the value produces by performing the action of m
.
Mutable state would look like this:
counter = mutableVariable(0)
increaseCounter(cnt) =
setIncreasedValue(oldValue) = setValue(cnt, oldValue + 1)
composeMonad(getValue(cnt), setIncreasedValue)
printCounter(cnt) = composeMonad( getValue(cnt), print )
main = composeVoidMonad( increaseCounter(counter), printCounter(counter) )
Here mutableVariable
is a function that takes value of any type T
and produces a MutableVariable<T>
. The function getValue
takes MutableVariable
and returns an IO<T>
that produces its current value. setValue
takes a MutableVariable<T>
and a T
and returns an IO<void>
that sets the value. composeVoidMonad
is the same as composeMonad
except that the first argument is an IO
that does not produce a sensible value and the second argument is another monad, not a function that returns a monad.
In Haskell there's some syntactic sugar, that makes this whole ordeal less painful, but it's still obvious that mutable state is something that language doesn't really want you to do.
Having authored several projects in functional style JavaScript, allow me to share my experience. I will focus on high level considerations, not specific implementation concerns. Remember, there are no silver bullets, do what works best for your specific situation.
Pure Functional vs Functional Style
Consider what you want to gain from this functional refactoring. Do you really want a pure functional implementation, or just some of the benefits a functional programming style can bring, such as referential transparency.
I have found the latter approach, what I call functional style, to mesh well with JavaScript and is usually what I want. It also allows selective non functional concepts to be carefully used in code where they make sense.
Writing code in a more purely functional style is also possible in JavaScript, but is not always the most appropriate solution. And Javascript can never be purely functional.
Functional Style Interfaces
In functional style Javascript, the most important consideration is the boundary between functional and the imperative code. Clearly understanding and defining this boundary is essential.
I organize my code around functional style interfaces. These are collections of logic that behave in a functional style, ranging from individual functions to entire modules. The conceptual separation of interface and implementation is important, a functional style interface may be implemented imperatively, as long as the imperative implementation does not leak across the boundary.
Unfortunately in Javascript, the burden of enforcing functional style interfaces falls entirely on the developer. Also, language features, such as exceptions, make a truly clean separation impossible.
Since you will be refactoring an existing codebase, start by identifying existing logical interfaces in your code that can be cleanly moved to a functional style. A surprising amount of code may already be functional style. Good starting points are small interfaces with few imperative dependencies. Everytime you move over code, existing imperative dependencies are eliminated and more code can be moved over.
Keep iterating, gradually working outwards to push large parts of your project onto the functional side of the boundary, until you are happy with the results. This incremental approach allows testing individual changes and makes identifying and enforcing the boundary easier.
Rethink Algorithms, Data Structures, and Design
Imperative solutions and design patterns often don't make sense in functional style. Especially for data structures, direct ports of imperative code to functional style can be ugly and slow. A simple example: would you rather copy every element of a list for a prepend or cons the element onto the existing list?
Research and evaluate existing functional approaches to problems. Functional programming is more than just a programming technique, it is a different way of thinking about and solving problems. Projects written in the more traditional functional programming languages, such as lisp or haskell, are great resources in this regard.
Understand the Language and Builtins
This is an important part of understanding the functional imperative boundary and will effect interface design. Understand what features can be safely used in functional style code and which can not. Everything from which builtins may throw exceptions and which, like [].push
, mutate objects, to objects being passed by reference and how prototype chains work. A similar understanding of any other libraries you consume in your project is also required.
Such knowledge allows making informed design decisions, like how to handle bad input and exceptions, or what happens if a callback function does something unexpected? Some of this can be enforced in code, but some just requires documentation on the proper usage.
Another point is how strict your implementation should be. Do you really want to try enforcing the correct behavior on maximum call stack errors or when the user redefines some builtin function?
Using Non Functional Concepts Where Appropriate
Functional approaches are not always appropriate, especially in a language like JavaScript. The recursive solution may be elegant until you start getting maximum call stack exceptions for larger inputs, so perhaps an iterative approach would be better. Similarly, I use inheritance for code organization, assignment in constructors, and immutable objects where they make sense.
As long as the functional interfaces are not violated, do what makes sense and what will be easiest to test and maintain.
Example
Here is an example of converting some very simple real code to functional style. I don't have a very good large example of the process, but this small one touches on a number of interesting points.
This code builds a trie from a string. Ill start with some code based on the top section of John Resig's trie-js build-trie module. Many simplifications / formatting changes have been made and my intent is not to comment on the quality of original code (There are much clearer and cleaner ways to build a trie, so why this c style code comes up first in google is interesting. Here's a quick implementation that uses reduce).
I like this example because I don't have to take it all the way to a true functional implementation, just to functional interfaces. Here is the starting point:
var txt = SOME_USER_STRING,
words = txt.replace(/\n/g, "").split(" "),
trie = {};
for (var i = 0, l = words.length; i < l; i++) {
var word = words[i], letters = word.split(""), cur = trie;
for (var j = 0; j < letters.length; j++) {
var letter = letters[j];
if (!cur.hasOwnProperty(letter))
cur[letter] = {};
cur = cur[ letter ];
}
cur['$'] = true;
}
// Output for text = 'a ab f abc abf'
trie = {
"a": {
"$":true,
"b":{
"$":true,
"c":{"$":true}
"f":{"$":true}}},
"f":{"$":true}};
Let's pretend this is the entire program. This program does two things: convert a string to a list of words and build a trie from the list of words. These are the existing interfaces in the program. Here is our program with the interfaces more clearly expressed:
var _get_words = function(txt) {
return txt.replace(/\n/g, "").split(" ");
};
var _build_trie_from_list = function(words, trie) {
for (var i = 0, l = words.length; i < l; i++) {
var word = words[i], letters = word.split(""), cur = trie;
for (var j = 0; j < letters.length; j++) {
var letter = letters[j];
if (!cur.hasOwnProperty(letter))
cur[letter] = {};
cur = cur[ letter ];
}
cur['$'] = true;
}
};
// The 'public' interface
var build_trie = function(txt) {
var words = _get_words(txt), trie = {};
_build_trie_from_list(words, trie);
return trie;
};
Just by defining our interfaces, we can move _get_words
to the functional style side of the boundary. Neither replace
or split
modifies the original string. And our public interface build_trie
is functional style too, even though it interacts with some very imperative code. In fact, this is a fine stopping point in most cases. More code would get overwhelming, so let me overview a few other changes.
First, make all the interfaces functional style. This is trivial in this case, just have _build_trie_from_list
return an object instead of mutating one.
Handling Bad input
Consider what happens if we call build_trie
with an array of characters, build_trie(['a', ' ', 'a', 'b', 'c', ' ', 'f'])
. The caller assumed this would behave in a functional style, but instead it throws an exception when .replace
is called on the array. This may be the intended behavior. Or we could do explicit type checks and make sure the inputs are as we expected. But I prefer duck typing.
Strings are just arrays of characters and arrays are just objects with a length
property and non-negative integers as keys. So if we refactored and wrote new replace
and split
methods that operates on array-like objects of strings, they don't even have to be strings, our code could just do the right thing. (String.prototype.*
will not work here, it converts the output to a string). Duck typing is a totally separate from functional programming, but the larger point is that bad input should always be considered.
Fundamental redesign
There are also more fundamental considerations. Assume that we want to build the trie in functional style as well. In the imperative code, the approach was to construct the trie one word at a time. A direct port would have us copy the entire trie every time we need to make an insertion to avoid mutating objects. Clearly that is not going to work. Instead, the trie could be constructed node by node, from the bottom up, so that once a node is completed, it never has to be touched again. Or another approach entirely may be better, a quick search shows many existing functional implementations of tries.
Hope the example clarified things a bit.
Overall, I find writing functional style code in Javascript is a worthwhile and fun endeavor. Javascript is a surprisingly capable functional language, albeit too verbose in its default state.
Good luck with your project.
A few personal projects written in functional style Javascript:
- parse.js - Parser Combinators
- Nu - Lazy streams
- Atum - Javascript interpreter written in functional style Javascript.
- Khepri - Prototype programming language I use for functional Javascript development. Implemented in functional style Javascript.
Best Answer
It is certainly possible to program in a purely functional style in an imperative language. In fact, if you look at books like Effective Java or Java Concurrency in Practice, much of the advice in those books basically boils down to "don't use mutable state", "don't use side-effects", etc.
However, it may not always be a pleasant experience, and you may not be able to do everything you want to do.
You mentioned C specifically as an example. Unless I'm missing something, the purely functional subset of C is not Turing-complete, because there is no way to express iteration. C does not have Proper Tail Calls, so, you will eventually overflow the stack when trying to express something like iterating over an infinite list. You will need to use a loop for iteration, but loops rely on mutable state and side-effects.
Of course, the standard Turing-tarpit argument applies … you can do functional programming C by implementing a Haskell interpreter in C and then program in Haskell. But that's not how I interpret the OP's question.