What you consider FooBuilder and Foo is actually part of a well-established Builder pattern. The other approach you mention with creating a Foo instance based on an existing instance is called a Prototype pattern. Both of these are well-known object creation alternatives and several books (most notably "Design Patterns", aka GoF) have been written describing them. (not sure I'm clear on your second mutable example, so comparing #1 to #3 here)
Each pattern obviously has its own advantages and drawbacks and only you can decide which one is better for your specific situation. For example, I would typically use builders when there's a lot of different ways of initializing Foo. For example, right now I'm working on a query builder class and many clients use it differently to build query objects. Different clients want different fields to be returned, some clients want only last 10 rows, some want sorting while others don't. Builder is perfect as it can look something like this:
qb = QueryBuilder()
query = qb.fields("name", "address")
.sort("zip", "asc")
.limit(10)
.build()
So multiple setters, each returning instance of the builder itself so you can daisy-chain basically serve as a very flexible constructor. Making one class a friend of another class has it's time and a place and potentially this could be one of those places.
Alternatively, you could define Foo as having one constructor that takes a rather complex set of params, or potentially you could define another class FooData, that FooBuilder could initialize and pass into Foo creation. Then again, maybe Foo and FooBuilder being friends in this case isn't such a bad thing. Just keep in mind that when one class is a friend of another one, you are expanding encapsulation boundary, which could be just as bad as putting more and more code all within one class (i.e. more code knows about internals)
At the end of the day, you could be thinking and considering all these different options and it will all boil down to 60/40. If there's no clear winner, you can always let a coin pick the winner and just go with it. Most likely either one of the design choices would work just fine and by going through the motions and seeing your code in action you will learn valuable lesson for future. Sometimes when I have design decisions and can't decide between A and B, I could end up picking B and then if I ever come across a similar situation I would intentionally pick the other choice. The work in either case gets done and I get the software to do what I want, but you get a ton of benefit of actually seeing your decisions in action and being able to compare the two working approaches, rather than just discussing and thinking about them.
I hope it's ok to answer my own question.
I believe I have found the optimal (without overcomplicating the problem) data structure for my problem. There was at least minor idiocy on my part for not recognising this earlier. The data doesn't need to be accessed by (x,y,z) but instead by (x, y, range of z (say 0 - 3)). This give a C++ struct as follows:
struct node {
struct node *next;
int zGroup;
int z;
50 bytes of misc data };
I can then address this through a 3D dynamic array (vectors):
vector< vector < vector < node* > > > Data;
Any given Data[x][y][zGroup]
points to the first element of a linked list, the entirety of which is needed every time one element of it is needed. No value of this array is NULL, every one contains a linked-list of at least one element.
The third dimension of the array - the zGroup has jagged dimensions, but with dynamic arrays this isn't an issue. Given the data and computations being performed on it, I know that the max x and y values are set when the file is read and do not change, neither does the number of z groups on any given (x,y) line, the actual z-values of nodes may change, but they will remain inside the same z-groups, giving a constant-sized, fully populated array.
With the way that the file is structured it is also easy enough to page it in and out of memory if I am brought to do this with much larger data sets.
Best Answer
allow iteration without leaking the internals is exactly what the iterator pattern promises. Of course that is mainly theory so here is a practical example:
You provide standard
begin
andend
methods, just like sequences in the STL and implement them simply by forwarding to vector's method. This does leak some implementation detail namely that you're returning a vector iterator but no sane client should ever depend on that so it is imo not a concern. I've shown all overloads here but of course you can start by just providing the const version if clients should not be able to change any People entries. Using the standard naming has benefits: anyone reading the code immediately knows it provides 'standard' iteration and as such works with all common algorithms, range based for loops etc.