Class Design – Can a Class Contain Its Own Class?

class-designcompositiondesign-patternsobject-oriented-designuml

Suppose I have the following class structure:

enter image description here

  1. A forest can have any number of trees, but each tree can belong to only one forest. If the forest is deleted, the tree is deleted.

  2. A tree must have at least one branch, but can have many more, however each branch can belong to only one tree. If the tree is deleted, the branch is deleted.

  3. A branch can have any number of leaves, and each leaf can belong to only one branch. If the branch is deleted, the leaf is deleted.

I believe this is a 'composition' structure, where each child depends on its parent for existence.

How might I represent a structure where each branch can contain other branches, which can contain other branches, and so on and so forth, essentially like this:

enter image description here

Notice the infinite loop I've added to branch, suggesting that a branch can contain any number of 'child' branches, which can in turn contain any number of child branches etc etc, similar to how a real tree might function, where each branch can contain smaller branches, which can in turn contain smaller branches:

enter image description here

How might I therefore create a structure where:

  1. Each branch class can contain an infinite (and unknown) number of 'child' branch classes

  2. Each branch can have access to itself and its child branches only (i.e it cannot see parent data)

  3. Each branch has the ability to make decisions for itself and its child branches only (i.e. the chain of power over subordinates increases as you move up the chain towards the parent)

In other words, a simple heirarchy structure, but with a flexible hierarchy depth.

It is this unknown number of parent/child generations which is causing me difficulty. If I knew that a 'grandparent' branch can only contain 'parent' branches, which in turn can only contain 'child' branches, then I could easily hardcode 3 classes to represent grandparent/parent/child branches.

However since the number of 'generations' is unknown, I can't seem to get my head around how this might be represented in a UML Class Diagram, and later implemented.

How might I best approach this?

Best Answer

Can a class contain its own class?

Yes. Case in point:

public class Foo
{
    public Foo Parent { get; set; }
}

var parent = new Foo();
var child = new Foo() { Parent = parent };

However, a class cannot then have only constructors which require a parameter of the same class, such as

public class Foo
{
    public Foo Parent { get; }

    public Foo(Foo parent)
    {
        this.Parent = parent;
    }
}

The language allows it, but it's impossible to ever create such an object, as it leads to an infinitely recursing chain of constructors:

var parent = new Foo(new Foo(new Foo(new Foo(...)))); // to infinity!

This just never ends.

Edit: user1937198 correctly pointed out that you can simply pass in null (barring some constraints such as C#'s non-nullable types), but that is in my opinion off-label usage as it's conceptually contradictory to (a) have a constructor ask for an object (b) not provide any constructor which doesn't aks for that object, and then (c) not passing an object.

However, if there are other constructors available as well, then there is no problem. The first created object is simply limited to using the constructors without the same class as constructor parameter

public class Foo
{
    public Foo Parent { get; }

    public Foo(Foo parent)
    {
        this.Parent = parent;
    }

    public Foo()
    {

    }
}

var parent = new Foo();
var child = new Foo(parent);

As a general rule, there are three ways to model this relationship:

  1. Parent-oriented:
public class Foo
{
    public Foo Parent { get; set; }
}

var parent = new Foo();
var child = new Foo() { Parent = parent };
  1. Child-oriented:
public class Foo
{
    public List<Foo> Children { get; set; }
}

var child = new Foo();
var parent = new Foo();
parent.Children.Add(child);
  1. Both (this is commonly used for navigational properties such as in Entity Framework)
public class Foo
{
    public Foo Parent { get; set; }
    public List<Foo> Children { get; set; }
}

var parent = new Foo();
var child = new Foo() { Parent = parent };
parent.Children.Add(child);

How might I therefore create a Branch?

I believe this is a 'composition' structure, where each child depends on its parent for existence.
How might I represent a structure where each branch can contain other branches, which can contain other branches, and so on and so forth

For reference, what you're looking for here is a 'recursive' structure, as opposed to a 'composite' structure.

Note that 'recursive' doesn't inherently mean a self-relationship (i.e. Branch->Branch->Branch->...), it could also be a longer chain without self-relationships (e.g. A->B->C->A->B->C->...), though that distinction is irrelevant for your specific use case.

  1. Each branch class can contain an infinite (and unknown) number of 'child' branch classes

This is describing a collection, e.g. List<Branch>. Other collection types exist but I'm using List<> for simplicity's sake.

  1. Each branch can have access to itself and its child branches only (i.e it cannot see parent data)

Therefore, a branch needs to contain such a collection of (child) branches:

public class Branch
{
    public List<Branch> Branches { get; set; }
}

This is the child-oriented example from the previous examples.

  1. Each branch has the ability to make decisions for itself and its child branches only (i.e. the chain of power over subordinates increases as you move up the chain towards the parent)

3 is a logical consequence of 2, so it is automatically achieved when 2 is achieved.

I can't seem to get my head around how this might be represented in a UML Class Diagram

Branch-to-branch

Your UML is perfectly fine in regards to the self-relationship. You drew a relationship from Branch to Branch, which does express it.

This kind of relationship is inherently recursive (i.e. what you call "infinite depth"). It's actually impossible to avoid recursion here, because that would require using multiple types to denote branches that can/can't have further children.

Note that while the relationship itself is inherently recursive, it's possible that your business logic might enforce a maximum depth. Maybe not necessary for your case, but it is a general possibility.
However, that sort of depth-limit is not visible on a UML diagram, and shouldn't particularly be visible as UML diagrams do not contain validation or business logic.

Tree-to-branch

However, do note that the relationship a branch has with its tree has changed. The "child" branches won't be related to a tree, they'll be related to their parent branch. And the "top" branch will have the relationship to the tree.

This means that your tree-to-branch relationship is now a 0..1 to many relationship.


Complexity vs security

I just want to point out that there are circumstances which can change the outcome.

If only a branch which has no child branches is allowed to have leaves, and branches with child branches are not allowed to, your UML is flawed, as all branches can have leaves.
If this is what you need, then you're going to need to model different types for a "branch-having branch" and a "leaf-having branch".

Note: I'm aware that in your drawing of a tree you drew leaves on all kinds of branches, but I also suspect that your tree scenario is an example and not indicative of what you may be trying to build in reality.

Similarly, your UML is relying on your business logic ensuring that only the ancestor branch is connected to the tree, not any other branch. If you wish to model this more securely, then you would also need to distinguish between a "tree-related branch" and a "branch-related branch".

However, there is some degree of freedom here. You could model that in your UML, but it will complicate the diagram. You could also stick to simple modeling with only a Branch type which could all be related to a tree, child branches, and leaves; and then rely on your business logic to ensure that you build your graph the right way.

This very much depends on whether you're willing to take on the added model complexity in order to outright prevent "bad" graphs, which requires context from how bad an error in your application is.

  • For example, if it could kill people by malfunctioning, prioritize security (enforcing it through the diagram) over simplicity (relying on business logic).
  • On the other hand, if this is a simple app that e.g. tracks cooking recipes, at the very worst you're going to get some dirty data that needs cleaning, which is a negligible issue and is not worth the headaches that such a complex UML diagram would cause.
Related Topic