Machine Learning – Decision Trees vs Neural Networks

data structuresmachine learning

I'm implementing a machine learning structure to try and predict fraud on financial systems like banks, etc… This means that there is a lot of different data that can be used to train the model eg. card number, card holder name, amount, country, etc…

I'm having trouble deciding which structure is the best for this problem. I have some experience with decision trees but currently I have started to question if a neural network would be better for this kind of problem. Also if any other method would be best please feel free to enlighten me.

Which are the pros and cons of each structure and which structure would be the best for this problem?

Also I'm not sure about this fact but I think decision trees have a great advantage over neural networks in terms of execution speed. This is important because speed is also a key factor in this project.

Best Answer

There are many differences between these two, but in practical terms, there are three main things to consider: speed, interpretability, and accuracy.

Decision Trees

  • Should be faster once trained (although both algorithms can train slowly depending on exact algorithm and the amount/dimensionality of the data). This is because a decision tree inherently "throws away" the input features that it doesn't find useful, whereas a neural net will use them all unless you do some feature selection as a pre-processing step.
  • If it is important to understand what the model is doing, the trees are very interpretable.
  • Only model functions which are axis-parallel splits of the data, which may not be the case.
  • You probably want to be sure to prune the tree to avoid over-fitting.

Neural Nets

  • Slower (both for training and classification), and less interpretable.
  • If your data arrives in a stream, you can do incremental updates with stochastic gradient descent (unlike decision trees, which use inherently batch-learning algorithms).
  • Can model more arbitrary functions (nonlinear interactions, etc.) and therefore might be more accurate, provided there is enough training data. But it can be prone to over-fitting as well.

You might want to try implementing both and running some experiments on your data to see which is better, and benchmark running times. Or, you could use something like the Weka GUI tooklit with a representative sample of your data to test drive both methods.

It may also be that using "bagging" or "boosting" algorithms with decision trees will improve accuracy while maintaining some simplicity and speed. But in short, if speed and interpretability are really important, then trees are probably where to start. Otherwise, it depends and you'll have some empirical exploration to do.

Related Topic