To answer just your title, yes. Neural nets can give non-boolean answers. For example, neural nets have been used to predict stock market values, which is a numeric answer and thus more than just yes/no. Neural nets are also used in handwriting recognition, in which the output can be one of a whole range of characters - the whole alphabet, the numbers, and punctuation.
To focus more on your example - recognising animals - I'd say it's possible. It's mostly an extension of the handwriting recognition example; you're recognising features of a shape and comparing them to "ideal" shapes to see which matches. The issues are technical, rather than theoretical. Handwriting, when run through recognition software, is usually mapped down to a set of lines and curves - nice and simple. Animal faces are harder to recognise, so you'd need image processing logic to extract features like eyes, nose, mouth, rough skull outline etc. Still, you only asked if it's possible, not how, so the answer is yes.
Your best bet is probably to take a look at things like Adaptive Resonance Theory. The general principle is that the sensory input (in this case, metrics on the relative size, shape and spacing of the various facial features) is compared to a "prototype" or template which defines that class of thing. If the difference between the sensory input and the remembered template is below a certain threshold (as defined by a "vigilance parameter"), then the object being observed is assumed to be a member of the group represented by the template; if no match can be found then the system declares it to be a previously unseen type. The nice thing about this sort of net is that when it recognises that an object is, say, a horse, it can learn more about recognising horses so that it can tell the difference between, say, a standing horse and a sleeping horse, but when it sees something new, it can start learning about the new thing until it can say "I don't know what this is, but I know it's the same thing as this other thing I saw previously".
EDIT:
(In the interest of full disclosure: I'm still researching this myself for a project, so my knowledge is still incomplete and possibly a little off in places.)
how does this tie in with backpropogation setting weights for one output node ruining the weights for another, previously-trained node?
From what I've read so far, the ART paradigm is slightly different; it's split into two sections - one that learns the inputs, and one that learns the outputs for them. This means that when it comes across an input set that doesn't match, an uncommitted neuron is activated and adjusted to match the input, so that that neuron will trigger a match next time. The neurons in this layer are only for recognition. Once this layer finds a match, the inputs are handed to the layer beneath, which is the one that calculates the response. For your situation, this layer would likely be very simple. The system I'm looking at is learning to drive. This is actually two types of learning; one is learning to drive in a variety of situations, and the other is learning to recognise the situation. For example, you have to learn how to drive on a slippery road, but you also have to learn to feel when the road you're driving on is slippery.
This idea of learning new inputs without ruining previously-learned behaviours is known as the stability/plasticity dilemma. A net needs to be stable enough to keep learned behaviour, but plastic enough that it can be taught new things when circumstances change. This is exactly what ART nets are intended to solve.
Monte-Carlo methods may work for you. In a nutshell: define the boundaries of the results you are looking for, for example energy consumption(derived from measured system parameters) within a range, and run a simulation where the free variables are allowed to vary to simulate a physical system. The sets that fall within the solution space will define the acceptable input ranges.
If your search space is manageable (depends on your system constraints), then you can fit curves to these parameters.
Alternately, you can use a minmax approach if you can define your search space in a way suitable for local minimization.
Best Answer
There are many differences between these two, but in practical terms, there are three main things to consider: speed, interpretability, and accuracy.
Decision Trees
Neural Nets
You might want to try implementing both and running some experiments on your data to see which is better, and benchmark running times. Or, you could use something like the Weka GUI tooklit with a representative sample of your data to test drive both methods.
It may also be that using "bagging" or "boosting" algorithms with decision trees will improve accuracy while maintaining some simplicity and speed. But in short, if speed and interpretability are really important, then trees are probably where to start. Otherwise, it depends and you'll have some empirical exploration to do.