Only 2-in NANDs?
I see no reason why you have to waste two NANDs to invert the same signal twice, if that was part of your question.
But I'm not entirely sure about what question you are asking.
The way I approached your problem was to lay out a table:
$$\begin{align*}
A_3~~A_2~~~~~~~~~~~\\
\begin{array}{c|ccccc}
~~A_1~A_0~~ & & 00 & 01 & 10 & 11 \\ \hline \\
00 & & 1 & 1 & 1 & 1 \\
01 & & 0 & 1 & 0 & 0 \\
10 & & 1 & 0 & 1 & 1 \\
11 & & 0 & 0 & 0 & 0
\end{array}\end{align*}$$
From this, it was pretty easy to see that three of the columns were identical and could be replaced by \$\overline{A_0}\$ and that the remaining column was just \$\overline{A_1}\$. So the new table became:
$$\begin{align*}
A_3~~A_2~~~~~~~~~~~\\
\begin{array}{c|ccccc}
& & 00 & 01 & 10 & 11 \\ \hline \\
& & \overline{A_0} & \overline{A_1} & \overline{A_0} & \overline{A_0}
\end{array}\end{align*}$$
The result is the following:
(Clearly, you will have to replace the inverters with NAND gates. So there are a total of 7 of them.)
The first inverter and NAND gate on the left, accepting \$A_2\$ and \$A_3\$, provides an active LOW to indicate when \$A_3~A_2=01\$. If it is LOW, then this fact disables the NAND gate that \$A_0\$ goes into, at the bottom. But it enables the NAND gate where \$A_1\$ arrives. Either one of these are then combined (and finally inverted) to get the desired output.
Perhaps someone else might try their hand at it. But that's how I may have approached it.
You can also go an entirely algebraic approach. You say you know DeMorgan's. So you can play around with the expression you have in order to construct a series of either \$\overline{A~B}\$ or else \$\overline{A}+\overline{B}\$ identifiable terms in your expression, making adjustments as you go when you see something that isn't of that basic form.
I chose to try a different approach.
I'm not holding myself out as an expert in this, though. Perhaps one such will enter in and provide a more thorough and steel-minded, rigorous approach for you. I might learn from that, as well.
Best Answer
General approach for \$m\$ bits
The general solution for a test for division by 3 is to sum up the even-numbered bits and separately sum up the odd-numbered bits, take the difference between these sums, and then see if the difference itself is divisible by 3. (There are a variety of approaches for this operation, but the one encountered first is usually via carry-save adders.)
For a binary value with \$m\$ bits, where \$m\$ is even, the difference will require at most \$\lceil\operatorname{ln}_2\frac{m}{2}\rceil\$ bits. For a binary value with \$m\$ bits, where \$m\$ is odd, the difference will require at most \$\lceil\operatorname{ln}_2\frac{m+1}{2}\rceil\$ bits. This difference result could itself then be submitted to a much smaller tier for, once again, computing the difference between the sums of even and odd numbered bits. (And repeat.)
Specific case where \$m=4\$
At this point it is pretty easy to see that the even and odd sums can be computed using a simple half-adder, each. The resulting table is:
$$ \begin{smallmatrix}\begin{array}{r|cccc} &\overline{C_\text{odd}}\:\overline{S_\text{odd}}&\overline{C_\text{odd}}\:S_\text{odd}&C_\text{odd}\:\overline{S_\text{odd}}\\ \hline \overline{C_\text{even}}\:\overline{S_\text{even}}&Y&N&N\\ \overline{C_\text{even}}\:S_\text{even}&N&Y&N\\ C_\text{even}\:\overline{S_\text{even}}&N&N&Y \end{array}\end{smallmatrix} $$
In this case, there is no need to worry about "divisibility by 3" of the difference. Instead, it's sufficient to compare the two sums for "equal," as shown in the above table.
This should be very easy to implement:
simulate this circuit – Schematic created using CircuitLab
The half-adders are easily recognized above. In addition, their associated outputs are directly compared using a pair of XORs. The results of these two comparisons are then considered using a NOR for the final output.