It kind of depends on how much data you will be moving between these two external subnets. If you allow the HP to route directly between those subnets, you can have as many 1GB streams between them as you have ports configured for them. With "router-on-a-stick" (I've always called it vlan-on-a-stick, but same concept), you would be limited to just 1GB in total throughput between the vlans (leaving out the possibility of doing an lacp trunk between the SonicWALL and the HP).
In doing this method, the third vlan would be considered a "transit network", and would make it easier down the road as your network grows to implement a dynamic routing protocol, or to add more routers into the network, if the need ever arises.
The HP switch would be acting as your layer 3 core, and you would have an IP address in each of the 3 vlans. The SonicWALL would need only an access port to the transit network, and it's own IP on that network.
From there, a default route statement in the HP pointing to the SonicWALL's transit net ip address, and two static routes in the SonicWALL (one for each of your 'external' subnets) pointing back at the HP's transit net IP.
The easy button is to simply run a vlan trunk to the SonicWALL, and put an address on each of the vlans you want to route for. I've done it this way in the past, and if you don't plan on heavy traffic, it's perfectly viable, and pretty easy to configure.
If you could post some of your route statements in your attempts at setting up the transit net, I'm sure someone could help you get that straightened out.
"Nesting" (overlapping networks) requires proxy-arp and therefore SHOULD be avoided at all costs. No enterprise router will allow such a broken configuration -- each interface/subnet must be completely independent, which means out in the real world, where real IP addresses are routed, this method of "conservation" cannot be used. (aka: nonsense) [*]
It SHOULD not be attempted by anyone not thoroughly versed in networking. (i.e. if you haven't been designing, setting up, and maintaining large, complex networks for a decade or more, you shouldn't even be talking about this type of damage.)
(Full disclosure)
I'm doing this exact thing in an OpenStack development environment right now. 192.168.xx.0/24 has a /29 behind one of the machines in the larger /24. That machine has to have a number of specific, non-default setting changed to pretend to be hosts within the /29 slice. (aka proxy-arp) Yes, I can add a route for the /29 on the router, but the machines in the /24 still won't be able to talk to the /29 because their larger netmask means they're link-local; I'd have to add that /29 route to all the machines in the /24 for them to work.
All-0 and All-1
Those concepts haven't had any tangible meaning in modern networking for decades. Nothing you're likely to run into on the internet makes any assumptions about network size -- everything is classless now. Yes, there used to be issues using an all-0 (or 1) subnet -- say 199.72.0.0/24 (the first subnet from 199.72.0.0/16) (true story) -- because some random system on the internet (AIX) applied class logic to the range. Nothing does that today. So, with 199.72.0.0/16, the address range is 0.0 to 255.255 -- with the those too addresses being the /16's network and broadcast addresses. Those are always the /16's network and broadcast, even if a /24 were nested with it somewhere.
The active netmask ALWAYS defines the network and broadcast. Yes, that means a nested construct has multiple broadcast addresses, but due to different netmasks, nodes within different zones (sub-network, parent-network, ...) listen to different addresses. At layer-2 (ethernet), all hosts in the same domain (eg. vlan) see the same broadcasts but the host will filter out, at layer-3, the "foreign" broadcasts, unless they're sent to the "all nodes" broadcast address of 255.255.255.255.
[*] ISPs wanting to conserve space like this do it via bridging, but that has it's own problems.
[*] I warned my idiot ("we know more than you") coworkers not to use 199.72.0.0/24, but they did it anyway -- putting the webdev desktops in 0.0/25. A day later came the "What. Did. I. Tell. You." after complaints from every single person about random places on the internet they simply couldn't get. That was in 1997.
Best Answer
There's no solution which satisfies all requirements, your answer of making 8 /27's is probably the most logical one.
You can easily verify this by trying to divide 256 by 7, you can't do this without a remainder. Also, all subnets need to have a size which is a power of 2, and 7 * 32 (a /27) = 252, 8 * 16 (a /26) = 512, so there's no way to do this without unused space.