Your problem is likely logic level mismatch. Since your RAM IC is powered with 3.3V, its output HIGH level on the SO pin can't be larger than 3.3V. Your microcontroller however seems to use 5V (looking at the schematics). Now I haven't consulted AVR datasheets, but most CMOS chips recognize a HIGH level when the voltage on their inputs is larger than 0.7Vcc. This threshold is 0.7*5V = 3.5V, so the microcontroller has every right to not recognize 3.3V as HIGH. This may well explain why you're reading all zeros from your RAM.
Note that the maximum input HIGH level your RAM tolerates is Vcc+0.3V (table 1.1) which in your case equals to 3.6V. That's much smaller than the 5V your MCU outputs, so you might have cooked your RAM already. The difference is certainly enough to bias protection diodes into conduction. Some 3.3V ICs have 5V tolerant inputs, this one does not.
Your options are:
Run your MCU at 3.3V. This is often best if your MCU supports that voltage (I think it does) as it avoids the need for level translation completely. Unfortunately this hardly seems possible within the confines of your platform.
Increase Vcc of your RAM (datasheet says it can tolerate up to 3.6V). However, this is going to be unreliable since the threshold is only marginally exceeded, so any voltage variations (and even internal variations between individual chips) may well lead to problems. This also doesn't solve the input level mismatch, so half of the problem remains.
Do logic level translation in both directions. Many solutions to this problem exist, you should search the web (including this website) to get the picture. In short, it's easier to lower the voltage from 5V to 3.3V (a divider is enough, perhaps with a diode) than to do the inverse (you'll need to use a FET or an IC as a buffer). This appnote shows some circuits, as does this page. See also this product from Sparkfun for a similar idea. HEF4050BT is a buffer IC with absolute input thresholds (HIGH is 3V) and can tolerate up to 15V on inputs. I'm sure more possibilities exist, but I personally believe it's best to power your microcontrollers with 3.3V and save the trouble. Many useful ICs aren't 5V compatible (or tolerant) these days.
As far as I can tell, you're doing everything right.
If this were sitting on my workbench, here's the next thing I would do:
On production units, I would make the resistor on MISO a pull-up, rather than a pull-down.
Page 37 of the datasheet only guarantees 100 uA on "MISO out high", but over 10 times that current on "MISO out low".
The 1.6 mA on "MISO out low" is enough to (dimly) light a high-efficiency LED with an appropriate pull-up resistor to +3.3 V.
I find adding LEDs to every questionable signal helps me find problems faster.
The 100 uA on "MISO out high" means don't expect the flash chip to work with a pull-down of 33 KOhm or less.
On my test jig (but not on production units), I would temporarily change resistor on MISO changed to weakly pull MISO to around 1.5 V -- that helps distinguish between high (near 3.3 V), low (near 0 V), and tristate (near 1.5 V).
I would re-run the test and make sure the only thing connected to MISO is the o'scope (or logic analyzer) probe and that bias resistor -- not even the PIC connected -- to rule out the possibility that the PIC is somehow accidentally driving MISO to GND.
I would make a custom test program on the PIC that does nothing but select the flash chip, attempt a READ IDENTIFICATION and reads 20 bytes, then deselects the flash chip, and then repeats forever. (It looks like maybe you've already done this).
It's theoretically possible that a PIC chip could be damaged just enough that it gives signals barely strong enough for the logic analyzer to distinguish "0" from "1", but not quite strong enough for the flash chip to distinguish them.
So I might check the voltages by (a) tweaking the custom test program so it runs the CLK at 1 Hz, so I can check each line on the flash chip with a voltmeter, or (b) running the test program at a more typical speed -- 500 KHz or 10 MHz should work fine -- and check each pin with an actual o'scope (not just a logic analyzer).
It's pretty easy to destroy a flash chip so it looks fine under visual inspection, but damaged such that now it won't ever work (always tri-state or always outputs 0).
Perhaps swap the flash chip with an "identical" M25P16 chip on something like the JeeLink and see if the problem follows the flash chip or stays with the PIC chip.
Perhaps re-build the entire circuit with fresh wires, a fresh PIC chip, and a fresh flash chip, and swap the chips around to see if the problem follows the PIC chip, follows the flash chip, or follows the prototype wires.
Best Answer
Assuming you don't plan to use the chip in anything but standard SPI (not Dual- or QSPI), then all the signal pins are unidirectional!
That means that in the Arduino->Flash direction, a voltage divider that divides 5V down to 3.3V is sufficient (place it close to the receiving end, usually). This applies to the clock and the master-out, slave in (MOSI) data pin (called "Data Input" in the flash datasheet).
In the opposite direction (MISO / Data Output), there's little you can do to ensure reliable transmission but buffer. A buffer doesn't have to be large – in fact, a dual-NPN package would totally do with two resistors, and that would take single-digit square millimeters in SMD.
You might simply want to invest 33ct into something like a 74LVC1T45. That thing is maybe 1.7×1mm² in size even in its largest variant; I doubt that will be a limiting factor for your circuitry.