I've worked on a project to implement the Kalman filter on an embedded system that was similar in hardware to the iNemo unit from STMicroelectronics.
Even if you can find these IMU (Inertial Measuring Unit) with 90% chance you will have to implement your algorithm by yourself; or if you're lucky, you can find someone that has the code. The problem is that this filter requires a lot of computation, and in our best experiment (using fixed point variables and trying to optimize the code) we were able to run it 45 times per second, in a STM32 at 72 MHz.
So maybe there is one, but as far as I know requires a good microcontroller or maybe a FPGA\ASIC.
You are not going to be able to distinguish potholes clearly from other short peak events apart from being able to distinguish between a rising bump in the road and a hole (the intial direction will be opposite) but you can certainly capture them quite easily.
Determine an initial direction (e.g. negative/positive XYZ depending on how your device is mounted), a threshold level, and a maximum time the reading should be over this level (determined by width of pothole) Then time the peak height/width and see if it fits your pothole characteristic.
The device already contains an internal 1kHz LPF, so you could add a HPF of say 50-200Hz for the potholes, since they will have a fast risetime. I'm not an expert on car vibration frequencies, but you will probably get some noise from vibration however you filter. However that's not an issue as long as the pot hole event is large in comparison with the noise - it looks like the data is okay as it is, I would just sample a bit faster to prevent aliasing (e.g. >2kHz) or add a LPF to the existing internal one as described in the datasheet. Since you are trying to capture fast risetime events, I'd go with the former (faster sampling, possibly with HPF)
To compensate for a change in inclination, you can have a running average value which can be used to zero the axis out (one for each axis). Also, note that a HPF will ignore the DC level, so (as long as it doesn't go off the end of the scale) a slow gradient will make no difference.
According to the datasheet (bottom of page 7 in the link above), the formula for the external capacitance is:
\$ C2 = C3 = C4 = \dfrac{4.97 \times 10^{-6}}{f_{BW}} \$
so your calculation of:
\$ \dfrac{4.97 \times 10^{-6}}{10Hz} = 497nF \$ is correct.
Best Answer
All of those details are heavily dependent on the firmware setting up the chip, low level software (Android OS) massaging and delivering the sensor data, and finally the application-level software further massaging and delivering data. At the firmware level, it would be useful to get an SPI/I2C sniffer, like the Bus Pirate or Open Bench Logic Sniffer. I wouldn't be surprised if applications were able to access this level, as well. In other words, the initial or default state may change. It will be very difficult to attach probes, however.
The Android OS is largely open-source. Check it out and see if you can find the relevant blocks of code and libraries. It would at least set the initial state, and likely define an API to change it.
The wrong way to go about this is physical tests while taking measurements with an on-board application. There are at least 3 layers of algorithms fiddling with the sensor data.
You cannot determine drift from the datasheet. It is dependent on temperature, age, and batch/wafer.