...would frequently windup when the
car didn't respond fast enough.
Sounds like your loop gains are outta wack (technically speaking).
PID loops are bound by the frequency response of the system they are driving.
I would think the throttle response is violently faster than the GPS velocity information...making high frequency hunting a given.
Start with a small Kp and all other loop gains at zero. Work Kp up until you get a response that is slow and smooth...so you are following an offset of the curve of the set-point. Only when you have something "under-damped" can you start bringing gains up and closing the error.
If the GPS is really slow, some interpolation of the velocity data between updates will allow the loop rate to be increased. A rate limit loop around the output would also help.
A PID loop is a model of a system....the system you have....not the system you want :)
I know exactly where you're coming from. The mistake you're making is assuming that the integral term drops to zero in steady state. This is not the case, and, indeed, is highly dependent on implementation details.
First off, understand the integral term in a mathematical PID is the integral from the start of time (or, well, the system) and not "error over last few cycles". Your implementation of PID or PI should not cause the older contributors of the integral term to drop in relative weightage of the I term. Let me explain. When writing the I term's code, the first instinct is to assume that the term will diverge, crossing the variable size and overflowing, and people attempt to fix this using moving averages, degrading the weightage of the older values, and all sorts of strange gimmicks. This should not happen in a properly implement PI or PID system. Instead, you should simply calculate I as I = I + Ki*Error.
The baseline level required to maintain the system, which you mention in your question, must be provided by the I term. Since you do not know how much this is apriori, you must allow the controller to discover this value for itself. That, in fact, is the job of the I term. The Ki value should be small enough for the controller to converge before it overflows. Some thought about how this works on paper will help. Try to visualize the process, not specific boundary conditions. One thing that you should keep in mind is that the I term is not constructed from absolute value of error. It includes both positive and negative values of error.
Further, imagine the condition where the controller is just reaching the steady state. You will realize that I is not necessarily zero at this point. Indeed, I is actually the baseline control force you mention in the question. If the state actually remains stable, and if error from here on in is continuously zero (or zero averaged over time), the value of I will remain as it is.
Now, when it comes to real implementations, the problem you will face is that even with a small I, by the time your system reaches the set point, I may well have saturated. The system will then have to err in the opposite direction for a long time to rid itself of the I term it accumulated while it was still reaching the set point. In fact, I've noticed that PI and PID work best for a single set point, and degrade when you have to keep changing that point by a large value. A big contributor to this is the fact that I has high inertia. Tuning the value of I is possible to keep rhe controller functional, but when the system itself responds to stimulus slowly (say you're heating a block of metal), tuning is often difficult. Instead, what can help is to activate I only when. The system is within a certain threshold of the set point. When you change the set point by something greater than this threshold, clear I and disable it (use only P/PD control) until you reach close to the new setpoint. By doing this, you add another tunable parameter (the threshold), but it makes setting both Ki and the threshold easier than setting Ki by itself to be optimal for both situations.
Best Answer
There are three common solutions to this: