There are two sides to this problem: the algorithmic and the engineering.
Algorithmic:
How do you determine the amount of each type of coins to give back as a change, such that the total amount of coins is minimized?
In this case, there is very simple algorithm: start by giving back the highest value coins. Once the remaining amount of change gets below the value of the highest value coins, start giving back the second highest. Keep this process until you gave back the whole amount of change required.
Engineering:
In the most general case of a vending machine, you would implement in HW the above algorithm. However, due to the simplifications provided in the problem's description you can do the following:
- Calculate yourself the required change for each item
- Implement a simple state machine which returns the amount of coins which you've just calculated based on the type of item bought.
The input to your state machine will be the type of the item.
There will be three outputs: one for each type of coin you can give back for change.
The purpose of the state machine is to assert the outputs some predefined number of times based on the input.
You do not necessarily have to latch the input - once you got the input, your state machine will transition to one of the two states corresponding to each type of item. From now on, you no longer need the input signal.
One possible implementation:
This implementation is one of the possible implementations, but not the only one.
You got just three coins types which you can use for the change. The amount of each coin type you need to give back to the buyer is determined by which kind of product did he buy.
You could use three different counters (one for each type of coins). Once you get the input from the buyer you load the initial amount of coins to each counter. While not all the counters are zeroed, on each clock cycle you assert the output corresponding to the one of non-zero counters and reduce the count of this particular counter by 1.
This is not the best approach in terms of gate count, but it will work.
First of all, throw out this concept of 'instructions'. They do not exist in Verilog. Nothing is executed. Verilog, VHDL, SystemVerilog, etc. are what are called hardware description languages. They are not executed. They are not interpreted. They define hardware components (logic gates, flip flops, registers, etc.) and their interconnections. (Not entirely accurate I suppose; but the only verilog that you can put on an FPGA - synthesizable verilog - will not be executed or interpreted. Testbenches are a different animal.)
Clocks are used to drive flip flops and registers. Data can be shifted into flip flops and registers on the edges of the clock. So inside of an always @(posedge clk) block, all of the statements will be 'executed' simultaneously and the results will be latched into the registers on the clock edge, according to the rules of how the HDL statements are interpreted. Be very careful where you are using = and <=, though. The meaning of these two assignment operations is very different inside of an always block. The basic idea is that all of the = operations are dealt with first in order of appearance. This happens at the propagation speed of the gates. Then all of the <= are dealt with at the same time, storing the argument into a register. The only thing the clock affects in this case is precisely when the registers are updated. If you are running a simulation, it won't matter how many operations need to occur between registers, but on an FPGA the clock will have to be slow enough to ensure that any changes have been able to propagate through the logic.
Faster clocks can be generated using a device called a phased lock loop (PLL). PLLs are not synthesizeable in verilog, but generally there is a way to instantiate a dedicated PLL component on the FPGA you are using. Actually, I take that back, you can certainly make a digital PLL in verilog, but you can only use it to generate signals lower than the clock frequency. A PLL contains a voltage controlled oscillator, one or more frequency dividers, a phase comparator, and some control circuitry. The output of the VCO is divided down and phase compared with the input frequency. The VCO control voltage is adjusted until the divided down VCO output precisely matches the frequency and phase of the reference signal. If you set the divider to 5 and use 50 MHz for the reference frequency, the PLL will generate a 250 MHz signal that is precisely phase locked to the 50 MHz reference. There are several reasons for doing this. Using a PLL allows generation of multiple clocks so different logic can be run at different speeds e.g. for specific peripheral interfaces or for slow, complex combinatorial logic. It also can allow the device to control its own clock frequency to save power.
Blocking statements inside of always blocks will generate combinatorial logic. Again, this logic will generally always be 'executed' regarless of the clock because it defines actual logic gates. It can be beneficial to use a few temporary variables, but care must be taken to ensure that there isn't so much extra logic that the timing requirements are not met.
Best Answer
The two types of FSM are Mealy machines and Moore machines.
In a Mealy machine, the output is a function of the inputs and the current state. In a Moore machine, the output is purely a function of the current state.
simulate this circuit – Schematic created using CircuitLab
The types of FSMs you're trying to describe in your question are a little unclear. It sounds as though you are referring to two ways of structuring a FSM in a hardware description language. These don't correspond to different types of FSMs; they're simply differences in coding style. Neither one is inherently better.