One reason we clock flip flops so that there isn't any chaos when the outputs of flip flops are fed through some logic functions and back to their own inputs.
If a flip-flop's output is used to calculate its input, it behooves us to have orderly behavior: to prevent the flip-flop's state from changing until the output (and hence the input) is stable.
This clocking allows us to build computers, which are state machines: they have a current state, and calculate their next state based on the current state and some inputs.
For example, suppose we want to build a machine which "computes" an incrementing 4 bit count from 0000 to 1111, and then wraps around to 0000 and keeps going. We can do this by using a 4 bit register (which is a bank of four D flip-flops). The output of the register is put through a combinatorial logic function which adds 1 (a four bit adder) to produce the incremented value. This value is then simply fed back to the register. Now, whenever the clock edge arrives, the register will accept the new value which is one plus its previous value. We have an orderly, predictable behavior which steps through the binary numbers without any glitch.
Clocking behaviors are useful in other situations too. Sometimes a circuit has many inputs, which do not stabilize at the same time. If the output is instantaneously produced from the inputs, then it will be chaotic until the inputs stabilize. If we do not want the other circuits which depend on the output to see the chaos, we make the circuit clocked. We allow a generous amount of time for the inputs to settle and then we indicate to the circuit to accept the values.
Clocking is also inherently part of the semantics of some kinds of flip flops.
A D flip flop cannot be defined without a clock input. Without a clock input, it will either ignore its D input (useless!), or simply copy the input at all times (not a flip-flop!) An RS flip-flop doesn't have a clock, but it uses two inputs to control the state which allows the inputs to be "self clocking": i.e. to be the inputs, as well as the triggers for the state change. All flip flops need some combination of inputs which programs their state, and some combination of inputs lets them maintain their state. If all combinations of inputs trigger programming, or if all combinations of inputs are ignored (state is maintained), that is not useful. Now what is a clock? A clock is a special, dedicated input which distinguishes whether the other inputs are ignored, or whether they program the device. It is useful to have this as a separate input, rather than for it to be encoded among multiple inputs.
You have unfortuantely run into a relatively subtle problem with this particular flip-flop. (BTW, this is not metastability; that's a different problem.) It's designed to operate at high speed over a wide range of supply voltages, and one of the compromises made in its design is that it has a rather strict requirement on the clock input transition speed.
If you look at section 9 of the datasheet, the input transition rate is given as 10 ns/V maximum. This means that you need to make the clock rise or fall by 5 volts in no more than 50 ns in order for the chip to operate correctly. With an RC time constant of 10 ms, you are about 6 orders of magnitude too slow.
Best Answer
Best guess: the positive-edge trend is a byproduct of designs trying to use a little area/parts as possible before the 1970's. A cost saving measure for production by increase the number of chip per wafer. Modern pos/neg-edge DFFs often have equal total area, therefore the positive-edge trend is now legacy practice.
Area saving came form "Classical" D-flip-flop designs. The modern master/slave components of a D-flip-flop can use two 5-transistor latches; Patents WO1984003806 A1 and US4484087 A both filed on Mar 23, 1984. An 8-transitor D-latch was patent was filed Feb 6, 1970; US3641511 A. For the sake of simplicity designs based on SR/SnRn latches will be be refereed to as "Classical" and "Modern" for designs using mentioned D-latch/S-cell patents.
In a IC design, a NAND gate uses less area then NOR gate because of characteristic properties of a NMOS and PMOS. Form there, the area saving size trend cascades. D-latches form SnRn latches are smaller then from SR latches. The Classical D-flip-flop designs are based on these logic gates. After searching for several designs Classical positive-edge designs are always smaller then Classical negative-edge designs. Migration to the Modern happened as the as the chips cost became favorable: area savings vs royalty fee.
Digging a little deeper to demonstrate area differences:
Classical positive-edge D-flip-flop: Schematic based Wikipedia's Classical positive-edge-triggered D flip-flop description and diagram using five NAND2 and one NAND3. This uses a total of thirteen NMOS and thirteen PMOS.
simulate this circuit – Schematic created using CircuitLab
The best Classical negative-edge D-flip-flop I could find is uses two D-Latches and two inverters. Schematic referenced form http://students.cs.byu.edu/~cs124ta/labs/L02-FSM/HowToUseMasterSlave.html. This uses a total of eighteen NMOS and eighteen PMOS. Placing an inverter on the classical posedge above will lower the transistor count of this design. Either case, the classical negative-edge is bigger then positive-edge design.
simulate this circuit
A Modern D-flip-flop design can look the following based on patents WO1984003806 A1 and US4484087 A five transistor D-latch description. This uses a total of five NMOS and fice PMOS; big area savings compared to Classical. Reversing the master/slave order would create a negative-edge flip-flop of equal size.
simulate this circuit
I am only demonstrating the smallest possible designs. Designs can very based on design requirements, allowed standard cell libraries, reset/preset features, or other reasons.