I think it depends on the machine, its design aims and the limitations of the technology.
Most traditional accumulator machines have very few registers, and many only have a single ‘general-purpose’ register (by today's standards). On the PDP-8, for example, there are a few registers, but the only one you can access directly is the AC. The vast majority of instructions (not that ‘vast’ applies to a design with eight instructions) operate on the AC. The only other object to operate on is memory, so most instructions have a single field (a 7-bit address). The data transfer direction is implicit in the instruction (think load/store).
On the other hand, the machine languages of many accumulator machines were a side effect of the system architecture more so than the other way round (which is how we do it today, software being king). The PDP-8 was an accumulator machine because, well, 12 bits worth of flip-flops were pretty expensive equipment in the late 60s and the PDP-8 was a cheap computer.
As for the speed factor: in-processor registers were almost always faster than memory, same as now. But their cost was much higher than it is now, instruction widths were smaller, sequencers and state machines were much simpler, and so there weren't many registers. Unless you have at least two of them, you can keep your instruction set very simple. And vice versa.
Also, with respect to early computers using ‘accumulators’ instead of ‘registers’: ‘accumulator’ came from its use in tabulating machines to denote registers for running totals. Presumably operators would have been more comfortable with the term. The PDP-8 schematics refer to every single flip-flop in the system as a ‘register’, but there's only one ‘accumulator’ and it's marked as a ‘major’ register.
You neglected to say what part number you are using, which makes other people do unnecessary googling. Is it the 40104/40194?
Anyway, it is not true that you send a pulse to S1 and S0. The values 1 1 should be presented, and take effect on the rising edge of the clock.
If you try to set the inputs to a sequential logic device at exactly the same time as the clock edge, you will get erratic behavior. In general, inputs must already be settled by the time the clock arrives.
For instance, take a look at the timing diagram on page 8 of this datasheet: http://www.datasheetcatalog.org/datasheet/philips/HEF40194BD.pdf
There is a minimum "set up time" and "hold time" before and after the clock edge.
Best Answer
To copy data from the accumulator to the register, you just apply a positive clock edge to the register.
To copy data from the register to the accumulator, you just apply a positive clock edge to the accumulator.
You only want to clock the device receiving the data.