Since your I2C works at 100Khz, but not 400 Khz, it is a good idea to look at the various factors that have an effect on timing.
1: Check that your slave board supports 400Khz.
2: Resistor values are too big.
When the timing is increased from 100k to 400k, the period of the clock drops from 10 us to 2.5 us.
This means that the rising edge of your data/clock signals has a significantly less amount of time to settle. the time taken is calculated as follows:
t = rc
the capacitance on the bus is usually constant and a property of each device. It sounds like you have these. Add them up.
The resistor values are the next variable. Since you have three in parallel, you need to add them using 1/Rt = 1/R1 + 1/R2 + 1/R3
and so on. You only need one resistor on the bus, so having three in parallel is going to lower the total resistance.
You can now calculate t using the above formula. If it is more than 300ns (just over 10% of your clock period at 400k), then the rise time is out of I2C spec. Here, table 5, page 32.
If you'd like to calculate the correct resistor value, you can re-arrange the above formula to get R=t/c
and work from there, where T is 300ns or less.
Not sure if I understand your question but... The dashed lines in your diagram means, more of the same for some time.. like the gap in SCL between 2 and 7 is dashed to mean there are bits 3, 4, 5, and 6 which repeat the same pattern as 1, and 2.
The reason for double lines.. the upper line is for a 1 and the lower is 0, they both are shown as either is valid. For example when SCL transitions at bit 7 in the diagram SDA could be either a 1 or a 0 depending on what you are sending. But.. it must be either a 1 or a 0. Elsewhere where you see the lines crossing, that indicates the signals can be undefined, the state is not determined. This allows time for I/O lines to change.
note that SDA is the data, which is clocked into the part using SCL. SDA must be at 0 or 1 state some time before the clock transition, this is called the setup time. It should also remain in that state for some time after the SCL transition, called the hold time.
Best Answer
The clock line can be bi-directional for two reasons.
Typically on slave devices that are entirely implemented in hardware the clock line is input-only. Such devices have no need to stretch the clock.
On the other hand microcontroller based slaves typically do need to stretch the clock to give them time to interpret the incoming bytes and prepare an appropriate response.
Unfortunately there are some masters out there that do not fully implement I2C and don't take any notice of clock-stretching or worse try to drive the line hard high. Such masters will not work correctly with a slave that needs to clock stretch.