How is encoding handled correctly during copy-paste between programs

character encodingtext-editortext-encoding

Suppose

  • a program A opens a text file A using encoding A to decode the file, and
  • a program B opens a text file B using encoding B.

When we copy some text from file B in program B to file A in program A using mouse selection, ctrl+c and then ctrl+v, I heard that the GUI of the OS (e.g. X window system in Linux, and I guess something similar in Windows) handles the transfer between the programs.

For example, program A can be any program which accepts text-paste, such as a text editor (e.g. emacs, gedit) or any other program, and program B can be any program which accepts text-copy, such as a text viewer (e.g. a web browser such as firefox, chrome), a text editor, or any other program.

Question:

Note that encoding A and encoding B can be different. What should happen under the hook of ctrl+c and ctrl+v so that the pasted text in file A in program A can be consistent with the original text in file A?

  • When hitting ctrl+c in file B and program B, is the binary content of the copied text in the "clipboard" of GUI of the OS the same as the binary content of the original text in file B? I.e. is the encoding for the copied text in the "clipboard" still encoding B? What program should determine the encoding of the copied text in the "clipboard"?

  • When hitting ctrl+v in file A and program A, is the binary content of the pasted text in file A the same as the binary content of the original text in file B? I.e. in the new file A, the original text should be decoded with encoding A, and the pasted text should be decoded with encoding B? What program should determine the encoding of the pasted text in file A?

Best Answer

The simplest solution is to use a standard encoding. For example, in Windows, one standard encoding is "unicode", which refers to UTF-16, the encoding recommended for Windows applications. The programs which accept clipboard input have to be able to interpret the encoding. This is all documented on MSDN.

Unicode (Windows) Standard Clipboard Formats

Related Topic