PHP Compilation – Can PHP Code be Compiled to Hide Source Code?

compilationPHP

A pretty critical issue came up, which is more legal than technical, but I hope to find a lower cost technical solution. There are laws in some countries (I'll leave out name) where you have to keep data on a server in that country. It is impossible to completely separate the back end software from the database. The government of this country can access the server and all the data on it at will and pass it to state-owned or sponsored competitor.

I am not concerned about the user data because it is kept separately from users from other countries, but I am very worried about the PHP code that is wide open on the server.

Is there a way I can compile or obfuscate the PHP code somehow so that it can still run on the server but cannot be viewed/edited/modified like compiled software?

Or is the only technical option to switch to a compiled programming language like Java?

Best Answer

Is there a way I can compile or obfuscate the PHP code somehow so that it can still run on the server but cannot be viewed/edited/modified like compiled software?

No.

In order for the code to run on the server, the CPU has to understand it. CPUs are much, much stupider than humans. If the CPU can understand it, then a human can, too. If you make it so a human cannot understand the code, then a CPU cannot either, and you can no longer run it.

By the way: all currently existing PHP implementations (Zend Engine, HVVM, P8, Quercus, Phalanger, Peachpie, HippyVM, Tagua VM, JPHP, …) will actually compile your PHP code.

Or is the only technical option to switch to a compiled programming language like Java?

No.

It makes no difference. The code has to be on the server in a form that the server can understand it. But then, a human can understand it, too.

The only possible solution to make sure that someone cannot get access to your code is … to not give them access to your code. It really is that simple.

The easiest way to achieve this, is to keep the code on your premises and only offer access to it via a tightly controlled service. That's how Google keeps its code secret, for example.

A much, much, much harder and more expensive way is to have full end-to-end encryption of your code. Essentially, you deliver the server as a tamper-proof, impenetrable black box that only allows access through a tightly controlled service. The trick here is to ensure that the black box actually stays "black". This requires that you have full control over the Operating System, all libraries, the CPU, the motherboard, the RAM, all busses and I/O chips in the system, and so on.

Remember the XBox? Microsoft transmitted the master encryption key from the encryption module to the CPU via an un-encrypted bus. They thought this would be okay because there doesn't exist any hardware can snoop such a highspeed bus. Well, it turns out if you have 10000000$ oscilloscope and a lot of time on your hand, you can actually snoop that bus, and as it so happens, MIT has invented such an oscilloscope, MIT students have a lot of time on their hands, and are trained in cryptanalysis, and they like running Linux on everything, so they created a Linux installer for the XBox, which in turn was reverse-engineered by crackers to create installers for pirated games.

By the way: there is no such thing as a "compiled programming language". Compilation or interpretation are traits of the compiler or interpreter (duh!), not the language. They live on completely different levels of abstraction. If English were a typed language, the term "compiled programming language" would be a type error. Every language can be implemented with a compiler and every language can be implemented with an interpreter. For example, all currently existing mainstream PHP, JavaScript, Python, and Ruby implementations have compilers. Conversely, there are interpreters for C, C++ and Haskell.

Related Topic