Tagua VM is an experimental PHP Virtual Machine that guarantees safety and quality by removing large classes of vulnerabilities thanks to the Rust language and the LLVM Compiler Infrastructure.
PHP is an extremely popular programming language. On 2015, PHP was used by more than 80% of all websites. However, almost 500 known severe vulnerabilities have been recorded, whose almost 50 with a high CVE score. This is inherent to any popular language but this is dangerous.
The goal of this project is to provide a PHP VM that guarantees safety and quality by removing large classes of vulnerabilities. This will be achieved by using appropriated tools like Rust and LLVM. Rust is a remarkable language that brings strong guarantees about memory safety. This is also a fast language that competes well with C. It can also talk to C very easily. LLVM is a famous compiler infrastructure that brings modernity, state-of-the-art algorithms, performance, developer tool chains…
This project will solve three problems at once:
PHP is a popular programming language. Today, it powers a large part of the softwares we daily use on Internet. To list a few: Wikipedia, Facebook, Baidu, Yahoo, or Pinterest, but also softwares you can install for your own purposes, such as: Wordpress (blog and website), Drupal (CMS), Joomla (CMS), Magento (commerce), Shopify (commerce), Moodle (learning), phpBB (forum)…. On 2015, PHP was used as the server-side programming language on more than 80% of all websites.
Due to its unique position in the Internet land, a single vulnerability could have a huge impact.
PHP VMs have recorded almost 500 known vulnerabilities, whose almost 50 vulnerabilities with a CVE score greater or equal to 9 over 10. Many of them and the most dangerous are about memory corruptions [7, 8, 9] or errors in parsers [10, 11, 12, 13, 14]. The implications of these vulnerabilities are for instance remote code execution or Denial Of Service, two vectors that have important impact on a whole infrastructure.
This situation is real for any programming language (like Python or Java). Nevertheless, the criticality of a vulnerability is hardly linked to the popularity of the language. In the case of PHP, a single vulnerability can be dangerous in so many fashions. However, this is not the fault of the language itself: All the listed vulnerabilities are due to the VMs.
Currently, PHP has two major virtual machines (VM): Zend Engine and HHVM. Zend Engine is the original VM, it is mainly written in C and counts hundreds of contributors. HHVM is mainly written in C++ and also counts hundreds of contributors. HHVM, by being more recent, has a more state-of-the-art approach and offers features like Just-In-Time compilation.
However, both VM are written in unsafe languages, where segmentation faults, data races, memory corruptions etc. are very frequent and are severe errors/vulnerabilities, as presented in the previous section.
Tagua VM has a different approach.
The class of vulnerabilities and class of bugs mentioned earlier are almost removed in the Rust language. This is part of its guarantees. It does not avoid the need of complete test suites and security audits though.
Due to the popularity of PHP, this is extremely important to have a safe VM to run these applications.
Since the old days of Computer Science, numerous bugs and vulnerabilities in OS (like Linux or BSD), in libraries (like Gclibc), in major programs (like Bash, X.org or PHP VMs), have been found, simply due to the lack of memory and type safety. Intrinsically, Rust enforces safety statically, hence removing most of the memory vulnerabilities like segmentation faults or data races.
The quality of a project can be defined in various ways. Here is what we mean when speaking about quality.
The term “performance” must be defined. By saying “performance” we mean: Speed and memory efficiency. While speed is not the top priority, memory is. It is part of safety. When safety is ensured and quality is high enough to detect most of the regressions, we can safely patch the VM to get better performances if and when necessary.
Obviously, we commit to use the state-of-the-art algorithms and structures to ensure excellent performances.
cargoavailable in the path. Cargo is the Rust package manager.
To build a release version:
$ cargo build --release $ ./target/release/tvm --help
To build a development version:
$ cargo build $ ./target/debug/tvm --help
If installing Rust and LLVM on your machine is too much, Docker might be an alternative: It provides everything needed to build, test and run Tagua VM.
First, build the Docker image:
$ docker build -t tagua-vm-dev .
Now, it is possible to run a container from this image:
$ docker run --rm -it -v $(pwd):/source tagua-vm-dev
If this command succeeds, you are inside a fresh container. To see if everything is fine, you can start the test suite:
$ cargo test
Do whatever you want. Just respect the license and the other contributors. Your favorite tool is going to be:
$ cargo test
to run all the test suites (unit test suites, integration test suites and documentation test suites).
In order to get an overview of what need to be done, what is in progress and what has been recently done, a kanban board is available.
The documentation is automatically uploaded online at the following address: https://tagua-vm.github.io/tagua-vm.
To generate it locally, please, run the following command:
$ cargo doc --openchannel on Freenode. Alternatively, there is a mirrored room on Gitter.
Tagua VM is designed as a set of libraries that can work outside of the project. So far, the following libraries are living outside of the project:
libtagua_parser, Safe, fast and memory efficient PHP parser (lexical and syntactic analysers).
Tagua VM is under the New BSD License (BSD-3-Clause):
New BSD License
Copyright © 2016-2016, Ivan Enderlin. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the Hoa nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS AND CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.