Soft Typing And Analyses For PHPPrograms

Stc
Date: 2007-03-15

Time: 12.00

Room: BBL room 471

Speaker: Patrick Camphuijsen

Title: Soft typing and analyses for PHP programs (thesis defense)

Abstract

Websites are often programmed in scripting languages such as PHP, which are dynamically typed. Learning the basics of PHP is not very difficult, and powerful applications can be developed with very little trouble. However, knowing all of the pitfalls of such a language, and all of the issues involved in developing secure and correctly working web applications is very hard.

These issues are typically oriented in the direction of security, and several other tools exist to tackle these problems. However, most of these solutions do not deal with correctness of code, and manually addressing these problems can then only be done by thoroughly testing the program, which can take a lot of time and effort.

We have investigated several of such correctness issues: for example, consider checking if no undefined variables are used, or if all functions are called with parameters of the correct types, etc. These problems are all type inference related, and would require the need for type inference before the program is executed. However, the dynamic type system of PHP performs type inference at run-time. Also issues such as checking if a database connection is actually opened before it is used is a matter of correctness of code. HTML validation is also a correctness check. However, common HTML validators only look at the output; if a mistake has been found, it does not tell you where in your code it can be found. Lastly, we looked at the notion of coding practices. These can be used to dictate a programming style, to make your code more readable and managable. Here you can think of things such as "do I use many large code blocks", or "do I use deprecated identifiers". Many variations are possible on this.

To deal with problems like undefined variables and functions types, type inference at compile time is required. But a static type system would not work, because in PHP declaration of variables is not required, and type changes of a variables are not allowed in a static type system. Also this would destroy the flexibility of PHP, and make many useful programs untypable. Instead, a soft type system is needed, which has the advantages of a static type system (i.e. type inference at compilation), yet still allows the flexibility of the PHP language.

In my thesis I have described several analyses that tackle these kinds of correctness problems. Among these analyses I have also specified a soft typing system for PHP, which is the main subject of this talk. In this talk I will explain the theory behind soft typing, and illustrate how it can be implemented for PHP. If time allows, I will also discuss a few of the other analyses that I have specified.

Docs