Away from trust-based application security: the programming language
--
We started last time to talk about application security and we gave the usual reminder about the most common offender: inability to control, limit and audit project dependencies. But then we went further down the rabbit hole and talked about the insufficient transparency of the operating system on remote connections and network traffic. Then we touched shortly on the numerous tracking extras applications have, which fundamentally take control away from any audit effort: the injected tracking or advertisement code is nobody’s concern, but marketing wants it because it drives some concept of revenue.
Finally, we voiced the fundamental conclusion to all these issues: security today, application security to be more precise, is based on trust. I trust the developer they give me a well behaved framework, library or application. The dependencies are too many to track and properly audit, the applications communicate with too many online APIs to allow any kind of control, and the operating system often doesn’t help make all this chatter easy to understand either.
Moreover, malicious code may hide in the most obscure library I reference in my project. It may also not be fundamentally malicious, but coerced into malicious behavior by certain bugs I fail to understand as a developer. So we are left with trust based security: I hope and trust that I don’t run malicious code in my project, I hope and trust that the application I build cannot be coerced into becoming a malicious mess that exfiltrates private data from users.
We hinted last time on possible solutions, and we categorized them a bit, showing that in order to have certainty about application security, we need a lot of work in all areas of software development: the programming language, the application runtime that hosts the program we build and the operating system that enables the runtime. Today we will talk about the paradigms we need to enforce application security at the programming language level. So let’s strap in and imagine a secure-by-default language!
The compiler-runtime contract
Essentially we need to limit the programming language to allow operations that would break out of the system’s normal operation. You would think this should be easy, but it’s very tricky to define what the program does at a certain moment. As I write my function that will send a trivial JSON package to an online API, I have no idea that a malicious user could force the system into an unpredictable behavior by sending carefully crafted content I have no clue about. How can we mitigate this?
We already have a lot of work done in this area, mostly related to memory misallocation and misuse. The .NET runtime for example employs many checks when accessing memory to make sure that we don’t escape the imposed variable bounds. That’s why we get an exception for example when we try to access an index further than an array’s size. Compilers also help here, for example Rust which is able to figure out the whole memory allocation business in the compilation step. While .NET makes sure the memory is handled properly at runtime using garbage collection, Rust resolves memory allocation at compilation time.
That’s all great but unfortunately memory is not the only place where misuse happens. When asking the operating system for a resource, the compiler needs to know what the resource would look like and what it will be used for. This can be done by annotating functions that make such requests:
@loads File(csvtext) -> file
@format File(csv)
@format-csv File(id: number, name: string, address: string)
@format-id(min: 0, max: 100)
@format-name(type: alphanumeric, length: 100)
@format-address(type: address)
function readFile(file) {
// open the file and read it
}
What happens in the above imagined code is that as we build our regular file reading function, we also instruct the compiler on how it should behave. We are essentially creating a contract, an agreement between the compiler and the runtime, instructing how the runtime should behave when this function executes. We are describing the input, that it’s a file and that it should be in a comma-separated-values format, or CSV. We are also describing each field in the file, give it a type and some constraints. In the same way we would describe an API endpoint request.
Enforced strong typing
All of this enable the compiler to understand what kind of file it deals with, allowing it to constrain the runtime and stop it from reading unacceptable files. Programming languages currently handle this by using strongly typed variables. The problem is that the strongly typed variables are often poorly used. We mark a counter as int, when in reality it should have its own counter
type. We mark an address as a string
, when in reality it should have an address
type. But the problem runs deeper still.
Even though programming languages allow us to create our counter
and address
types, and even though some developers take their time to build them and provide proper validation, not all addresses are created equal, and not all developers want to put in the time to have them fully secured. To enforce security, the compiler would need to enforce these custom types to be built. It would need to require such function descriptions we saw in the earlier example to be written for each function. When the compiler encounters a variable, it should immediately request a full description on it: constraints, usage and lifetime like in Rust.
After the compiler understands what each variable does, it can restrict the runtime to perform only the types of activities the variable allows. There is a great difference between a string
variable and an address
variable. When it’s an address
, a name
or a phone
, you immediately know how and where it should be used. But the fact that the developer knows is not enough. Just like for memory management, the compiler and the application runtime need to know as well so they can prepare the allowed operations and deny the input that cannot be handled.
Another easy example is encountered all the time on the web, where strings
are not sanitized and allow random content such as scripts to be passed and rendered on the web page. If the javascript runtime would know more about that string
, that it’s supposed to be an address or an email, it would know to deny other inputs and stay clear of malicious behavior.
Annotating properties dealing with user input
Again, all of this can be implemented using better type systems which enforce proper descriptions instead of vague and rushed compromises. But since developers cannot be bothered to write proper types, enforcement is required from the compiler. But this type description needs to go further: the compiler needs to know explicitly which properties are linked to user interaction:
@user-input (address)
function readAddress(address: Address) {
// do something with the address
}
By letting the compiler know that a certain variable comes from the user, or deals with user interaction or is produced by the user, means the world for the application security. Once instructed, the compiler can secure the variable, sanitize it, give it properties that are not necessary in system-to-system interaction. It can make it mutable to allow editing, while keeping every other variable immutable. It can encrypt it in place if needed or mark it as vulnerable.
By marking a variable as user-facing, the application could enforce specific UI for it. For example if a variable has the address
type, the UI could show a special and fully secure screen for inputing the needed values. Of course this goes against the holy grail of current software development: custom input controls, but these are the worst offenders for application security we have at the moment.
The user-application contract
The trouble with custom inputs is that they allow the developer to establish the user interaction contract between the users and the application. The whole responsibility of interacting correctly with the application is rested on the shoulders of the developer. It’s way too much to handle. No matter how nice the designer makes everything look, the developer has to think of so many ways in which the user may mess around with the custom input, it will always be a cat and mouse game.
The trouble is that the user, the designer and the developer all see different things on the screen when they look at the same UI. For example when asked for a phone number the user forgets that the country code is required, the developer simply assumes the user will input it, and the designer forgets entirely that a country code exists and it’s needed for the phone library to work. Instead, if the runtime would provide its own, fixed, standardized, secured and carefully tested phone input, all of this would go away. There would be no room for misunderstanding: every application in the world would have the same input for phone numbers.
By saying this of course I trigger all UI designers. Everybody wants their UI to be unique, to draw attention. And if all interfaces are the same and we cannot change them, how are we able to innovate? This of course, is not a new problem and it was thought about and fixed before. I don’t know if you remember the Windows Presentation Foundation, or WPF. It had this suite of UI components that were headless, or style-less. They had the behavior associated with them, they had a certain default look, but you were able as a developer to infinitely style them as you wish, to make them look exactly like you want them.
Unfortunately the solution did not stick. In part, HTML is able to do the same, kind of, but it’s also way too permissive. You can hide an input
element entirely, create a different interaction format from a div
, and then assign the values entered to the input
element. This is extremely dangerous and creates so many security issues that I cannot even begin to explain. But even more scary is that this paradigm is used commonly to “style” elements that have certain fixed UI traits that cannot be changed, such as the checkbox and dropdown for example. Work has progressed in recent time to allow styling of these elements, but in the past, hiding them and faking them with other elements was the norm.
The contract users think they sign
The custom elements problem is a bit deeper and here we come back full circle to the main problem. The trouble is that the user when they see a calendar, they assume they are requested to input a date. When they see a textbox
labeled name
, they assume they need to input a name. But all of this breaks down when a malicious developer tricks the user into inputing a name for a certain use, and then they run away with the name to perform malicious tasks, hidden away from the user.
How do you protect the user from a malicious developer? Currently, you don’t because we live in the “trust” based security model. That’s why the programming language and the application runtime need to step in and take this developer-can-do-anything power, and allow the user to feel safe when entering their name, address or password on a form.
Adding and enforcing proper types allow the compiler to get better data on what the user is requested, and it can provide a short text description with all such findings: what the user needs to input, what endpoints the application will access and what will each endpoint be used at. The short description can be given to the user when installing the application and they may then decide if what they read is acceptable to them. Just like with permission lists when installing phone apps, input descriptions could provide better insight for the user regarding the application behavior.
Of course, not all users will read these, just like not all users care about the rights requested by their installed apps. But at least there would be a point where some would stop, read and agree to the proper interaction contract, not the imagined one they make up in their mind. And yes, such descriptions would also allow application runtimes to catch malicious users from forcing the application into a bad state, coercing it into becoming the next best key-logger ever produced.
Thank you for participating in this imagination exercise. Some of the traits described here are already implemented in modern programming languages, other remain to be tackled by smart developers concerned with application security. It’s not an easy subject and it requires involvement from all parties: compilers, runtimes, developers and users. I hope you had a good time reading. See you in the next article!