Secure Programming

    Guides -> Secure Programming
 
Web ik.homelinux.org

Secure Programming

1. Forward

This guide is an attempt to teach a different approach in how to create software. The guide uses very simple examples to show that many problems can be used in order to create a security attacks on a computer, a program or on an entire system.

Please note that the document only a preview on how to write a better and a bit more secure code, but it does not attempt to be a complete guide on how to do so. In fact it's only a brief of how we need to see our code and program, and how to avoid many common problems out there.

Please remember that this document is about educating for better coding, and NOT about teaching the readers on how to hack or crack programs.

Top

2. General Info

When developing a program, it is likely that it will interact with users in one way or another, even if that only means that our program will read files in the system and use that data.

Usually at schools and at universities when one starts to write programs, that person learns how to receive input, while teachers usually say to that person "assume that the data you receive is valid". That's when the problems begin.

From the second that a program receives an input, we can not trust any unknown input that we can not control.

Reading from a file is reading an untrusted input, and so does using user input, or accepting data (packets) from a network for example.

Top

2.1. Why can't I trust an input ?

In order to understand why an input is so dangerous, we first need to understand what is an input:

An input can be a from of key stroke, mouse movement or mouse button clicks. Input can also be the content from reading and accepting information in many other forms such as data stream or even system functions. Anything that is not arriving from us, but it is used, usually considered to be an input.

It does not matter what is the type of the input, because the user can give us wrong input, and the reasons can be intentional or by mistake. You can not control or trust this input, because you cant guess what the input is going to be.

The result of an input can be an empty (NULL) "data" that the user provides us, an out of range number, higher amount of chars then we expecting, and even an attempt to change the address of the variable that accepts the input from the user. We just can not know what the user is going to do or to provide.

Any "unsafe" handle of user input can cause for retrieving vital information that the user must not accept or could not accept, another possibility is to be able to modify data that the user would not change in any other way, and even breaking the program itself by supplying bad input.

Top

2.2. What type of problems can we expect ?

On every type of bug you will probably find a type of security attack, but I wish to give in this guide a very small list of common attacks and security risks, instead of writing a lot about the every single type of attacks.

The most common type of attacks are:

Top

2.2.1. Buffer Overflow

When a given data overflows the amount of memory that was allocated for it:

var
  iNums : array [0..9] of integer;
  ....
  FillChar (iNums[-1], 100, #0);
  ....
  for i := -10 to 10 do
     readln (iNums[i]);
  ....

In this example we can see that the static array of iNums can accept only 10 numbers, while we entered to the array a content of 21 numbers.

Please note that while the compiler might warn or report an error and stop compiling in this simple case, it won't in more complex forms.

If the user will try to execute an arbitrary code in one of our attempts he or she will succeed in doing so, because we went outside the buffer limits that was given to us by the system. This type of problem is a type of buffer overflow.

Most buffer overflow types allows us to go outside the buffer, and change a return address of a giving function for example, that instead of continue our code when returning from it's execution, will execute a code given by the user, and for example open a shell with user privileges.

Top

2.2.2. DoS Attack

Denial of Service is not only a network problem of repeated pings, but can exists also in many other ways:

procedure Recurse;
begin
  while (True) do
    begin
      Recurse;
    end;
end; 

This procedure will run until the system will be out of resources to allocate more memory to run, and will cause the system to stop responding, or even crash. Although some operating systems like Linux, will try to give you the ability to stop running the program, it will take a lot of time for you to do so.

Please note that while this is only a static example, we did create a DoS attack on every system that will run this code.

Another known DoS attack is the lack of freeing other system resources and not only memory allocations, but also open sockets, open file descriptors, the right data, but with invalid or non expected order and much much more...

For example on memory allocation DoS:

 ...
begin
  while (True) do
    begin
      Getmem (OurPtr, 10);
      OurPtr := Something;
    end;
end. 

The above example display a memory allocation (Getmem is the same as the C's malloc function) that keep on allocating memory without freeing the allocated memory that is not used, while loosing the old allocated memory address. This problem known as a memory leak. Please note that most operating systems (Linux, Unix/Mac OS X and more) will have the memory back as soon as the problem will finish it's execution, but some will not (Such as Microsoft Windows that will not return the full allocated memory, and only a reboot will give the memory back).

Top

2.2.3. Injections

When the user gives us an input, we need to work with that input. However, when we are using it "as-is" without filtering and escaping the data to our needs, the user can place for example some SQL tags that will cause our program to delete some records/tables or send the user some restricted data such as database/table structure/list of fields, database user name and password, content of directory or file, or even execute a program at the computer.

Please note that while on this example we are talking about SQL code, injections can also be for programming code under "eval" for example, opening pipes as another examples, and the list goes on and on...

A SQL injection example:

User Input:
  Please enter your name: a' OR 1=1 
// Inside the code:
  ...
  write ('Please enter your name: ');
  readln (sName);
  Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name='#32 + sName + #32);
  ...

This addition of SQL statement will cause our query to add new "WHERE" rule that can cause for data traversal or other problems that we are not always able to detect.

Top

2.2.4. Myth and Assumptions

Many of the security issues exists because developers ignore important warnings and information that was given by the compiler. Another reason is because they think the program does not contain any problem or bugs that some one can take advantage of.

Here are some examples for this type of problem:

Myths:

  • Security by Obscurity - When no one knows about a problem, no one can take advantage of it.
  • Secure programming language - There are programming languages such as Perl and other script languages that many developers think that they are secure by default from buffer overflows and vulnerabilities, and thous developers never takes simple measures to fix such potential problems because of that.
  • Hash password is secure - A file that have an hashed password is not secure. Hash can only have one passed and that's for generating that hash, you can not retrieve the original data, so by taking such hash, attackers can authenticate as a valid user without knowing the real password, but with the ability to use that hash.
  • Nothing can break my program.
  • Fixing and solving bugs on the fly

Assumptions:

  • The QA team will find and fix all of my bugs.
  • The user will not harm my program and its data.
  • My program will be used only for its original use.
  • All exceptions can remain unhanded.
  • Compiled code can not be read by humans
  • Obfuscation of the machine code symbols is a type of protection
Top

3. Explanation

Now after we know some of the problems we should face while developing programs, we should learn how to fix this type of problems. All of the problems I listed above manifest into two types of problems, assumptions and the lack of care programming. In order to learn how to fix them, we first need to learn how to think in a different approach, that we have.

Top

3.1. Overflow

For fixing overflow of data, such as buffers or other type of input, we first of all need to identify the type of data we need to work with.

Top

3.1.1. Buffer Overflow

If we will return to our small example of:

var
  iNums : array [0..9] of integer;
  ....
  FillChar (iNums[-1], 100, #0);
  ....
  for i := -10 to 10 do
     readln (iNums[i]);
  ....

We see here a range that was overridden by our values, without even a small test to check if the index number is correct.

With the Pascal's dynamic/open arrays we can know the limits of the allocated memory. So all we need to do is to check if the size is too small or too high for our buffer, and limit the index to the range we can handle in our buffer.

So the example should be changed into:

var
  iNums : array [0..9] of integer;
  ....
  FillChar (iNums[Low(iNum)], High(iNum), #0);
  ....
  for i := Low (iNum) to High (iNum) do
     readln (iNums[i]);
  .... 

But wait ! something is not right yet !

The readln will accept an unlimited amount of chars, and no one promises us that it will be an integer or even in a range of numbers we can handle.

Top

3.1.2. Number Overflow

Because string in Pascal is pure array (hrmm hrmm.. not really, at least not AnsiString in FPC and Delphi, but lets pretend it is for a second OK ?), readln will try to find and see what is the string limits and will not try to overflow the range we gave that type. The problem is that numbers are not the same as string, and readln is not built to guess the needed range.

A computer have limits of many types and shapes regarding memory and numbers. It can gives only "small" amount of memory for numeric (floating point and integer numbers). And many times we do not need a large range of numbers to use (Boolean for example is a variable that needs only two numbers [or one byte range of numbers] most of the times).

In the above example we may have a type of "buffer overflow" that will cause a range check error that will give us the wrong number (Carry Flag reminder issues... I'm not going to explain them in this document), and we also have a DoS effect, because our program will halt from that point.

So what can we do from that point ?

First of all we may wish to work in that point with a string variable that will be in the length of the largest number +1 (for minus sign), or we can create our own readln procedure/function that will specialize in receiving Integer types.

For the first offer we can do the following (Copied from the FPC documentation):

Program Example74;

{ Program to demonstrate the Val function. }
Var I, Code : Integer;

begin
  Val (ParamStr (1),I,Code);
  If Code <> 0 then
    Writeln ('Error at position ',code,' : ',Paramstr(1)[Code])
  else
    Writeln ('Value : ',I);
end.
     

Here we can see how to convert a string into an integer with a very easy error handling. The function StrToInt will also do the trick, but then we need to capture an exception in any case of error.

Here is a small example for a small readln like procedure for integer numbers.

program MyReadln;
uses CRT;

procedure MyIntReadLn (var Param : Integer; ParamLength : Integer);
var
  Line  : string;
  ch    : char;
  Error : Integer;

begin
  Line  := '';

  repeat
    ch := readkey;
    if (Length (Line) <> ParamLength) then
     begin
      if (ch in ['0'..'9']) then
       begin
         Line := Line + ch;
         write (ch);
       end
      else
      if (ch = '-') and (Length (Line) = 0) then
       begin
         Line := '-';
         write (ch);
       end;
      end;

    if (ch = #8) and (Length(Line) <> 0) then // backspace 
     begin
      Line := copy (Line, 1, Length (Line) -1);
      gotoxy (WhereX -1, WhereY);
      write (' ');
      gotoxy (WhereX -1, WhereY);
     end;
  until (ch = #13);

  val (Line, Param, Error);

  if (Error <> 0) then
    Param := 0;

 writeln;
end;

var
 Num : Integer;

begin
  write ('Number: ');
  MyIntReadLn (Num, 2);
  writeln ('The number is: ', Num);
end.
     

Please note that you can make it even better, and more efficient if you wish. This is only a very small and basic example that demonstrate how it should be done.

Top

3.1.3. What is the security risks in Overflows ?

Overflow of memory can allow users to give arbitrary CPU flags code to. This code will execute anything that the user wishes to execute, and we will loose control over the system at that point.

Top

3.2. Denial of Service

Denial of Service (DoS) is one of the hardest types of attacks to prevent. The reasons are:

  • The denial of service can be executed even without any bug that is actually exploited, like the DDoS (Distributed DoS -> Slashdot effect for example, when many people request the same web page at the same time) "attacks".
  • Every system resource have the potential to cause a DoS. The DoS will exists because we open one too many sockets, we read file too big to exist in memory, or just opened too much files at the same time. Another DoS that we already talked about is the lack of freeing memory when we "do not need" it anymore.
  • Removal of files like a kernel module can cause a big mess and potential DoS as well.
  • Lack of configuration or wrong configuration can cause a denial of service just as well.
  • Too much permissions or lack of them.
  • Almost any type of exploit can result into a Denial of Service.

So as we can see here, a Denial of Service can be almost anything that can stop us from making our work as we wish. And the reasons can be from exploiting a buggy code, through bad configuration to just capturing the system resources.

In the above example (of the denial of service):

procedure Recurse;
begin
  while (True) do
    begin
      Recurse;
    end;
end; 

I created also a stack overflow (another type of buffer overflow), that caused the computer to need more memory resources to continue executing the code.

Any system resources that available to the program can be abused by not returning it back to the system when the program "does not need it anymore". By holding system resources such as memory or sockets, we are removing the ability from other programs to perform some of their needed actions. That is way most programs will stop their execution and report an error, while the others will hang and keep on looking for the system resources forever.

Please note that some of the abuse of system resources exist because of a bug in the source code. For example waiting for a 150k buffer, while the actual buffer is only 2 bytes. When the program still looking for the 150k buffer a new request for a 150k buffer is made etc.. until the system is unable to answer anymore for any of the requests (this is a known type of security attack BTW).

A good workaround for this bug is to limit how many non full buffers can be allocated at one time and if after a "timeout" the buffer is not full, we will free the buffer completely. But also doing that, will cause a Denial of Service, because the communication will stop anyway at some point.

Top

3.3. Injection

There are many ways to inject type of code into our programs. As we saw at the above example:

User Input:
  Please enter your name: a' OR 1=1 
// Inside the code:
  ...
  write ('Please enter your name: ');
  readln (sName);
  Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name='#32 + sName + #32);
  ... 

The injection happens when we do not filter our code (sanitize is the more professional term), we also do not escape dangerous chars, and we do not check that we have received the exact type of input that we are looking for.

For example, we could check if sName have spaces, and if so, do not continue to check the rest of the variable. The reason is very simple. The name should only be one word, and for us a word defines by letters, maybe even the tick sign (\') and maybe even underscore (_) and then it's over. If we place a number, it is not a word anymore (unless we wish to use "hacker language", or allow the use of numbers).

There are many ways to check if we have the valid structure. The less effective one, but highly in use is the following:

function ValidVar (const S : AnsiString; AllowChars : TCharset) : Boolean;
var
 i : Word;
begin
 i      := 0;
 Result := True;

 While (Result) and (i <= Length (S)) do
  begin
     inc (i);
     Result := S [i] in AllowChars;
  end;
end; 

The function returns true if we have a valid structure of content given by the AllowChars in the S variable. Please note that this function is only a proof of concept and may need more work in order to fully be used.

Another good way to do the same is to use regular expression (regex) as the following (This is a Proof of concept only written in Perl. FPC does not have a fully supported regular engine that allows string modification, but there are 3rd party code for that, that does offer such support):

$sName =~ s/[^a-z0-9\_\']//gi; 

The regex remove any non valid chars from the string and return to us only valid or empty string.

Now when we know that our input is valid, we need to see what is the use of the variable's content. If the variable content is going into a database, or a cgi script (or anything else that have it's own syntax), we must escape the content.

There are many ways to escape this type of content. Lets assume for now that this content is going into a query of a database. Now first of all we must make sure that our escaping will not raise above the length limits of our database fields. Because if they will, then we can cause a data lost, a denial of service, and even buffer overflow problems (Please note that a respected database usually will trunk the data and sometimes not in a good location, but never count on it to do so).

After we made sure that we stand in our limits, we can continue in our attempts. To escape the code we can use several approaches. A less debugging friendly way, but a sure way of correct escaping is to use the parameters technique (there are usually naming parameters, and index parameters):

Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name=?');
Query1.Parameters.Add (sName);
if (Query1.Execute) then
 ... 

The above technique is an index parameter, that allow the database engine to escape the parameter in a way that we could use the content without any problems of illegal characters. The down side is that we can never debug the outcome of the query. That is, we can not see how the content of sName embedded in the SQL statement, and we can never see if the parameters and their order that was given to the query are correct because of that.

Usually the only escaping we need to do for using a string in a database is to escape only the ticks (') char (although some databases may have problems with more chars then ticks). So all we should do is to represent ticks in a way that will not effect the database engine, like backslash tick (\') or double every single tick to two ticks (''), or maybe even use another char that will be replace the ticks in the query and replace again when we will show it to the user.

Top

3.4. Myth and Assumption

One of the biggest problems with myth and assumptions is that we are starting to loose the ability to write efficient code. We all need to remember that there is not even one program that does not have bugs. But that is also an assumption :) although this assumption was never broken so far :(

Top

4. Beyond The Document

While in this document I gave a short (yea I know it's an understatement ;)) example and information on how to create better code, there are many issues that I did not touch or raised in the document. Part of them are user privileges for execution of the programs, system root kits, race condition and much more problems that our code needs to take in consideration.

Top
Creative Commons License
This work is licensed under a Creative Commons Attribution-NoDerivs 2.5 License.