Chapter 1 CGI
Chapter 1 CGI
<html>
<title>An Average Website</title>
<body bgcolor="#003399" text="#ffcc33">
<h1>An Average Website</h1>
<p>This is an average website. </p>
</html>
The above HTML code is static.
Static vs Dynamic Pages
…
If the user reloads a static website, they would see the exact same
content every time.
Its content was written directly by an author, and when the user goes
to the site, that code is downloaded into a browser and interpreted.
Expires: Date The date the information becomes invalid. This should be used by
the browser to decide when a page needs to be refreshed. A valid
date string should be in the format 01 Jan 1998 12:00:00 GMT.
Location: URL The URL that should be returned instead of the URL requested.
You can use this field to redirect a request to any file.
Content-length: N The length, in bytes, of the data being returned. The browser uses
this value to report the estimated download time for a file.
A GET method will show the input data to the user in the URL area
of the browser, showing a string like:
www.check.com/cgi-bin/test.cgi?name=Tola&sex=male&age=25.
The GET method is acceptable for small amounts of data.
It is also the default method when a CGI program is run via a link.
GET vs POST…
With GET, there is a limit how large a URL can be.
The maximum length of a URL, as decreed by
HTTP standard, is 256 characters.
However, longer URL may still work, but servers
are not obliged to accept them
In POST, query string is encoded in the HTTP
request body
It is not part of the URL
As a result, they are not limited in size
Unlike GET, POST allows arbitrarily long form data
to be communicated
Arguments usually do not appear in server logs
GET vs POST…
POST / HTTP/1.1
Host: localhost:1888
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.12)
Gecko/20051010 Firefox/1.0.7 (Ubuntu package 1.0.7)
Accept: text/xml,application/xml,application/xhtml+xml,text/html;
q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://springer/~s133ar/cform1.html
Content-Type: application/x-www-form-urlencoded
Content-Length: 22
name=Tola&sex=male&age=25
GET vs POST…
Your CGI program should inspect (check or examine)the
REQUEST_METHOD environment variable to determine if
the form was a GET or POST method
Then it can take the appropriate action to retrieve the form.
The CGI Program can get the request method, Post or Get,
using getenv() and environment variable REQUEST_METHOD.
void getQuery()
{
method = getenv("REQUEST_METHOD");
for(int i = 0; i < strlen(method); i++)
method[i] = toupper(method[i]);
if(strcmp(method, "GET") == 0)
query = getenv("QUERY_STRING");
else if(strcmp(method,"POST") == 0)
{
int len = atoi(getenv("CONTENT_LENGTH"));
query = new char[len];
fread(query, len, 1, stdin);
}
else
query = "unknown";
}
void changeSpecialCharacters() {
int t = 0;
char hex[4], digits[18], ch;
strcpy(digits, "0123456789ABCDEF");
strcpy(temp, "\0");
for(int i = 0; i < strlen(query); i++)
{
if(query[i] == '+')
temp[t++] = ' ';
else if(query[i] == '%‘)
{
hex[0] = query[++i];
hex[1] = query[++i];
hex[2] = '\0';
for(int j = 0; j < strlen(digits); j++) {
if(hex[0] == digits[j])
{
ch = 16 * j;
break;
}
}
for(int j = 0; j < strlen(digits); j++) {
if(hex[1] == digits[j])
{
ch = ch + j;
break;
}
}
temp[t++] = ch;
}
else
temp[t++] = query[i];
}
temp[t] = '\0';
query = (char*)temp;
cout<<"\n<br>Decoded URL: "<<query;
}
char* GET(char *name)
{
char *value;
int eq = -1;
strcpy(str, "\0");
prevo = -1;
for(int i = 0; i < strlen(query); i++)
{
strcpy(str, "\0");
if(query[i] == '&')
{
for(int j = (prevo + 1), t = 0; j < i; j++, t++)
{
str[t] = query[j];
str[t+1] = '\0';
}
if(strncmp(str, name, strlen(name)) == 0)
{
value = separate(str);
return value;
}
prevo = i;
}
}
strcpy(str, "\0");
for(int i = (prevo + 1), t = 0; i < strlen(query); i++, t++)
{
str[t] = query[i];
str[t+1] = '\0';
}
if(strncmp(str, name, strlen(name)) == 0)
{
value = separate(str);
return value;
}
return "";
}
char* separate(char field[])
{
int u = 0;
char *ret;
strcpy(sepr,"");
for(int t = 0; t < strlen(field); t++)
{
if(field[t] == '=')
{
for(int i = (t + 1); i < strlen(field); i++)
sepr[u++] = field[i];
sepr[u] = '\0';
break;
}
}
ret = sepr;
return ret;
}
int main()
{
cout<<"Content-type: text/html\r\n\r\n";
getQuery();
changeSpecialCharacters();
if (method == NULL)
{
cout<<"<p>No posting method identified.</p>";
return 0;
}
cout<<"\n<br> First name: "<<GET("first_name");
cout<<"\n<br> Last name: "<<GET("last_name");
cout<<"\n<br> Password: "<<GET("password");
return 0;
}
HTML form for above CGI:
<html>
<head>
<title>CGI Test</title>
<script language="JavaScript">
function validate() {
if(inp.first_name.value=="") {
alert("First name is empty");
return false;
}
if(inp.last_name.value=="") {
alert("Last name is empty");
return false;
}
if(inp.password.value=="") {
alert("Password name is empty");
return false;
}
return true;
}
</script>
</head>
<body>
<form name="inp" method="get" action="/cgi-bin/Test.cgi" onSubmit="return validate();">
<span class="style1"><strong>Registration Form</strong></span> <br /> <br />
First name: <input type="text" name="first_name" /> <br /> <br />
Last name: <input type="text" name="last_name" /> <br /> <br />
Password: <input type="password" name="password" /> <br /> <br />
<input type="submit" value="Submit">
</form>
</body>
</html>
Test the CGI
To test the above CGI, first compile the C++ file
and run it.
This will create an executable file.
Rename the executable file to Test.cgi and put in
the CGI directory.
Now you can open the form, fill and then submit
it to run the CGI program.
Security
Since a CGI program is executable, it is basically the
equivalent of letting the world run a program on your
system
This isn't safe at all – it creates security risk.
Therefore, there are some security precautions that need to
be taken when it comes to using CGI programs.
The first one is the fact that CGI programs need to reside in
a special directory, so that the Web server knows to
execute the program rather than just display it to the
browser.
This directory is usually under direct control of the
webmaster, prohibiting the average user from creating CGI
programs.
Security…
The other is, when dealing with forms, it is
extremely critical to check the data.
A malicious user can embed shell metacharacters ─
characters that have special meaning to the shell ─
in the form data.
This could cause big problem to your system.
For example, here is a form that asks for user name:
<FORM action="/cgi-bin/finger.cgi"
method="POST">
<input type="text" name="user" size=40>
<input type="submit" value="Get Information">
</form>
Security…
#include<dos.h>
int main()
{
system(“mkdir “<<get(“user”));
return 0;
}
What would happen if the user enters “John; del *”
If this is passed to commandline, it could cause
catastrophic damage
Security…
The false security of HTML form Hidden input, limited options, and
the POST method
One way to input constant data from a form, or to allow several sequential
inputs from the same user, is to use the <input type="hidden"> tag.
You should be aware that anyone can see this information using "View
Source". So, don't hide you secrets there.
Related to this is the issue of limiting user choices to the options in a SELECT
box.
This will stop random data from being entered, but unfortunately it is quite
easy to construct a URL that contains a query string with whatever the bad
guy wants.
For example, say you have a select box that limits the user to "male" or
"female" parameter.
http://biolinx.bios.niu.edu/cgi-bin/z012345/your_program.cgi?sex=male
A modestly clever user could change this to:
http://biolinx.bios.niu.edu/cgi-bin/z012345/your_program.cgi?
sex=monday
Your carefully chosen options would be subverted.
Most CGI scripts are designed to work with both POST and GET.
Security…
Scripts that read or write files
A script that writes a file can be a problem.
In the simplest case, the contents of that file are completely trashed
by malicious user.
A more dangerous case is that a bad guy might write a file that