07 Cgi
07 Cgi
CGI Programming
Common Gateway Interface
• CGI is a standard mechanism for:
➢Associating URLs with programs that can be run by a web server.
➢A protocol (of sorts) for how the request is passed to the external program.
➢How the external program sends the response to the client.
CGI URLs
• There is some mapping between URLs and CGI programs provided by
a web sever. The exact mapping is not standardized (web server
admin can set it up).
• Typically:
➢requests that start with /CGI-BIN/ , /cgi-bin/ or /cgi/ , etc. refer to CGI
programs (not to static documents).
Request → CGI program
• The web server sets some environment variables with information
about the request.
• The web server fork()s and the child process exec()s the CGI
program.
QUERY_STRING
CONTENT_LENGTH
Request Method: Get
• GET requests can include a query string as part of the URL:
/cgi-bin/finger?hollingd
• The web server treats everything before the ‘?’ delimiter as the
resource name
• Everything after the ‘?’ is a string that is passed to the CGI program.
Simple GET queries - ISINDEX
• You can put an <ISINDEX> tag inside an HTML document.
• The browser will create a text box that allows the user to enter a
single string.
If you enter the string “blahblah”, the browser will send a request to
the http server at foo.com that looks like this:
char *method;
method = getenv(“REQUEST_METHOD”);
if (method==NULL) … /* error! */
Getting the GET
• If the request method is GET:
if (strcasecmp(method,”get”)==0)
• The next step is to get the query string from the environment variable
QUERY_STRING
char *query;
query = getenv(“QUERY_STRING”);
Send back http Response and Headers:
• The CGI program can send back a http status line :
• and headers:
printf(“Content-type: text/html\r\n”);
printf(“\r\n”);
Important!
• A CGI program doesn’t have to send a status line (the http server will
do this for you if you don’t).
• A CGI program must always send back at least one header line
indicating the data type of the content (usually text/html ).
• The web server will typically throw in a few header lines of it’s own (
Date, Server, Connection ).
Simple GET handler
int main() {
char *method, *query;
method = getenv(“REQUEST_METHOD”);
if (method==NULL) … /* error! */
query = getenv(“QUERY_STRING”);
printf(“Content-type: text/html\r\n\r\n”);
printf(“<H1>Your query was %s</H1>\n”, query);
return(0);
}
URL-encoding
• Browsers use an encoding when sending query strings that include
special characters.
• Most nonalphanumeric characters are encoded as a ‘%’ followed by 2 ASCII
encoded hex digits.
• ‘=‘ (which is hex 3D) becomes “%3D”
• ‘&’ becomes “%26”
More URL encoding
• The space character ‘ ‘ is replaced by ‘+’ .
Example:
“foo=6 + 7” becomes “foo%3D6+%2B+7”
Security!!!
• It is a very bad idea to build a command line containing user input!
grep ; rm -r *; /usr/dict/words
Beyond ISINDEX - Forms
• Many Web services require more than a simple ISINDEX.
• If user types in “Dave H.” as the name and “none” for occupation, the
query would look like this:
“name=Dave+H%2E&occupation=none”
HTML Forms
• Each form includes a METHOD that determines what http method is
used to submit the request.
• The CGI must decode the query and separate the individual fields.
HTTP Method: POST
• The HTTP POST method delivers data from the browser as the content
of the request.
if (read(0,buff,len)<0)
… /* handle error */
pray_for(!hacker);
CGI Method summary
• GET:
• REQUEST_METHOD is “GET”
• QUERY_STRING is the query
• POST:
• REQUEST_METHOD is “POST”
• CONTENT_LENGTH is the size of the query (in bytes)
• query can be read from STDIN
Form CGI Example
• Student enters first name, last name and social security number and
presses a submit button.
• CGI program looks up grades for the student and returns a list of
grades.
There’s More to Come
• Keeping track of state information.
• Cookies.
• Using HTML templates
• Using JavaScript to perform form validation and other fancy stuff.
• Image Mapping
• Authentication
• Encryption