Difference between revisions of "Rewrite engine"
(9 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | A '''rewrite engine''' is a piece of [[web server]] software used to modify | + | A '''rewrite engine''' is a piece of [[LAMP|web server]] software used to modify URLs, for a variety of purposes. Some benefits derived from a rewrite engine are: |
* Making website URLs more user friendly | * Making website URLs more user friendly | ||
* Making website URLs more search-engine friendly | * Making website URLs more search-engine friendly | ||
− | * Preventing undesired "[[inline linking]]" | + | * Preventing undesired "[[wikipedia:inline linking|inline linking]]" |
* Not exposing the (web address related) inner workings of a website to users | * Not exposing the (web address related) inner workings of a website to users | ||
− | Many of these only apply to HTTP servers whose default behaviour is to map URLs to [[filesystem]] entities (i.e. files and directories); certain environments, such as many HTTP | + | Many of these only apply to HTTP servers whose default behaviour is to map URLs to [[filesystem]] entities (i.e. files and directories); certain environments, such as many HTTP application server platforms, make this irrelevant. |
− | The [[Apache HTTP server]] has a rewrite engine called '''mod_rewrite''' (see below), which has been described as "the Swiss Army knife of URL manipulation". | + | The [[Apache|Apache HTTP server]] has a rewrite engine called '''mod_rewrite''' (see below), which has been described as "the Swiss Army knife of URL manipulation". |
== mod_rewrite == | == mod_rewrite == | ||
+ | |||
+ | <div style="float:left; margin:3px 0;"> | ||
+ | {| align="center" style="border: 1px solid #999; background-color:#FFFFFF" | ||
+ | |- | ||
+ | ! colspan="4" bgcolor="#EFEFEF" | '''RewriteRule FLAGS''' | ||
+ | |-align="center" bgcolor="#1188ee" | ||
+ | !Flag | ||
+ | !Description | ||
+ | |- align="left" | ||
+ | |<tt>R[=code]</tt> || Redirect to new URL, with optional code (see below). | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>F</tt> || Forbidden (sends [[List of HTTP status codes|403 header]]) | ||
+ | |- align="left" | ||
+ | |<tt>G</tt> || Gone (no longer exists) | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>P</tt> || Proxy | ||
+ | |- align="left" | ||
+ | |<tt>L</tt> || Last Rule | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>N</tt> || Next (ie, restart rules) | ||
+ | |- align="left" | ||
+ | |<tt>C</tt> || Chain | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>T=mime-type</tt> || Set [[Mime Type]] | ||
+ | |- align="left" | ||
+ | |<tt>NS</tt> || Skip if internal sub-request | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>NC</tt> || Case insensitive | ||
+ | |- align="left" | ||
+ | |<tt>QSA</tt> || Append query string | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>NE</tt> || Do not escape output | ||
+ | |- align="left" | ||
+ | |<tt>PT</tt> || Pass through | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>S=x</tt> || Skip next ''x'' rules | ||
+ | |- align="left" | ||
+ | |<tt>E=var:value</tt> || Set environmental variable "<tt>var</tt>" to "<tt>value</tt>". | ||
+ | |} | ||
+ | </div> | ||
+ | <br clear="all"/> | ||
+ | <div style="float:left; margin:3px 0;"> | ||
+ | {| align="center" style="border: 1px solid #999; background-color:#FFFFFF" | ||
+ | |- | ||
+ | ! colspan="4" bgcolor="#EFEFEF" | '''RewriteCond FLAGS''' | ||
+ | |-align="center" bgcolor="#1188ee" | ||
+ | !Flag | ||
+ | !Description | ||
+ | |- align="left" | ||
+ | |<tt>NC</tt> || Case insensitive | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>OR</tt> || Allows a rule to apply if one of a series of conditions are true. | ||
+ | |} | ||
+ | </div> | ||
+ | <br clear="all"/> | ||
+ | <div style="float:left; margin:3px 0;"> | ||
+ | {| align="center" style="border: 1px solid #999; background-color:#FFFFFF" | ||
+ | |- | ||
+ | ! colspan="4" bgcolor="#EFEFEF" | '''[[Regular expression|Regular Expression Syntax]]''' | ||
+ | |-align="center" bgcolor="#1188ee" | ||
+ | !Flag | ||
+ | !Description | ||
+ | |- align="left" | ||
+ | |<tt>^</tt> || Start of string | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>$</tt> || End of string | ||
+ | |- align="left" | ||
+ | |<tt>.</tt> || Any single character | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>(a<nowiki>|</nowiki>b)</tt> || <tt>a</tt> or <tt>b</tt> | ||
+ | |- align="left" | ||
+ | |<tt>(...)</tt> || Group sectioin | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>[abc]</tt> || Item in range (<tt>a</tt> or <tt>b</tt> or <tt>c</tt>) | ||
+ | |- align="left" | ||
+ | |<tt>[^abc]</tt> || ''Not'' in range (''not'' <tt>a</tt> or <tt>b</tt> or <tt>c</tt>) | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>a?</tt> || Zero or one of <tt>a</tt> | ||
+ | |- align="left" | ||
+ | |<tt>a*</tt> || Zero or more of <tt>a</tt> | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>a+</tt> || One or more of <tt>a</tt> | ||
+ | |- align="left" | ||
+ | |<tt>a{3}</tt> || Exactly 3 of <tt>a</tt> | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>a{3,}</tt> || 3 or more of <tt>a</tt> | ||
+ | |- align="left" | ||
+ | |<tt>a{3,6}</tt> || Between 3 and 6 of <tt>a</tt> | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>!(pattern)</tt> || "''Not''" prefix. Apply rule when URL does not match pattern. | ||
+ | |} | ||
+ | </div> | ||
+ | <br clear="all"/> | ||
+ | <div style="float:left; margin:3px 0;"> | ||
+ | {| align="center" style="border: 1px solid #999; background-color:#FFFFFF" | ||
+ | |- | ||
+ | ! colspan="4" bgcolor="#EFEFEF" | '''[[List of HTTP status codes|Redirection Header Codes]]''' | ||
+ | |-align="center" bgcolor="#1188ee" | ||
+ | !Flag | ||
+ | !Description | ||
+ | |- align="left" | ||
+ | |<tt>301</tt> || Moved permanently | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>302</tt> || Moved temporarily | ||
+ | |- align="left" | ||
+ | |<tt>403</tt> || Forbidden | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |<tt>404</tt> || Not found | ||
+ | |- align="left" | ||
+ | |<tt>410</tt> || Gone | ||
+ | |} | ||
+ | </div> | ||
+ | <br clear="all"/> | ||
+ | === Server Variables === | ||
+ | ==== Format ==== | ||
+ | * <tt>%{NAME_OF_VAR}</tt> | ||
+ | ==== HTTP Headers ==== | ||
+ | * <tt>HTTP_USER_AGENT</tt> | ||
+ | * <tt>HTTP_REFERER</tt> | ||
+ | * <tt>HTTP_COOKIE</tt> | ||
+ | * <tt>HTTP_FORWARDED</tt> | ||
+ | * <tt>HTTP_HOST</tt> | ||
+ | * <tt>HTTP_PROXY_CONNECTION</tt> | ||
+ | * <tt>HTTP_ACCEPT</tt> | ||
+ | ==== Request ==== | ||
+ | * <tt>REMOTE_ADDR</tt> | ||
+ | * <tt>REMOTE_HOST</tt> | ||
+ | * <tt>REMOTE_USER</tt> | ||
+ | * <tt>REMOTE_IDENT</tt> | ||
+ | * <tt>REQUEST_METHOD</tt> | ||
+ | * <tt>SCRIPT_FILENAME</tt> | ||
+ | * <tt>PATH_INFO</tt> | ||
+ | * <tt>QUERY_STRING</tt> | ||
+ | * <tt>AUTH_TYPE</tt> | ||
+ | ==== Server ==== | ||
+ | * <tt>DOCUMENT_ROOT</tt> | ||
+ | * <tt>SERVER_ADMIN</tt> | ||
+ | * <tt>SERVER_NAME</tt> | ||
+ | * <tt>SERVER_ADDR</tt> | ||
+ | * <tt>SERVER_PORT</tt> | ||
+ | * <tt>SERVER_PROTOCOL</tt> | ||
+ | * <tt>SERVER_SOFTWARE</tt> | ||
+ | ==== Time ==== | ||
+ | * <tt>TIME_YEAR</tt> | ||
+ | * <tt>TIME_MON</tt> | ||
+ | * <tt>TIME_DAY</tt> | ||
+ | * <tt>TIME_HOUR</tt> | ||
+ | * <tt>TIME_MIN</tt> | ||
+ | * <tt>TIME_SEC</tt> | ||
+ | * <tt>TIME_WDAY</tt> | ||
+ | * <tt>TIME</tt> | ||
+ | ==== Special ==== | ||
+ | * <tt>API_VERSION</tt> | ||
+ | * <tt>THE_REQUEST</tt> | ||
+ | * <tt>REQUEST_URI</tt> | ||
+ | * <tt>REQUEST_FILENAME</tt> | ||
+ | * <tt>IS_SUBREQ</tt> | ||
+ | ==== Directives ==== | ||
+ | * <tt>RewriteEngine</tt> | ||
+ | * <tt>RewriteOptions</tt> | ||
+ | * <tt>RewriteLog</tt> | ||
+ | * <tt>RewriteLogLevel</tt> | ||
+ | * <tt>RewriteLock</tt> | ||
+ | * <tt>RewriteMap</tt> | ||
+ | * <tt>RewriteBase</tt> | ||
+ | * <tt>RewriteCond</tt> | ||
+ | * <tt>RewriteRule</tt> | ||
+ | |||
+ | === Example rules === | ||
+ | |||
+ | ''# Site has permanently moved to new domain'' | ||
+ | ''# domain.com to domain2.com'' | ||
+ | RewriteCond %{HTTP_HOST} ^www.domain.com$ [NC] | ||
+ | RewriteRule ^(.*)$ <nowiki>http://www.domain2.com/$1</nowiki> [R=301,L] | ||
+ | |||
+ | ''# Page has moved temporarily'' | ||
+ | ''# domain.com/page.htm to domain.com/new_page.htm'' | ||
+ | RewriteRule ^page.htm$ new_page.htm [R,NC,L] | ||
+ | |||
+ | ''# Nice looking URLs (no querystring)'' | ||
+ | ''# domain.com/category-name-1/ to domain.com/categories.php?name=category-name-1'' | ||
+ | RewriteRule ^([A-Za-z0-9-]+)/?$ categories.php?name=$1 [L] | ||
+ | |||
+ | ''# Nice looking URLs (no querystring) with pagination'' | ||
+ | ''# domain.com/articles/<font color=red>title</font>/<font color=green>5</font>/ to domain.com/article.php?name=<font color=red>title</font>&page=<font color=green>5</font>'' | ||
+ | RewriteRule ^articles/<font color=red>([A-Za-z0-9-]+)</font>/([0-9]+)/?$ article.php?name=<font color=red>$1</font>&page=<font color=green>$2</font> [L] | ||
+ | |||
+ | ''# Block referrer spam'' | ||
+ | RewriteCond %{HTTP_REFERER} (weight) [NC,OR] | ||
+ | RewriteCond %{HTTP_REFERER} (drugs) [NC] | ||
+ | RewriteRule .* - [F] | ||
== User friendly / Search engine friendly URLs == | == User friendly / Search engine friendly URLs == | ||
Line 23: | Line 214: | ||
Using an URL rewrite engine, the website software can be presented with URLs in one form, while actual requests (and URLs seen by the user) are in another form. So rewrite engines allow URLs to be tidied up and made more user friendly, by configuring rewrite rules, rather than modifying the webserver software. | Using an URL rewrite engine, the website software can be presented with URLs in one form, while actual requests (and URLs seen by the user) are in another form. So rewrite engines allow URLs to be tidied up and made more user friendly, by configuring rewrite rules, rather than modifying the webserver software. | ||
+ | |||
+ | ==Example VirtualHost domain redirect== | ||
+ | |||
+ | <!-- PRE: Start --> | ||
+ | <Directory /> | ||
+ | Options FollowSymLinks | ||
+ | AllowOverride All | ||
+ | </Directory> | ||
+ | |||
+ | <VirtualHost *:80> | ||
+ | |||
+ | ServerAdmin admin@example.com | ||
+ | ServerName example.com | ||
+ | ServerAlias www.example.com | ||
+ | |||
+ | # Index file and Document Root (where the public files are located) | ||
+ | DirectoryIndex index.html index.php | ||
+ | DocumentRoot /var/www/html/example.com | ||
+ | |||
+ | # Rewrite rules | ||
+ | # Example: <nowiki>http://xtof.ch/skills</nowiki> redirects to <nowiki>http://wiki.christophchamp.com/index.php/Technical_and_Specialized_Skills</nowiki> | ||
+ | RewriteEngine On | ||
+ | RewriteRule <font color=red>^/skills$</font> <font color=green><nowiki>http://wiki.christophchamp.com/index.php/Technical_and_Specialized_Skills</nowiki></font> [R=301,L] | ||
+ | |||
+ | # Custom log file locations | ||
+ | LogLevel warn | ||
+ | ErrorLog /var/log/httpd/example.com-error.log | ||
+ | CustomLog /var/log/httpd/example.com-access.log combined | ||
+ | |||
+ | </VirtualHost> | ||
+ | <!-- PRE: End --> | ||
== See also == | == See also == | ||
Line 29: | Line 251: | ||
==External links== | ==External links== | ||
− | + | *[http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html Apache Module mod_rewrite] | |
*[http://httpd.apache.org/docs/mod/mod_rewrite.html Apache's mod_rewrite]: a detailed discussion of mod_rewrite's many features. | *[http://httpd.apache.org/docs/mod/mod_rewrite.html Apache's mod_rewrite]: a detailed discussion of mod_rewrite's many features. | ||
*[http://www.yourhtmlsource.com/sitemanagement/urlrewriting.html Rewrite URLs with mod_rewrite]: a tutorial for redirecting URLS. | *[http://www.yourhtmlsource.com/sitemanagement/urlrewriting.html Rewrite URLs with mod_rewrite]: a tutorial for redirecting URLS. | ||
Line 37: | Line 259: | ||
*[http://lotsofphp.com/tutorials/rewritten-urls-unlimited-parameters.html Unlimited mod_rewrite Parmaters]: Ruben K's tutorial on how to have unlimited mod_rewrite parameters without having to add more rewrite lines | *[http://lotsofphp.com/tutorials/rewritten-urls-unlimited-parameters.html Unlimited mod_rewrite Parmaters]: Ruben K's tutorial on how to have unlimited mod_rewrite parameters without having to add more rewrite lines | ||
*[http://www.myhtaccess.com Repository of mod_rewrite/.htaccess snippets, examples and tricks] | *[http://www.myhtaccess.com Repository of mod_rewrite/.htaccess snippets, examples and tricks] | ||
+ | *[http://www.askapache.com/htaccess/modrewrite-tips-tricks.html Htaccess Rewrites – Rewrite Tricks and Tips] — by AskApache | ||
− | |||
[[Category:World Wide Web]] | [[Category:World Wide Web]] |
Latest revision as of 21:43, 15 April 2015
A rewrite engine is a piece of web server software used to modify URLs, for a variety of purposes. Some benefits derived from a rewrite engine are:
- Making website URLs more user friendly
- Making website URLs more search-engine friendly
- Preventing undesired "inline linking"
- Not exposing the (web address related) inner workings of a website to users
Many of these only apply to HTTP servers whose default behaviour is to map URLs to filesystem entities (i.e. files and directories); certain environments, such as many HTTP application server platforms, make this irrelevant.
The Apache HTTP server has a rewrite engine called mod_rewrite (see below), which has been described as "the Swiss Army knife of URL manipulation".
Contents
mod_rewrite
RewriteRule FLAGS | |||
---|---|---|---|
Flag | Description | ||
R[=code] | Redirect to new URL, with optional code (see below). | ||
F | Forbidden (sends 403 header) | ||
G | Gone (no longer exists) | ||
P | Proxy | ||
L | Last Rule | ||
N | Next (ie, restart rules) | ||
C | Chain | ||
T=mime-type | Set Mime Type | ||
NS | Skip if internal sub-request | ||
NC | Case insensitive | ||
QSA | Append query string | ||
NE | Do not escape output | ||
PT | Pass through | ||
S=x | Skip next x rules | ||
E=var:value | Set environmental variable "var" to "value". |
RewriteCond FLAGS | |||
---|---|---|---|
Flag | Description | ||
NC | Case insensitive | ||
OR | Allows a rule to apply if one of a series of conditions are true. |
Regular Expression Syntax | |||
---|---|---|---|
Flag | Description | ||
^ | Start of string | ||
$ | End of string | ||
. | Any single character | ||
(a|b) | a or b | ||
(...) | Group sectioin | ||
[abc] | Item in range (a or b or c) | ||
[^abc] | Not in range (not a or b or c) | ||
a? | Zero or one of a | ||
a* | Zero or more of a | ||
a+ | One or more of a | ||
a{3} | Exactly 3 of a | ||
a{3,} | 3 or more of a | ||
a{3,6} | Between 3 and 6 of a | ||
!(pattern) | "Not" prefix. Apply rule when URL does not match pattern. |
Redirection Header Codes | |||
---|---|---|---|
Flag | Description | ||
301 | Moved permanently | ||
302 | Moved temporarily | ||
403 | Forbidden | ||
404 | Not found | ||
410 | Gone |
Server Variables
Format
- %{NAME_OF_VAR}
HTTP Headers
- HTTP_USER_AGENT
- HTTP_REFERER
- HTTP_COOKIE
- HTTP_FORWARDED
- HTTP_HOST
- HTTP_PROXY_CONNECTION
- HTTP_ACCEPT
Request
- REMOTE_ADDR
- REMOTE_HOST
- REMOTE_USER
- REMOTE_IDENT
- REQUEST_METHOD
- SCRIPT_FILENAME
- PATH_INFO
- QUERY_STRING
- AUTH_TYPE
Server
- DOCUMENT_ROOT
- SERVER_ADMIN
- SERVER_NAME
- SERVER_ADDR
- SERVER_PORT
- SERVER_PROTOCOL
- SERVER_SOFTWARE
Time
- TIME_YEAR
- TIME_MON
- TIME_DAY
- TIME_HOUR
- TIME_MIN
- TIME_SEC
- TIME_WDAY
- TIME
Special
- API_VERSION
- THE_REQUEST
- REQUEST_URI
- REQUEST_FILENAME
- IS_SUBREQ
Directives
- RewriteEngine
- RewriteOptions
- RewriteLog
- RewriteLogLevel
- RewriteLock
- RewriteMap
- RewriteBase
- RewriteCond
- RewriteRule
Example rules
# Site has permanently moved to new domain # domain.com to domain2.com RewriteCond %{HTTP_HOST} ^www.domain.com$ [NC] RewriteRule ^(.*)$ http://www.domain2.com/$1 [R=301,L]
# Page has moved temporarily # domain.com/page.htm to domain.com/new_page.htm RewriteRule ^page.htm$ new_page.htm [R,NC,L]
# Nice looking URLs (no querystring) # domain.com/category-name-1/ to domain.com/categories.php?name=category-name-1 RewriteRule ^([A-Za-z0-9-]+)/?$ categories.php?name=$1 [L]
# Nice looking URLs (no querystring) with pagination # domain.com/articles/title/5/ to domain.com/article.php?name=title&page=5 RewriteRule ^articles/([A-Za-z0-9-]+)/([0-9]+)/?$ article.php?name=$1&page=$2 [L]
# Block referrer spam RewriteCond %{HTTP_REFERER} (weight) [NC,OR] RewriteCond %{HTTP_REFERER} (drugs) [NC] RewriteRule .* - [F]
User friendly / Search engine friendly URLs
People use website URLs in all kinds of ways. We send them to other people by email, put them on online discussion boards, or even write them on scraps of paper. This often applies not just to website home pages, but to specific content within a website. Typically website developers want to encourage this, as it means increased traffic to their sites. As such, a well designed website should allow users to enter at any URL (not just the homepage), and the URLs throughout the site should be easy to use.
A URL is easier to use if it is short but descriptive. The URL should have some text describing the content (not just numbers), but should not be too long.
Search engines will also find it easier to index pages which follow these rules. Content which is easier to index is more likely to be included in search results.
Website URLs are often quite long and quite meaningless to humans. This is because many websites have dynamic content, meaning that HTML returned to the browser is generated on-the-fly, rather than simply being stored as a static HTML file. The URL is used not only to reference an HTML document at a fixed address, but to pass pieces of data to software running on the webserver, which then generates the HTML page dynamically. Typically this software is of the form of scripts written in a web scripting language such as Perl or PHP.
Using an URL rewrite engine, the website software can be presented with URLs in one form, while actual requests (and URLs seen by the user) are in another form. So rewrite engines allow URLs to be tidied up and made more user friendly, by configuring rewrite rules, rather than modifying the webserver software.
Example VirtualHost domain redirect
<Directory /> Options FollowSymLinks AllowOverride All </Directory> <VirtualHost *:80> ServerAdmin admin@example.com ServerName example.com ServerAlias www.example.com # Index file and Document Root (where the public files are located) DirectoryIndex index.html index.php DocumentRoot /var/www/html/example.com # Rewrite rules # Example: http://xtof.ch/skills redirects to http://wiki.christophchamp.com/index.php/Technical_and_Specialized_Skills RewriteEngine On RewriteRule ^/skills$ http://wiki.christophchamp.com/index.php/Technical_and_Specialized_Skills [R=301,L] # Custom log file locations LogLevel warn ErrorLog /var/log/httpd/example.com-error.log CustomLog /var/log/httpd/example.com-access.log combined </VirtualHost>
See also
- .htaccess
- Robots Exclusion Standard (aka "robots.txt")
External links
- Apache Module mod_rewrite
- Apache's mod_rewrite: a detailed discussion of mod_rewrite's many features.
- Rewrite URLs with mod_rewrite: a tutorial for redirecting URLS.
- Apache mod rewrite tutorials & regular expression lessons: Mod rewrite instruction with Regular Expression syntax lessons, a how to tutorial from beginner to advanced programmers and a code library of pre-written rewrite rules.
- Mod_Rewrite URLs for Search Engines
- Free Mod_Rewrite Tutorials for php scripts: Forum helps users solve their mod_rewrite problems and assist in development mod_rewrite for popular php scripts like phpBB, vBulletin, Copppermine Gallery, Simple Machine Forum.
- Unlimited mod_rewrite Parmaters: Ruben K's tutorial on how to have unlimited mod_rewrite parameters without having to add more rewrite lines
- Repository of mod_rewrite/.htaccess snippets, examples and tricks
- Htaccess Rewrites – Rewrite Tricks and Tips — by AskApache