Directory Traversal

A website is stored within a file system on a server. Some of the server's file system is therefore exposed to the outside world and can be accessed by an end-user's web browser. The part of the file system (or directory structure) that is visible to the outside world is limited to a specific root folder and its contents.

Any folders higher up the hierarchy (ie. before you get to the root folder) are theoretically unreachable by the world at large - only authorized users who are logged in on the web server itself can access such folders.

For example, on the actual web server, you might have a directory structure similar to this (intentionally this is NOT something you can find AS THIS on an actual hosting account, this is just an imaginary setup):

. username
. . public_html
. . . images
. . . downloads
. . private
. . documents
. . passwords

In the above example, the public_html folder is the root folder (often referenced as WEBROOT)  for the website. Anything underneath that folder in the hierarchy can be accessed by a web browser (like the Images and downloads folders). All of the other folders are not accessible to the world at large because they are not located under the public_html folder.

In a directory traversal attack though, a poorly written script can allow a hacker to access those other folders and read their contents - just using a web browser. This is because a server-side scripting language, such as PHP - the "mother tongue" of Joomla - , runs on the server as though it were a logged-in user - the scripting language has access to all of the folders and files, not just those underneath the root. If a script reads (or outputs the contents of) files on the server as part of its legitimate processing, it must be written in such a way that the files that are used cannot be specified arbitrarily by the end user.

Taking the above directory structure as an example, suppose there was a script on the server that reads the contents of a text file in the public_html folder and outputs it to the screen. If the end user were able to specify the name of the text file to be displayed, the script would need to make sure that the name they entered was still within the public_html folder. If they entered a file name like '..\private\passwords\passwordlist.txt', the two dots at the start would tell the script to move up in the directory structure - effectively breaking out of the website's root folder - and then the hacker can specify any file path he likes whether within the website's root (WEBROOT) or not.

Therefore, where user input is used as the basis of files that are to be read (or more importantly, output) by a dynamic web page, the script must include a validation routine that ensures that the value entered by the user is legitimate and does not allow the directory structure to be traversed.