Difference between revisions of "Apache"

From CSE330 Wiki
Jump to: navigation, search
(Created page with 'This page describes how to set up a web server on a Linux machine. If you are unfamiliar with using Linux from the command line, you should read the Linux guide first.…')
 
Line 83: Line 83:
 
In this case, Alice can run any command as SUDO privileges on the computer.  For more detail on SUDO configuration, see http://www.linuxhelp.net/guides/sudo/
 
In this case, Alice can run any command as SUDO privileges on the computer.  For more detail on SUDO configuration, see http://www.linuxhelp.net/guides/sudo/
  
{{Stub}}
+
== Apache ==
 +
 
 +
''Apache'' is the industry standard web server for Linux distributions.  It is highly configurable and has a wide range of modules ready for different needs.
 +
 
 +
=== Installing Apache ===
 +
 
 +
In yum, Apache is distributed under the package name '''httpd''' (for ''hypertext transfer protocol daemon'').  In aptitude, it is distributed under the name '''apache2'''.  Use the package manager associated with your distribution to install Apache.  (For more information on how to use yum and aptitude, see [[Linux#Repository-Based Package Managers|the Linux guide]].)
 +
 
 +
When Apache is installed through yum or aptitude, the HTTP Daemon will be automatically to added as a startup item.
 +
 
 +
In RHEL, all Apache configurations are stored in ''/etc/httpd/httpd.conf''.  Debian takes a more modular approach, having separate directories for each type of configuration, all located in ''/etc/apache2/''.  For more detail on Debian's approach, see http://www.control-escape.com/web/configuring-apache2-debian.html
 +
 
 +
=== Apache Directives ===
 +
 
 +
You define your settings for Apache using ''directives''.  Some of the directives you will likely encounter include:
 +
 
 +
* '''DocumentRoot:''' The path to the directory where the top level web files are going to be stored.
 +
* '''IfModule:'''  The following block would be included if specified module exists.
 +
* '''User:''' Which user apache2 will run as.
 +
* '''Group:''' Which group will have group access to default web files.
 +
* '''AccessFileName:''' The name of the access file (that specifies user names/passwords and other limitations to files/directories).
 +
* '''ErrorLog:''' Where any errors will be written.
 +
* '''Include:''' Include some other files.
 +
* '''LogFormat:''' How to write a log message.
 +
* '''ErrorDocument:''' Files to display for some HTTP errors (500, 404, 402 etc.).
 +
* '''Alias:''' Map a directory URL to some other location on your filesystem.  Requires that the ''Alias'' module be loaded.
 +
 
 +
==== .htaccess Files ====
 +
 
 +
You can also specify some Apache configurations without delving into the master configuration file.  To do this, put a file named ''.htaccess'' in any directory that Apache is serving.  All directives in it will be interpreted as if they were in a Directory directive in the master configuration file.
 +
 
 +
'''Note:''' The directory containing ''.htaccess'' must not have the '''AllowOverride None''' directive in the master configuration file in order for '''.htaccess''' to be read.  (In Debian, '''AllowOverride None''' is enabled by default!)
 +
 
 +
==== Directory Directive ====
 +
 
 +
Use the Directory directive to assign other directives to a specific directory.  For example:
 +
 
 +
<source lang="apache">
 +
<Directory /var/www/>
 +
Options Indexes FollowSymLinks
 +
AllowOverride None
 +
Order allow,deny
 +
allow from all
 +
RedirectMatch ^/$ /apache2-default/
 +
</Directory>
 +
</source>
 +
 
 +
This sets options for the ''/var/www'' directory.
 +
# The '''Options''' directive says that:
 +
## If no index page is present in a directory, display a directory index page instead
 +
## Apache will follow symbolic links in the directory
 +
# '''AllowOverride None''' says that ''.htaccess'' files cannot alter the Apache options in this directory and all sub-directories
 +
# '''Order allow,deny''' and '''Allow from all''' specifies that anybody is allowed to access this server via HTTP.
 +
 
 +
Note that this directory is actually the root directory of the web server.
 +
 
 +
=== Apache Logs ===
 +
 
 +
Apache records all access attempts and errors associated with your server in log files.  It is useful to check your access logs to ensure that things are subbing smoothly and that, for example, you aren't experiencing any denial-of-service-like attacks on your server.
 +
 
 +
In RHEL, the Apache logs are located in ''/var/log/httpd''.  In Debian, the Apache logs are located in ''/var/log/apache2''.
 +
 
 +
=== Virtual Hosts ===
 +
 
 +
'''Virtual Hosts''' are used to run multiple Apache web servers from the same machine.  Virtual hosts can listen for connections on different ports and/or different hostnames, serving completely different web sites to each.  For example:
 +
 
 +
<source lang="apache">
 +
<VirtualHost cse330.dyndns.org>
 +
ServerAdmin webmaster@localhost
 +
ServerName cse330.dyndns.org
 +
DocumentRoot /home/www/cse330/
 +
ErrorLog /var/log/httpd/error_log
 +
LogLevel warn
 +
CustomLog /var/log/apache2/access_log combined
 +
ServerSignature On
 +
</VirtualHost>
 +
</source>
 +
 
 +
This configuration enables any requests that use a host name of ''cse330.dyndns.org'' will use ''/home/www/cse330'' as the root document directory.  Make sure that the DocumentRoot directory exists and is readable by the httpd process.  In RHEL, Apache runs as the '''apache''' user.  In Debian, it runs as the '''www-data''' user.
 +
 
 +
It is good practice to put raw server configuration files in ''/etc/httpd/sites-available'' in RHEL or ''/etc/apache2/sites-available'' in Debian.  To activate a site, create a symlink from the configuration in ''sites-available'' to a sibling directory called ''sites-enabled''.  In Debian, these directories are already set up for you, and Debian Apache even provides the '''a2ensite''' and '''a2dissite''' commands to create or destroy the symlinks!  In RHEL, you have to do this by hand.
 +
 
 +
=== Restarting Apache ===
 +
 
 +
Whenever you make changes to the Apache configuration files, you will need to restart Apache for the changes to take effect.  There are several different ways to restart Apache; they all do the same thing, so choose your favorite:
 +
 
 +
<source lang="bash">
 +
$ /etc/init.d/httpd restart
 +
$ /sbin/service httpd restart
 +
$ service httpd restart    # if /sbin is in your PATH
 +
$ /usr/sbin/apachectl restart
 +
$ apachectl restart    # if /usr/sbin is in yoru PATH
 +
</source>
 +
 
 +
'''Note:''' ''restart'' performs a hard restart of Apache.  To perform a soft restart, use ''graceful'' instead (e.g. <code>apachectl graceful</code>).  To only reload the configuration files but not restart the server, use ''reload'' (e.g. <code>/etc/init.d/httpd reload</code>).
  
  
 
[[Category:Module 1]]
 
[[Category:Module 1]]

Revision as of 00:56, 10 August 2012

This page describes how to set up a web server on a Linux machine. If you are unfamiliar with using Linux from the command line, you should read the Linux guide first.

SSH

When connecting to your machine over the internet or intranet, you will most likelly be using ssh (secure shell). SSH access requires that the sshd daemon is running in your machine.

By default, SSH is preinstalled on your EC2 instance. If you are not using an EC2 instance, simply install it from yum or aptitude.

SSH Keys

Normally, you can SSH into your machine with one of two ways: you can use traditional username/password authentication, or you can use a public/private key pair. A public/private key pair is generally considered to be more secure, but it requires that you always have access to your private key file when you want to log into your remote machine. By default, EC2 instances allow only public/private key pair authentication. You can enable password-based authentication by changing the PaswordAuthentication option in /etc/ssh/sshd_config to yes:

PasswordAuthentication yes

SSH Server Configuration

The configuration files for SSH are in /etc/ssh. You can modify the files to affect SSH permissions, among other things. For example, it is always a good idea to disable root access over ssh. This could be done by editing /etc/ssh/sshd_config and setting

PermitRootLogin no

For more detail on editing files on the command line, see the Linux guide.

Note that you must restart the ssh process for this to take effect. Should that fail, rebooting your server should do the trick.

Warning: Disabling root access over SSH for your EC2 instance should only be done after setting up an additional user account and adding that account to the sudoers list.

SSH Client Configuration

Unix-Based Systems (including Mac OS X)

Mac OS X is based on BSD, a flavor of Unix. As such, Mac OS X comes pre-built with all the tools you need to use SSH! Simply fire up Terminal and enter the command

ssh username@hostname

To use SSH with a key pair, use the command

ssh -i /path/to/key.pem username@hostname

Non-Unix-Based Systems (including Microsoft Windows)

Unfortunately, using SSH with Windows is more complicated. It is necessary to install an SSH client to support the connections. A widely used SSH client for Windows is PuTTY. You can download PuTTY from http://www.chiark.greenend.org.uk/~sgtatham/putty/

PuTTY is fairly simple and straight forward with one caveat: Amazon's .pem key pair files are not compatible with PuTTY keys. In order to convert .pem keys to a PuTTY .ppk privte key file, you should use the puttygen.exe utility available from the same page [1] as PuTTY. Next select import under the conversions menu,load the amazon .pem key file and press the save private key button. Be sure to save the file in the directory where PuTTY looks for its keys.

Copy and paste works similarly to the X Window System in Unix. You use the left mouse button to select text in the PuTTY window. The act of selection automatically copies the text to the clipboard: there is no need to press Ctrl-Ins or Ctrl-C or anything else. In fact, pressing Ctrl-C will send a Ctrl-C character to the other end of your connection (just like it does the rest of the time), which may have unpleasant effects. The only thing you need to do, to copy text to the clipboard, is to select it.

To paste the clipboard contents into a PuTTY window, by default you click the right mouse button. If you have a three-button mouse and are used to X applications, you can configure pasting to be done by the middle button instead, but this is not the default because most Windows users don't have a middle button at all.

Also, here is a good PuTTY tutorial that you might find useful to get started: http://kb.mediatemple.net/questions/1595/Using+SSH+in+Putty+%28Windows%29

SSHFS

SSHFS is a filesystem client which allows secure mounting of remote file systems. While there are other ways to mount remote file systems, SSHFS has the advantage of being able to mount a file system located on any host that has an SSH daemon running without any host side installation or configuration. This means that you can easily access and edit your files using all of your local applications including IDEs.

As you may have inferred from the name, the underlying implementation utilizes SSH File Transfer Protocol in combination with FUSE, a package now included in the kernel that allows unprivileged users to easily create their own file systems in userspace (see the wikipedia entry for more information [2]).

To mount a share using password based authentication, the command is

sshfs user@domain:/path/to/remote/directory /path/to/local/mountpoint

e.g. To mount the directory /home/joe/myfiles in the user joe's home directory for a machine with the domain schmoesfiles.org using SSHFS you would enter the command

sshfs joe@www.schmoesfiles.org:myfiles

Note that if you are using public key authentication, the command to mount the remote share is slightly different

sshfs -o IdentityFile=/path/to/private/key user@domain:/path/to/remote/directory /path/to/local/mountpoint

To unmount the filesystem you can use the following command

fusermount -u /path/to/local/mountpoint

SFTP

Any server running an SSH server is also compatible with SFTP or Secure File Transfer Protocol. (Compare to FTP, or File Transfer Protocol.)

You can use SFTP from the command line, or you can use any GUI file transfer client. All FTP clients I have seen also support SFTP. One popular FTP client is Filezilla.

SUDO Users

For security reasons, you should never SSH into your server as the root user. Instead, you should use a normal user to whom you give sudo privileges. (For more detail on sudo, see the Linux guide.)

When you create an Amazon EC2 instance, the user you set up initially is already given SUDO privileges. If you want to give more users SUDO privileges, use the command visudo, which opens up the SUDO configuration file in the system's default text editor. (Never edit the file /etc/sudoers directly!) SUDO users are specified using lines similar to

alice   ALL=(ALL) ALL

In this case, Alice can run any command as SUDO privileges on the computer. For more detail on SUDO configuration, see http://www.linuxhelp.net/guides/sudo/

Apache

Apache is the industry standard web server for Linux distributions. It is highly configurable and has a wide range of modules ready for different needs.

Installing Apache

In yum, Apache is distributed under the package name httpd (for hypertext transfer protocol daemon). In aptitude, it is distributed under the name apache2. Use the package manager associated with your distribution to install Apache. (For more information on how to use yum and aptitude, see the Linux guide.)

When Apache is installed through yum or aptitude, the HTTP Daemon will be automatically to added as a startup item.

In RHEL, all Apache configurations are stored in /etc/httpd/httpd.conf. Debian takes a more modular approach, having separate directories for each type of configuration, all located in /etc/apache2/. For more detail on Debian's approach, see http://www.control-escape.com/web/configuring-apache2-debian.html

Apache Directives

You define your settings for Apache using directives. Some of the directives you will likely encounter include:

  • DocumentRoot: The path to the directory where the top level web files are going to be stored.
  • IfModule: The following block would be included if specified module exists.
  • User: Which user apache2 will run as.
  • Group: Which group will have group access to default web files.
  • AccessFileName: The name of the access file (that specifies user names/passwords and other limitations to files/directories).
  • ErrorLog: Where any errors will be written.
  • Include: Include some other files.
  • LogFormat: How to write a log message.
  • ErrorDocument: Files to display for some HTTP errors (500, 404, 402 etc.).
  • Alias: Map a directory URL to some other location on your filesystem. Requires that the Alias module be loaded.

.htaccess Files

You can also specify some Apache configurations without delving into the master configuration file. To do this, put a file named .htaccess in any directory that Apache is serving. All directives in it will be interpreted as if they were in a Directory directive in the master configuration file.

Note: The directory containing .htaccess must not have the AllowOverride None directive in the master configuration file in order for .htaccess to be read. (In Debian, AllowOverride None is enabled by default!)

Directory Directive

Use the Directory directive to assign other directives to a specific directory. For example:

<Directory /var/www/>
	Options Indexes FollowSymLinks 
	AllowOverride None
	Order allow,deny
	allow from all
	RedirectMatch ^/$ /apache2-default/
</Directory>

This sets options for the /var/www directory.

  1. The Options directive says that:
    1. If no index page is present in a directory, display a directory index page instead
    2. Apache will follow symbolic links in the directory
  2. AllowOverride None says that .htaccess files cannot alter the Apache options in this directory and all sub-directories
  3. Order allow,deny and Allow from all specifies that anybody is allowed to access this server via HTTP.

Note that this directory is actually the root directory of the web server.

Apache Logs

Apache records all access attempts and errors associated with your server in log files. It is useful to check your access logs to ensure that things are subbing smoothly and that, for example, you aren't experiencing any denial-of-service-like attacks on your server.

In RHEL, the Apache logs are located in /var/log/httpd. In Debian, the Apache logs are located in /var/log/apache2.

Virtual Hosts

Virtual Hosts are used to run multiple Apache web servers from the same machine. Virtual hosts can listen for connections on different ports and/or different hostnames, serving completely different web sites to each. For example:

<VirtualHost cse330.dyndns.org>
	ServerAdmin webmaster@localhost
	ServerName cse330.dyndns.org
	DocumentRoot /home/www/cse330/
	ErrorLog /var/log/httpd/error_log
	LogLevel warn
	CustomLog /var/log/apache2/access_log combined
	ServerSignature On
</VirtualHost>

This configuration enables any requests that use a host name of cse330.dyndns.org will use /home/www/cse330 as the root document directory. Make sure that the DocumentRoot directory exists and is readable by the httpd process. In RHEL, Apache runs as the apache user. In Debian, it runs as the www-data user.

It is good practice to put raw server configuration files in /etc/httpd/sites-available in RHEL or /etc/apache2/sites-available in Debian. To activate a site, create a symlink from the configuration in sites-available to a sibling directory called sites-enabled. In Debian, these directories are already set up for you, and Debian Apache even provides the a2ensite and a2dissite commands to create or destroy the symlinks! In RHEL, you have to do this by hand.

Restarting Apache

Whenever you make changes to the Apache configuration files, you will need to restart Apache for the changes to take effect. There are several different ways to restart Apache; they all do the same thing, so choose your favorite:

$ /etc/init.d/httpd restart
$ /sbin/service httpd restart
$ service httpd restart    # if /sbin is in your PATH
$ /usr/sbin/apachectl restart
$ apachectl restart    # if /usr/sbin is in yoru PATH

Note: restart performs a hard restart of Apache. To perform a soft restart, use graceful instead (e.g. apachectl graceful). To only reload the configuration files but not restart the server, use reload (e.g. /etc/init.d/httpd reload).