Introduction to Unix

A hands-on-workshop covering the basics of the Unix/Linux command line interface

Introduction

If you are attending a workshop called "Introduction to Unix" then you can skip ahead to Topic 1 as it will be covered in the introductory presentation (slides)

Why use unix

Powerful: Unix computers are typically very powerful in comparision to your desktop/laptop computers. Additionally they don't typically use a Graphical User Interface which can free up much resources for actual computing.
Big data: Unix programs are designed to handle large data sets
Flexible: Small programs that can be arranged in many ways to solve your problems
Automation: Scripting allows you to do many tasks in one step and repeat steps many times
Pipelines: Unix programs are designed to be 'chained' together to form long multi-step pipelines
Science Software: Lots of Scientific software is designed to run in a Unix environment

User interface

The Unix interface is a text-based command driven one; often known as a Command Line Interface (CLI). This means that you control it by issuing (i.e. typing) commands at the command prompt. Consequently, the Mouse does not perform any function in the unix environment.

Unix Screen

Command prompt

The command prompt is the first thing you see when you connect to a Unix Computer. It's purpose is to receive the next command from you, the user.

Command prompt

There are several parts that make up the command prompt:

Time: the time (when the last command finished)
Username: the username that you are logged in as
Hostname: the name of the computer that your are connected to
Current working directory: the current position within the file system that your are working. More to follow
Prompt: this is simply a sign to the user that the computer is ready to accept the next command

From this point forward in this document, the command prompt will be simply represented as a '$' rather than the full command prompt as shown above. When copying and pasting commands you should NOT copy the '$' sign.

Command line

Below is an example command with various flags and options.

Command line

There are a number of parts which may be included in a command; each is separated by one of more 'white-space' characters (i.e. space, tab):

Command: this is the name of the program (command) that you want to run
Flag: these turn on (or off) specific features in the program. They consist of a dash (-) followed by a single character.
Long flag: same as flag except they are generally two dashes (--) followed by a word (or two)
Option: set the value of a configurable option. They are a flag (or long flag) followed by a value
Anonymous options: these are one or more options that are specified in the required order
Quoted value: if you need to specify a space (or tab) in an option then you will need to use double (") or single (') quotes on each side of the value.

File system

The file-system of a unix computer can be thought of as an up-side-down tree. The top most directory has a special name called 'root'; it contains all files and directories that are on the computer system. It is represented by a single slash (/). The figure below shows an example file system with directories (black outline boxes) and files (grey dashed boxes).

File system

At the top level we have one file (settings) and one directory (home). Inside the home directory we have two directories (user1 and user2) and so on.

Absolute file names

Absolute file names are a single unique name for each file and directory within the computer. They start with the slash (/) character and follow all the parent directories above the file/directory.

File system

Absolute file name: /settings

File system

Absolute file name: /home/user1/file01.txt

File system

Absolute file name: /home/user2

Note: the final slash is not needed (but generally doesn't hurt if it is present).

Current working directory

The current working directory is the current location within the file system that you are currently using. When you first login to a unix computer it will begin with the current working directory set to your home directory, that is, a place that is unique to you and generally nobody else will have access to it.

Current working directory

Remember from earlier that the current working directory is shown in the command prompt.

Relative file names

Relative file names are a short cut to writing file names that are shorter. The difference between an absolute file name is that relative file names do NOT begin with a slash.

File system

If your current working directory is set to /home you can leave this part from the beginning of the filename.

Relative file name: user1/muscle.fq

(Note: the absence of the leading slash)

Special file names:

There are a few further short cuts for typing relative file names:

~ (Tilde): is a short cut to your home directory
. (dot): is a short cut for the current directory
.. (2x dot): means the parent (or one directory up) from current directory
... (3x dot): does not mean anything (a got-ya for new users). If you want 2 directories up then chain two double dot's e.g. ../..

Note: the special file names above can be used within absolute and relative file name and used multiple times.

File system

Now, if the current working directory is changed to /home/user2 the relative path to muscle.fq is different.

Relative file name: ../user1/muscle.fq

How to use this workshop

The workshop is broken up into a number of Topics each focusing on a particular aspect of Unix. You should take a short break between each to refresh and relax before tackling the next.

Topics may start with some background followed by a number of exercises. Each exercise begins with a question, then sometimes a hint (or two) and finishes with the suggested answer.

Question

An example question looks like:

What is the Answer to Life?

Hint

Depending on how much of a challenge you like, you may choose to use hints. Even if you work out the answer without hints, its a good idea to read the hints afterwards because they contain extra information that is good to know.

Note: hints may be staged, that is, there may be a more section within a hint for further hints

Hint <- click here to reveal hint

What is the answer to everything?

As featured in "The Hitchhiker's Guide to the Galaxy"

More <- and here to show more

It is probably a two digit number

Answer

Once you have worked out the answer to the question expand the Answer section to check if you got it correct.

Answer <- click here to reveal answer

Answer: 42

Ref: Number 42 (Wikipedia)

Usage Style

This workshop attempts to cater for two usage styles:

Problem solver: for those who like a challenge and learn best be trying to solve the problems by-them-selves (hints optional):
- Attempt to answer the question by yourself.
- Use hints when you get stuck.
- Once solved, reveal the answer and read through our suggested solution.
- Its a good idea to read the hints and answer description as they often contain extra useful information.
By example: for those who learn by following examples: Expand all sections
- Expand the Answer section at the start of each question and follow along with the commands that are shown and check you get the same (or similar) answers.
- Its a good idea to read the hints and answer description as they often contain extra useful information.

Topic 1: Remote log in

In this topic we will learn how to connect to a Unix computer via a method called SSH and run a few basic commands.

Connecting to a Unix computer

To begin this workshop you will need to connect to an HPC. Today we will use the LIMS-HPC. The computer called lims-hpc-m.latrobe.edu.au (m is for master which is another name for head node) is the one that coordinates all the HPCs tasks.

Server details:

host: lims-hpc-m.latrobe.edu.au
port: 6022
username: trainingXX
password: (provided at workshop)

Mac OS X / Linux

Both Mac OS X and Linux come with a version of ssh (called OpenSSH) that can be used from the command line. To use OpenSSH you must first start a terminal program on your computer. On OS X the standard terminal is called Terminal, and it is installed by default. On Linux there are many popular terminal programs including: xterm, gnome-terminal, konsole (if you aren't sure, then xterm is a good default). When you've started the terminal you should see a command prompt. To log into LIMS-HPC, for example, type this command at the prompt and press return (where the word username is replaced with your LIMS-HPC username):

$ ssh -p 6022 trainingXX@lims-hpc-m.latrobe.edu.au

The same procedure works for any other machine where you have an account except that if your Unix computer uses a port other than 22 you will need to specify the port by adding the option -p PORT with PORT substituted with the port number.

You may be presented with a message along the lines of:

The authenticity of host 'lims-hpc-m.latrobe.edu.au (131.172.24.10)' can't be  established.
...
Are you sure you want to continue connecting (yes/no)?

Although you should never ignore a warning, this particular one is nothing to be concerned about; type yes and then press enter. If all goes well you will be asked to enter your password. Assuming you type the correct username and password the system should then display a welcome message, and then present you with a Unix prompt. If you get this far then you are ready to start entering Unix commands and thus begin using the remote computer.

Windows

On Microsoft Windows (Vista, 7, 8, 10) we recommend that you use the PuTTY ssh client. PuTTY (putty.exe) can be downloaded from this web page:

http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

Documentation for using PuTTY is here:

http://www.chiark.greenend.org.uk/~sgtatham/putty/docs.html

When you start PuTTY you should see a window which looks something like this:

Putty Connection Dialog

To connect to LIMS-HPC you should enter lims-hpc-m.latrobe.edu.au into the box entitled "Host Name (or IP address)" and 6022 in the port, then click on the Open button. All of the settings should remain the same as they were when PuTTY started (which should be the same as they are in the picture above).

In some circumstances you will be presented with a window entitled PuTTY Security Alert. It will say something along the lines of "The server's host key is not cached in the registry". This is nothing to worry about, and you should agree to continue (by clicking on Yes). You usually see this message the first time you try to connect to a particular remote computer.

If all goes well, a terminal window will open, showing a prompt with the text "login as:". An example terminal window is shown below. You should type your LIMS-HPC username and press enter. After entering your username you will be prompted for your password. Assuming you type the correct username and password the system should then display a welcome message, and then present you with a Unix prompt. If you get this far then you are ready to start entering Unix commands and thus begin using the remote computer.

Putty login screen

Gotcha: for security reasons ssh will NOT display any characters when you enter your password. This can be confusing because it appears as if your typing is not recognised by the computer. Don’t be alarmed; type your password in and press return at the end.

LIMS-HPC is a high performance computer for La Trobe Users. Logging in connects your local computer (e.g. laptop) to LIMS-HPC, and allows you to type commands into the Unix prompt which are run on the HPC, and have the results displayed on your local screen.

You will be allocated a training account on LIMS-HPC for the duration of the workshop. Your username and password will be supplied at the start of the workshop.

Log out of LIMS-HPC, and log back in again (to make sure you can repeat the process).

All the remaining parts assume that you are logged into LIMS-HPC over ssh.

Exercises

1.1) When you've logged into the Unix server, run the following commands and see what they do:

who
whoami
date
cal
hostname
/home/group/common/training/Intro_to_Unix/hi

Introduction to Unix

Introduction

Why use unix

User interface

Command prompt

Command line

File system

Absolute file names

Current working directory

Relative file names

How to use this workshop

Question

Hint

Answer

Usage Style

Topic 1: Remote log in

Connecting to a Unix computer

Exercises

Topic 2: Exploring your home directory

Topic 3: Exploring the file system

Topic 4: Working with files and directories

Topic 5: Pipes, output redirection and shell scripts

Finished