.. _qs-command-line: UNIX/Linux Command Line Interface ================================= Most of the time, a user interacts with an HPC cluster using a Command Line Interface (CLI), also known as a terminal. On a terminal, all your interaction is controlled by a program called *shell*. You can tell that you are in *shell* when you see what is called a *prompt* message, a set of characters that indicate that the *shell* is ready to receive instructions to operate. To issue a command to the shell you type a command, occasionally followed by a few arguments. When you finish typing the full command you type the ``ENTER`` key, the command is executed and if the command is programmed to produce screen output it will appear on your terminal as it executes. Once the command is terminated you get a new *prompt* indicating that the shell is ready for new commands. The CLI interface is a powerful way to interact with a remote computer for several reasons: 1. It takes a few resources on the remote machine allowing the machine to serve tens, hundreds, and in some cases even thousands of concurrent users. 2. The *shell* is far more than a command reader and executor, it is actually a complete programming language. You can actually create complex sets of instructions by doing what is called *Shell Programming*; the ability to do shell programming is what we use in this document to differentiate a basic user from an advanced user.. 3. Despite the learning curve being steeper, the CLI gives you far more control on the machine, so as you become more comfortable using it you will be able to do things that are too cumbersome on a Graphical User Interface (GUI). Using a CLI is probably the biggest obstacle that beginners have to overcome before start taking advantage of an HPC cluster. Here we will offer a short straight-to-point introduction to the bare minimum of managing files and folders on the *shell*. Files and Folders ----------------- Operating Systems from UNIX legacy, Linux and MacOS being the most prominent examples, use the idea of *folders* and *files* to organize the data on their storage device. Differently from Windows, on UNIX there is no such idea of Drives ``C:`` or ``D:`` or letters for CD Drives or USB Keys. In UNIX every piece of data on any storage device is logically located in some place of a *filesystem tree*. Think about the *filesystem tree* where the lowest level is called *the root folder* and is indicated by ``/``. From ``/`` you will see branches like ``/bin``, ``/lib`` and many others, those are folders. Inside each folder, there are potentially more folders and files. The tree structure as a metaphor for storing data is very powerful and in the case of UNIX systems, that data structure is deeply exploited, even hardware devices such as sound cards, CD drives, and hard drives receive a file-like entry on the rooted tree. When you insert a USB drive, Modern Linux distributions and MacOS will *mount* it automatically, meaning that it will receive a location on the tree, in the case of Linux, the mounting point is usually somewhere inside */media/*, in MacOS the mounting point is */Volumes*. In the particular case of your interaction with the HPC cluster, there are two important folders that you should be aware of. They are so important that they receive special variables to tell you what they are. The ``$HOME`` and ``$SCRATCH``, each user gets assigned a unique location for both folders. On them, the user has the right to create, modify and delete files and folders as he/she wants. In Linux, the names of files and folders are basically arbitrary. Most Linux filesystems are case sensitive, meaning that changing the file in lowercase and uppercase refers to different files. Using spaces for files and folders is allowed but discouraged as it forces you to enclose the filename in quotes `` " and "``to declare them or escape the spaces like, ``This\ file\ has\ spaces``. The *echo* and *cat* commands ----------------------------- Your first command will show you what those locations are. Execute:: $> echo $HOME /users/ $> echo $SCRATCH /scratch/ The first command to learn is ``echo``. The command above uses ``echo`` to show the contents of two shell variables ``$HOME`` and ``$SCRATCH``. Shell variables are ways to store information in such a way that the shell can use it when needed. Each user on the cluster receives appropriate values for those variables. Let us explore a bit more the usage of ``echo``. Enter this command line and execute ``ENTER``:: $> echo "I am learning UNIX Commands" I am learning UNIX Commands The shell is actually able to do basic arithmetical operations, and execute this command:: $> echo $((23+45*2)) 113 Notice that as customary in mathematics products take precedence over addition. That is called the PEMDAS order of operations, ie "Parentheses, Exponents, Multiplication and Division, and Addition and Subtraction". Check your understanding of the PEMDAS rule with this command:: $> echo $(((1+2**3*(4+5)-7)/2+9)) 42 Notice that the exponential operation is expressed with the ``**`` operator. The usage of ``echo`` is important, otherwise, if you execute the command without ``echo`` the shell will do the operation and will try to execute a command called ``42`` that does not exist on the system. Try by yourself:: $> $ $(((1+2**3*(4+5)-7)/2+9)) -bash: 42: command not found As you have seen before, when you execute a command on the terminal in most cases you see the output printed on the screen. The next thing to learn is how to redirect the output of a command into a file. This will be very important later to submit jobs and control where and how the output is produced. Execute the following command:: $> echo "I am learning UNIX Commands" > report.log With the character ``>`` redirects the output from ``echo`` into a file called *report.log*. No output is printed on the screen. If the file does not exist it will be created. If the file exists previously, the file is erased and only the new contents are stored. To check that the file actually contains the line produced by echo, execute:: $> cat report.log I am learning UNIX Commands The cat (concatenate) command displays the contents of one or several files. In the case of multiple files the files are printed in the order they are described in the command line, concatenating the output as the name of the command implies. You can even use a nice trick to write a small text on a file. Execute the following command, followed by the text that you want to write, sat the end execute ``Ctrl-D`` (``^D``), the *Control Key* followed by the ``D`` key. I am annotating below the location where ``^D`` should be executed:: $> cat > report.log I am learning UNIX Commands^D $> cat report.log I am learning UNIX Commands In fact, there are hundreds of commands, most of them with a variety of options that change the behavior of the original command. You can feel bewildered at first by a large number of existing commands, but in fact most of the time you will be using very few of them. Learning those will speed up your learning curve. Another very simple command that is very useful in HPC is ``date``. Without any arguments, it prints the current date to the screen. Example:: $> date Mon Nov 5 12:05:58 EST 2018 Folder commands --------------- As we mentioned before, UNIX organizes data in storage devices as a tree. The commands ``pwd``, ``cd`` and ``mkdir`` will allow you to know where you are, move your location on the tree and create new folders. Later we will see how to move folders from one location on the tree to another. The first command is ``pwd``. Just execute the command on the terminal:: $> $ pwd /users/ It is very important at all times to know where in the tree you are. Doing research usually involves dealing with a significant amount of data, exploring several parameters or physical conditions. Properly organizing all the data in meaningful folders is very important to research endeavors. When you log into a cluster, by default you are located on your ``$HOME`` folder. That is why most likely the command ``pwd`` will return that location in a first instance. The next command is ``cd``. This command is used to *change directory*. The directory is another name for *folder*. The term *directory* is also widely used. At least in UNIX the terms *directory* and *folder* are interchangeable. Other desktop operating systems like Windows and MacOS have the concept of *smart folders* or *virtual folders*, where the *folder* that you see on screen has no correlation with a directory in the filesystem. In those cases the distinction is relevant. There is another important folder defined in our clusters, its called the scratch folder and each user has their own. The location of the folder is stored in the variable ``$SCRATCH``. Note that this is internal convention and is not observed in other HPC clusters. Use the next command to go to that folder:: $> cd $SCRATCH $> pwd /scratch/ Notice that the location is different now, if you are using this account for the first time you will not have files on this folder. It is time to learn another command to list the contents of a *folder*, execute:: $> ls $> Assuming that you are using your HPC account for the first time, you will not have anything on your ``$SCRATCH`` folder. This is a good opportunity to start creating one folder there and change your location inside, execute:: $> mkdir test_folder $> cd test_folder We have use two new commands here, ``mkdir``allows you to create folders in places where you are authorized to do so. For example your ``$HOME`` and ``$SCRATCH`` folders. Try this command:: $> mkdir /test_folder mkdir: cannot create directory `/test_folder': Permission denied There is an important difference between ``test_folder`` and ``/test_folder``. The former is a location in your current working directory (CWD), the later is a location starting on the root directory ``/``. A normal user has no rights to create folders on that directory so ``mkdir`` will fail and an error message will be shown on your screen. The name of the folder is ``test_folder``, notice the underscore between *test* and *folder*. In UNIX, there is no restriction having files or directories with spaces but using them becomes a nuisance on the command line. If you want to create the folder with spaces from the command line, here are the options:: $> mkdir "test folder with spaces" $> mkdir another\ test\ folder\ with\ spaces In any case, you have to type extra characters to prevent the command line application of considering those spaces as separators for several arguments in your command. Try executing the following:: $> mkdir another folder with spaces $> ls another folder with spaces folder spaces test_folder test folder with spaces with Maybe is not clear what is happening here. There is an option for ``ls`` that present the contents of a directory:: $>ls -l total 0 drwxr-xr-x 2 myname mygroup 512 Nov 2 15:44 another drwxr-xr-x 2 myname mygroup 512 Nov 2 15:45 another folder with spaces drwxr-xr-x 2 myname mygroup 512 Nov 2 15:44 folder drwxr-xr-x 2 myname mygroup 512 Nov 2 15:44 spaces drwxr-xr-x 2 myname mygroup 512 Nov 2 15:45 test_folder drwxr-xr-x 2 myname mygroup 512 Nov 2 15:45 test folder with spaces drwxr-xr-x 2 myname mygroup 512 Nov 2 15:44 with It should be clear, now what happens when the spaces are not contained in quotes ``"test folder with spaces"`` or escaped as ``another\ folder\ with\ spaces``. This is the perfect opportunity to learn how to delete empty folders. Execute:: $> rmdir another $> rmdir folder spaces with You can delete one or several folders, but all those folders must be empty. If those folders contain files or more folders, the command will fail and an error message will be displayed. After deleting those folders created by mistake, let's check the contents of the current directory. The command ``ls -1`` will list the contents of a file one per line, something very convenient for future scripting:: $> ls -1 another folder with spaces test_folder test folder with spaces Commands for copy and move -------------------------- The next two commands are ``cp`` and ``mv``. They are used to copy or move files or folders from one location to another. In the simplest case, those two commands take two arguments, the first argument is the source and the last one the destination. In the case of more than two arguments, the destination must be a directory. The effect will be to copy or move all the source items into the folder indicated as the destination. Before doing a few examples with ``cp`` and ``mv`` let's use a very handy command to create files. The command ``touch`` is used to update the access and modification times of a file or folder to the current time. In case there is not such a file, the command will create a new empty file. We will use that feature to create some empty files for the purpose of demonstrating how to use ``cp`` and ``mv``. Lets create a few files and directories:: $> mkdir even odd $> touch f01 f02 f03 f05 f07 f11 Now, lets copy some of those existing files to complete all the numbers up to ``f11``:: $> cp f03 f04 $> cp f05 f06 $> cp f07 f08 $> cp f07 f09 $> cp f07 f10 This is good opportunity to present the ``*`` *wildcard*, use it to replace an arbitrary sequence of characters. For instance, execute this command to list all the files created above:: $> ls f* f01 f02 f03 f04 f05 f06 f07 f08 f09 f10 f11 The *wildcard* is able to replace zero or more arbitrary characters, see for example:: $> ls f*1 f01 f11 There is another way of representing files or directories that follow a pattern, execute this command:: $> ls f0[3,5,7] f03 f05 f07 The files selected are those whose last character is on the list ``[3,5,7]``. Similarly, a range of characters can be represented. See:: $> ls f0[3-7] f03 f04 f05 f06 f07 We will use those special character to move files based on its parity. Execute:: $> mv f[0,1][1,3,5,7,9] odd $> mv f[0,1][0,2,4,6,8] even The command above is equivalent to execute the explicit listing of sources:: $> mv f01 f03 f05 f07 f09 f11 odd $> mv f02 f04 f06 f08 f10 even Delete files and Folders ------------------------ As we mentioned above, empty folders can be deleted with the command ``rmdir`` but that only works if there are no subfolders or files inside the folder that you want to delete. See for example what happens if you try to delete the folder called ``odd``:: $> rmdir odd rmdir: failed to remove `odd': Directory not empty If you want to delete odd, you can do it in two ways. The command ``rm`` allows you to delete one or more files entered as arguments. Let's delete all the files inside odd, followed by the deletion of the folder ``odd`` itself:: $> rm odd/* $> rmdir odd Another option is to delete a folder recursively, this is a powerful but also dangerous option. Even if deleting a file is not actually filling with zeros the location of the data, on HPC systems the recovery of data is practice unfeasible. Let's delete the folder even recursively:: $> rm -r even Summary of Basic Commands ------------------------- The purpose of this brief tutorial is to familiarize you with the most common commands used in UNIX environments. We have shown 10 commands that you will probably use very often in your interactions. These 10 basic commands and one editor from the next section is all that you need to be ready for submitting jobs on the cluster. The next table summarizes those commands. .. table:: WVU's High-Performance Computer (HPC) Clusters :widths: 10 20 30 +---------+-------------------------------------------------------+-----------------------------------+ |Command | Description | Examples | +=========+=======================================================+===================================+ |``echo`` | | Display a given message on the screen | | ``$> echo "This is a message"`` | | | | | +---------+-------------------------------------------------------+-----------------------------------+ |``cat`` | | Display the contents of a file on screen | | ``$> cat my_file`` | | | | Concatenate files | | +---------+-------------------------------------------------------+-----------------------------------+ |``date`` | | Shows the current date on screen | | ``$> date`` | | | | | Wed Nov 7 10:40:05 EST 2018 | | | | | +---------+-------------------------------------------------------+-----------------------------------+ |``pwd`` | | Return the path to the current working directory | | ``$> pwd`` | | | | | /users/username | | | | | +---------+-------------------------------------------------------+-----------------------------------+ |``cd`` | | Change directory | | ``$> cd sub_folder`` | +---------+-------------------------------------------------------+-----------------------------------+ |``mkdir``| | Create directory | | ``$> mkdir new_folder`` | +---------+-------------------------------------------------------+-----------------------------------+ |``touch``| | Change the access and modification time of a file | | ``$> touch new_file`` | | | | Create empty files | | | | | | +---------+-------------------------------------------------------+-----------------------------------+ |``cp`` | | Copy a file in another location. | | ``$> cp old_file new_file`` | | | | Copy several files into a destination directory | | | | | | +---------+-------------------------------------------------------+-----------------------------------+ |``mv`` | | Move a file in another location. | | ``$> mv old_name new_name`` | | | | Move several files into a destination directory | | | | | | +---------+-------------------------------------------------------+-----------------------------------+ |``rm`` | | Remove one or more files from the file system tree | | ``$> rm trash_file`` | | | | | ``$> rm -r full_folder`` | | | | | +---------+-------------------------------------------------------+-----------------------------------+