Introduction to Astronomical Computing

Introduction to Astronomical Computing

At the College of Charleston

(last updated 21 August 2012 by J.E. Neff)

Exercise #1 – Getting Started

The astronomical computing labs at CofC consist primarily of the computers in 145 SCIC, the observatory, and astronomy faculty offices and labs. In most cases, these are UNIX machines. You will need to develop basic unix skills as well as mastering some programming (IDL) and astronomical packages (e.g. IRAF). This exercise introduces you to the computing environment and teaches you how to get started in unix. Later exercises will introduce you to the windowing environments, IDL, IRAF, etc.

I. POLICY and HARDWARE OVERVIEW

Before doing anything else, <a href="computer145.pdf">READ THE POLICY DOCUMENT</a> and ask if you have any questions or comments.

All of the machines share their disk space via NFS (covered later). The solaris machines use COOLSTAR as a software server (i.e. all of the astronomical software is installed on that machine; the other machines run that copy as if it were locally installed). We do not yet have the linux or os-x software served in this way, but we will in the future.

You have all been assigned accounts on the computer cluster in 145 SCIC. Currently all machines have unique log-in's. In the future, we hope to make a common login so that you can log in anywhere and use any of the network resources regardless of which machine you are physically working on.

Before doing anything else, (1) log onto one of the machines and use the System Preferences to change your password. (2) Do not change your "picture"; these identify the different research groups. (3) Bring up a terminal window. Log into each of the other machines (e.g. ssh spitzer), type in your username and initial password. Once you are logged in, change your password by typing "passwd" then enter your initial password followed by your new password twice.

II. POLICIES

Because these machines all share resources over the network, it is critical that you NEVER POWER OFF any of them! It is recommended that you power off the monitor when you leave, but never power off the machines. If there's a power outage, or if you accidentally power one down, contact Professor Neff (843-953-5325; neffj@cofc.edu).

Please keep room 145 clean. Do not eat or drink around the keyboards. Do not put finger prints on the monitor screens. Do not leave CD's in the drive bays or flash drives in the USB slots.

Except for transferring your personal files using a portable flash drive, do not attach anything to the USB or firewire ports of any of the machines without Dr. Neff's permission. You can inadvertently create a network disaster or create a serious security hole.

Accounts on all machines are assigned by Professor Neff. He also serves as the "system administrator" for all of these machines, so he is the primary person to contact if you have problems. When you receive your account, you agree to abide by all the policies set forth here and in the future. They will be kept to a minimum, but they are not arbitrary or capricious. They are to ensure network security and to ensure access to multiple users.

However, nothing is formally supported; we all share our expertise. This is primarily an in-house, departmental facility. DO NOT contact the CofC HELPDESK with questions. Please bring yourself up to speed as quickly as possible. Use the manuals on the shelves, internet documentation, and on-line help to become proficient yourself. Later in this lesson, we'll discuss some of the important documents.

III. USER ACCOUNTS

UNIX is a true multi-user, multi-tasking operating system. It's older than Apple or Microsoft, but it's also the wave of the future, at least in scientific computing! The "personal computing" paradigm just doesn't fit the kind of work that many of us do with computers. Any number of users can be "logged in" and using the system resources simultaneously. You can either log in at the system console or through a remote network connection. Depending on how the machines are configured, you might not be able to access all of the power of the windowing system unless you are logged in at the console.

If you have not already logged in, do so now. If you were assigned a temporary password, the first thing we need to do is change that to something secure. There are many levels of security in unix, but they can be compromised by poorly chosen user passwords. So pick a good one, but one that you can remember. Passwords are case sensitive! Most of the special characters are allowed, but it is a VERY BAD IDEA to put a "space" in your password. Ideally, chose a password of 8 characters (some systems ignore anything after the first 8 characters anyway) that is not a word in any dictionary, that has a mixture of numbers, letters (some capitalized and some not), and special symbols. Do not use a simple substitution for a common word (e.g. replacing "oh's" with "zeros" or "el's" with "ones"); that's as bad as choosing a word from a dictionary. DO NOT use your email password. That password gets transmitted "in the clear" all the time, and anyone can steal it. We will adopt measures to make sure that if your unix password needs to be transmitted over a network line, it is encrypted first. It is ok to use the same password on every machine, but it is not ok to use your initial password. I will be checking and locking accounts that have not been changed.

When you logged in, you almost certainly got a windowing system (KDE, CDE, or Carbon in OS-X). This lesson will not cover the windowing systems. They are mostly self-explanatory, and they all have built in help and tutorials. You will also need to spend time "playing with" the windowing system before you become comfortable. You'll find that many tasks can be performed in either the windowing system or by issuing unix commands. You'll develop a feel for which is most effective as you practice both. OS-X works mostly through the FINDER application. KDE, CDE, and Microsoft Windows have a similar application, but it might not start up by default.

How do you "talk" to the unix machine? Most of these windowing systems hide that from you. You need to start up a "terminal" window. How you do this depends on which machine. In OS-X, there is a "terminal" application in the Applications/Utilities folder. You should use the finder to find the icon and pull it onto your dock; you're going to use it a lot. KDE and CDE have a menu of options if you left click on the button in the lower left or right click on any of the background parts of the screen. Look for something called "terminal", "console", or "xterm". You can have any number of terminal windows open.

Once you have a terminal window, type passwd and hit the Enter key to change your password (from here on you should assume you hit the Enter or Return key after each command to send it to the unix level). The system will respond with a prompt asking you to enter you current password. This is a security measure to keep anyone from using on open keyboard to gain access to your account. After you've typed your current password, you will be prompted to enter your new password twice. If you can't make it work, make sure you don't have the Caps Lock key on, and make sure you type your new password the same both times.

IV. The UNIX FILESYSTEM

You can think of the unix operating system as a single system of "files". The others are, too, but using windows has hidden most of that from beginning users. You'll need to master the file system right away. You can think of it as a tree or as a system of folders and subfolders (and subsub and subsubsub ...). Folders (traditionally called directories) can contain either simple files or subdirectories. EVERYTHING is somewhere in the file system – hardware, software, user files, networked devices; everything.

Where are you in the file system? Type pwd into your terminal window and carefully note the system response. What you see is called your "home" directory. Whenever you log in, this is where you start out. pwd means "print working directory". It's a typical unix command: short, obscure, and lower case. If you are on a Mac, it probably responded something like /Users/yourname; on the Linux machines it probably said /home/yourname; on the Solaris machines it will be /export/home/yourname. yourname is the username you were assigned and used to log in. That's your identity on the system. Anything from this level of the file system on down, you control. Anything "above" this in the file hierarchy is something else.

Before learning how to move around, let's see what you have already. The command to view the files in your directory is ls. Type that and tell me what you see. Possibly nothing, because you haven't created anything yet. Linux and OS-X create all sorts of junk when your account is created, so you'll probably see some file names. ls by itself doesn't tell you anything except the names of the files in the current working directory (i.e. the one pwd told you about). Like most unix commands, ls can be used in a whole bunch of ways. Here are a few simple extensions of the ls command to try:

· ls –F if you have any files, this will append a * to the end of the name if they are executable, and it will append a / to the end of the name if the file is actually a subdirectory

· ls –a some files are "hidden"; their names start with a . in order to see them, you need the –a option

· ls –l this shows you a full listing of all the files and their properties

· ls –alF this combines all three commands above

Let's look at the ls –alF results. The first column on the left has a – if this is a regular file, a d if it is a directory, and one of several possible symbols if it's a special file. If you have a directory in there somewhere (let's say it's called "directoryname"), type ls directoryname to see what's inside it. The next 9 columns are very important. There are 3 kinds of file access, and there are 3 levels of file "ownership".

The first level is called "user". In this example, the user is you (or anyone logged in with your password, so don't let anyone do that!). The first 3 columns list whether you have read (r), write (w), or execute (x) to the files. The only things that should be executable are compiled programs and directories. An executable directory means that you can "move" into that directory and see what's in there and read and write files in there. If you have write access, you also can delete the file, so there's no need for a 4^th attribute. The default is rwx for directories and rw- for normal files.

The second level is called "group". Groups are assigned by the system administrator . You have a unique username, but you may be a member of many different groups. Student accounts are all in the "students" group, for example. Different research groups might have their own group name (e.g. "grb"). The next 3 columns show what anyone who is in the group can do to your files. The default is r-x for directories and r—for normal files.

The last three columns describe what anyone who is logged into the system ("other"), or anyone who has access to the file system (even if they are not logged in to this computer) can do with your file.

Later we'll learn how you can control these attributes. Obviously you should never create any files that with "other" access (e.g. ---, r--, r-x are all ok, but rwx and rw- are not ok). But unix is a collaborative system, and there are many good reasons we might want to share files. So by default, your "home" directory is rwxr-xr-x. Therefore any files in your home directory can be viewed, but not changed or deleted, by anyone on the system. You will learn how to create directories that can only be shared with members of the group or with nobody (except the notorious "superuser").

After the access control, you'll see a column telling you who the owner of the file is. That's you now, but we could have issued this command anywhere on the system. The next column tells you the group ownership of the file. Then there's a column telling you how big the file is. This number can take several forms. Generally for normal files, it is the number of kilobytes. Big numbers mean big files.

The next column gives the date and time that the file was last changed. Finally, the last column gives you the full file name.

A few words about file naming conventions. Unix is case sensitive. File1.txt is not the same as file1.txt. Extensions are useful to give a sense of what's in the file, and they are used by some applications (like the finder on a Mac) to associate the file with an application. But they are not special in anyway to unix. Any of the following are valid filenames: file1.doc file1_aworddoc file1 file1.doc.mydoc.quack.quack. It is ALWAYS a bad idea to include spaces in filenames. They are the plague of Microsoft and Apple. The problem is that unix commands are separated by spaces, so unix often gets confused if you are doing something with a file and you encounter a space. So don't use them in filenames. If your filename starts with a period (e.g. the file .cshrc), it is a special "hidden" file. That's really the only thing that is special about it: you can't see it unless you put a –a option on the end of your ls command. They are often used to set up preferences for programs and applications, and they are hidden for your convenience only. Your home directory will fill up with them as you progress. Another character you must never use in a unix file name is the / character. Can you see why? Type pwd again... / plays a special role in the unix file system.

For example, your home directory might have been /home/yourname. That means there's one level above your home directory, and it's called /home. There's even a level above /home. What do you think it's called? Let's find out. Type cd and then pwd to make sure you are starting in your home directory. Now type cd .. then pwd. cd ("change directory") with no arguments returns you to your home directory from anywhere on the system. The .. argument to cd means "move up one level". Yeah, I know, I didn't invent it though. You should now be in /home. Do an ls –l to see all of the other home directories of the other users on this machine. One of them will be neffj. Type cd neffj to move into my home directory. Since this argument didn't start with anything funny like a .. or a /, it looked for a directory called neffj in the current directory. Hopefully there's nothing embarrassing in there. Maybe you don't have access. But if you made it there, let's learn how to peek into one of the files. The file called .cshrc is special; you'll use it to define a bunch of defaults and aliases. To see what's in mine, type cat .cshrc You could also type more .cshrc to see the same thing one at a time. Let's go home now. Type cd again then pwd to make sure. Another way to get to my directory would be to use it's full pathname. Type cd /home/neffj (or /Users/neffj or /export/home/neffj depending on which system you are on) then ls and you should see the same thing you saw a minute ago. In other words, you move around using cd, and you can either give an "absolute path" starting with a / and including everything above that directory, or you can give a "relative path" (from here up, from here down, etc). The absolute top level directory is called the "root directory". Change directories until you are there and type ls –l. Everything will be owned by funny names like "root" and "admin", which means you can't change it. But you have read access to alot of it, and you have execute access to many of the directories, so you can look around and see what's where. Practice for 5 or 10 minutes snooping around the system with cd, ls, and cat (or more). When you are done, return to your home directory.

V. SETTING UP YOUR ENVIRONMENT

Whenever you start up a new terminal (or anything that opens a new window), you are creating another unix "shell". There are many types. For our purposes, the shell consists of the commands we type and special things that are "known" to the system wherever we are in the file system and whatever we are doing. Unix is powerful. You can create a shell that uses any commands you want and keeps track of all this extra information any way you can imagine. Kind of strange, then, that we still use cryptic commands for everything, but that's how it goes. On our system, every user account is established to enter the "c-shell" by default. Actually, it's not the basic unix c-shell, but an extended one called tcsh. Anything you read about c-shells also works in tcsh. The main extra feature is that arrow keys allow you to recall, edit, and reissue commands. For example, type the following series of commands: cd, pwd, ls, up arrow, add a –a to the end of the ls line, hit return/enter, hit the up arrow 4 times and add a .. to the end of the cd command, return/enter, then hit the up arrow twice and hit return/enter. Later on, the commands will get long, so this is a useful feature.

Whenever you start a new shell, for example by opening a terminal window, unix looks for a file called .cshrc in your home directory. This special hidden "rc" file sets up some shell wide definitions to make your life easier. Among other things, it creates "environment variables" and defines "aliases". Look at your .cshrc file. It looks different than mine. Rather than building up one from scratch, just grab a copy of mine. You have read and execute access to my home directory, and since you have read access to that file, you can also copy it. To copy my .cshrc to your home directory, first make a backup copy of yours: mv .cshrc .cshrc.ORI then ls –a to see what happened. mv is short for "move", or rename. Be careful with this command, because you can erase stuff accidentally. By default, mv won't let you move a new file on top of an existing file, erasing it in the process. But we can change that default behavior. Now type cp ../neffj/.cshrc . (yes, that's a period at the end; it means copy the file called .cshrc that' s located up one level from your home directory then down into the neffj directory to the current working directory – . means "here"; .. means "up one level"). You could also type cp /home/neffj/.cshrc .cshrc; same thing.

We want to do more than look at it; we want to make some changes (and you'll keep modifying this until your shell is comfortable for you). To do that, we need an editor. The universal editor on all unix systems is called vi, but it really sucks. Learn it if you want, but on all of our systems, I installed one called pico. There are many other choices, but this should work for you for now. To start it up, type pico .cshrc (make sure you are in your home directory, not mine). You will see a page of the file, and you can use the arrow keys to move around, then the backspace or delete key to make changes. When you are done, you save your work by typing Cntrl-X then answering y to save the results and then return/enter to save it to the same file name (or n if you screwed up; if you didn't make any changes it will just exit). All of the control characters are summarized cryptically at the bottom of the screen, but this is all you need for now.

In the .cshrc file, you'll see some important things: umask, cdpath, prompt, PATH, LD_LIBRARY_PATH, MANPATH, INFO_PATH, a bunch of aliases, and some setenv commands. We'll learn what they all mean in fairly short order. But for now, let's just 3 minor changes for practice. First, comment out the alias defining ls. Do this by putting a # at the beginning of the line. Now add a line below it that says alias lsa ls –aF. Finally, in the PATH definition, delete the . Save the file.

Now type ls and what happens? It gives you an ls –aF. That's because we changed the .cshrc file but did not tell the shell about it. To use the new .cshrc file, you need to type source .cshrc Now if you type ls you should get the raw ls line. When you type lsa you'll get the ls –aF listing. It also changed your path so that you cannot accidentally execute programs in your current working directory, which you shouldn't be trying to do anyway. You can define your own aliases either in your .cshrc file or at a command prompt (but if you do the latter, it won't be remembered next time you start a new shell). For example, type ll then type unalias ll then type ll then type alias ll "ls –alF" then type ll (those are all "el's" not "one's").

The PATH is an important environment variable. It tells unix where to look for a program if you just type it's name. To see how it works, type printenv You should get a listing of a bunch of stuff, including PATH. You know of a program called pico. It must actually be an executable file somewhere in the PATH. Where? Type which pico cd to the directory that pico is in, do a full directory listing to see its attributes and verify that it is an executable file that you have read and execute access for. Now type which piceau. It will look in the same path and should tell you that it didn't find a file called "piceau" anywhere in there.

VI. LEARNING MORE

We've only scratched the surface. The first thing you need to do is play around with the commands you know. Use pico to create a text file in your home directory. Practice copying it, renaming it, editing it, etc. Make yourself a subdirectory to work in. To make a subdirectory called TEMP, use the command mkdir TEMP To move a file from your home directory into the subdirectory, type mv filename TEMP To go the other way, cd into the subdirectory and type mv filename .. I can't guarantee that the default printers are set up right for you yet, but you can try printing a file with the command lp filename If it looks funny, try the print alias from your .cshrc: print filename If it still doesn't work, we'll cover it in a later lesson.

All unix commands have online help, called the "man pages". Look at the man pages for the commands you know. For example, man ls

There are a couple of unix books on the shelf. The oldest ones are best at this point, because they focus just on the raw, simple unix commands. There are a host of on-line tutorials. A Google search should reveal several for you to work through. Share the good ones with other students.

Talk to other students. Share your tips and tricks. The only way to learn is to just dive in.

As you've already seen, there are multiple ways to accomplish the same task. The windowing system gives you even more control (e.g. finder does all the cd'ing and cp'ing stuff graphically). You need to learn both ways. Finder sometimes takes a lot longer to accomplish a simple task, it is sometimes dangerous (e.g. when moving a lot of files at once; they can disappear), and typing is often quicker than clicking. Before you know it, you'll be stringing unix commands together into higher level "scripts".