Introduction
to Astronomical Computing
At
the College of Charleston
(last
updated 21 August 2012 by J.E. Neff)
Exercise #
2 -- Getting Started with IDL
This is
a short exercise to get you started using IDL. This is the only programming language you will ever need to
know, unless you plan to work with instrumentation or do heavy-duty modelling. It is the most common language used by
astronomers. It was written by
scientists for scientists, so the syntax is fairly easy to master, and it
provides a platform to apply your scientific creativity to make the computer do
what you want. If you have any
other programming experience, you will quickly become a convert to IDL. If you don't, some of the early
learning curve might seem obscure, but it will be time well spent. You've picked up a little bit of unix (operating system) and iraf (data analysis package)
experience. In those, you mostly
"operate" the computer, performing actions that were anticipated by
the developers. In a programming
language, you are in charge. To be
fair, there is a lot more to learn about unix and
iraf, including the ability to "script" with them. Don't stop learning those, but learn to
enjoy IDL.
I.
The
Nature of IDL; Starting up IDL; Getting Help
IDL is
several things at once. First and
foremost, it is a high-level programming language. It is also an INTERACTIVE operating environment (kind of
like iraf and unix). Finally, it
is a collection of an almost unlimited library of useful tasks and programs,
mostly written by other scientists (also kind of like iraf, except that these
programs are readable and changeable).
You can be issuing commands interactively, compiling and running
programs, and calling library routines all at the same time.
It is
always best to run IDL from an xterm.
Fire up an xterm window and type
IDL. If everything is set up
right, you should get no errors and a COFC_IDL> prompt. If you have errors, let me know now so
I can fix them. There are other
ways to run IDL, but you can easily learn those later. The first IDL command you should type
is HELP. This command tells you what is in IDL's "active"
memory. There is a memory level
(it should say $MAIN$), variables (you won't have any yet), compiled procedures
(probably ASTROLIB and $MAIN$), and compiled functions (you shouldn't have any
of these yet either). Type HELP
regularly as we go along. You'll
quickly develop a fell for what's going on. One reason IDL is so flexible is that it operates primarily
in the RAM. When you start it up,
it allocates a fixed amount of space.
$MAIN$ is the top-level in the memory tree, kind of like / is the top
level in the unix file system. Most of the time, you will be
interacting on the $MAIN$ level, but sometimes you get stuck in a
subroutine. If that happens, IDL
will stop and allow you to fix things and move on. In that case, when you type HELP, it might appear that IDL
has lost everything you thought it knew.
Have no fear, it's still there. HELP just shows you the active part of
the memory. This might sound like
a bug right now, but you'll soon see it to be a valuable feature.
If you
type a ?, you
will get a separate window that provides access to all of the on-line
documentation. There's more
documentation on the shelves and on the web. You can leave this help window up all
the time if you wish.
II.
Variable
Types; Special Symbols
We
already talked about this in class a bit, so you should have no trouble
understanding the need for different variable types. Integer variables
store numbers from -32767 to +32762 using 2 bytes (i.e. 16 bits). Long-word
Integers use 4 bytes, so they range up to some huge number. Floating point numbers are stored with 4 bytes each,
reserving some space for the sign, the exponent (which therefore has an upper
limit), and the sign of the exponent.
Double precision numbers are
64-bit floating point. You usually
don't have to worry about double precision and longword, but you have to be
aware of them. Byte variables are stored in a single
8-bit byte, so they are actually numbers ranging from 0 to 255 (IDL nearly
always starts at 0). The entire
ASCII character set can be represented this way; nearly all image display
values are also converted to BYTE form before they are displayed (you can think
of it as a gray scale with 0 = black and 255 = white, or vice versa). Remember that IDL works in the
RAM, so the more space your variables take up, the less free memory is
available to crunch numbers. This
is not going to be an issue until you get involved with something more heavy
duty than we are doing here. IDL
also has the power to treat a STRING
as a variable. These are generally
text. For example, you could type silly='Hello there' and then print,silly. Go ahead and try it. Now type a=2 then type b=2L then
type c=2.0 then type d=2D. Type HELP and see
if it all makes sense.
One
powerful aspect of IDL compared to other languages is that you do NOT have to
explicitly tell it what kind of variable you want to store your value in. It will chose
the easiest one by default. Unlike
most other languages, it can also operate on variables of different types. For example, let's operate on your
numbers: e=a*b f=a*c g=a*d
(hit the return key between each command) then print,a,b,c,d,e,f,g and type HELP . Can you follow
what's going on? What will happen
if you type print,a*silly
? How about print,1/a,1/c ? Try them out and see if it makes
sense. IDL will automatically
convert the variable types before operating on them in order to maintain the
highest precision. Not always,
though. If you give it all
integers, it will not convert them to floating point before operating on
them. But if one of your numbers
is floating and the rest are integer, it will convert all of them to floating
before the operation. Note that it
doesn't change your variables in its memory permanently, just long enough to
get the job done (i.e. type HELP and you'll see that a is still an integer).
IDL
also provides commands to convert a variable of one data type to another. FIX turns a floating point or string
(under certain conditions) into an integer [ type print,fix('2'),fix(1.49), fix(1.51) ]. FLOAT turns an integer or string into a floating point. LONG and DOUBLE convert integers and
floating point to double precision.
There
are very few rules regarding variable names, but there are special symbols that
should never be used. Any variable
that begins with a ! is a
"system variable". These
are variables whose value is known to IDL no matter what level or subroutine
you are in. Otherwise, when
control passes to a subroutine, all the "main" level variables are
temporarily forgotten until the subroutine is finished. You have a few system variables set up
already. print,!path
!path tells IDL where to find a
procedure (the idl name for program).
If you type a word that doesn't correspond with one of its built in
commands, it will look through this path in order until it comes to a file with
that name and a .pro extension.
Then it will compile and immediately execute that program. Part of your search path is procedures
that came with IDL, and part of it is stuff that I loaded that is relevant to
astronomical data analysis. When
you get a chance, start snooping around in your search path to find, read, and
try to interpret some real IDL programs.
If you
want to issue a unix command from inside IDL, you can
start it with a $ (e.g. $pwd). $ has a different meaning inside a program (it tells idl to
keep reading on the next line rather than executing the command when it gets to
the carriage return). Don't put $
in a variable name. Also, NEVER
EVER use spaces in a variable name or a procedure name.
Comment
lines in IDL programs start with a semicolon, so never use one of those in your
variable and procedure names.
Don't use a . in the
name either. IDL has another
(actually several more) variable type called a structure. These are incredibly powerful, but they
have .'s in them. For example, print,!d.name
and it should tell you what kind of device IDL is communicating with
(probably "X" for now).
Some special commands begin with a "dot" (e.g. .run), so just
don't use it in variable names (your procedure names will of course be called something.pro, but to IDL you will just
call them something; for example,
we'll be using a command juldate, which is actually a program in your path
called juldate.pro; go ahead and locate it and print it out). Dashes (-) are
not a good idea, because they might get interpreted as a minus sign! If you
need some kind of delimiter, use an underscore (_). When you read the introductory IDL manuals, you will come
across several more special symbols, but these are the main culprits.
IDL is
not case sensitive, but filenames and unix commands
are. print,a
is the same as PRINT,A or print,A
III.
Vectors
and Arrays
Quite
possibly the single-most powerful aspect of IDL, from the perspective of a
scientific programmer, is the way it handles arrays of numbers. Simply put, it treats an array as a
single variable. If you want to
divide one image by another, you just type result=image1/image2 and IDL figures
out the rest. This may not sound
so impressive to you if you haven't struggled through writing loops in other
languages to operate on each number in the array one at a time and then put the
resulting numbers in the right spot in the resulting array (after declaring all
the variable names and data types).
All of the data types that apply to single variables also applies to arrays.
A 1-D array (i.e. a column of numbers) is called a
"vector". Mostly in
astronomy we deal with 1- and 2-dimensional arrays. A spectrum, for example, is a vector, and an image is a 2-d
array (which we'll just call an "array" from here on, because
everything you'll learn also applies to arrays of any dimensions, but we seldom
use them).
You can
also string multiple operations togther in a single command line; you don't
have to define intermediate variables or read and write disk files. Of course, you can if you want to; IDL
has the power. But if you don't
need to, it uses less memory and runs faster to operate on your variables
inside a command line. You'll see
some examples below.
If a
vector called charlie is 25 elements long, the first element is charlie(0) and the last is charlie(24). In 2-dimensional arrays, you can think
of the index values as rows and columns.
For example, in a 1024 x 1024 image that you called fred, fred(512,736) is the value of the pixel in the 512th
row and 736th column.
Each row and column of the 2-dimensional array can be treated as a
vector by giving a wildcard to the othe dimension (e.g. fred(512,*)
is the entire 512th row of fred).
IV.
Running
IDL Interactively
You
seldom jump into IDL and start writing programs. Generally, you sit down at the "interpreter" level
and issue a command and look at the result. When it starts doing what you want it to do, you copy these
commands into a text editor somewhere and start piecing together a longer
program (so that you don't have to do so much typing the next time). The ability to interpret commands one
at a time is a rare feature among programming languges, so you should learn to
love it. I'm going to give you a
bunch of random but educational examples.
Type them in, see if you can figure out what's going on, ask questions,
follow your interest into more complex examples. When you get tired of this, we'll learn a little
programming.
First,
type HELP then type EXIT
to get out of IDL. This is the
only way to completely clear all the junk out of its memory. Then start IDL back up and type HELP.
intvec=intarr(100)
fltvec=fltarr(100)
help
print,intvec
intvec=indgen(100)/100
print,intvec
fltvec=findgen(100)/100
plot,intvec
plot,sin(fltvec*!pi)
silly=sin(fltvec*!pi)
plot,silly
plot,20*sin((fltvec-.2)*15)
oplot,[0,100],[-.2,-.2]
intvec=[23,97,111,45,-19]
print,intvec
intvec=intarr(10)
read,intvec
... type in 10 random integers
print,intvec
help
for
i=0,9 do read,intvec
... type in 10 random numbers
for
i=0,9 do print,i,'my random number is ',intvec(i)
a=findgen(100)/100.
b=fltarr(100,100)
for
i=0,99 do b(i,*)=sin(a*2*!pi)
surface,b
contour,b
tv,b
tvscl,b
tvscl,rebin(b,500,500)
print,max(b),min(b),mean(b),median(b)
c=bytscl(b)
print,max(c),min(c),mean(c),median(c)
plot,b(50,*)
plot,b(*,60)
plot,b(*,60),yrange=[-1,0),psym=2
Now try
to make up some commands on your own, based on what you've learned so far. Some will work, and some will
bomb. Think of something simple
you'd like to calculate or plot, and make it work. Show me what you did.
V.
Simple
IDL Programming
There
are multiple ways to "drive" IDL, and that applies to writing
programs, too. You can write a
one-time-only program interactively.
Try this example....
.run
moron=fltarr(50)
for
i=0,49 do moron(i)=3*i^2-14*i+14.5
plot,i
end
Every
the end command is necessary. When
IDL encounters it, the program will be compiled and then run line-by-line. If there's an error, it will stop at
that line. You can either type RETURN to go back to the level that ran
the program and try again, or you can fix your error and type .CON to continue execution at this
point. Because you can get many
levels deep in subroutines, you sometimes have to type RETURN multiple times to
get back to the $MAIN$ level. You
can save yourself work by typing RETALL,
which sends you back in one big jump.
If you have a procedure already written (a text file that ends with
"end" and has a .pro extension and is in your !path),
you can either type .run filename (without the .pro) or just type the
filename. They do slightly
different things. .run always recompiles the program. If you've made changes to it since it
was last compiled (e.g. if you fixed an error), this will compile and execute
the new version. If you want to
compile it the first time, all you have to do is type the name without the
.run.
It is a
good habit to always start any procedure with a line that reads pro
procedurename (where procedure name is what you call the program and
procedurename.pro is the name of the file). If variables are passed to this program (i.e. it is a subtroutine)
you must have this line. If it is
totally self-contained, or if it just uses the variables already known to IDL,
you have to comment out this line by starting it with a semi-colon (but leave
it in there commented out as a title for your program). After the pro line, it is customary to
include a comment section starting with a line that has just a
;+ on it and ending with a line that has only ;- on it. All the lines between start with a ;. They are
just comments, but if you type doc_library,procedurename,
those comments will print out on the screen. Many procedures are written flexibly, so that if you call
them wrong or with the wrong number of parameters, they print out instructions
on what to do and die gracefully.
As you'll soon see, there are many ways to send parameters and variables
between programs. Some parameters are required, some are
optional. Some are global
(the system variables) and some are only known to the subroutine. Some are changed in the subroutine then
passed back to the calling routine.
I know this sounds obscure now, but we'll look at a few examples and
you'll soon get the hang of it.
If
you haven't already, print out the JULDATE program. What is the julian date right
now? See if you can figure out how
the program works, and calculate a number. What was the Julian date when you were born? How many days have
your lived. Do all this in
IDL.