A MATLAB Primer

THE BASICS
MATLAB is a programming language and a computing environment that uses matrices as one of its basic data types. It is a commercial product developed and distributed by
MathWorks. Because it is a high level language for numerical analysis, numerical code to be written very compactly. For example, suppose you have defined two matrices (more on how to do that presently) that you call A and B and you want to multiply them together to form a new matrix C. This is done with the code

C=A*B;
(note that expressions generally terminate with a semicolon in MATLAB). In addition to multiplication, most standard matrix operations are coded in the natural way for anyone trained in basic matrix algebra. Thus the following can be used

A+B
A-B
A' for the transpose of A

inv(A) for the inverse of A

det(A) for determinant of A

diag(A) for a vector equal to the diagonal elements of A
With the exception of transposition all of these must be used with appropriate sized matrices, e.g., square matrices to
inv and det and conformable matrices for arithmetic operations.

In addition, standard mathematical operators and functions are defined that operate on each element of a matrix. For example, suppose A is defined as the 2x1 matrix
[2 3]
then
A.^2 (.^ is the exponentiation operator) yields
[4 9]
(not
A*A which is not defined for non-square matrices). Functions that operate on each element include

exp, ln, sqrt, cos, sin, tan, arccos, arcsin, arctan, and abs.
In addition to these standard mathematical functions there are a number of less standard but useful functions such as cumulative distribution functions for the normal:
cdfn (in the STATS toolbox). The constant pi is also available.

MATLAB has a large number of built-in functions, far more than can be discussed here. As you explore the capabilities of MATLAB a useful tool is MATLAB’s help documentation. Try typing helpwin at the command prompt; this will open a graphical interface window that will let you explore the various type of functions available. You can also type help or helpwin followed by a specific command or function name at the command prompt to get help on a specific topic. Be aware that MATLAB can only find a function if it is either a built-in function or is in a file that is located in a directory specified by the MATLAB path. If you get a function or variable not found message, you should check the MATLAB path (using path to see if the functions directory is included) or use the command addpath to add a directory to the MATLAB path. Also be aware that files with the same name can cause problems. If the MATLAB path has two directories with files called TIPTOP.m, and you try to use the function TIPTOP, you may not get the function you want. You can determine which is being used with the which command, e.g., which tiptop, and the full path to the file where the function is contained will be displayed.

A few other built in functions or operators are extremely useful, especially

index=start:increment:end;
creates a row vector of evenly spaced values. For example,

i=1:1:10;
creates the vector [1 2 3 4 5 6 7 8 9 10]. It is important to keep track of the dimensions of matrices; the size function does this. For example, if A is 3x2,

size(A,1)
returns a 3 and

size(A,2)
returns a 2. The second argument of the size function is the dimension: the first dimension of a matrix is the rows, the second is the columns. If the dimension is left out a 1x2 vector is returned:

size(A)
returns [3 2].

There are a number of ways to create matrices. One is by enumeration

X=[1 5;2 1];
which defines X to be the 2x2 matrix
1 5
2 1
The
; indicates the end of a row (actually it is a concatenation operator that allow you to stack matrices; more on that below). Other ways to create matrices include

X=ones(m,n);
and

X=zeros(m,n);
which create mxn matrices with each element equal to 1 or 0, respectively. MATLAB also has several random number generators with a similar syntax.

X=rand(m,n);
creates an mxn matrix of independent random draws from a uniform distribution (actually they are pseudo-random).

X=randn(m,n);
draws from the standard normal distribution.

Individual elements of a matrix the size of which has been defined can be accessed using (); for example if you have defined the 3x2 matrix B, you can set element 1,2 equal to cos(2.5) with the statement

B(1,2)=cos(5.5);
If you then what to set element 2,1 to the same value use

B[2,1]=B[1,2];
A whole column or row of a matrix can be referenced as well in the following way

B(:,1);
refers to column 1 of the matrix B and

B(3,:);
refers to its third row. The
: is an operator that selects all of the elements in the row or column. An equivalent expression is

B(3,1:end);
where
end indicates the column in the matrix.

You can also pick and choose the elements you want, e.g.,

C=B([1 3],2);
results in a new 2x1 matrix equal to
B12
B32
Also the construction

B(1:3,2);
is used to refer to rows 1 through 3 and column 2 of the matrix B. The ability to access parts of a matrix is very useful but also can cause problems. One of the most common programming errors is attempting to access elements of a matrix that don’t exist; this will cause an error message.

While on the subject on indexing elements of a matrix, you should know that MATLAB actually has two different ways of indexing. One is to use the row and column indices, as above, the other to use the location in the vectorized matrix. When you vectorize a matrix you stack its columns on top of each other. So a 3x2 matrix becomes a 6x1 vector composed of a stack of two 3x1 vectors. Element 1,2 of the matrix is element 4 of the vectorized matrix. If you want to create a vectorized matrix the command

X(:)
will do the trick.

MATLAB has a powerful set of graphics routines that enable you to visualize your data and models. For starters, it will suffice to note that routines plot, mesh and contour. For plotting in two dimensions, use plot(x,y). Passing a string as a third argument gives you control over the color of the plot and the type of line or symbol used. mesh(x,y,z) provides plots of a 3-D surface, whereas contour(x,y,z) projects a 3-d surface onto two dimensions. It is easy to add titles, labels and text to the plots using title, xlabel, ylabel and text. Subscripts, superscripts and Greek letters can be obtained using TEX commands (eg., x_t, x^2 and \alpha\mu). To gain mastery over graphics takes some time; the documentation Using MATLAB Graphics available with MATLAB is as good a place as any to learn more.

You may have noticed that statements sometimes end with ; (semi-colon) and they don’t. MATLAB is an interactive environment, meaning it interacts with you as it runs jobs. It communicates things to you via your display terminal. Any time MATLAB executes an assignment statement, meaning that is assigns new values to variables, it will display the variable on the screen UNLESS the assignment statement end with a semi-colon. It will also tell you the name of the variable, so the command
x=2+4
will display
x =
6
on your screen, whereas the command
x=2+4;
displays nothing. If you ask MATLAB to make some computation but do not assign the result to a variable, MATLAB will assign it to an implicit variable called
ans (short for answer''). Thus the command
2+4
will display
ans =
6

CONDITIONAL STATEMENTS AND LOOPING
As with any programming language MATLAB can evaluate boolean expression such as
A>B, A>=B, A<B, A<=B and A~=B (the last one is not equal; ~ is MATLAB’s negation operator). Also ~(A>B), ~(A<B), etc., can be used. These need to be used with a bit of case when A and B are not scalars, however. A>B creates a matrix of zeros and ones equal in size to A and B. If you want to know is any of the elements of A are bigger than any of the elements of B is the same as checking whether any of the elements of the matrix A>B are non-zero. MATLAB provides the functions any and all to evaluate matrices resulting form boolean expressions. As with many MATLAB functions, any and all operate on rows and return a rows vector with the same number of columns as the original matrix. This is true for sum and prod functions as well. The following are equivalent expressions

any(A>B)
and

sum(A>B)>0.
The following are also equivalent:

all(A>B)
and

prod(A>B)>0
All of these expression are row vectors:

size(all(A>B))
is equal to

[1 max(size(A),size(B))]

Boolean expressions are mainly used to handle conditional execution of code using one of the following:

if expression
...

end
if expression
...

else
...

end
and

while expression
...

end
The first two of these are single conditionals, for example

if X>0, A=1/X; else A=0, end
You should also be aware of the
switch command (type help switch).
The last is for looping. Usually you use
while for looping when you don’t know how many times the loop is to be executed and use a for loop when you know how many times it will be executed. To loop through a procedure n times for example, one could use the following code:

for i=1:n, X(I)=3*X(i-1)+1; end
A common use of
while for our purposes will be to iterate until some convergence criteria is met, such as
P=2.537;
X=0.5;
DX=0.5;
while DX<1E-7;
DX=DX/2;
if normcdf(X)>P, X=X-DX; else X=X+DX; end
disp(X)
end
(can you figure out what this code does?). One thing in this code fragment that has not yet been explained is disp(X). This will write the matrix X to the screen.

SCRIPTS AND FUNCTIONS
When you work in MATLAB you are working in an interactive environment that stores the variables you have defined and allows you to manipulate them throughout a session. You do have the ability to save groups of commands in files that can be executed many times. Actually MATLAB has two kinds of command files, called M-files. The first is a script M-file. If you save a bunch of commands in a script file called MYFILE.m and then type the word MYFILE at the MATLAB command line, the commands in that file will be executed just as if you had run them each from the MATLAB command prompt (assuming MATLAB can find where you saved the file). A good way to work with MATLAB is to use it interactively, and then edit you session and save the edited commands to a script file. You can save the session either by cutting and pasting or by turning on the
diary feature (use the on-line help to see how this works by typing help diary).

The second type of M-files is the function file. One of the most important aspects of MATLAB is the ability to write your own functions, which can then be used and reused just like intrinsic MATLAB functions. A function file is a file with an m extension (e.g., MYFUNC.m) that begins with the word function.
function Z=DiagReplace(X,v)
% DiagReplace Replace the diagonal elements of a matrix X with a vector v
% SYNTAX:
% Z=DiagReplace(X,v);
n=size(X,1);
Z=X;
ind=(1:n:n*n) + (0:n-1);
Z(ind)=v;

You can see how this function works by typing the following code at the MATLAB command line:
m=3; x=randn(m,m);v=rand(m,1); x,v,xv=diagreplace(x,v)
Any variables that are defined by the function that are not returned by the function are lost after the function has finished executing (
n and ind in the example). Here is another example:
function x = rndint(k,m,n)
% RANDINT Returns an mxn matrix of random integers between 1 and k (inclusive).
% SYNTAX:
%   x= rndint(k,m,n);
% Can be used for sampling with replacement.
x=ceil(k*rand(m,n));

Documentation of functions (and scripts) is very important. In M-files a % denotes that the rest of the line is a comment. Comment should be used liberally to help you and others who might read your code understand what the code is intending to do. The top lines of code in a function file are especially important. It is here where you should describe what the function does, what its syntax is and what each of the input and output variables are. These top line become an online help feature for your function. For example, typing help randint at the MATLAB command line would display the four commented lines on your screen.

A note of caution on naming files is in order. It is very easy to get unexpected results if you give the same name to different functions, or if you give a name that is already used by MATLAB. Prior to saving a function that you write, it is useful to use the which command to see if the name is already in use.

MATLAB is very flexible about the number of arguments that are passed to and from a function. This is especially useful if a function has a set of predefined defaults values that usually provide good results. For example, suppose you write a function that iterates until a convergence criteria is met or a maximum number of iterations has been reached. One way to write such a function is to make the convergence criteria and the maximum number of iterations be optional arguments. The following function attempts to find the value of x such that ln(x)=ax, where a is a parameter.
function x=SolveIt(a,tol,MaxIters)
if nargin<3 | isempty(MaxIters), MaxIters=100; end
if nargin<2 | isempty(tol), tol=sqrt(eps); end
x=a;
for i=1:MaxIters
lx=log(x);
fx=x.*lx-a;
x=x-fx./(lx+1);
disp([x fx])
if abs(fx)<tol, break; end
end
In this example, the command nargin means "number of input arguments" and the command isempty checks to see is a variable is passed but is empty (an empty variable is created by setting it to []). An analogous function for the number of output arguments is nargout; many times it is useful to put a statement like
if nargout<2, return; end
into your function so that the function does not have do computations that are not requested.

It is possible that you want nothing or more than one thing returned from a procedure. For example
function [m,v]=MeanVar(X)
% MeanVar Computes the mean and variance of a data matrix
% SYNTAX
%     [m,v]=MeanVar(X);
n=size(X,1);
m=mean(X);
if nargout>1
temp=X-m(ones(n,1),:);
v=sum(temp.*temp)/(n-1);
end
To use this procedure call it with [mu,sig]=MeanVar(X). Notice that is only computes the variance if more than one output is desired. Thus, the statement mu=MeanVar(X) is correct and returns the mean without computing the variance.

In the following example, the function can accept one or two arguments and checks how many outputs are requested. The function computes the covariance of two or more variables. It can handle both a bivariate case when passed two data vectors and a multivariate case when passed a single data matrix (treating columns as variables and rows as observations). Furthermore it returns not only the covariance but, if requested, the correlation matrix as well.
function [CovMat,CorrMat]=COVARIANCE(X,Y)
% COVARIANCE Computes covariances and correlations
n=size(X,1);
if nargin==2
X=[X Y]; % Concatenate X and Y
end
m=mean(X); % Compute the means
X=X-m(ones(n,1),:); % Subtract the means
CovMat=X'*X./n; % Compute the covariance
if nargout==2 % Compute the correlation, if requested
s=sqrt(diag(CovMat));
CorrMat=CovMat./(s*s');
end
This code executes in different ways depending on the number of input and output arguments used. If two matrices are passed in, they are concatenated before the covariance is computed, thereby allowing the frequently used bivariate case to be handled. The function also checks whether the caller has requested one or two outputs and only computes the correlation if 2 are requested. Although it would not be a mistake to just go ahead and compute the correlation, there is no point if it is not going to be used. Unless additional output arguments must be computed anyway, it is good practice to compute them only as needed. Some examples of calling this function are
c=COVARIANCE(randn(10,3));
[c1,c2]=COVARIANCE((1:10)',(2:2:20)');

Good documentation is very important but it is also useful to include some error checking in your functions. This makes it much easier to track down the nature of problems when they arise. For example, if some arguments are requires and/or their values must meet some specific criteria (they must be in a specified range or be integers) these things are easily checked. For example, consider the function DiagReplace listed above. This is intended for a square matrix (nxn) X and an n-vector v. Both inputs are needed and they must be conformable. The following code puts in error checks.
function Z=DiagReplace(X,v)
% DiagReplace Replace the diagonal elements of a matrix X with a vector v
% SYNTAX:
% Z=DiagReplace(X,v);
if nargin<2, error(‘2 inputs are required’); end
n=size(X,1);
if size(X,2)~=n, error(‘X is not square’); end
if prod(size(v))~=n, error(‘X and v are not conformable’); end
Z=X;
ind=(1:n:n*n) + (0:n-1);
Z(ind)=v;

The command
error in a function file prints out a specified error message and returns the user to the MATLAB command line.

An important feature of MATLAB is the ability to pass a function to another function. For example, suppose that you want to find the value that maximizes a particular function, say f(x)= x*exp(-0.5x2). It would useful not to have to write the optimization code every time you need to solve a maximization problem. Instead, it would be better to have solver that handles optimization problems for arbitrary functions and to pass the specific function of interest to the solver. For example, suppose we save the following code as a MATLAB function file called MYFUNC.m
function fx=myfunc(x)
fx=x.*exp(-0.5*x.^2);
Furthermore suppose we have another function call MAXIMIZE.m which has the following calling syntax
function x=MAXIMIZE(f,x0)
The two arguments are the name of the function to be maximized and a starting value where the function will begin its search (this is the way many optimization routines work). One could then call MAXIMIZE using
x=maximize(‘myfunc’,0)
and, if the MAXIMIZE function knows what it’s doing, it will return the value 1. Notice that the word myfunc is enclosed in single quotes. It is the name of the function, passed as a string variable, that is passed in. The function MAXIMIZE can evaluate MYFUNC using the feval command. For example, the code
fx=feval(f,x)
is used to evaluate the function. It is important to understand that the first argument to
feval is a string variable (you may also want to find out about the command eval, but this is only a primer, not a manual).

It is often the case that functions have auxiliary parameters. For example, suppose we changed MYFUNC to
function fx=myfunc(x,mu,sig)
fx=x.*exp(-0.5*((x-mu)./sig).^2);
Now there are two auxiliary parameters that are needed and MAXIMIZE needs to be altered to handle this situation. MAXIMIZE cannot know how many auxiliary parameters are needed, however, so MATLAB provides a special way to handle just this situation. Have the calling sequence be
function x=MAXIMIZE(f,x0,varargin)
and, to evaluate the function, use
fx=feval(f,x,varargin{:})
The command
varargin (variable number of input arguments) is a special way that MATLAB has designed to handle variable numbers of input arguments. Although it can be used in a variety of ways the simplest is shown here, where it simply passes all of the input arguments after the second on to feval. Don’t worry too much if this is confusing at first. Until you start writing code to perform general functions like MAXIMIZE you will probably not need to use this feature in your own code, but it is handy to have an idea of what its for when you are trying to read other peoples’ code.

DEBUGGING
Bugs in your code are inevitable. Learning how to debug code is very important and will save you lots of time and aggravation. Debugging proceeds in three steps. The first ensures that your code is syntactically correct. When you attempt to execute some code, MATLAB first scans the code and reports an error the first time it finds a syntax error. These errors, known as complie errors, are generally quite easy to find and correct (once you know what the right syntax is). The second step involves finding errors that are generated as the code is executed, known as run-time errors. MATLAB has a built-in editor/debugger and it is the key to efficient debugging of run-time errors. If your code fails due to run time errors, MATLAB reports the error and provides a trace of what was being done at the point where the error occurred. Often you will find that an error has occurred in a function that you didn’t write but was called by a function that was called by a function that was called by a function (etc.) that you did write. A safe first assumption is that the problem lies in your code and you need to check what your code was doing that led to the eventual error.

The first thing to do with run-time errors is to make sure that you are using the right syntax in calling whatever function you are calling. This means making sure you understand what that syntax is. Most errors of this type occur because you pass the wrong number of arguments, the arguments you pass are not of the proper dimension or the arguments you pass have inappropriate values. If the source of the problem is not obvious, it is often useful to use the debugger. To do this, click on File and the either Open or New from within MATLAB. Once in the editor, click on Debug, then on Stop if error. Now run your procedure again. When MATLAB encounters an error, it now enters a debugging mode that allows you to examine the values of the variables in the various functions that were executing at the tie the error occurs. These can be accessed by selecting a function in the stack on the editor's toolbar. Then placing your cursor over the name of a variable in the code will allow you to see what that variable contains. You can also return to the MATLAB command line and type commands. These are executed using the variables in the currently selected workspace (the one selected in the Stack). Generally a little investigation will reveal the source of the problem (as in all things, it becomes easier with practice).

There is a third step in debugging. Just because your code runs without generating an error message, it is not necessarily correct. You should check the code to make sure it is doing what you expect. One way to do this is to test it one a problem with a know solution or a solution that can be computed by an alternative method. After you have convinced yourself that it is doing what you want it to, check you documentation and try to think up how it might cause errors with other problems, put in error checks as appropriate and then check it one more time. Then check it one more time.

A few last words of advice on writing code and debugging.
(1) Break your problem down into small chunks and debug each chunk separately. This usually means write lots of small function files (and document them).
(2) Try to make functions work regardless of the size of the parameters. For example, if you need to evaluate a polynomial function, write a function that accepts a vector of values and a coefficient vector. If you need such a function once it is likely you will need it again. Also if you change your problem by using a fifth order polynomial rather than a fourth order, you will not need to rewrite your evaluation function.
(3) Try to avoid hard-coding parameter values and dimensions into your code. Suppose you have a problem that involves an interest rate of 7%. Don’t put a lot of 0.07s into your code. Later on you will want to see what happens when the interest rate is 6% and you should be able to make this change in a single line with a nice comment attached to it, e.g.,
rho=0.07;               % the interest rate
(4) Avoid loops if possible. Loops are slow in MATLAB. It is often possible to do the same thing that a loop does with a vectorized command. Learn the available commands and use them.
(5) RTFM – internet lingo meaning Read The (F-word of choice) Manual.
(6) When you just can’t figure it out, check the MATLAB technical support site (
MathWorks), the MATLAB discussion group (comp.soft-sys.matlab) and DejaNews for posting about your problem and if that turns up nothing, post a question to the discussion group. Don’t overdo it, however; people who abuse these groups are quickly spotted and will have their questions ignored. Also don’t ask the group to solve your homework problems; you will get far more out of attempting them yourself then you’ll get out of having someone else tell you the answer. You are likely to be found out anyway and it is a form of cheating.

OTHER DATA TYPES
So far we have only used variables that are scalars, vector or matrices. MATLAB also recognizes multidimensional arrays. Element by element arithmetic works as usual on these arrays (including addition and subtraction, as well as boolean arithmetic). Matrix arithmetic is not clearly defined for multidimensional arrays and MATLAB has not attempted to define a standard. If you try to multiply two multidimensional arrays, you will generate an error message. Working with multi-dimensional arrays can get a bit tricky but is often the best way to handle certain kinds of problems. An alternative to multi-dimensional arrays is what MATLAB calls a cell array. A multidimensional array contains numbers for elements. A cell array is an array (possibly a multi-dimensional one) that other data structures as elements. For example, you can define a 2x1 cell array that contains a 3x1 matrix in it first cell (i.e., as element (1,1)) and a 4x4 matrix in its second cell. Cell arrays are defined using curly brackets rather than square ones, e.g.,
x={[1;2],[1 2;3 4]};

Other data types are available in MATLAB include string variables, structure variables and objects. A string variable is self-explanatory. Structure variables are variables that have named fields that can be referenced. For example, a structure variable, X, could have the fields DATE and PRICE. One could then refer to the data contained in these filed using X.DATE and X.PRICE. If the structure variable is itself an array, one could refer to fields of an element in the structure using X(1).DATE and X(1).PRICE.

Object type variables are like structures but have methods attached to them. The fields of an object cannot be directly accessed but must be access using the methods associated with the object. Structures and objects are advanced topics that are not needed to get started using MATLAB. They are quite useful if you are trying to design user friendly functions for other users. It is also useful to understand objects when working with MATLAB's graphical capabilities, although, again, you can get pretty nice plots without delving into how objects work.

AN EXTENDED EXAMPLE
The following example is a bit more elaborate and is actually useful code. It constructs estimates of the mean and standard error of a statistic using a resampling (bootstrapping) method. It then demonstrates the use of the procedure with an illustration from a journal article. The code uses the technique of passing a function to another function. This allows the bootstrapping program to be reused for different statistics without having to alter any of its code. This was discussed in general terms in the section above on functions. Recall from that discussion that it is the name of a function file that is passed and then the command
feval is used to evaluate the function. In this case the name of the function is passed to BOOT as its second argument, stat.

Here we have defined a function and saved it to a file named BOOT.m

% BOOT Computes bootstrap estimates
% Bootstrapped mean and standard error of a user specified statistic.
% SYNTAX
% [mean,stderr]=boot(data,stat,rep);
% Inputs:
%     DATA - a matrix containing the data used in computing
%        the statistic
%     STAT - a procedure with a single input (a data matrix) that
%        computes the desired statistic (can return a vector)
%    REP - the number of bootstrap replications performed
% Outputs:
%    MEAN - the simulated mean value of the statistic
%    STDERR - the bootstrap standard error of the statistic
function [mean,stderr]=boot(data,stat,rep);
ndata=size(data,1);
cum=0;
cum2=0;
i=0;
for i=1:rep
ind=randint(ndata,ndata,1);
s=feval(stat,data(ind,:));
cum=cum+s;
cum2=cum2+s.^2;
end
mean=cum./rep;
stderr=sqrt((cum2-(cum.^2)/rep)/(rep-1));

Now define a function to compute the correlation coefficient for 2 variable stored in the columns of an nx2 matrix X. Save this file as RHO.M.

function rho=CorrStat(x);
% CorrStat Correlation coefficient
n=size(x,1);
m=mean(x);
x=x-m(ones(n,1),:);
V=x’*x;
rho=V(2,1)/sqrt(V(1,1)*V(2,2));

The following is a script file that can be executed by typing its file name (without extension) at the MATLAB command line. Notice how it passes the name CorrStat to the function BOOT.

% Example from
% Efron, B. and R. Tibshrani.
% "Bootstrap Methods for Standard Errors, Confidence Intervals
% and Other measures of Statisical Accuracy."
% Statistical Science, 1(1986): 54-77.
% Law School Data: bootstrap estimates of the correlation
%     between LSAT and GPA scores.
% The sample correlation coefficient is 0.776.
% Normal distribution theory yields a standard error
%     for this statistic of 0.115.
% The authors reported a bootstrap estimate of 0.127.
lawdata=[
576 3.39
635 3.30
558 2.81
578 3.03
666 3.44
580 3.07
555 3.00
661 3.43
651 3.36
605 3.13
653 3.12
575 2.74
545 2.76
572 2.88
594 2.96];
[m,s]=boot(lawdata,’CorrStat’,10);
disp(‘Bootstrap estimate of the standard error:‘)
disp(s);

Copyright, 1998, Paul L. Fackler, North Carolina State University.
Last Modified: December 28, 1998