Go to the first, previous, next, last section, table of contents.


3 Programming

Q3.1: Can I read a text file with a certain format?

Yes. If the file has nothing but numbers separated by whitespace, and has a constant number of columns through the entire file, you can just type load myfile.txt.

The function DLMREAD is more flexible and allows you to read files with fields delimited by any character.

The function TEXTREAD is more flexible still and allows you to skip lines at the beginning, ignore certain comment lines, read text as well as numbers, and more. Type help textread for more info.

If none of these suit your needs, you can always use the low-level file I/O functions FOPEN, FREAD, FSCANF, FGETL, FSEEK and FCLOSE to read the data however you would like.

Q3.2: Why are global variables bad?

us writes in msgid:<90mer5$1u8$1@nnrp1.deja.com> (lightly edited):

I'll always fondly remember the outburst of the notorious and seasoned user Lars Gregersen to a recent poster (that most likely started this question) and will give it on to my students:

The short answer is: Don't use global variables.

The longer answer is: Don't use global variables unless you absolutely have to.

Now,

1) Using globals is perfectly ok (that's why they here on first place), just like for's and while's and other intelligible 2nd generation computer language constructs (that MATLAB is built on).

2) Using globals is a blasphemy from a programmer's point of view because it shows that she/he didn't think ahead but was rather sloppy... and just kept adding on those never-ending "important set up" parameters that she/he needed to use by all the axes - or whatever - of a one single project ...

3) Using globals is a problem in terms of book-keeping if(f) you end up having zillions of them hovering around in your workspace, and start having problems because one <my_par> from func1 is mixed up with <my_par> from func2 ... and <whos global> shows you 20 pages worth of variables ...

4) Using globals won't work in certain ML contexts, such as gui callbacks

[Respectfully edited]

Hence, eventually collecting your globals into a struct (i.e., cleaning up your toolbox) and passing it on as a parameter of your functions is the most economic way to program in MATLAB ... and it shows your intelligence, to boot.

Q3.3: What about this logical array business?

From the Getting Started book:

The logical vectors created from logical and relational operations can be used to reference subarrays. Suppose X is an ordinary matrix and L is a matrix of the same size that is the result of some logical operation. Then X(L) specifies the elements of X where the elements of L are nonzero.

To remove the logical flag from an array, don't add 0 as suggested by help logical in MATLAB 5.x. This actually adds zero to each element of the matrix. Instead, use a + in front, as in +x. This is a solution given in help logical in MATLAB 6.

Q3.4: Huge memory waste using array of structs?

The following example was posted to the newsgroup:

I've discovered to my horror that structs take up an obscene amount of overhead (I'm running version 5.3.1.29215a (R11.1) on a Dec ALPHA). I have a set of 10,242 observations, each consisting of 3+13=16 fields, which have 3*27 + 1*13 = 94 values. So the total size in bytes should be 10,242 * 94 * 8 bytes/double = 7,701,984.

I have this stored in a 1 x 10242 data structure, and when I issue the whos command, it tells me that the data now takes up 27,367,136 bytes!

Cris Luengo answers:

My guess would be that a structure contains MATLAB arrays. Each array has some overhead, like data type, array sizes, etc. In your second implementation (index using data.latitude(observation)), there are 10,242 times less arrays allocated. Note that in your data, for each observation, you have 13 arrays with one value. I don't know how large the matrix header exactly is, but it is a waste putting only a single value in it!

I think Cris has hit it exactly. Every MATLAB matrix has an overhead of ~100 bytes, even matrices with a single element. In this example, there are 16 fields * 10242 elements = 163872 matrices. Each one of these matrices adds an additional 100 bytes, for 16.4Mbytes in pure overhead. This still comes up a little short of the amount reported, but it is fairly close.

It is much more efficient, both for storage and computation, to use a struct of arrays rather than an array of structs.

Q3.5: Why is my MEX file crashing?

Memory errors are one likely reason. Greg Wolodkin suggests the debug memory manager:

The platform-independent way to use the debug memory manager is to set the environment variable MATLAB_MEM_MGR to contain the string "debug".

On Windows:

  C:\> set MATLAB_MEM_MGR=debug
  C:\> matlab

On Unix with csh or tcsh:

  % setenv MATLAB_MEM_MGR debug
  % matlab

On Unix with sh or bash:

  $ MATLAB_MEM_MGR=debug matlab

The debug memory manager cannot catch your code the instant it writes out of bounds (tools like Purify can do this but the performance hit they induce is quite painful). What it will catch is that in general, when you write outside of one memory block you end up writing into another, corrupting it or (in the case of the debug memory manager) hopefully corrupting only a guard band. When you later free the memory, we can tell you that you walked off the end of the block and corrupted the guard band.

Q3.6: How can I create variables A1, A2,...,A10 in a loop?

Don't do this. You will find that MATLAB arrays (either numeric or cell) will let you do the same thing in a much faster, much more readable way. For example, if A1 through A10 contain scalars, use:

A = zeros(1,10);        % Not necessary, just much faster
for i=1:10
  A(i) = % some equation
end

Now refer to A(i) whenever you mean Ai. In case each Ai contains a vector or matrix, each with a different size, you want to use cell arrays, which are intended exactly for this:

for i=1:10
  A{i} = 1:i;
end

Note that each A{i} contains a different size matrix. And be careful to use the curly braces for the subscript!

Now, if you still really want to create variables with dynamically generated names, you need to use eval. With eval, you use MATLAB commands to generate the string that will perform the operation you intend. For example, eval('A=10') has the same effect as A=10, and eval(['A' 'B' '=10']) has the same effect as AB=10, only the eval method executes much more slowly. So in a loop, you could use:

for i=1:10
  eval(sprintf('A%d = [1:i]', i));
end

Notice how much more obfuscated this is. Repeat: don't do this unless you have a very good reason (such as someone gives you a MAT file with 2000 variables named A1428, for example).

Q3.7: Do boolean operators short-circuit?

In many programming languages, boolean operators like AND and OR will stop evaluating as soon as the result is known. For instance,

1 | error('Short-circuit')

would never get to the error part, since the 1 is always true.

MATLAB versions >= 6.5 include the new short-circuiting logical operators || and &&. Use these for all condition tests in loops and similar, and use the old | and & for element-by-element logical operations. You can find details at http://www.mathworks.com/access/helpdesk/help/base/relnotes/matlab/matlab134.shtml#53717

In older versions of MATLAB, the boolean operators | and & are only short-circuit evaluated inside the conditions of IF and WHILE statements. In all other contexts, all parts of the conditional are evaluated. For an interesting discussion on this topic, see

msgid:<twhf7jkimx.fsf@frenchslinux.dhcp>

Q3.8: How do I fix "Out of Memory" problems?

A frequent variant of this question is: "I have 512M of RAM, and 2G of swap space. Why can't I create this 200M matrix?"

Simple answers first: Remember that double precision floats take up 8 bytes. So a million element vector takes up 8Mbytes. Be sure you're estimating properly.

Many operations need to create duplicate matrices. For example, B=inv(A.') must create a tempory variable the same size as A to hold the transpose, and B is again, the same size as A.

If you're sure your matrices are reasonably sized, then read all of TMW Tech Note 1106, a great reference: http://www.mathworks.com/support/tech-notes/1100/1106.shtml

Q3.9: How do I dynamically generate a filename for SAVE?

You're probably trying

fname = 'foobag';
save fname variable;

To do this correctly, you need to use the "functional" form of save:

fname = 'foobar';
save(fname, 'variable');

In fact, it is true in general that the following two lines are equivalent:

command str1 str2 str3
command('str1', 'str2', 'str3')

This allows one replace any or all of the parameters with dynamically generated strings. This is also useful in commands like PRINT, LOAD, CLEAR, etc.

Q3.10: What's the difference between M-files, Pcode, and MEX files?

Suggested by Joshua Stiff:

Q3.11: Can MATLAB pass by reference?

This is really two questions in one. One is "Can I modify a function's input argument?" This would save memory and simplify programming in some cases. The answer here is "NO". If you modify the input argument of a function, all you do is modify a copy of the argument local to the function. The only way to modify variables from a function is to return the result when finished, as in

bigstruct = addelement(bigstruct, 5);

The other question is: "Pass by value wastes memory and time, since copies of variables are made. How can I fix this?" Here, the answer is "Your assumption is flawed, you don't need to." MATLAB uses a scheme called "copy-on-write" to optimize this sort of thing. Basically, data is shared between variables whenever possible, and a true copy is made only when one of the variables is modified. So although MATLAB's calling convention appears to be pass-by-value, if you don't modify the input variables, the data is never copied.

Q3.12: How can I process a sequence of files?

If you can generate the filename using an incrementing counter, use code like this:

for k=1:20
  fname=sprintf('/path-name/m%d.dat',k);

  data=load(fname);
  % or
  data=imread(fname);
  % or
  fid=fopen(fname, 'rb');
  fread(fid, ...);
end

If instead you want to process all the files in a directory, you might instead wish to use dir:

d=dir('*.jpg');
for k=1:length(d)
  fname=d(k).name;
  % ...
end


Go to the first, previous, next, last section, table of contents.