How to handle large datasets / memory intensive computations in STATA.

Stata typically loads all data into memory (unlike SAS), and it's ability to perform computations is constrained by the amount of physical memory it can access on your PC.  There are two scenarios where this might be a problem.  First, if you are loading a really huge data set, you may not be able to allocate enough memory to Stata to import the data.  Second, you may be working on a more modest data set, but may be using a routine such as xtabond2 which demands huge amounts of memory. The Stata website discusses the memory problem in general here.

The typical solution in either case is to change the set memory command.  For example "set memory 1000M" which will allocate about 1GB to stata.  However, on a computer running a 32 bit operating system, there is a software limit to how high you can set the memory.  In most cases, it is unlikely that you will be able to allocate more than about 1.4GB to stata (i.e. set memory 1400M) regardless of how much physical RAM your computer has.   This constraint is due to the nature of a 32 bit operating system - it applies to windows as well as linux.  Most desktop computers running Windows XP or Vista are running the 32 bit version, even though most desktops and laptops purchased in recent years are actually fully capable of running a 64 bit O/S.

The solution:
1. Add more physical RAM - 4, 8, 16 GB - whatever you can, AND,
2. Run a 64 bit version of Stata in a 64 bit operating system.

Item 1 is pure economics, but most machines can be upgraded to 4GB RAM for relatively little cost.  Item 2 is a bit more daunting.  You could replace your O/S with a 64 version - such as Windows Vista or XP with a 64 bit version.  This could be expensive, could involve creating a dual boot scenario with two partitions, and may annoy your in house IT folks!

There is a more simple solution - install Ubuntu Linux using wubi.  Ubuntu linux is a free linux O/S - and is one of the most popular and most well supported versions.   You could install it on your computer in a dual boot setup, but this would involve creating a dedicated partition.  While this is not too hard - it can result in disaster in rare circumstances if you do some stupid like delete the wrong partition.  Wubi is a Ubuntu installer that installs ubuntu from inside windows - just like you would install any normal windows program.   Then, when you reboot, a choice between windows and ubuntu pops up before the computer boots into windows.  You select ubuntu and you are now running linux.  The version of ubuntu that comes with wubi is the 64 bit version.

Note that ubuntu is not running on top of windows.   When you boot into ubuntu you are running a pure linux setup.  The only difference is that the ubuntu files are sitting in the windows partition.  To remove ubuntu, you just boot into windows and then go to add/remove programs and remove wubi - just as if you were removing any windows software.

Finally, once you have ubuntu up and running, you need to order your 64 bit version of stata for linux.  This will cost about $400 for the SE version if you are upgrading.  You can only install this in linux, as it won't work in windows.   Installation is simple, just follow the instructions, but you need to know a few basics of linux before you embark on this - so it is worth figuring out how to run commands from the terminal and what a "superuser" is.   A simple intro to ubuntu is here (well worth a read if you are a linux newbie).   If you have questions you can often get them answered on the ubuntu forum..  

A couple of notes:
1. Wubi does it all - you don't need to create a ubuntu disk or download anything - just run the wubi installer, answer a couple of simple prompts and you are done.
2. Running 64 bit stata in a 32 bit O/S won't work.
3. Windows 7 is rumored to be coming as with a 64 bit version as default.  Either way, your next O/S must be 64 bit!
4. You must follow the install instructions for Stata very closely.  The "sudo su" command is important!

Best of luck and don't hesitate to email me if you have any comments on this.