Using Stata for Windows shell scripting
Thursday, 11 June 2009
Occasionally a need arises to move things between computers. You might, for example, be getting about 20 gigabytes of data per week, use it to re-run your model so your parameter estimates stay up to date, and then you must make room for next week's batch.
For this sort of job you probably have a remote RAID system where data is compressed automatically as it arrives. But you need to get it there somehow, and there is an easy way to do so from within the very Stata do-file where your model runs. Here's one way:
// i have three sub-folders and two loose files in the folder "experiment"
// i'm going to xcopy the whole thing to the archive, then rmdir it
local data_from "D:\data\\"
local data_to "\\nairobi\archive\\"
local data_folder1 experiment
// assemble `from_here', `to_here' file paths
foreach where in from to {
local `where'_here "`data_`where''`data_folder1'"
di "path `where': ``where'_here'" // this is just for checking
}
// now move stuff
!xcopy "`from_here'" "`to_here'" /e /y /i
!rmdir "`from_here'" /s /q
// notes on command-line options:
// xcopy /e -- copy all sub-directories, including empty ones
// xcopy /i -- copy to a folder that does not yet exist
// xcopy /y -- overwrite existing files at destination without prompting (!)
// rmdir /s -- forced delete of non-empty directory, ask for confirmation first
// rmdir /s /q -- forced delete of non-empty directory, ask no questions (!)
Some details: from a Windows client I am moving data to a Samba server named Nairobi. The Windows shell command "move" works for loose files but not for entire folder structures. Those need to be copied with "xcopy" to the destination computer and then deleted with "rmdir" from the source computer. Both of these commands have some options, as detailed above. I googled a bit, found them here, and figured I'd jot them down inside the Stata do-file for future reference.
There you have it. You can embed this into a loop, say, if you have more than one folder to move, maybe spread across more than one `data_from' file path.
No. 1 — June 16th, 2009 at 9:04 pm
I use Stata do-files for linux scripting (or Mac OSX via Terminal) much in the same way that you describe for Windows....just replace your !xcopy commands with !cp and change the !rmdir command to !rm (or !rm -d) . Also, instead of copying the file and then deleting the file, you could use the !mv command in to simply 'move' the files.
One thing I'd found is that as Stata and Stata users have expanded the files & directory tools, you can do most of this stuff within Stata without 'shelling out' to the OS. The advantage of using the Stata commands, rather than the OS commands, would be that your do-file would be cross-platform compatible (except for the file paths, which I'll talk about later).....not that this is something that everyone is worried about, but I work in a multi-platform environment, so I like to think about it for these types of set-ups...so, I guess your posting gives me a reason to think this through a bit.
So, you could use the Stata commands -copy- , -erase-, -rmdir- directly from the do-file rather than the OS-specific commands after a "!" .
The remaining issue for cross-platform compatibility of your script above would be the file paths. This makes me think that you could ask Stata to detect the OS (Windows or Mac) at the start of the do-file and then adjust accordingly, here's an example using your code:
// FIRST, DETECT THE OS, THEN CHOOSE THE APPROPRIATE FILE PATHS
if "`c(os)'" == "MacOSX" {
local data_from "/users/username/data/"
local data_to "\\nairobi\archive\\"
local data_folder1 experiment
}
if "`c(os)'" == "Windows" {
local data_from "D:\data\\"
local data_to "\\nairobi\archive\\"
local data_folder1 experiment
}
// assemble `from_here', `to_here' file paths
foreach where in from to {
local `where'_here "`data_`where''`data_folder1'"
/*
---> I think you could use the -pathjoin()- command too..but I havent tried it (?)
*/
di "path `where': ``where'_here'" // this is just for checking
}
// now move stuff
copy "`from_here'" "`to_here'" , replace //--> using the Stata command instead of OS command
erase "`from_here'" // --> You can use -rmdir- if the folder is empty
//Now, it should work on both MacOS/Linux & Windows.
No. 2 — June 17th, 2009 at 10:47 am
Thank you for a fine example of the use of `c(os)'. I've been experimenting with both Linux and FreeBSD with a mind to switching away from the Windows desktop one of these days. The impetus for that came from the early reviews of Vista. But it's a tough call. My Windows XP Pro machine works fine: it has required by far the least tinkering. BitDefender keeps is safe and Diskeeper has eliminated hard drive fragmentation. That took care of the only two complaints I ever had about Windows. In addition, there are a few things that I must run under Windows right now: iTunes and Skype come to mind. But the Unix command line opened up a world of options and I'm drawn. So the compromise I cooked up after I wrote this post would be this: Instead of "shelling out" piecemeal with !, I made Stata write one .bat file and ran the whole thing all at once with winexec. That worked, so I decided that the next step would be to make Stata write a shell script that I could run via Cygwin somehow. I'll post it here when it's done. I do agree with your reasons to favor Stata alternatives over shell commands when they're available.