Consider ado-files
A while ago I suggested a particular do-file architecture that seemed to work well for me at the time. The post is here.
That architecture still works fine, but an obvious improvement suggested itself since I proposed it. I've been finding that some jobs that I had encapsulated in programs are so ubiquitous that I could have just as well saved them as ado files. I had resisted doing that, because I tend to treat ado-files with a bit of reverence. Code, I think, is ado-file worthy if it is general enough and has proper online help.
Maybe I should ease up on that rule though. Maybe code that meets those requirements is actually package-worthy, ready to be distributed to the rest of the world, while the standards for ado-files should be more relaxed.
As long as your code does a job that's general within the scope of a given job, maybe it could go into an ado file that is available to that job and not others. There is an easy way to do that. Typing adopath
in the Stata command line will list all the places where Stata will look for code for any commands you throw at it. Ado-files that you write go into the PERSONAL folder by default, but you can send them anywhere, as long as you tell Stata where to look for them.
Time for an example. I am working on a project where some do-files are built automatically (this is seldom necessary, by the way). Stata has some internal system limits (type help limits
to see them all) and one of them is the restriction that a program cannot have more than 3,500 lines. This is very seldom a binding restriction. In fact, it should never even be an issue. But let's say you're stuck with legacy ways of writing code and adding to existing do-files, and you might run into this limit.
In such cases it may be interesting to have a routine for counting the lines in any do-file. Since that is done in the same way regardless of the content of the do-file, this job lends itself to being packaged into an ado-file. That ado file might look like this:
// counts lines in an ascii file (.do, .txt, .csv, etc)
// one argument: file name with path as needed
// (use full path for safety; spaces are OK)
capture prog drop lineCount
program lineCount, rclass
version 9.2
local filename `0'
tempname fh
local linenum=0
file open `fh' using "`filename'", read
file read `fh' line
while r(eof)==0 {
local linenum=`linenum'+1
file read `fh' line
}
file close `fh'
return local count `linenum'
end
So now, in your current do-file, you can do things such as
foreach client in `clients' {
local this_client_file "`my_file_path'`client'_data_cleaning.do"
lineCount `this_client_file'
display "lines in `client'_data_cleaning.do: " `r(count)'
}
Now, under the old system, lineCount would have been a program defined in the Section 3 of either the current do-file or another do-file called by this one. But if I'm going to use it all the time and it never changes, making it a stand-alone ado-file instead makes sense. There are two steps for that.
First, in the current project folder (let's say I defined is as the local `project_root') I set up a sub-folder called ado. Next, I set up a sub-folder called simply l, as in the first letter of lineCount. This may be overkill, but I like Stata's idea of grouping ado-files into subfolders by first letter. It's cleaner. Next, I save the program above into `project_root'/ado/l/ with the name lineCount.ado.
Second, I need to let Stata know where to look for it. That is as simple as adding
adopath + "`project_root'/ado"
at the top of my current do-file.
As time goes by and I find more jobs that could be handled this way, I can set aside their programs and save them as ado-files into sub-folders starting with their first letters. Stata only needs you to point it in the right direction. If it doesn't find your command in the ado folder, it will look for a sub-folder named after the first letter of your command.
Ado-files will keep my do-files less cluttered and will make it easier to both recycle old code and debug new one.