Looking at your data: table vs. tabulate
Wednesday, 19 November 2008
Last night I got an e-mail from a reader, asking how to keep Stata from clipping long string values in two-way tables. This is his code:
clear
set obs 10
gen a = ""
replace a = "a123456789b123456789c123456789" if _n<5
replace a = "987654321x987654321y987654321z" if _n>=5
gen b = "value " + string(int(_n/4))
tab a b
tab a
You can see that the a values are clipped in the two-way table. There's an easy way to fix that, but first, here's a little digression on how Stata does things; tabulate is one of the commands that have been around since Stata was born. It is written in C and it is part of the executable -- unlike the commands written in the interpreted ado language, such as table, whose source files are stored in the /base folder. When you type the first few letters of C commands, Stata assumes that you want the C commands, so you can use abbreviations -- "tab" for "tabulate", "di" for "display", etc. Stata won't presume to call the command "table" instead of "tabulate" in response to your order of "tab". In this case, though, "table" is the one you want. Namely,
table a b, stubwidth(32)
Setting the stub width to something suitably large will prevent the clipping of a.
No. 1 — November 19th, 2008 at 12:27 pm
Here is a twist: it seems that you do not need the stubwith(), unless you want to cut the length down.
So, the solution is to just use 'table' instead of 'tab'.
No. 2 — January 27th, 2010 at 12:47 pm
I have a similar problem: How prevent Stata from clipping long variable names in 'tab' and 'summary' etc.?
No. 3 — January 27th, 2010 at 2:48 pm
For "tab", do as I suggest above: use table instead. For "summarize" you could use "tabstat" instead. See "help tabstat" for a description of the options. Either labelwidth(), varwidth() or longstub might work for you, depending on how you're trying to summarize the data.
No. 4 — January 27th, 2010 at 4:19 pm
OK, I'll try that. Thanks!