Data from Excel to Stata, take two
Wednesday, 12 November 2008
I don't like spreadsheets. I like data sets and matrices. But I get data in Excel workbooks all the time, and it's a pain in the neck. This will never go away altogether, but if my Stata-using colleagues could be persuaded to quit sending me spreadsheets, that would at least cut down on the volume. I've tinkered with all kinds of ways to make data extraction from spreadsheets into Stata easier. This is the latest:
clear
set mem 10m
set more off
set type double
local here "`c(pwd)'/"
// declare name and date of Excel file to read from
local xlsfile "revised PriceIncr20080815-TargetTempl.xls"
local filedate "20081110"
// list the worksheets of interest
local sheets ""Move these data" "And these too""
// Extract data from Excel to Stata. Leave this part alone.
local sheetno: list sizeof sheets
forvalues i=1/`sheetno' {
local k: word `i' of `sheets'
local my_path "`here'`xlsfile'"
local my_sheet "`k'"
local source "Excel Files;DBQ=`my_path'"
clear
odbc load, dsn("`source'") table("`my_sheet'$")
unab varz: _all
foreach z in `varz' {
local lower=lower("`z'")
if "`lower'"!="`z'" {
rename `z' `lower'
}
}
describe
compress
save "`here'`k'`filedate'.dta", replace
}
Notice the c(pwd) function. It returns the path to the working directory. Essentially, the do-file above translates the worksheets listed in the local `sheets' into Stata files saved in the same directory as the do-file is in. The original Excel file may be in the same directory, as in this example, or it may be somewhere else. The `filedate' local is optional. Also optional, of course, is the loop that turns variable names to lower case. I just like mine that way.
No. 1 — July 31st, 2009 at 7:10 am
Hi,
i have seen your routine (which) is great and i am trying to use it for an excel file of 1,357 KB size. It contains quite a few sheets and i think this is the reason why when i run the routine Stata gives me the following error message (after have worked successfully just on one sheet. )
[Microsoft][ODBC Excel Driver] Too many fields defined.
Do you know how to solve this problem?
Thank you for your help.
Regards
Enrico
No. 2 — July 31st, 2009 at 7:41 am
I've never seen that problem, but as it happens I'm in Washington, DC at the Stata conference. I'll ask around and get back to you. Somebody here must have run into this before.
No. 3 — August 2nd, 2009 at 4:34 pm
I never got around to asking anybody else about this. From your message, it looks like it's an ODBC driver error, not a Stata error, so I tried to replicate it.
I had no luck, but since it's an ODBC drive error, you know that the problem is either in finding the correct workbook, or in finding the correct worksheet. So here's what I would do:
Change the code below the line that reads "leave this part alone" as follows:
// first check that Stata found the workbook:
local my_path "`here'`xlsfile'"
capture confirm file "`my_path'"
if _rc==0 {
di "OK, my Excel file is in the right place."
}
else {
di "I am not looking in the right place."
}
// Next, check that I listed the correct tabs:
di "And these should be my tabs: "
forvalues i=1/`sheetno' {
local k: word `i' of `sheets'
di "Tab `i' should be called `k'. Is it?"
}
Running this code should shed some light on what's wrong. Please let me know how this goes.