<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Stata Things &#187; odbc load</title>
	<atom:link href="http://enoriver.net/index.php/tag/odbc-load/feed/" rel="self" type="application/rss+xml" />
	<link>http://enoriver.net</link>
	<description>computing for fun and profit</description>
	<lastBuildDate>Mon, 07 May 2012 13:43:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Building TOra with Oracle support on Jaunty</title>
		<link>http://enoriver.net/index.php/2009/10/29/building-tora-with-oracle-support-on-jaunty/</link>
		<comments>http://enoriver.net/index.php/2009/10/29/building-tora-with-oracle-support-on-jaunty/#comments</comments>
		<pubDate>Thu, 29 Oct 2009 16:55:00 +0000</pubDate>
		<dc:creator>Gabi Huiber</dc:creator>
				<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[odbc load]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://enoriver.net/?p=972</guid>
		<description><![CDATA[Today I had to install TOra with Oracle support. That was interesting. TOra is on the Synaptic list, but if you install it from there you only get PostgreSQL support. For anything else you need to build TOra from source. Instructions are here, courtesy of Brad Hudson. Follow them. In particular, do not skip installing [...]]]></description>
			<content:encoded><![CDATA[<p>Today I had to install <a href="http://torasql.com/">TOra</a> with Oracle support. That was interesting. TOra is on the Synaptic list, but if you install it from there you only get PostgreSQL support. For anything else you need to build TOra from source. Instructions are <a href="http://www.pythian.com/news/3869/installing-tora-with-oracle-support-on-ubuntu-904-jaunty-jackalope">here</a>, courtesy of Brad Hudson.  Follow them. In particular, do not skip installing the prerequisite packages on his list. Use his code to get them all in one step.</p>
<p>You also need to download the Oracle Instant Client. Download all three rpm files -- basiclite, devel and sqlplus -- not just the first. You do need all three. Finally, the <code>./configure [...]</code> line in <code>debian/rules</code> took some fiddling. I read <a href="http://setdosa.blogspot.com/2008/02/installing-database-client-tora-in.html">this</a> post by Arun Mallikarjunan and, emboldened, I tried a few variations of my own. This is what worked:</p>
<pre><code>
./configure --prefix=/usr --with-oracle=/usr/lib/oracle/11.2/client
--with-instantclient --without-rpath --disable-new-check --without-kde
--with-oracle-includes=/usr/include/oracle/11.2/client
--with-oracle-libraries=/usr/lib/oracle/11.2/client/lib --enable-libsuffix=
</code></pre>
<p>The above is all on one long line. Incidentally, Arun commented <a href="http://www.pythian.com/news/1419/installing-tora-with-oracle-support-on-ubuntu-804lts-hardy-heron#comment-329233">here</a> that in addition to the prerequisites Brad listed, you need a few others. Brad disagreed <a href="http://www.pythian.com/news/1419/installing-tora-with-oracle-support-on-ubuntu-804lts-hardy-heron#comment-330809">here</a>. He is right that the extra packages on Arun's list not are required. But if you install them, you also get MySQL and ODBC support enabled, so I did. One time last year I had to <code>odbc load</code> some data from a MySQL database into Stata (I found help for that <a href="http://www.ats.ucla.edu/stat/Stata/faq/odbc.htm">here</a>) and I figured the need might come up again. Finally, I cleaned up after myself with help from <a href="http://maketecheasier.com/8-ways-to-maintain-a-clean-lean-ubuntu-machine/2008/10/07">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://enoriver.net/index.php/2009/10/29/building-tora-with-oracle-support-on-jaunty/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Data from MS Access to Stata</title>
		<link>http://enoriver.net/index.php/2008/11/21/data-from-ms-access-to-stata/</link>
		<comments>http://enoriver.net/index.php/2008/11/21/data-from-ms-access-to-stata/#comments</comments>
		<pubDate>Fri, 21 Nov 2008 06:38:15 +0000</pubDate>
		<dc:creator>Gabi Huiber</dc:creator>
				<category><![CDATA[Stata]]></category>
		<category><![CDATA[global macros]]></category>
		<category><![CDATA[macros]]></category>
		<category><![CDATA[MS Access]]></category>
		<category><![CDATA[odbc load]]></category>

		<guid isPermaLink="false">http://enoriver.net/?p=305</guid>
		<description><![CDATA[In my oldest post about the odbc load command I mentioned that you can pretty easily read data from MS Access into Stata, and even execute SQL commands from within Stata, but I dismissed the whole thing a bit too lightly. A recent look on Google Analytics showed that some people came here looking for [...]]]></description>
			<content:encoded><![CDATA[<p>In my oldest post about the <code>odbc load</code> command I mentioned that you can pretty easily read data from MS Access into Stata, and even execute SQL commands from within Stata, but I dismissed the whole thing a bit too lightly. A recent look on Google Analytics showed that some people came here looking for that sort of information and, judging by how fast they left, they didn't find what they were looking for. So here's another try.</p>
<p>Some context first: one of my ongoing projects is a suite of Stata do-files for helping certain daily newspapers optimize their subscriber retention, using survival analysis. I have several clients. Some of their needs are common, so they can be met with one do-file, others are not so common, so they need custom code. I use global macros as bridges between custom and common code. You need this introduction to see what I'm doing in my example below.</p>
<p>One client is sending me a weekly snapshot of its subscriber database in the MS Access format, in a file called Submaster_YYYYMMDD.mdb. Suppose that that file resides in c:/data/my access file comes from here/. Then I must read it into Stata and save it as subsYYYYMMDD.dta in c:/data/my stata file goes here/. Now suppose that I spread this project across several do-files, so I use some internal names for these file paths in the form of global macros defined once:</p>
<p><code><br />
global datafrom "c:/data/my access file comes from here/"<br />
global datato    "c:/data/my stata file goes here/"<br />
</code></p>
<p>Also suppose that I save the date of arrival part of the file name, YYYYMMDD, as a global macro:</p>
<p><code><br />
global t "YYYYMMDD"<br />
</code></p>
<p>Finally, I don't want my lines to be too long, so I use local macros to compress things a bit whenever that looks practical. Now let's have a look at the actual example:</p>
<p><code><br />
capture confirm file "${datato}subs${t}.dta"<br />
if _rc!=0 {<br />
</code><code><br />
   local mdbpath  "${datafrom}"<br />
   local mdbname "Submaster"<br />
   local source   "MS Access Database;DBQ=`mdbpath'`mdbname'_${t}.mdb"<br />
   capture confirm file "`mdbpath'`mdbname'_${t}.mdb"<br />
</code><code><br />
   if _rc==0 {<br />
      #delimit ;<br />
      local vars "<br />
      ${${paper}index},<br />
      prod_code,<br />
      start_orig,<br />
      last_start_date,<br />
      last_stop_date,<br />
      expire_date,<br />
      rate_table,<br />
      paper_code,<br />
      rate,<br />
      credit_card_usage";<br />
</code><code><br />
      drop _all;<br />
      odbc load,<br />
      exec("SELECT `vars' FROM `table' WHERE (prod_code='1D' OR prod_code='3D')")<br />
      dsn("`source'");<br />
      save "${datato}subs${t}", replace;<br />
      #delimit cr<br />
   }<br />
}<br />
</code></p>
<p>First, I check if the Stata file subsYYYYMMDD.dta exists in the path ${datato} with "capture confirm file". If it does not, then I check if the Access file Submaster_YYYYMMDD.mdb exists in the path ${datafrom}. If yes, then I read it and save subsYYYYMMDD.dta. The <code>exec()</code> part of <code>odbc load</code> is for running SQL syntax. Just to make sure that you appreciate the coolness of this: any SQL line that you could execute in Access, no matter how complicated, can be put between the <code>exec()</code> parentheses. Mine is simple -- I wrote the SQL commands in all caps for clarity -- but yours needn't be. You can also combine Stata locals with SQL syntax like I did.</p>
<p>I switch to the semicolon delimiter when it helps readability. It allows the line with SQL syntax to wrap around, and the listing of my `vars' local as one word per line. This, of course, is optional. The checks for the existence of either file are also optional. They keep my do-files from exiting with an error if I happen to run them on a day when either file does not exist -- either because the source file Submaster has not arrived yet, or because it hasn't been translated yet into Stata.</p>
]]></content:encoded>
			<wfw:commentRss>http://enoriver.net/index.php/2008/11/21/data-from-ms-access-to-stata/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Data from Excel to Stata, take two</title>
		<link>http://enoriver.net/index.php/2008/11/12/data-from-excel-to-stata-take-two/</link>
		<comments>http://enoriver.net/index.php/2008/11/12/data-from-excel-to-stata-take-two/#comments</comments>
		<pubDate>Wed, 12 Nov 2008 04:49:49 +0000</pubDate>
		<dc:creator>Gabi Huiber</dc:creator>
				<category><![CDATA[Stata]]></category>
		<category><![CDATA[Excel]]></category>
		<category><![CDATA[odbc load]]></category>

		<guid isPermaLink="false">http://enoriver.net/?p=290</guid>
		<description><![CDATA[I don't like spreadsheets. I like data sets and matrices. But I get data in Excel workbooks all the time, and it's a pain in the neck. This will never go away altogether, but if my Stata-using colleagues could be persuaded to quit sending me spreadsheets, that would at least cut down on the volume. [...]]]></description>
			<content:encoded><![CDATA[<p>I don't like spreadsheets. I like data sets and matrices. But I get data in Excel workbooks all the time, and it's a pain in the neck. This will never go away altogether, but if my Stata-using colleagues could be persuaded to quit sending me spreadsheets, that would at least cut down on the volume. I've tinkered with all kinds of ways to make data extraction from spreadsheets into Stata easier. This is the latest:<br />
<code><br />
clear<br />
set mem 10m<br />
set more off<br />
set type double<br />
</code><code><br />
local here "`c(pwd)'/"<br />
</code><code><br />
// declare name and date of Excel file to read from<br />
local xlsfile  "revised PriceIncr20080815-TargetTempl.xls"<br />
local filedate "20081110"<br />
</code><code><br />
// list the worksheets of interest<br />
local sheets ""Move these data" "And these too""<br />
</code><code><br />
// Extract data from Excel to Stata. Leave this part alone.<br />
local sheetno: list sizeof sheets<br />
forvalues i=1/`sheetno' {<br />
   local k: word `i' of `sheets'<br />
   local my_path     "`here'`xlsfile'"<br />
   local my_sheet    "`k'"<br />
   local source "Excel Files;DBQ=`my_path'"<br />
</code><code><br />
   clear<br />
   odbc load, dsn("`source'") table("`my_sheet'$")<br />
</code><code><br />
   unab varz: _all<br />
   foreach z in `varz' {<br />
      local lower=lower("`z'")<br />
      if "`lower'"!="`z'" {<br />
         rename `z' `lower'<br />
      }<br />
   }<br />
</code><code><br />
   describe<br />
   compress<br />
   save "`here'`k'`filedate'.dta", replace<br />
}<br />
</code><br />
Notice the c(pwd) function. It returns the path to the working directory. Essentially, the do-file above translates the worksheets listed in the local `sheets' into Stata files saved in the same directory as the do-file is in. The original Excel file may be in the same directory, as in this example, or it may be somewhere else. The `filedate' local is optional. Also optional, of course, is the loop that turns variable names to lower case. I just like mine that way.</p>
]]></content:encoded>
			<wfw:commentRss>http://enoriver.net/index.php/2008/11/12/data-from-excel-to-stata-take-two/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Getting data into Stata: odbc load</title>
		<link>http://enoriver.net/index.php/2008/10/01/getting-data-into-stata-odbc-load/</link>
		<comments>http://enoriver.net/index.php/2008/10/01/getting-data-into-stata-odbc-load/#comments</comments>
		<pubDate>Thu, 02 Oct 2008 03:58:33 +0000</pubDate>
		<dc:creator>Gabi Huiber</dc:creator>
				<category><![CDATA[Stata]]></category>
		<category><![CDATA[odbc load]]></category>

		<guid isPermaLink="false">http://enoriver.net/?p=177</guid>
		<description><![CDATA[ODBC stands for Open Database Connectivity; odbc load is a tool for getting data out of ODBC-ready applications and into Stata quickly and reliably. One such application is Microsoft Excel. If you haven't yet received data in an Excel workbook, you will. Just to get it out of the way: if you run Stata 10 [...]]]></description>
			<content:encoded><![CDATA[<p>ODBC stands for Open Database Connectivity; <code>odbc load</code> is a tool for getting data out of ODBC-ready applications and into Stata quickly and reliably. One such application is Microsoft Excel. If you haven't yet received data in an Excel workbook, you will.</p>
<p>Just to get it out of the way: if you run Stata 10 and you receive data from somebody using one of the newer, XML-style Excel versions, you don't need <code>odbc load</code> for that. Use the command <code>xmluse</code> instead. You can tell by the file name extension. Whenever Excel went XML (as of Office 2007 perhaps? I've been using OOo for a while now, so I'm not sure) it started saving workbooks with the .xlsx extension. The notes below apply to getting data from old-school -- .xls -- workbooks into Stata.</p>
<p>So, back to <code>odbc load</code>. Its syntax is a bit arcane; for that reason, sometimes more code is preferred to less. You can split this command into components with local macros, for example. Well-chosen local macros make it much easier to re-use code or organize it into modules that can be maintained by different people working simultaneously.</p>
<p>Stata may still be the preserve of the lone researcher, or the grad student working for the typical PI, which is the same thing, and for them these niceties don't matter as much. But the private sector has highly structured ways and Stata can roll with that just as well as it does with the more freewheeling style of the academia. So here's how I propose that <code>odbc load</code> should be run for getting data out of an Excel workbook:</p>
<p><code><br />
// your customers can edit this<br />
local my_path     "D:/data/My Project/Source Files/"<br />
local my_workbook "workbook.xls"<br />
local my_file     "Sheet1"<br />
</code><code><br />
// without worrying about any of this<br />
local source "Excel Files;DBQ=`my_path'`myworkbook'"<br />
clear<br />
odbc load, dsn("`source'") table("`my_file'$")<br />
tempfile my_file<br />
compress<br />
describe<br />
save "`my_file'", replace<br />
</code><br />
And that's all for MS Excel. Your temporary file "`my_file'" has all the data on worksheet Sheet1. If you ever need to pass this code on to other people, they will appreciate that they only need to edit the local macros with the path, workbook and worksheet that are right for them, and everything else works unchanged. You can use <code>odbc load</code> for getting data out of MS Access or a MySQL database. You can even embed SQL code. More on that <a href="http://www.ats.ucla.edu/stat/Stata/faq/odbc.htm">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://enoriver.net/index.php/2008/10/01/getting-data-into-stata-odbc-load/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

