## Macbook Pro running hot, draining battery after upgrading to SSD?

Mine did. That was an unpleasant surprise. Googling for a solution brought up untold amounts of speculation and wasted time.

What ended up working for me was resetting the System Management Controller (SMC), as documented here and especially here. You should see that Reddit comment thread, especially if you're also wondering whether you're supposed to enable TRIM.

Resetting the SMC brought down the CPU core temperatures from about 90°C to about 60°C, low enough for the fan to not kick in. My Mac is once again as quiet as it used to be.

I decided to replace the HDD with a SSD in my Mac for Christmas, but I only got as far as buying the thing and backing up the computer using Time Machine as explained here to a poor man's FreeNAS server that I cobbled together from a USB stick (for the OS) and the old Fedora 14 home server whose sole 500G HDD is now one big ZFS volume, with 2G of RAM.

That's right, ZFS on one HDD with 2G of RAM. I'm not saying that this is a good setup. The official hardware recommendation is 8G for ZFS. But this is the kit I had lying around, and I just wanted to move on with the actual disk replacement; my D510MO board won't even support more than 4G of RAM (though I'm not sure why, since it was made to accommodate a 64-bit CPU). Anyway, I managed to make one first complete Time Machine backup and a few incremental ones before leaving for work on Monday, January 6.

I flew back on Thursday, January 9, and found a non-responsive Mac with a HDD so sick that an erase-and-install OS restore was in order. That's what you end up having to do when upon entering your password at boot-up you see the apple logo for a while, then that "prohibited access" barred circle, while the gear animation is spinning and spinning.

I have no idea how this happened. I felt very fortunate for having made that backup. I decided that the accident was a good excuse to just proceed with the SSD installation already.

The proof of this particular pudding was going to be in restoring the old system from that Time Machine backup, over the LAN, off the grossly inadequate NAS box. I am happy to report that the restore succeeded, and my Mac is back in business, now with a SSD.

What I'm saying is this: if you don't have a Time Capsule but do have some idle hardware, FreeNAS may be a good Time Machine backup solution for you too.

One thing you will want to know about is user quotas: a 500G NAS HDD will fill up quickly if you let Time Machine have its way with it. The solution is to set some reasonable user quotas for people in your house who might use the FreeNAS box as their Time Machine backup destination. You can do that from the web GUI. The Advanced Mode of the Create ZFS Dataset menu under Storage (or, for an existing dataset, the Advanced Mode of Edit ZFS Options) lets you set quotas four different ways; for specifics, google thin and thick provisioning. This seems to be advanced sysadmin stuff.

There is also a command-line recipe for setting user quotas here. You get to the FreeNAS shell from the web GUI: look at the bottom of the vertical navigation menu on the left.

Smaller quotas will force Time Machine to keep a shorter history. It deletes old backups as it runs out of space -- so, less room, shorter history. That is not a bad thing.

## Invisible methods

R objects come with various methods that make them useful. I tend to stumble over these by googling something I want to do, and finding some code example on StackOverflow. But today I learned (from @RLangTip) that there is a straightforward way to list them all: you simply call e.g., `methods(class='lm')`.

That's nice, but mileage varies and I don't have a good explanation for it. Take Zelig for example. It has this `sim()` function which produces a simulation object with some methods of its own. One of these is `plot.ci()`, illustrated here. Unfortunately, you won't find it with the `methods()` call:

``````
> library("Zelig", lib.loc="C:/Program Files/R/library")
ZELIG (Versions 4.2-2, built: 2013-10-22)

+----------------------------------------------------------------+
|  Please refer to http://gking.harvard.edu/zelig for full       |
|  documentation or help.zelig() for help with commands and      |
|  models support by Zelig.                                      |
|                                                                |
|  Zelig project citations:                                      |
|    Kosuke Imai, Gary King, and Olivia Lau.  (2009).            |
|    ``Zelig: Everyone's Statistical Software,''                 |
|    http://gking.harvard.edu/zelig                              |
|   and                                                          |
|    Kosuke Imai, Gary King, and Olivia Lau. (2008).             |
|    ``Toward A Common Framework for Statistical Analysis        |
|    and Development,'' Journal of Computational and             |
|    Graphical Statistics, Vol. 17, No. 4 (December)             |
|    pp. 892-913.                                                |
|                                                                |
|   To cite individual Zelig models, please use the citation     |
|   format printed with each model run and in the documentation. |
+----------------------------------------------------------------+

Attaching package: ‘Zelig’

The following object is masked from ‘package:utils’:

cite

> methods(class='sim')
[1] plot.sim*   print.sim*   repl.sim*   simulation.matrix.sim*
[5] summary.sim

Non-visible functions are asterisked
``````

See that? There's a non-visible `plot()` method listed, but no `plot.ci()` method, yet it exists and it works. I wonder why that is. Is it maybe that `plot.ci()` is some kind of child of `plot()`? If so, how do you list such children?

## How I backed up a bunch of old pictures to Amazon Glacier

This is from a home server that runs Fedora 14, to which I have ssh access from my MacBook Pro.

1. I `git clone`'d this.

2. Then, as super-user, I called

``````
wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | python
``````

as instructed here, to install the `setuptools` module.

3. Then, also as super-user, I called

``````
python setup.py install
``````

4. At this point, it was time to fill out the .glacier-cmd configuration file, as shown in the README.md.

5. Bookkeeping using Amazon SimpleDB requires setting up an Amazon SimpleDB domain (= database) first. You cannot do this through the AWS Management Console.

6. So I googled, and found official directions here.

7. Unfortunately, my Chrome wouldn't render properly the SimpleDB Scratchpad web app. That caused some unnecessary confusion. The solution was to just run Scratchpad in Safari.

8. Your computer has folders and files. Amazon Glacier has vaults and archives. One archive = one upload. This can be an individual file, but it's more practical to bundle individual files into tarballs first, so one archive = one tarball.

9. I'm in business: two large tarballs uploaded and showing up in my SimpleDB domain that keeps tabs on this particular vault, one on the way.

It looks like everything works, but I can't be sure until Amazon Glacier gets around to producing an inventory (this happens about once a day, it seems). I can then check SHA sums between what's on Glacier and what I thought I sent there. Next I will upload something small, then download it the next day.

Glacier is the digital equivalent of self-storage. You put stuff there that you don't really want anymore; you think you might, but you don't. It's a problem that comes with ease of acquiring such stuff in the first place. I don't think there's a big self-storage industry in Zambia, and I'm sure that storing old photos wasn't much of a problem back when you had to take them on film and you only had 32 frames in a roll.

I have no idea why we bother with digital self-storage. I guess simply deleting old pictures and a bunch of music we no longer listen to makes us feel like jerks. It's a total trap.

## I put up my first post on RPubs

Sure, it may be the 4chan of data analysis, but it's so nice to be able to do R Markdown right there in RStudio and just hit the Publish button.

Of course, this convenience has downsides. I know it's prudent to sit with your work a bit, just like thinking carefully before you go skinny-dipping, especially when you don't have the benefit of peer review.

On the other hand, it's no use to wait until nobody cares anymore. So, here goes.

## Stata 13 is coming on June 24

Yellow color scheme is out, sky-blue is in, plus expanded capabilities, as one might expect. Notable among them, `xtologit`, `xtoprobit` and long strings -- 2 billion character long, that is. One of these days you won't need an RDBMS anymore. Wouldn't that be nice?

See more details here.

## Keeping knitr happy after upgrading to R 3.0.0

As noted here, after upgrading to R 3.0.0 you must run

``````
update.packages(checkBuilt=TRUE)
``````

This is because a bunch of packages have to be to rebuilt under R 3.0.0 in order to keep working.

So I did, but that was not enough for LyX to be able to compile my pdf's from knitr like it used to only a week ago. What I had to do besides was this:

``````
remove.packages("tikzDevice")
``````

That is right. The package `tikzDevice` can no longer be installed directly from R-forge as a binary, as in
``` ```

``````install.packages("tikzDevice", repos="http://R-Forge.R-project.org")
``````
``` ```

Also, the source files are only available as a .tar.gz archive. To install from it on a Windows machine, you must have Rtools installed first.

## A quick note on rJava

I recently had to set up a PC with similar kit as I have on my Mac. On this PC the OS is Windows 7 64-bit but the browser is IE8 32-bit. This causes `jucheck.exe` to install (and occasionally update) 32-bit Java. This is unfortunate if you use 64-bit R, because it breaks the `rJava` package, which in turn breaks the `xlsx` package, with the practical consequence that you cannot read Excel worksheets into R.

There is a workaround. First, install Oracle's manual download of 64-bit Java. As of this writing, its Windows 7 home will be in `C:\Program Files\Java\jre7`. You should add this to the `%path%` environment variable. In addition, the `rJava` package depends on `jvm.dll`, and R might be looking for it in the wrong spot. It won't hurt, then, to add this to your `%path%` as well: `C:\Program Files\Java\jre7\bin\server`. There's more on this, as usual, on StackOverflow.

As Oracle warns, your manually-installed 64-bit Java will not be automatically updated. That is a problem when security flaws hit Java, but I find being able to read Excel files into R so useful that I'm willing to just live with this risk, though I don't have a good idea how to best manage it. I'll just keep an eye on ArsTechnica for bug news. If anybody has a better way, I'm all ears.

## An R-squared for logistic regression, packaged

This morning I checked Paul Allison's Statistical Horizons blog and found a post on $R^2$ measures for logistic regression. It introduced me to Tjur's $R^2$ by way of an example, which I repackaged below:

``````
// Reference: http://www.statisticalhorizons.com/r2logistic

// program definition
capture prog drop tjur2
program tjur2, rclass

if !inlist(e(cmd),"logit","logistic") {
di as err "Tjur's R-squared only works after logit or logistic."
exit 498 // Thank you, Nick Cox.
}
tempname yhat
predict `yhat' if e(sample)
local y `e(depvar)'
quietly ttest `yhat', by(`y')
local r2logistic r(mu_2)-r(mu_1)
di "Tjur's R-squared " _col(20) %4.3f `r2logistic'
return local r2logistic `r2logistic'

end

// use case
use "http://www.uam.es/personal_pdi/economicas/rsmanga/docs/mroz.dta", clear
logistic inlf kidslt6 age educ huswage city exper
tjur2
``````

I'm not sure yet if it's worth saving this program as `ado/personal/t/tjur2.ado` for my future logistic regression diagnostic needs, but I haven't posted anything Stata-related in too long, so there you have it.

## Tidying up your R packages

Do you have the same R packages installed in two places? Would you like to remove the duplicates? You might find the script below useful:

``````
rm(list=ls(all=TRUE))

# define function to return duplicate packages and paths
tidyup <- function() {
packs <- as.data.frame(installed.packages())
paths <- levels(packs\$LibPath)
main <- subset(packs, LibPath==paths[2]) # base and recommended
mine <- subset(packs, LibPath==paths[3]) # stuff I installed
dups <- intersect(main\$Package,mine\$Package)
return(list(paths,dups))
}

# do the work:
cleanthis <- tidyup()
removethese <- cleanthis[[2]]          # here's the list of dups
fromhere <- cleanthis[[1]][3]          # I only want them on the main path
remove.packages(removethese, fromhere) # done

# check the result:
# if length(tidyup()[[2]])=0, all is well. no dup packages left.
checkthis <- as.numeric(length(tidyup()[[2]]))
``````

Why I wrote this:

A while back I chose to separate my package library over two file paths. One would be for base and recommended packages (1), the other for everything else (2). My notes on how I did that are here, and my reasons are here.

Today, I wanted to update my Zelig. I used the wizard -- `source("http://r.iq.harvard.edu/zelig.installer.R")` -- so I would get all the add-ons in one step. The wizard works under the assumptions that your library is all in one place. It installed a few packages that Zelig and its add-ons depend on on path (2), because it didn't find them there. They were present on path (1) though, so I ended up with duplicates. This is how I got rid of them.