1. Take five minutes to simplify your life with Make

published:
edited: June 22, 2016, 13:30
tags:

I use GNU Make to automate my data processing pipelines. I’ve written a tutorial 1 for novices on the basics of using Make for reproducible analysis and I think that everyone who writes more than one script, or runs more than one shell command to process their data can benefit from automating that process. I’m not alone.

However, the investment required to learn Make and to convert an entire project can seem daunting to many time-strapped researchers. Even if you aren’t living the dream—rebuilding a paper from raw data with a single invocation of make paper—I still think you can benefit from adding a simple Makefile to your project root.

When done right, scripting the tedious parts of your job can save you time in the long run2. But the time savings aren’t the only reason to do it. For me, a bigger …

2. Tutorial: Reproducible bioinformatics pipelines using GNU Make

published:
edited: March 5, 2016, 10:00
tags:

For most projects with moderate to intense data analysis you should consider using Make. Some day I’ll write a post telling you why, but for now check out this post by Zachary M. Jones1. If you’re already convinced, or just want to see what it’s all about, read on.

This post is the clone of a tutorial that I wrote for Titus Brown’s week-long Bioinformatics Workshop at UC Davis’s Bodega Marine Laboratory in February, 2016. For now, the live tutorial lives in a Github repository, although I eventually want to merge all of the good parts into the Software Carpentry Make lesson (repository).

I’m posting this tutorial because I think it’s a good introduction to the analysis pipeline approach I have been slowly adopting over the last several years. This approach is even more deeply enshrined in a project template that I …

3. Compiling SciPy on RHEL6

published:
tags:

Within the past two years I’ve discovered something interesting about myself (…actually really, really boring about myself): I can be happily entertained for hours on end setting up my computational environment just right. I find that it gives me a similar type of satisfaction to cataloguing my music collection. I guess you could call it a hobby.

Usually this entails installing the usual suspects (NumPy, Pandas, IPython, matplotlib, etc.) in a python virtual environment. When I’m particularly into it (which is always), I’ll also compile the python distribution itself. I’ve had several opportunities to indulge this pasttime, most recently in setting up my research pipeline on the Flux high-performance compute cluster at The University of Michigan.

Installing NumPy is usually no trouble at all, but for some reason (if you know, please tell me), SciPy has always given me a “BlasNotFoundError” when installing on the Red …

4. PyMake I: Another GNU Make clone

published:
edited: March 4, 2016, 10:00
tags:

(Edit 1): This is the first of two posts about my program PyMake. I’ll post the link to Part II here when I’ve written it. While I still agree with some of the many of the views expressed in this piece, I have changed my thinking on Makefiles.

(Edit 2): I’ll post a new post about the topic when I take the time to write it. I’ve written a tutorial on using Make for reproducible data analysis.

I am an aspiring but unskilled (not yet skilled?) computer geek. You can observe this for yourself by watching me fumble my way through vim configuration, multi-threading/processing in Python, and git merges.

Rarely do I actually feel like my products are worth sharing with the wider world. The only reason I have a GitHub account is personal convenience and absolute confidence that no one else will ever look …

Page 1 / 1