Fork me on GitHub

Articles tagged programming

  1. Things I'm Glad I Learned

    Skills, concepts, techniques, and models

    WARNING: This post was written with haste and therefore contains all kinds of typos, spelling errors, grammatical issues, and delusions of grandeur, wisdom, and writing ability.

    This post is intended as a living document—a gratitude journal of sorts—of some things that I'm glad I learned. I expect many of the items on this list will be relevant to computation biology, but that may change in the future.

    The big idea is that for every item on this list I am (A) glad that someone introduced me to it, and (B) think more people should know about it. This post is my chance to "pay it backwards", as it were; maybe someone else will be grateful for something they find for the first time on this list.

    It may also double as an inspiration list for future posts.

    My goal is to write a small blurb for each item …

  2. Tutorial: Reproducible data analysis pipelines using Snakemake

    In many areas of natural and social science, as well as engineering, data analysis involves a series of transformations: filtering, aggregating, comparing to theoretical models, culminating in the visualization and communication of results. This process is rarely static, however, and components of the analysis pipeline are frequently subject to replacement and refinement, resulting in challenges for reproducing computational results. Describing data analysis as a directed network of transformations has proven useful for translating between human intuition and computer automation. In the past I've evangelized extensively for GNU Make, which takes advantage of this graph representation to enable incremental builds and parallelization.

    Snakemake is a next-generation tool based on this concept and designed specifically for bioinformatics and other complex, computationally challenging analyses. I've started using Snakemake for my own data analysis projects, and I've found it to be a consistent improvement, enabling more complex pipelines with fewer of the "hacks" that …

  3. Teaching Python by the (Note)Book

    tl;dr: I tried out a modified Python lesson and I think it was successful at balancing learner motivation with teaching foundational (and sometimes boring) concepts.

    In many ways, teaching Python to scientists is easier than just about every other audience. The learning objective is clear: write code to make my science more accurate, more efficient, and more impactful. The motivation is apparent: data is increasingly plentiful and increasingly complex. The learners are both engaged and prepared to put in the effort required to develop new skills.

    But, despite all of the advantages, teaching anybody to program is hard.

    In my experience, one of the most challenging trade-offs for lesson planners is between motivating the material and teaching a mental model for code execution. For example, scientists are easily motivated by simple data munging and plotting using pandas and matplotlib; these are features of the Python ecosystem that can convince …

  4. Tutorial: Reproducible bioinformatics pipelines using GNU Make

    WARNING: Because of the Markdown rendering of this blog, tab characters have been replaced with 4 spaces in code blocks. For this reason, the makefile code will not work when copied directly from the post. Instead, you must first replace all 4-space indents with a tab character.

    For most projects with moderate to intense data analysis you should consider using Make. Some day I'll write a post telling you why, but for now check out this post by Zachary M. Jones1. If you're already convinced, or just want to see what it's all about, read on.

    This post is the clone of a tutorial that I wrote for Titus Brown's week-long Bioinformatics Workshop at UC Davis's Bodega Marine Laboratory in February, 2016. For now, the live tutorial lives in a Github repository, although I eventually want to merge all of the good parts into the Software Carpentry Make lesson …

  5. First time teaching Python to novices

    This July I co-instructed with Jennifer Shelton a Software Carpentry workshop at Stanford University, targeted to researchers with genomic or evolutionary datasets. Jennifer taught the shell (Bash) and version control with Git, while I taught the general programming language Python. I've been aware of the organization, which teaches software development and computational methods to scientists, since attending a workshop in 2012. Since then I've served as a helper at one workshop (troubleshooting individual learner's problems and helping catch them up with the rest of the class), and gone through the "accelerated", two day, instructor training at Michigan State University. After the Stanford workshop, I took part in new-instructor debriefing on August 4th, during which I mentioned that I had to greatly pare down the community-written lesson plan, python-novice-inflammation, to fit into the two half-day session we allotted it.

    Karin and Tiffany, who were running the debriefing, asked me to send …

Page 1 / 1