The rise of R for statistical analysis

Note: This is an opinion piece written for the Department of Nutritional Sciences (University of Toronto) newsletter NutriNews. Link to the piece will be put up after the newsletter is published.

Read More

Managing time and projects (at least in a research setting)

I sometimes get distracted with other things going on in my life that end up making me less productive in my work (an academic setting). I want to try to determine what is the best (or better) approach to getting things done and producing things. Many approaches I use or try to use come from recommendations in the “Getting Things Done”. Several recommendations also come from project management or software development practices.

Read More

Building tables using the carpenter R package

In biomedical research, there are certain types of tables that are often included in the article. For instance, some basic statistics between the treatment and control group. Or maybe it is between males and females, before and after an intervention, and so on. Often these tables are a hassle to create and are prone to needing updates based on slight changes in the data or from reviewer comments. I can’t stand doing these repetitive and simple tasks by hand, so I designed carpenter to make creating these tables easily.

Read More

Statistical analysis construction: The mason R package

Most analyses follow a similar pattern to how construction/engineering projects are developed: design -> add specifications -> construction -> (optional) add to the design and specs and continue construction -> cleaning, scrubbing, and polishing. I created the mason package to try to emulate this process and make it easier to do analyses in a consistent and ‘tidy’ format.

Read More

Standardized project generation: the prodigenr R package

Academic researchers need to write up abstracts for conferences or submits manuscripts to journals. Often, each abstract, manuscript, or presentation is created ad hoc and may not have much structure to the files and folders, making it harder to come back to after a few months, harder to reproduce the results, and harder for others to look over the work. I developed the R package prodigenr to make this step automated and to help make the project adhere to reproducible analytic guidelines.

Read More

Loops and Forests: Running and presenting multiple tests of linear regression

If you do any type of data-heavy work, you likely have had to run many tests of a regression. As the number of response and explanatory variables increases, the number of potential combinations of course also increases. There is no way you are going to type out dozens of different regressions… You also have the challenge of presenting this much information. In this post, I’m going to go over a way to loop through each of the possible combinations. I’m also going to advocate that any time that many results of the same test are shown, that the tabular format for these results is probably the absolute worst way to show your data… and that plots, in particular a modified forest plot, are the best way to present your data. In both the loop and the forest plot case, I’ve created several functions to carry out this task for a generalized estimating equations analysis on my GitHub rstatsToolkit package with an example found on the plotForest function example section

Read More

Understanding variance and linear regression

I’ve written this blog because I want to get a better understanding of how the foundation of most statistics, the variance, works and how it applies within the context of a simple linear regression. I hope that this blog will also be useful for others learning about statistics.

Read More

Staying sane: Rules on consistent file naming practices

One important component of maintaining order and sanity with your files if you do any significant amount of computer work (or even if you don’t) is making sure you have established rules for naming files and folders. You can develop your own or you can use rules developed by others. I’ve taken a bit from all over the place (including from the University of Edinburgh website), incorporated some of my own thoughts on filenaming, and decided to put it up on my blog so others may have a starting point to developing their own. Below I have detailed my filenaming rules that I try to be strict about.

Read More

Using more advanced macro techniques in SAS: Conditionals

In a previous post, I went over a step-by-step introduction to starting your own macro. For this post, I want to go over some more advanced features of the macro facility in SAS. There are two features in particular that are very useful to know (at least in my opinion) for developing more useful and powerful personalized macros. These are:

Read More

An introduction to creating your own macro in SAS

Ever copy and paste a proc statement or data step? Or wish to do some more complex tasks, either only once or multiple times? Well in SAS, there is a facility that allows you to create these snippets of code that can be reused or that make doing complex tasks easier and more maintainable. This facility is called macros.

Read More