Tuesday: Tips & Tricks 

 I've been programming in R for four years now, and it seems that no how much I learn there are a million tiny ways that I could do it better.  We all have our own programming styles and frequently used functions that may prove useful to others.  I often find that a casual conversation with an office mate yields new approaches to a programming quandary.  I'm speaking not of statistical insights, though those are important too, but rather the "simple" art of data manipulation and programming implementation--those essential tricks that help to improve coding efficiency.  So, to that end I'm announcing the beginning of a bi-weekly "Tuesday Tips & Tricks" posting.  These tips may include the description of a useful and perhaps obscure function, or the solutions to common coding problems.  I'm selfishly hoping that if readers of this blog know of better or alternate approaches, they'll respond in the comment section.  So I'm looking forward to reading your responses.   
  
This week's tip: How to quickly summarize contents of an object.    
  
Answer:  summary(), str(), dput()  


 The primary option, of course, is the familiar  summary()  command.  This command works well for viewing model output, but also to get a quick sense of data frame, matrices and factors.  For example, summary of a data frame or matrix shows the following:   

 > summary(dat1) 
     Hello           test     citynames         
 Min.   :1.00   Min.   :-3   Length:2           
 1st Qu.:1.25   1st Qu.:-2   Class :character   
 Median :1.50   Median :-1   Mode  :character   
 Mean   :1.50   Mean   :-1                      
 3rd Qu.:1.75   3rd Qu.: 0                      
 Max.   :2.00   Max.   : 1                           
  
This is an incredibly useful function for numeric data, but is less useful for string data.  For character vectors the summary function only reveals the length, class, and mode of the variable.  In this case, to get a quick look at the data, one might want to use  str().   Officially  str()   "compactly displays the structure of an arbitrary R object", and in practice this is incredibly useful.  So using the same dataframe as an example:  
  
> str(dat1) 
'data.frame': 2 obs. of  3 variables: 
 $ Hello    : num  1 2 
 $ test     : num  -3 1 
 $ citynames: chr  "Cambridge" "Rochester" 
  
In this case, this is just a 2 x 3 data frame, where the first variable is Hello, it's a numeric variable, and the values of the variable Hello are: 1, 2.   In this case, the character vector for citynames is much more usefully displayed.  While this is a small example, the function works just as well for much larger data frames and matrices where it only displays the first ten values of each variable.   
  
For smaller objects, the function  dput()  might also prove useful.   This function shows the ASCII text representation of the R object and it's characteristics.  So for this same example:   
  
> dput(dat1) 
structure(list(Hello = c(1, 2), test = c(-3, 1), citynames = c("Cambridge",  
"Rochester")), .Names = c("Hello", "test", "citynames"), row.names = c(NA,  
-2L), class = "data.frame")