Metadata
Title
STU33011 Lab 1
Category
general
UUID
af0303aaeeae477982e2d5641993c6e0
Source URL
https://www.scss.tcd.ie/arthur.white/Teaching/STU33011/Lab1.html
Parent URL
https://www.scss.tcd.ie/arthur.white/Teaching/STU33011.html
Crawl Time
2026-03-16T07:05:04+00:00
Rendered Raw Markdown

STU33011 Lab 1

Source: https://www.scss.tcd.ie/arthur.white/Teaching/STU33011/Lab1.html Parent: https://www.scss.tcd.ie/arthur.white/Teaching/STU33011.html

Lab sessions will focus on using R to perform the techniques that we have discussed in lectures. In this first session, we will

R is a free software environment especially designed for statistical computing and graphics. It is open source and is regularly updated with new packages that can perform recently developed statistical techniques (see e.g. http://dirk.eddelbuettel.com/cranberries/).

If R has been installed on your PC then access can be gained by selecting the Start menu, going to All Programs, and selecting the R file from the folder of the same name. If R has not already been installed, then it is available for free download from http://www.r-project.org/. Click on the “CRAN” option, choose a “mirror” (for example, Dublin, or Bristol), and then select the appropriate version you require. Follow the usual commands thereafter.

RStudio is an integrated development environment (IDE) for the R programming language. It is more user-friendly than the basic R GUI (graphical user interface) and provides greater functionality. It can be downloaded for free at rstudio.com.

1 Manuals

Useful on-line manuals are available at:

2 Using the R Console

Typing a command after the > prompt and pressing the enter/return key will cause R to evaluate the command and return the results of its computations.

In the R Console, enter:

x <- sqrt(25) + 2

This simple command tells R to calculate (\sqrt{25}+2) and to store the solution under the name x. In RStudio, the object x should be visible in the Environment tab of the top right pane. The value of x can also be inspected by entering the following:

x
## [1] 7

Exercise

3 Help Commands

In order to access the internal help files in R, use either the ? command or the help.search function. Enter the following into R and observe the result:

?exp 
help.search("exponential")

The first of these commands brings up the “Logarithms and Exponentials” help file, and explains how to use the exp function. In general, given an R command or function x, entering ?x will bring up its help file.

The second command provides a list of help files in which the term exponential can be found in the concept or title. This can be very useful when searching for the appropriate R command for performing a given exercise.

In RStudio, help files can also be accessed by selecting the Help tab in the bottom right pane on screen and using the search box.

Exercise

4 Vectors and Matrices

Probably the most commonly used command in R is the c command, which combines all the arguments it has been given into an ordered vector. For example:

s <- c(1, 2, 3, 4)
s
## [1] 1 2 3 4

Exercise

To construct a matrix, the matrix command is used. Check this command’s help file to make sure you understand what the following code does:

A <- matrix(c(1, 2, 2, 5), nrow = 2, ncol = 2, byrow = TRUE)
A
##      [,1] [,2]
## [1,]    1    2
## [2,]    2    5

Exercise

5 Re-using Code

One quick method of re-running previously typed code is to use the (\uparrow) and (\downarrow) keys in the R console. Pressing these allows the user to cycle through previous code that is saved in the workspace. This then allows previous code to be either run again as it is, or to be quickly altered as needed (e.g. to fix a typo).

Exercise

An alternative (and generally better) approach is to write your code as an R script.

To open an R script in RStudio, select File, then New File, and R Script. Alternatively, use the keyboard shortcut Ctrl+Shift+N, or click the New File icon denoted by a blank page with a green circle containing a white plus sign, which is located below File.

Whereas a line of code in the R Console can be run by pressing the Enter key, to run code directly from an R script press Ctrl+Enter or click the Run (‘Run the current line or selection’) button in the top right corner of the R script tab. This will run the line on which the cursor is currently located or any code which is currently highlighted.

R scripts can be saved by clicking the floppy disk save icon in the top left corner of the R script tab or by using the keyboard shortcut Ctrl+s.

Exercise

6 Additional Packages

The base distribution of R comes with many commonly used add-on packages. These implement additional R commands that are often very useful.

In RStudio, select the Packages tab of the bottom right pane and click the box to the left of any package.

Alternatively, just use the library function, e.g. library("cluster").

Exercise

There are many more additional packages available from the CRAN (comprehensive R archive network) website (http://cran.r-project.org/web/packages/). The easiest way to install a package is to use the install.packages function.

Another option is to follow the links on CRAN and directly download the package of choice. In RStudio, the downloaded package can be installed by clicking the Install icon at the top of the Environment tab in the bottom right pane.

Exercise

7 Creating Functions

Many of the functions we have been using were created by an R contributor, as R is open source. For example, the mclust package and its functions were written by Chris Fraley, Adrian Raftery, Brendan Murphy, Michael Fop, and Luca Scrucca. It is easy to write your own functions in R. Enter the following:

square <- function(x){ x^2 }

square(3)
## [1] 9
square(s)
## [1]  1  4  9 16
square(A)
##      [,1] [,2]
## [1,]    1    4
## [2,]    4   25

The first piece of code creates a function called square that returns the squared value of its argument. Note how this function treats the vector s and the matrix A (specifically, square acts on the individual elements of its argument.) In general, functions are created by assigning a name to the command function followed by the arguments of the function within parentheses (), followed by the commands performed by the function within brackets {}.

8 if and for Loops

Although the speed of such calculations can be poor compared to lower-level languages, R can be used to write for and if loops. For example:

for(i in 1:10){ 
  if(i==1){
    print( paste("The first number is", i) )
    }
  if(i>1){
    print( paste("The next number is", i) )
    }
}
## [1] "The first number is 1"
## [1] "The next number is 2"
## [1] "The next number is 3"
## [1] "The next number is 4"
## [1] "The next number is 5"
## [1] "The next number is 6"
## [1] "The next number is 7"
## [1] "The next number is 8"
## [1] "The next number is 9"
## [1] "The next number is 10"

The syntax here is straightforward. A for loop is performed by entering the command for followed by the index for the loop and its range within parentheses (), followed in brackets {} by the commands to be performed during each iteration of the index:

for(i in 1:10){
  print(i)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10

The if argument is followed by a logically testable statement given in parentheses (), followed in brackets {} by the commands to be performed if the logical statement is true:

if(2>0){  print("yes") }
## [1] "yes"
if(2==0){ print("yes") }

If a different command is to be performed if the initial condition is not met, an if statement can be followed by an else argument:

toy1 <- -2

if(toy1 > 0){
  print("yes")
} else{
  print("no")
}
## [1] "no"

Exercise

9 Importing/Exporting Data

R can import common data formats such as text files, Excel files, SPSS files, etc.

Exercise

This file consists of a comma delimited Excel sheet which can be read into R via the following (where the dots represent the file path): music <- read.csv("C:\\...\\music.csv").

Another approach, instead of using the full file path, is to change the R working directory to the folder which contains the file to be downloaded and then only use the file name itself.

In RStudio, select Session, then Set Working Directory. If the file to be downloaded is in the same folder as the current R script, then select To Source File Location. Otherwise, select Choose Directory... and browse to find the correct folder.

Alternatively, you can use the setwd function to change the working directory. The getwd function allows you to check the current working directory.

Then you can enter:

music <- read.csv("music.csv")

Note that you can also import data directly using the url, e.g.

music <- read.csv("https://www.scss.tcd.ie/~arwhite/Teaching/STU33011/music.csv")

Additionally, in RStudio, you can click Import Dataset in the Environment tab of the top right pane or browse through the computer’s files in the Files tab of the bottom right pane, click on the file, and choose Import Dataset....

We have loaded a data frame named music into R. In RStudio, this dataframe should be visible in the Environment tab in the top right pane. Once the file has been read in, the first 10 rows of music can be checked by entering:

head(music, 10)
##                 X Artist Type     LVar      LAve  LMax   LFEner     LFreq
## 1   Dancing Queen   Abba Rock 17600756 -90.00687 29921 105.9210  59.57379
## 2      Knowing Me   Abba Rock  9543021 -75.76672 27626 102.8362  58.48031
## 3   Take a Chance   Abba Rock  9049482 -98.06292 26372 102.3249 124.59397
## 4       Mamma Mia   Abba Rock  7557437 -90.47106 28898 101.6165  48.76513
## 5     Lay All You   Abba Rock  6282286 -88.95263 27940 100.3008  74.02039
## 6   Super Trouper   Abba Rock  4665867 -69.02084 25531 100.2485  81.40140
## 7  I Have A Dream   Abba Rock  3369670 -71.68288 14699 104.5969 305.18689
## 8      The Winner   Abba Rock  1135862 -67.81905  8928 104.3492 277.66056
## 9           Money   Abba Rock  6146943 -76.28075 22962 102.2407 165.15799
## 10            SOS   Abba Rock  3482882 -74.13000 15517 104.3624 146.73700

music consists of 8 columns and 62 observations. To determine the dimensions of music run the command:

dim(music)
## [1] 62  8

To save the music data frame, i.e. to write the data frame to a new file, the following could be used:

write.table(music, file = "C:\\...\\music2.csv", sep = ",")

Or if the working directory has been changed then you can simply enter:

write.table(music, file = "music2.csv", sep = ",")

In the above, the music argument provides the name of the data frame to be saved, the second (file) argument specifies the file path to which you wish the data to be saved, and the final (sep) argument tells R to save the file in a comma delimited format.

Exercise

10 R Markdown

This document was created using an R Markdown file. R Markdown files allow us to create text documents which include blocks of R code, as well as the output and plots that they produce. We can also use LaTex to write mathematical formulas in R Markdown files. There are a lot of online resources for R Markdown. See here for example: https://shiny.rstudio.com/articles/rm-cheatsheet.html

To open an R Markdown document, click on the New File icon or navigate to New File via File and then select R Markdown.... Click on the Knit icon above the R Markdown file text to produce the output document.