Metadata
Title
STU33011 Lab 1
Category
general
UUID
af0303aaeeae477982e2d5641993c6e0
Source URL
https://www.scss.tcd.ie/arthur.white/Teaching/STU33011/Lab1.html
Parent URL
https://www.scss.tcd.ie/arthur.white/Teaching/STU33011.html
Crawl Time
2026-03-16T07:05:04+00:00
Rendered Raw Markdown
# STU33011 Lab 1

**Source**: https://www.scss.tcd.ie/arthur.white/Teaching/STU33011/Lab1.html
**Parent**: https://www.scss.tcd.ie/arthur.white/Teaching/STU33011.html

Lab sessions will focus on using `R` to perform the
techniques that we have discussed in lectures. In this first session, we
will

- look at `R`;
- learn how it is used; and
- discover what options are available for finding additional
  help.

`R` is a free software environment especially designed for
statistical computing and graphics. It is open source and is regularly
updated with new packages that can perform recently developed
statistical techniques (see e.g. <http://dirk.eddelbuettel.com/cranberries/>).

If `R` has been installed on your PC then access can be
gained by selecting the `Start` menu, going to
`All Programs`, and selecting the `R` file from
the folder of the same name. If `R` has not already been
installed, then it is available for free download from <http://www.r-project.org/>. Click on the “CRAN” option,
choose a “mirror” (for example, Dublin, or Bristol), and then select the
appropriate version you require. Follow the usual commands
thereafter.

`RStudio` is an integrated development environment (IDE)
for the `R` programming language. It is more user-friendly
than the basic `R` GUI (graphical user interface) and
provides greater functionality. It can be downloaded for free at
`rstudio.com`.

# 1 Manuals

Useful on-line manuals are available at:

- Introduction to `R`: <http://cran.r-project.org/doc/manuals/R-intro.pdf>.
- `R` reference card: <http://cran.r-project.org/doc/contrib/Short-refcard.pdf>.
- `R` mailing list help archive: <http://tolstoy.newcastle.edu.au/R/about.html> .
- Statistics questions on stack exchange (`R` is a common
  tag): <https://stats.stackexchange.com/>
- Additional `R` packages: <http://cran.r-project.org/web/packages/> (More on this
  later).
- For more details on `R Markdown`, see <https://bookdown.org/yihui/rmarkdown/> or <http://rmarkdown.rstudio.com>.
- Cheatsheets for `RStudio`, `R Markdown`,
  Importing Data, and other topics are available at <https://www.rstudio.com/resources/cheatsheets/>.

# 2 Using the R Console

Typing a command after the `>` prompt and pressing the
`enter`/`return` key will cause `R` to
evaluate the command and return the results of its computations.

In the `R` Console, enter:

```
x <- sqrt(25) + 2
```

This simple command tells `R` to calculate \(\sqrt{25}+2\) and to store the solution
under the name `x`. In `RStudio`, the object
`x` should be visible in the `Environment` tab of
the top right pane. The value of `x` can also be inspected by
entering the following:

```
x
```

```
## [1] 7
```

### Exercise

- Tell `R` to save the value \(e^{2.5}\) under the variable name
  `y`.
- Using the previously saved values of `x` and
  `y`, save the value \(e^{2.5}(\sqrt{25}+2)\) under the variable
  name `z` (note that `*` denotes the multiplication
  symbol).

# 3 Help Commands

In order to access the internal help files in `R`, use
either the `?` command or the `help.search`
function. Enter the following into `R` and observe the
result:

```
?exp 
help.search("exponential")
```

The first of these commands brings up the “Logarithms and
Exponentials” help file, and explains how to use the `exp`
function. In general, given an `R` command or function
`x`, entering `?x` will bring up its help
file.

The second command provides a list of help files in which the term
`exponential` can be found in the concept or title. This can
be very useful when searching for the appropriate `R` command
for performing a given exercise.

In `RStudio`, help files can also be accessed by selecting
the `Help` tab in the bottom right pane on screen and using
the search box.

### Exercise

- Find the `R` command for determining the logarithm of 10
  to base 10.
- Find the `R` command for determining the logarithm of 10
  to base 2.

# 4 Vectors and Matrices

Probably the most commonly used command in `R` is the
`c` command, which combines all the arguments it has been
given into an ordered vector. For example:

```
s <- c(1, 2, 3, 4)
s
```

```
## [1] 1 2 3 4
```

### Exercise

- Create a vector `t` with ordered elements 5, 6 and 7,
  respectively.
- Using the previously saved values of `s` and
  `t`, create a vector `u` with ordered elements 1,
  2, 3, 4, 5, 6 and 7, respectively.

To construct a matrix, the `matrix` command is used. Check
this command’s help file to make sure you understand what the following
code does:

```
A <- matrix(c(1, 2, 2, 5), nrow = 2, ncol = 2, byrow = TRUE)
A
```

```
##      [,1] [,2]
## [1,]    1    2
## [2,]    2    5
```

### Exercise

- Create a matrix `B` such that: \[B = \left(\begin{array}{cc}
  1 & 2\\
  2 & 3\\
  4 & 5
  \end{array}\right)\]

# 5 Re-using Code

One quick method of re-running previously typed code is to use the
\(\uparrow\) and \(\downarrow\) keys in the R console.
Pressing these allows the user to cycle through previous code that is
saved in the workspace. This then allows previous code to be either run
again as it is, or to be quickly altered as needed (e.g. to fix a
typo).

### Exercise

- Use the \(\uparrow\) to edit the
  line that was previously used to save matrix `A` so that the
  term in the second row and second column is a 2 instead of a 5.

An alternative (and generally better) approach is to write your code
as an `R` script.

To open an `R` script in `RStudio`, select
`File`, then `New File`, and
`R Script`. Alternatively, use the keyboard shortcut
`Ctrl`+`Shift`+`N`, or click the
`New File` icon denoted by a blank page with a green circle
containing a white plus sign, which is located below
`File`.

Whereas a line of code in the `R` Console can be run by
pressing the `Enter` key, to run code directly from an
`R` script press `Ctrl`+`Enter` or
click the `Run` (‘Run the current line or selection’) button
in the top right corner of the `R` script tab. This will run
the line on which the cursor is currently located or any code which is
currently highlighted.

`R` scripts can be saved by clicking the floppy disk save
icon in the top left corner of the `R` script tab or by using
the keyboard shortcut `Ctrl`+`s`.

### Exercise

- Open and save an `R` script that tells `R` how
  to create the matrix `B` that you have previously defined.
  Run the command from the script file directly.

# 6 Additional Packages

The base distribution of `R` comes with many commonly used
add-on packages. These implement additional `R` commands that
are often very useful.

In `RStudio`, select the `Packages` tab of the
bottom right pane and click the box to the left of any package.

Alternatively, just use the `library` function, e.g.
`library("cluster")`.

### Exercise

- Load the `cluster` package and go through the help file
  for its `clusplot` function.

There are many more additional packages available from the CRAN
(comprehensive R archive network) website (<http://cran.r-project.org/web/packages/>). The easiest
way to install a package is to use the `install.packages`
function.

Another option is to follow the links on CRAN and directly download
the package of choice. In `RStudio`, the downloaded package
can be installed by clicking the `Install` icon at the top of
the `Environment` tab in the bottom right pane.

### Exercise

- Download the `mclust` package and install it into
  `R`. (Before `R` allows you to load this package,
  it may request you download and install additional packages. These are
  called dependencies.)
- Load the package and examine the help file for its
  `Mclust` function.

# 7 Creating Functions

Many of the functions we have been using were created by an R
contributor, as R is open source. For example, the `mclust`
package and its functions were written by Chris Fraley, Adrian Raftery,
Brendan Murphy, Michael Fop, and Luca Scrucca. It is easy to write your
own functions in R. Enter the following:

```
square <- function(x){ x^2 }

square(3)
```

```
## [1] 9
```

```
square(s)
```

```
## [1]  1  4  9 16
```

```
square(A)
```

```
##      [,1] [,2]
## [1,]    1    4
## [2,]    4   25
```

The first piece of code creates a function called `square`
that returns the squared value of its argument. Note how this function
treats the vector `s` and the matrix `A`
(specifically, `square` acts on the individual elements of
its argument.) In general, functions are created by assigning a name to
the command `function` followed by the arguments of the
function within parentheses `()`, followed by the commands
performed by the function within brackets `{}`.

# 8 `if` and `for` Loops

Although the speed of such calculations can be poor compared to
lower-level languages, R can be used to write `for` and
`if` loops. For example:

```
for(i in 1:10){ 
  if(i==1){
    print( paste("The first number is", i) )
    }
  if(i>1){
    print( paste("The next number is", i) )
    }
}
```

```
## [1] "The first number is 1"
## [1] "The next number is 2"
## [1] "The next number is 3"
## [1] "The next number is 4"
## [1] "The next number is 5"
## [1] "The next number is 6"
## [1] "The next number is 7"
## [1] "The next number is 8"
## [1] "The next number is 9"
## [1] "The next number is 10"
```

The syntax here is straightforward. A for loop is performed by
entering the command `for` followed by the index for the loop
and its range within parentheses `()`, followed in brackets
`{}` by the commands to be performed during each iteration of
the index:

```
for(i in 1:10){
  print(i)
}
```

```
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
```

The `if` argument is followed by a logically testable
statement given in parentheses `()`, followed in brackets
`{}` by the commands to be performed if the logical statement
is true:

```
if(2>0){  print("yes") }
```

```
## [1] "yes"
```

```
if(2==0){ print("yes") }
```

If a different command is to be performed if the initial condition is
not met, an `if` statement can be followed by an
`else` argument:

```
toy1 <- -2

if(toy1 > 0){
  print("yes")
} else{
  print("no")
}
```

```
## [1] "no"
```

### Exercise

- Write a function that takes as its argument a vector and that
  returns the sum of the elements if the vector is of length 10, returns
  the square of each element if its length is less than 10, or provides
  the list of differences between adjacent elements if its length is
  greater than 10 (hint: the `length` function will be useful
  here. You could also make use of the `square` function you
  have already defined, if you wanted.)

# 9 Importing/Exporting Data

`R` can import common data formats such as text files,
Excel files, SPSS files, etc.

### Exercise

- Go to <https://www.scss.tcd.ie/~arwhite/Teaching/STU33011.html>
  and download the data file “music.csv”, saving it to a suitable
  directory.

This file consists of a comma delimited Excel sheet which can be read
into `R` via the following (where the dots represent the file
path): `music <- read.csv("C:\\...\\music.csv")`.

Another approach, instead of using the full file path, is to change
the `R` working directory to the folder which contains the
file to be downloaded and then only use the file name itself.

In `RStudio`, select `Session`, then
`Set Working Directory`. If the file to be downloaded is in
the same folder as the current `R` script, then select
`To Source File Location`. Otherwise, select
`Choose Directory...` and browse to find the correct
folder.

Alternatively, you can use the `setwd` function to change
the working directory. The `getwd` function allows you to
check the current working directory.

Then you can enter:

```
music <- read.csv("music.csv")
```

Note that you can also import data directly using the url, e.g.

```
music <- read.csv("https://www.scss.tcd.ie/~arwhite/Teaching/STU33011/music.csv")
```

Additionally, in `RStudio`, you can click
`Import Dataset` in the `Environment` tab of the
top right pane or browse through the computer’s files in the
`Files` tab of the bottom right pane, click on the file, and
choose `Import Dataset...`.

We have loaded a data frame named `music` into
`R`. In `RStudio`, this dataframe should be
visible in the `Environment` tab in the top right pane. Once
the file has been read in, the first 10 rows of music can be checked by
entering:

```
head(music, 10)
```

```
##                 X Artist Type     LVar      LAve  LMax   LFEner     LFreq
## 1   Dancing Queen   Abba Rock 17600756 -90.00687 29921 105.9210  59.57379
## 2      Knowing Me   Abba Rock  9543021 -75.76672 27626 102.8362  58.48031
## 3   Take a Chance   Abba Rock  9049482 -98.06292 26372 102.3249 124.59397
## 4       Mamma Mia   Abba Rock  7557437 -90.47106 28898 101.6165  48.76513
## 5     Lay All You   Abba Rock  6282286 -88.95263 27940 100.3008  74.02039
## 6   Super Trouper   Abba Rock  4665867 -69.02084 25531 100.2485  81.40140
## 7  I Have A Dream   Abba Rock  3369670 -71.68288 14699 104.5969 305.18689
## 8      The Winner   Abba Rock  1135862 -67.81905  8928 104.3492 277.66056
## 9           Money   Abba Rock  6146943 -76.28075 22962 102.2407 165.15799
## 10            SOS   Abba Rock  3482882 -74.13000 15517 104.3624 146.73700
```

`music` consists of 8 columns and 62 observations. To
determine the dimensions of `music` run the command:

```
dim(music)
```

```
## [1] 62  8
```

To save the `music` data frame, i.e. to write the data
frame to a new file, the following could be used:

```
write.table(music, file = "C:\\...\\music2.csv", sep = ",")
```

Or if the working directory has been changed then you can simply
enter:

```
write.table(music, file = "music2.csv", sep = ",")
```

In the above, the `music` argument provides the name of
the data frame to be saved, the second (`file`) argument
specifies the file path to which you wish the data to be saved, and the
final (`sep`) argument tells `R` to save the file
in a comma delimited format.

### Exercise

- Save an `R` script that tells `R` how to
  import the `matrix` dataset. Check your script by running the
  command from the script file directly.

# 10 R Markdown

This document was created using an `R Markdown` file.
`R Markdown` files allow us to create text documents which
include blocks of `R` code, as well as the output and plots
that they produce. We can also use LaTex to write mathematical formulas
in `R Markdown` files. There are a lot of online resources
for `R Markdown`. See here for example: <https://shiny.rstudio.com/articles/rm-cheatsheet.html>

To open an `R Markdown` document, click on the
`New File` icon or navigate to `New File` via
`File` and then select `R Markdown...`. Click on
the `Knit` icon above the `R Markdown` file text
to produce the output document.