R is a very useful tool for Data Analysis and has many good libraries to help visualise your data. I will go over the many qays you can do this with a seleciton package sin this session.

For demonstrative purposes will be using the iris dataset available wihtin R by default

data(iris)
summary(iris)
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
##        Species  
##  setosa    :50  
##  versicolor:50  
##  virginica :50  
##                 
##                 
## 

Base R visualisations

You can also embed plots, for example:

ggplot2 package

For more detailed plot, you can use the ggplot package. It has three main arguments:

  • data: this is the dataframe you wish to visualise.

  • aes: This represents the vairous aestehetics of the chart. For now we will just state what is to go on the x and y axes.

  • geom: This specifies the type of graph you wish to produce. For now will use geom_point for a scatter chart.

library(ggplot2)
ggplot(
  data = iris, 
  aes(
    x=Sepal.Length,
    y=Petal.Length
    )
  )+
  geom_point()

We can now add the color argument into the aestethics to colour it by Species. This can be done either

ggplot(
  data = iris, 
  aes(
    x=Sepal.Length,
    y=Petal.Length,
    color = Species
    )
  )+
  geom_point()

You cna also facet the data across seperate plots using the facet_grid() function. You can do this either by column or row.

ggplot(
  data = iris, 
  aes(
    x=Sepal.Length,
    y=Petal.Length,
    color = Species
    )
  )+
  geom_point()+
  facet_grid(rows = vars(Species))

ggplot(
  data = iris, 
  aes(
    x=Sepal.Length,
    y=Petal.Length,
    color = Species
    )
  )+
  geom_point()+
  facet_grid(cols = vars(Species))

Formatting ggplot charts

ggplot2 has many preset themes you can use for your charts. You can access these using scale_color_brewer.

ggplot(
  data = iris, 
  aes(
    x=Sepal.Length,
    y=Petal.Length,
    color = Species
    )
  )+
  geom_point()+
  facet_grid(cols = vars(Species))+
  scale_color_brewer(palette="Dark2")

Alternatively you can also use the ggthemes package:

library(ggthemes)
ggplot(
  data = iris, 
  aes(
    x=Sepal.Length,
    y=Petal.Length,
    color = Species
    )
  )+
  geom_point()+
  facet_grid(cols = vars(Species))+
theme_tufte()

plotly Package

For interactive charts, you can use the plotly package.This is a widley used package in both Python and R and cna be used to create interactive charts you can embed into websites.

The syntax is slightly different to that of ggplot. Firstly, when calling on columns from a dataframe, the column name needs to be preceded with ‘~’.

plot_ly(
  data = iris,
  x = ~Sepal.Length, 
  y = ~Petal.Length,
  type = 'scatter',
  mode = "markers"
  ) 

Again like ggplot you can color and customise your chart and no need for the aes argument.

plot_ly(
  data = iris,
  x = ~Sepal.Length, 
  y = ~Petal.Length,
  type = 'scatter',
  mode = "markers",
  color = ~Species)

You can go further and customise the shapes and symbols use din yur visualisations too:

plot_ly(
  data = iris, 
  x = ~Sepal.Length, 
  y = ~Petal.Length, 
  type = 'scatter',
  mode = 'markers', 
  symbol = ~Species, 
  symbols = c('circle','x','o'),
  color = I('black'), 
  marker = list(size = 10))

When adding more attributes to to an object, instead of using the plus symbol, you use the dplyr “pipe” operation %>% . This can simply be written out using Ctrl + Shift + M within R Studio.

trelliscope Package