Hello folks, today I’ll share with you my experience using R for the past few weeks. The problem was that I had to analyze different time series resulted from gathering a sensors output. I needed to load the data and see its behavior A.S.A.P.! I talked with my advisor and his recommendation was: start with R. You can follow his advice too and download R.
What is R?
So first things first, according with R’s official site, this tool is a complete environment that includes a programming language, a high quality graph generator, a statistical and machine learning back-end. One more attribute or R is its extensibility to link programs written in C, C++ or even Fortran programming language. I read a little more and turns out R has amazing capabilities share data between different database management systems like Hadoop, PostgreSQL or MySQL. Also R has a community that’s growing fast with a ton of functions (or packages) for any kind of data analysis you want to do.
Who uses R?
The answer is pretty simple: people who are intended to generate and replicate statistical models, create state of the art graphs and analyze data sets in a matrix like fashion. Sometimes you don’t have time to create a complete tool for every problem you are solving in a common programming language like C, Java or Python. Instead you could look for an integrated environment as I did with R.
A lot of content has been published around R and how to solve things. There is a cool tutorial for getting started at R and using machine learning techniques. The following instructions are a bunch of things that I would it have to see together in one post when I first started using R.
1. How to create a variable.
You could use either “=” or “<-” assign operators to put values in a variable, where types are not needed. For example:
a = 3 print(a) a <- "Hello" print(a)
The first message will be “3” and the next message will be “Hello”. As you see, there is no explicit sentence or casting operation in the code to manage different data types.
2. How to create an array.
Use the function c to create a vector. Where the function’s arguments are the vector’s elements. The function will arrange the content as an array so you could treat the elements as an object and make scalar operations over every element in the array, in one sentence. For example
vector = c(2, 4, 6, 8) vector1 = vector + 1 vector2 = vector - 2 vector3 = vector * 3 vector4 = vector ** 2 vector5 = vector / 2 print(vector) print(vector1) print(vector2) print(vector3) print(vector4) print(vector5)
The code above will output the following messages:
- 2 4 6 8
- 3 5 7 9
- 0 2 4 6
- 6 12 18 24
- 4 16 36 64
- 1 2 3 4
3. How to read values from a file to a matrix and add new columns to the matrix
The next set of instructions are no joke, is that easy to load data. After loading the data, create a new column in the matrix called newColumn to hold the result of the matrix’s first column plus 1.
data = read.table("yourCSVFile.csv") # Adding a new column to the matrix data$newColumn = data$V1 + 2
4. How to know what are you dealing with, and a basic data plot
The function to know the matrix content is head, it outputs the columns names and the first six rows from the matrix. For a basic data plot use the function plot with the data, plot type and line color as arguments, respectively. If no line color is indicated the default line color is black. To overplot another graph, you use the function lines, with the same type or arguments. For example:
data = read.table("yourCSVFile.csv") # Print the matrix content head(data) # Adding a new column to the matrix data$newColumn = data$V1 + 2 plot(data$V1, type="l") lines(data$newColumn, type="l", col="red")
Well, that’s all for today folks! I hope you enjoyed the post. Please leave any questions or comments bellow. See you on the next post!