Hola a todos, en mi post anterior nos pusimos serios revisando los tipos de competencias que un buen programador debe de contar en su arsenal de conocimientos y habilidades. En este post vamos a enfocarnos en el aprendizaje continuo, pero jugando. 😃
Habiendo actualmente tantos recursos donde aprender como:
- MOOC (Massive Online Open Courses)
- Degrees y Nanodegrees. (De estos dos últimos hablaré en algún post a futuro).
Destaca uno en particular que son los juegos, que te ayudan a reforzar los conceptos de programación, inteligencia artificial (AI) o herramientas, tratan de hacerlo divertido en el proceso y promueven que aprendas “ensuciandote las manos”.
Les voy a compartir tres juegos que he usado y estoy usando actualmente en tres plataformas y herramientas diferentes: VIM, R y Ruby.
Hello everybody, this is my first post here.
One of my first assignments was to make a word-cloud using R. At the beginning I had no idea what R was. Thus, I had to research about it. There are so many pages you can find how to do that, but I tried to integrate them and do something more focused and different.
The first thing you need is, obviously, R. You can download R from it’s official page, it’s lighweight and easy to configure. R Project
The second step (and maybe the most important) is to get your data. In this point you have to decide what is gonna be your data origin, such as a known database, a web page, or even a file. In our case, we’ll use a link to a new york time’s post to generate the word-cloud.
More.. Continue reading
Time Series and PAA
Time series processing is an emerging area in recent years. Despite storage capabilities are not such a problem today, we may deal with large datasets in several areas such as medicine, enterprises, schools, among others.
In order to reduce computing complexity and dimensionality reduction over time series, PAA (Piecewise Linear Approximation) is a very used technique.
Given a time series C of length n, it can be represented in a w-dimensional space by a vector, where the i-th element is given by:
Simply stated, to reduce the time series from n dimensions to w dimensions, the data is divided into w equal sized “frames”. The mean value of the data falling within a frame is calculated and a vector of these values becomes the data-reduced representation. Continue reading
Decision trees (DT) are predictive models aimed to linearly classify an item or a set of items among different classes. The distinctive feature of DT is that only parallel to x and y axis lines may be drawn.
In this post i’m showing you how to train, validate and use a DT to correctly classify objects in R, by using the “cars” set embedded in any R environment.
The cars set gives the speed of 50 cars and the distances taken to stop. Initially this set looks as follows.
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
In order to classify objects, we’ll assume that about the first half of dataset are “TRUE” objects, while the second half is labeled as “FALSE” objects. Continue reading
Classification is a commonplace problem nowadays. I have been lately working on classification issues for job reasons. I used Matlab, python and some other lower level languages for classification but resulted in tedious and kind of hard approaches.
R incorporates a special library to classify by using neural networks (NN), called ‘neuralnet’, which can be installed directly using: install.packages(‘neuralnet’).
In order to give an example along this post, we need a data set. Iris data set will be enough, it is perhaps the best known database to be found in the pattern recognition literature. We can download it from HERE.
The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. Based on the combination of these four features. Continue reading
Hello folks, today I’ll share with you my experience using R for the past few weeks. The problem was that I had to analyze different time series resulted from gathering a sensors output. I needed to load the data and see its behavior A.S.A.P.! I talked with my advisor and his recommendation was: start with R. You can follow his advice too and download R.
What is R?
So first things first, according with R’s official site, this tool is a complete environment that includes a programming language, a high quality graph generator, a statistical and machine learning back-end. One more attribute or R is its extensibility to link programs written in C, C++ or even Fortran programming language. I read a little more and turns out R has amazing capabilities share data between different database management systems like Hadoop, PostgreSQL or MySQL. Also R has a community that’s growing fast with a ton of functions (or packages) for any kind of data analysis you want to do. Continue reading