Showing posts with label LondonR. Show all posts
Showing posts with label LondonR. Show all posts
Test Driven Analysis
I mused over Test Driven Analysis on this blog before, but it was Richard Pugh's talk on SAS to R Migration at LondonR last week that brought the topic back into my mind and clarified a few things.
Rich's presentation focused on the challenge of how to ensure that the new system (R) would provide the same answers as the legacy system (SAS).
This is when it clicked with me: My brain is just another system as well. Suppose you have an idea for an analysis in your head. Taking that idea and transforming it into code is basically just the same as migrating code from one system to another system. Or, isn't it?
Rich showed us how he does it: Start with the old code, write unit tests in the legacy system to confirm your understanding, re-write the unit tests in the new system and then start building the new analysis code in the new system.
Once he achieved that, he said, he would go backwards in forwards between the different pieces until he has enough confidence that the new system does what it supposed to do.
Test Driven Analysis is just that as well.
I start with an idea in my head, think about reasonable checks and following that I (should) write down unit tests and only then start writing the analysis code. Finally I go backwards and forwards until I have gained enough evidence and confidence to present my output and be able to defend it.
Rich's presentation focused on the challenge of how to ensure that the new system (R) would provide the same answers as the legacy system (SAS).
This is when it clicked with me: My brain is just another system as well. Suppose you have an idea for an analysis in your head. Taking that idea and transforming it into code is basically just the same as migrating code from one system to another system. Or, isn't it?
Rich showed us how he does it: Start with the old code, write unit tests in the legacy system to confirm your understanding, re-write the unit tests in the new system and then start building the new analysis code in the new system.
Once he achieved that, he said, he would go backwards in forwards between the different pieces until he has enough confidence that the new system does what it supposed to do.
Test Driven Analysis is just that as well.
I start with an idea in my head, think about reasonable checks and following that I (should) write down unit tests and only then start writing the analysis code. Finally I go backwards and forwards until I have gained enough evidence and confidence to present my output and be able to defend it.
Test Driven Analysis
7 Apr 2015
08:00
LondonR
,
R
,
SAS
,
Test Driven Analysis
Interactive pivot tables with R
I love interactive pivot tables. That is the number one reason why I keep using spreadsheet software. The ability to look at data quickly in lots of different ways, without a single line of code helps me to get an understanding of the data really fast.
Perhaps I can do the same now in R as well. At yesterday's LondonR meeting Enzo Martoglio presented briefly his
The following animated Gif from Nicolas' project page gives an idea of the interactive functionality of PivotTable.js.
Perhaps I can do the same now in R as well. At yesterday's LondonR meeting Enzo Martoglio presented briefly his
rpivotTable package. Enzo builds on Nicolas Kruchten's PivotTable.js JavaScript library that provides drag'n'drop functionality and wraps it with htmlwidget into R. The result is an interactive pivot table rendered in either your default browser or the viewer pane of RStudio with one line of code:## Install packages
library(devtools)
install_github("ramnathv/htmlwidgets")
install_github("smartinsightsfromdata/rpivotTable")
## Load rpivotTable
library(rpivotTable)
data(mtcars)
## One line to create pivot table
rpivotTable(mtcars, rows="gear", col="cyl", aggregatorName="Average",
vals="mpg", rendererName="Treemap")The following animated Gif from Nicolas' project page gives an idea of the interactive functionality of PivotTable.js.
| Example of PivotTable.js Source: Nicolas Kruchten |
Session Info
R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.2 (Yosemite)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils
[5] datasets methods base
other attached packages:
[1] rpivotTable_0.1.3.4
loaded via a namespace (and not attached):
[1] digest_0.6.8 htmltools_0.2.6
[3] htmlwidgets_0.3.2 RJSONIO_1.3-0
[5] tools_3.1.3 yaml_2.1.13
31 Mar 2015
07:59
LondonR
,
News
,
PivotTable
,
R
Using planel.groups in lattice
Last Tuesday I attended the LondonR user group meeting, where Rich and Andy from Mango argued about the better package for multivariate graphics with R: lattice vs. ggplot2.
As part of their talk they had a little competition in visualising London Underground performance data, see their slides. Both made heavy use of the respective panelling / faceting capabilities. Additionally Rich used the
As part of their talk they had a little competition in visualising London Underground performance data, see their slides. Both made heavy use of the respective panelling / faceting capabilities. Additionally Rich used the
panel.groups argument of xyplot to fine control the content of each panel. Brilliant! I had never used this argument before. So, here is a silly example with the iris data set to remind myself of panel.groups in the future.
17 Sept 2013
07:02
lattice
,
LondonR
,
panel.groups
,
R
There is definitely R in July
The useR!2013 conference in Albacete, Spain, will commence next Wednesday, 10 July, and on the day before Diego and I will give a googleVis tutorial.
The following Monday, 15 July, the first R in Insurance event will take place at Cass Business School and I am absolutely delighted with the programme and the fact that we are sold out.
On Tuesday, 16 July, the LondonR user group meets in the City, awaiting presentations by Andrie de Vries (Revolution Analytics), Rich Pugh (Mango Solutions) and Hadley Wickham (RStudio).
Finally on Friday, 19 July, the next Cologne R user group meeting is scheduled with two talks: Predicting the Euro/Dollar exchange rates with Twitter (Dietmar Janetzko) and Networks in R using igraph (Afshin Sadeghi).
2 Jul 2013
06:32
googleVis
,
Koelner R User
,
LondonR
,
News
,
R
,
R in Insurance
Test Driven Analysis?
At the last LondonR meeting Francine Bennett from Mastodon C shared some of her experience and findings from an analysis of a large prescriptions data set of the UK's national health service (NHS). However, it was her last slide, which I found the most thought provoking. It asked for the definition of the following term:
Indeed, how do I go about test driven analysis? How do I know that I haven't made a mistake, when I start an analysis of a new data set? Well, I don't. But I try to mitigate risks. Similar to TDD, I consider which outputs I should expect from my analysis. Those outputs form the test scenarios of my analysis. Basically I try to write down everything I know, before I start working with the data, e.g.
Test-driven analysis?Francine explained that test driven development (TDD) is a concept often used in software development for quality assurance and she wondered if a similar approach could be also used for data analysis. Unfortunately the audience couldn't provide her with the answer, but many expressed that they face similar challenges. So do I.
Indeed, how do I go about test driven analysis? How do I know that I haven't made a mistake, when I start an analysis of a new data set? Well, I don't. But I try to mitigate risks. Similar to TDD, I consider which outputs I should expect from my analysis. Those outputs form the test scenarios of my analysis. Basically I try to write down everything I know, before I start working with the data, e.g.
- any other data sets or reports I can use for cross referencing,
- any back-of-the-envelope analysis I can carry out to provide ballpark answers,
- any relativities and ratios which should hold true,
- any known boundaries and thresholds,
- test scenarios for my code with small well known data, for which I know the outcome,
- names of experts, who could sense check and peer review my output.
16 Apr 2013
07:52
Analysis
,
LondonR
,
R
,
Soapbox
,
Test Driven Analysis
Dynamical systems in R with simecol
This evening I will talk about Dynamical systems in R with simecol at the LondonR meeting.
Thanks to the work by Thomas Petzoldt, Karsten Rinke, Karline Soetaert and R. Woodrow Setzer it is really straight forward to model and analyse dynamical systems in R with their deSolve and simecol packages.
I will give a brief overview of the functionality using a predator-prey model as an example.
This is of course a repeat of my presentation given at the Köln R user group meeting in March.
For a further example of a dynamical system with simecol see my post about the Hodgkin-Huxley model, which describes the action potential of a giant squid axon.
I shouldn't forget to mention the other talks tonight as well:
- Writing R for Dummies - Andrie De Vries
- News from data.table 1.6, 1.7 and 1.8 - Matthew Dowle
- Converting S Plus Applications into R - Andy Nicholls (postponed to 18 September 2012)
19 Jun 2012
06:23
Dynamical Systems
,
LondonR
,
Lotka-Volterra
,
predator-prey
,
Presentations
,
R
,
simecol
,
Tutorials
LondonR, 6 December 2011
The London R user group met again last Wednesday at the Shooting Star pub. And it was busy. More than 80 people had turned up. Was it the free beer and food, sponsored by Mango, which attracted the folks or the speakers? Or the venue? James Long, who organises the Chicago R user group meetings and who gave gave the first talk that night, noted that to his knowledge only the London and Chicago R users would meet in a pub.
However, it were the speakers and their talks which attracted me:
However, it were the speakers and their talks which attracted me:
- James Long: Easy Parallel Stochastic Simulations using Amazon's EC2 & Segue
- Chibisi Chima-Okereke: Actuarial Pricing Using General Linear Models In R
- Richard Saldanha: Practical Optimisation
10 Dec 2011
08:54
LondonR
,
Presentations
,
R
LondonR, 7 September 2011
On 7 September 2011 I attended the London R user group meeting. It was a very good turn out with about 50 attendees at the Shooting Star, a pub close to Liverpool Street Station. The session started at 18:00 with four presentations, followed by drinks sponsored by Mango Solutions. The slides of the presentation are available on londonr.org.
The first presentation was given by Lisa Wainer from UCL Department of Security and Crime Science about crime data analysis using R. Lisa presented about a project with Merseyside police, where she had built software, in R with the gWidgets package, called the Hot Products Early Warning System, that is used to help understand and characterise the acquisitive crime problem in Merseyside on an ongoing basis, detecting emerging trends in hot products.
Chris Wood gave an insightful talk about his research on sediment biogeochemical modelling in the North Sea. His model uses a set differential equations with over 20 parameters. Chris is able to analyse and fit his model to data he gathered on an expedition in the North Sea using R, the deSolve package and having access to the super-computer at the University of Southampton. How cool is this?
Jean-Robert Avettand-Fenoel talked about the Rook package and how R and Rook has helped him to roll out new applications to his colleagues faster than using Excel, VBA and C++ or RExcel. Rook allows you to build web apps with R. The package is maintained by Jeffery Horner, who also brought us the brew package. The brew allows us, in combination with Rapache, to mix html and R code in the same file. This is quite similar to the approach taken by Sweave for LaTeX and R. However, Rook provides a way to run R web applications on your desktop with the new internal R web server named Rhttpd.
The final presentation was actually given by myself talking about the googleVis package and the recent developments in version 0.2.9:
The first presentation was given by Lisa Wainer from UCL Department of Security and Crime Science about crime data analysis using R. Lisa presented about a project with Merseyside police, where she had built software, in R with the gWidgets package, called the Hot Products Early Warning System, that is used to help understand and characterise the acquisitive crime problem in Merseyside on an ongoing basis, detecting emerging trends in hot products.
Chris Wood gave an insightful talk about his research on sediment biogeochemical modelling in the North Sea. His model uses a set differential equations with over 20 parameters. Chris is able to analyse and fit his model to data he gathered on an expedition in the North Sea using R, the deSolve package and having access to the super-computer at the University of Southampton. How cool is this?
Jean-Robert Avettand-Fenoel talked about the Rook package and how R and Rook has helped him to roll out new applications to his colleagues faster than using Excel, VBA and C++ or RExcel. Rook allows you to build web apps with R. The package is maintained by Jeffery Horner, who also brought us the brew package. The brew allows us, in combination with Rapache, to mix html and R code in the same file. This is quite similar to the approach taken by Sweave for LaTeX and R. However, Rook provides a way to run R web applications on your desktop with the new internal R web server named Rhttpd.
The final presentation was actually given by myself talking about the googleVis package and the recent developments in version 0.2.9:






