Creating One Unified Calendar of all Data Science Events in the Netherlands

Over engineering with renv and github actions

Creating One Unified Calendar of all Data Science Events in the Netherlands
I enjoy learning new things about machine learning, and I enjoy meeting like minded people too. That is why I go to meetups and conferences. But not everyone I meet becomes a member of every group. So I keep sending my coworkers new events that I hear about here in the Netherlands. And it is easy to overlook a new event that comes in over email. Me individually cannot scale. So in this post I will walk you through an over engineered solution to make myself unnecessary. [Read More]

Don't Panic! a Scientific Approach to Debugging Production Failure

Your production system just broke down. What should you do now? Can you imagine your shiny application / flask app, or your API service breaking down? As a beginning programmer, or operations (or devops) person it can be overwhelming to deal with logs, messages, metrics and other possible relevant information that is coming at you at such a point. And when something fails you want it to get back to working state as fast as possible. [Read More]

WTF is Kubernetes and Should I Care as R User?

Fearless to production

I’m going to give you a high overview of kubernetes and how you can make your R work shine in kubernetes. Are you, an R-user in a company that uses kubernetes? building R applications (models that do predictions, shiny applications, APIs)? curious about this whole kubernetes thing that your coworkers are talking about? somewhat afraid? Then I have the post for you! Many R users come from an academic background, statistics and social sciences. [Read More]

Should I Move to a Database?

Long ago at a real-life meetup (remember those?), I received a t-shirt which said: “biggeR than R”. I think it was by microsoft, who develop a special version of R with automatic parallel work. Anyways, I was thinking about bigness (is that a word? it is now!) of your data. Is your data becoming to big? big data stupid gif Your dataset becomes so big and unwieldy that operations take a long time. [Read More]

Distributing data science products

Where or what is production? What does it mean when someone says to bring some data science product ‘in production’ ? What does it mean for data science products to be in production? Is your product already in production? Is it a magical place? I think two questions are of importance: does my ‘thing’ provide value? is my work repeatable? If the answer to these questions is yes, than your ‘thing’ is in production. [Read More]

UseR2021: Integrating R into Production

A view on UseR 2021

This year’s useR was completely online, and I watched many of the talks. I believe the videos will be public in the future but there were some talks that I wanted to highlight. I think that the biggest problem with machine learning- (or even data-) projects is the integration with existing systems. Many machine learning products are batch or real-time predictions. For those predictions to make value you will need: [Read More]

Walkthrough UbiOps and Tidymodels

From python cookbook to R {recipes}

In this walkthrough I modified a tutorial from the UbiOps cookbook ‘Python Scikit learn and UbiOps’, but I replaced everything python with R. So in stead of scikitlearn I’m using {tidymodels}, and where python uses a requirement.txt, I will use {renv}. So in a way I’m going from python cookbook to {recipes} in R! Components of the pipeline The original cookbook (and my rewrite too) has three components: [Read More]

Reasons to Use Tidymodels

I was listening to episode 135 of ‘Not so standard deviations’ - Moderate confidence The hosts, Hilary and Roger talked about when to use tidymodels packages and when not. Here are my 2 cents for when I think it makes sense to use these packages and when not: When not you are always using GLM models. (they are very flexible!) it makes no sense to me to go for the extra {parsnip} layer if you are always using the same models. [Read More]

Tidymodels on UbiOps

I’ve been working with UbiOps lately, a service that runs your data science models as a service. They have recently started supporting R next to python! So let’s see if we can deploy a tidymodels model to UbiOps! I am not going to tell you a lot about UbiOps, that is for another post. I presume you know what it is, you know what tidymodels means for R and you want to combine these things. [Read More]

Deploy to Shinyapps.io from Github Actions

Last week I spend a few hours figuring out how to auto deploy a shiny app on 2 apps on shinyapps.io from github. You can see the result on this github repository. This github repository is connected to two shiny apps on shinyapps.io. Here is what I envisioned, every new commit to the main branch will be published to the main app. We could then lock down the main branch so that no one can directly commit to main. [Read More]