Little useless-useful R functions – How to make R-squared useless

Uselessness is such a long useless word!

In statistics, R-squared is a statistical measure, that determines the proportion of variance in dependent variable that can be explained by the independent variable. Therefore, it ranges in value from 0 to 1 and is usually interpreted as summarizing the percent of variation in the response that the regression model explains.

So, an R-squared of 0.59 might show how well the data fit to the model (hence goodness of fit) and also explains about 59% of the variation in our dependent variable.

Given this logic, we prefer our regression models to have a high R-squared. Yes? Right! And by useless test, with adding random noise to a function, what happens next?

  set.seed(2908)                   
# some toy/random data
x <- 1:30                        
y <- 2 + 0.5*x + rnorm(30,0,4)    
mod <- lm(y~x)                    
summary(mod)$r.squared

R-squared is also the sum of squared residuals (fitted-value deviations) – mms -over the total sum of squared – tts.

We want to check the useless assumption. R-squared doesn’t necessarily mean measure goodness of fit. It can be arbitrarily low when the model is completely correct. By making sigma2 large
enough, we drive R-squared crazy and towards 0, even when every assumption of the simple linear regression model is correct in every particular.

And the simple function can be described:

useless_r2_with_sigma <- function(sig){
  x <- seq(1,10,length.out = 100)        
  y <- 2 + 1.2*x + rnorm(100,0,sd = sig)  
  summary(lm(y ~ x))$r.squared            
}

And plot the results to check, if it holds water:

assumption_sigma <- seq(0.5,20,length.out = 100)
results <- sapply(assumption_sigma, useless_r2_with_sigma)  
plot(results ~ sigmas, type="b")

And check the results and see, how the sigmas “pull” down the r-squares.

The section of the useless script is as always available on GitHub in  Useless_R_function repository. The sample file in this repository is here (filename: R-squared.R). Check the repository for future updates.

Happy R-coding and stay healthy!

Tagged with: , , ,
Posted in R, Useless R functions
2 comments on “Little useless-useful R functions – How to make R-squared useless
  1. […] article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page […]

    Like

Leave a comment

Follow TomazTsql on WordPress.com
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
TomazTsql

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Discover WordPress

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, Fabric, enterprise Power BI, SQL Server BI, Data Modeling, SSAS Design, SSRS, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Insane Asylum (A Blog by Pat Wright)

Information about SQL (PostgreSQL & SQL Server) from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...

SQLPam's Blog

Life changes fast and this is where I occasionally take time to ponder what I have learned and experienced. A lot of focus will be on SQL and the SQL community – but life varies.

William Durkin

William Durkin a blog on SQL Server, Replication, Performance Tuning and whatever else.

$hell Your Experience !!!

As aventuras de um DBA usando o Poder do $hell

Design a site like this with WordPress.com
Get started