Little useless-useful R functions – Create Pandas DataFrame from R data.frame

Fusion and stuff 🙂 And since data science and data engineering is becoming a melting-pot of languages, here is another useless, yet some of you might find it useful, function that creates Python code for pandas DataFrame from R data.frame including the data. Schema + data.

Let the fusion begin. I will construct a pandas DataFrame from dictionary. So we will work in R towards Python dictionary.

Iris dataset is the best example for this transition. Small but very useful.

Presume, you have a data.frame in R. In this case Iris dataset (complete or just first 15 rows, for sake of brevity of this useless function):

iris <- data.frame(iris)
#iris <- data.frame(iris[1:15,]) 

Once we have a data.frame we want to generate Python dictionary that will hold schema and data for direct creation (or import) into your favourite Python environment. Function RtoPy does the needed transformation:

RtoPy <- function(df_input, filename_path) {
  
    # column names and number of rows
    Nn <- names(df_input)
    nr <- nrow(df_input)
    
#Python is Indentation  sensitive - leave these two lines without indentation   
py_df <- "import pandas as pd
d = {"
    
    for (x in 1:length(Nn)){
      var <- (Nn[x])
      #Column Names
      py_df <- paste0(py_df, "'",var,"':[", collapse=NULL)
      
      #Data Rows
      for (i in 1:nr) {
        val <- df_input[i,x]
        #Check for data type
        if (sapply(df_input[i,x], class) == "factor") {
          py_df <- paste0(py_df, "'",val,"'", ",", collapse=NULL)
          #for last value in a column
          if (i == nr){
            py_df <- paste0(py_df, "'",val,"'", "],","\n", collapse=NULL)
          } 
        } else {
          py_df <- paste0(py_df, val, ",", collapse=NULL)
          #for last value in a column
          if (i == nr){
            py_df <- paste0(py_df, val, "],","\n", collapse=NULL)
          }  
        }
        
      }
      if (x == length(Nn)){
        py_df <- substr(py_df, 1, nchar(py_df)-2)
        py_df <- paste0(py_df, "}
df=pd.DataFrame(data=d)", collapse=NULL)
      }
    }
    
    ## Store to file
    sink(file = filename_path)
    cat(py_df)
    sink(file = NULL)
}

The input parameters are:
– data.frame in R that you want to have it scripted in python
– filename to store the schema and data

# Get the data from R data.frame to Python Pandas script
iris <- data.frame(iris)
RtoPy(iris, "/users/tomazkastrun/desktop/iris_py.py")

And the python code for creating this data.frame in pandas is:

'''python 
import pandas as pd
d = {'Sepal.Length':[5.1,4.9,4.7,4.6,5,5.4,4.6,5,4.4,4.9,5.4,4.8,4.8,4.3,5.8,5.8],
'Sepal.Width':[3.5,3,3.2,3.1,3.6,3.9,3.4,3.4,2.9,3.1,3.7,3.4,3,3,4,4],
'Petal.Length':[1.4,1.4,1.3,1.5,1.4,1.7,1.4,1.5,1.4,1.5,1.5,1.6,1.4,1.1,1.2,1.2],
'Petal.Width':[0.2,0.2,0.2,0.2,0.2,0.4,0.3,0.2,0.2,0.1,0.2,0.2,0.1,0.1,0.2,0.2],
'Species':['setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa','setosa']}
df=pd.DataFrame(data=d)

Since Python is indentation sensitive, storing the schema and data to file turned out to be safest way. And don’t ask why not use CSV to do the transformation from one to another language. 🙂

As always, complete set of the code is available at Github repository and function itself here.

Happy R-coding !!

Tagged with: , , , , , ,
Posted in Uncategorized, Useless R functions
4 comments on “Little useless-useful R functions – Create Pandas DataFrame from R data.frame
  1. […] Tomaz Kastrun builds a function: […]

    Like

  2. jimmykglenn's avatar jimmykglenn says:

    Avid R user, not much Python, why not dump it to a cab?

    Like

  3. jimmykglenn's avatar jimmykglenn says:

    Why not dump it to a cab?

    Like

  4. […] by data_admin [This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page […]

    Like

Leave a comment

Follow TomazTsql on WordPress.com
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
TomazTsql

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Discover WordPress

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Insane Asylum (A Blog by Pat Wright)

Information about SQL (PostgreSQL & SQL Server) from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...

SQLPam's Blog

Life changes fast and this is where I occasionally take time to ponder what I have learned and experienced. A lot of focus will be on SQL and the SQL community – but life varies.

William Durkin

William Durkin a blog on SQL Server, Replication, Performance Tuning and whatever else.

$hell Your Experience !!!

As aventuras de um DBA usando o Poder do $hell

Design a site like this with WordPress.com
Get started