{"id":381742,"date":"2024-01-27T18:00:00","date_gmt":"2024-01-28T00:00:00","guid":{"rendered":"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/"},"modified":"2024-01-27T18:00:00","modified_gmt":"2024-01-28T00:00:00","slug":"exploring-non-linear-effects-visual-cate-analysis-of-continuous-confounders-binary-exposures-and-continuous-outcomes","status":"publish","type":"post","link":"https:\/\/www.r-bloggers.com\/2024\/01\/exploring-non-linear-effects-visual-cate-analysis-of-continuous-confounders-binary-exposures-and-continuous-outcomes\/","title":{"rendered":"Exploring Non-linear Effects: Visual CATE Analysis of Continuous Confounders, Binary Exposures, and Continuous Outcomes"},"content":{"rendered":"<!-- \r\n<div style=\"min-height: 30px;\">\r\n[social4i size=\"small\" align=\"align-left\"]\r\n<\/div>\r\n-->\r\n\r\n<div style=\"border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;\">\r\n[This article was first published on  <strong><a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/\"> r on Everyday Is A School Day<\/a><\/strong>, and kindly contributed to <a href=\"https:\/\/www.r-bloggers.com\/\" rel=\"nofollow\">R-bloggers<\/a>].  (You can report issue about the content on this page <a href=\"https:\/\/www.r-bloggers.com\/contact-us\/\">here<\/a>)\r\n<hr>Want to share your content on R-bloggers?<a href=\"https:\/\/www.r-bloggers.com\/add-your-blog\/\" rel=\"nofollow\"> click here<\/a> if you have a blog, or <a href=\"http:\/\/r-posts.com\/\" rel=\"nofollow\"> here<\/a> if you don't.\r\n<\/div>\n<blockquote>\n<p>It was enjoyable to visualize the non-linear relationship with interaction and observe the corresponding changes in CATE. If one understands the underlying equation, it\u2019s possible to easily obtain the ATE using calculus. Lastly, adopting Richard McElreath\u2019s Owl framework as a documented procedure ensures quality assurance! \ud83d\ude4c<\/p>\n<\/blockquote>\n<p><img src=\"https:\/\/i2.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/feature.jpg?w=578&#038;ssl=1\" alt=\"\" data-recalc-dims=\"1\"><\/p>\n\n\n\n\n<h2 id=\"question-of-the-day\">Question of the Day\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#question-of-the-day\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h2>\n<p>Is there a change in CATE if there is interaction between our confounder and exposure, present of non-linear relationship of confounder and outcome? It sounds like there should be, shouldn\u2019t it? Let\u2019s test the theory out<\/p>\n<p>One of my goals this year is to finish \n<a href=\"https:\/\/github.com\/rmcelreath\/stat_rethinking_2023?tab=readme-ov-file\" rel=\"nofollow\" target=\"_blank\">Statistical Rethinking<\/a> videos by Richard McElreath. Using his scientific framework of establishing DAG, Golem, and Owl to go through this interesting question we have, without bayesian method.<\/p>\n<p>If you\u2019re only interested in the non-linear effect exploration, please skip to \n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#viz\" rel=\"nofollow\" target=\"_blank\">Visualization<\/a> or follow the <code>&lt;- TL;DR<\/code> on objectives.<\/p>\n\n\n\n\n<h3 id=\"objectives\">Objectives:\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#objectives\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h3>\n<ul>\n<li>\n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#truth\" rel=\"nofollow\" target=\"_blank\">Truth<\/a> <- TL;DR<\/li>\n<li>\n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#dag\" rel=\"nofollow\" target=\"_blank\">DAG<\/a><\/li>\n<li>\n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#golem\" rel=\"nofollow\" target=\"_blank\">Golem<\/a><\/li>\n<li>\n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#owl\" rel=\"nofollow\" target=\"_blank\">Owl<\/a><\/li>\n<li>\n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#viz\" rel=\"nofollow\" target=\"_blank\">Visualization<\/a> <- TL;DR<\/li>\n<li>\n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#lessons\" rel=\"nofollow\" target=\"_blank\">Lessons learnt<\/a><\/li>\n<\/ul>\n\n\n\n\n<h2 id=\"truth\">Truth \u2705\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#truth\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h2>\n<pre>library(tidyverse)\nlibrary(mgcv)\nlibrary(ggpubr)\n\nset.seed(1)\nn &lt;- 10000\nx &lt;- rnorm(n)\nt &lt;- rbinom(n, 1, plogis(0.5*x))\nz &lt;- rnorm(n)\ny &lt;- x^2 + 2*x*t + 5*t + 0.5*z + rnorm(n) \ndf &lt;- tibble(x=x,y=y,t=t,z=z)\n<\/pre><p>Let\u2019s take a look at the above. To test our theory out, we should construct a world where we know the truth. The above relationship of <code>y<\/code>, <code>x<\/code>, <code>z<\/code> and <code>t<\/code>. Here we will treat <code>y<\/code> as a continuous outcome, <code>x<\/code> as our continuous confounder, <code>t<\/code> as our binary exposure, and <code>z<\/code> (which has not relationship to <code>x<\/code> or <code>z<\/code>). And we\u2019re interesting in finding out the conditional average treatment effect (CATE), given the change of <code>x<\/code>.<\/p>\n<p>The truth here lies in the equation <code>y &lt;- x^2 + 2*x*t + 5*t + 0.5*z + rnorm(n)<\/code>. We\u2019ve constructed the outcome whereby we know the functional relationship of <code>y<\/code> with respect to <code>x<\/code>, <code>t<\/code>, and <code>z<\/code>. We also know that <code>x<\/code> influences <code>t<\/code> as well.<\/p>\n\n\n\n\n<h2 id=\"dag\">DAG\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#dag\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h2>\n<blockquote>\n<p>Transparent scientific assumptions to justify scientific effort, expose it to useful critique, and connect theories to golems<\/p>\n<\/blockquote>\n<p align=\"center\">\n  <img loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/dag.png?w=50%25&#038;ssl=1\" alt=\"image\" height=\"auto\" data-recalc-dims=\"1\">\n  <\/p>\n<p>Let\u2019s assume that we know the structural relationships of all the nodes as depicted above. The interesting thing about DAG is you don\u2019t actually need to know the functional relationships to create one. DAG is helpful in communicating the causal model to further guide the creation of <code>golem<\/code>, aka statistical model \/ estimators.<\/p>\n\n\n\n\n<h2 id=\"golem\">Golem\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#golem\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h2>\n<blockquote>\n<p>Brainless, powerful statistical models<\/p>\n<\/blockquote>\n<p align=\"center\">\n  <img loading=\"lazy\" src=\"https:\/\/i1.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/golem.jpg?w=50%25&#038;ssl=1\" alt=\"image\" height=\"auto\" data-recalc-dims=\"1\">\n  <\/p>\n<p>Now, in order for us to know what statistical models to use, we\u2019d have to know the underlying functional relationships of each nodes. Are the relationships linear or non-linear? Are there confounders that need adjustment or colliders that need mindful adjustment avoidance?<\/p>\n<p>Since we don\u2019t really know the true functional relationships between the nodes, we will consider both linear (linear regression) and non-linear approaches (generalized additive model). Also given the DAG above, we need to adjust for <code>x<\/code> to assess for ATE which is E(y|t=1,X=x) - E(y|t=0,X=x), and nothing else.<\/p>\n\n\n\n\n<h2 id=\"owl\">Owl\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#owl\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h2>\n<blockquote>\n<p>Documented procedures, quality assurance<\/p>\n<\/blockquote>\n<p align=\"center\">\n  <img loading=\"lazy\" src=\"https:\/\/i1.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/owl.jpg?w=50%25&#038;ssl=1\" alt=\"image\" height=\"auto\" data-recalc-dims=\"1\">\n  <\/p>\n<p>The point of the <code>Owl<\/code> is to bring everything together in a procedural format project after project. To produce a documentation of transparency and also the thought process of the causal model to the analysis. The previous <code>DAG<\/code> and <code>golem<\/code> fit right in here as well on <code>1<\/code> and <code>2<\/code>. Below are the steps.<\/p>\n<p>Steps to draw the Owl:<\/p>\n<ol>\n<li>Theoretical estimand -> DAG<\/li>\n<li>Scientific (causal) model(s) -> Golem<\/li>\n<li>Use (1) & (2) to build statistical model(s)<\/li>\n<li>Simulate from (2) to validate (3) yields (1)<\/li>\n<li>Analyze real data<\/li>\n<\/ol>\n<p>Since we have gone through <code>1<\/code> and <code>2<\/code>, let\u2019s put some work into <code>3<\/code> and <code>4<\/code> before we dive into <code>5<\/code> which is contained in <code>df<\/code> as simulated earlier on when we constructed the <code>truth<\/code>.<\/p>\n\n\n\n\n<h4 id=\"golem-1-assuming-linear-relationships\">Golem 1: Assuming Linear Relationships\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#golem-1-assuming-linear-relationships\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<pre>set.seed(1)\nn &lt;- 10000\nx &lt;- rnorm(n) #confounder\nt &lt;- rbinom(n, 1, plogis(0.2*x)) #exposure binary\nz &lt;- rnorm(n) \ny &lt;-  0.1*x + 3*t + 0.4*z + rnorm(n) #outcome\n\ndf_sim &lt;- tibble(x=x,y=y,t=t,z=z)\n\nsim_model &lt;- lm(y ~ x + t, df_sim)\nsummary(sim_model)\n\r\n## \n## Call:\n## lm(formula = y ~ x + t, data = df_sim)\n## \n## Residuals:\n##     Min      1Q  Median      3Q     Max \n## -3.6850 -0.7383  0.0025  0.7330  3.7861 \n## \n## Coefficients:\n##             Estimate Std. Error t value Pr(&gt;|t|)    \n## (Intercept) 0.002103   0.015329   0.137    0.891    \n## x           0.101445   0.010754   9.433   &lt;2e-16 ***\n## t           3.015260   0.021774 138.483   &lt;2e-16 ***\n## ---\n## Signif. codes:  0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1\n## \n## Residual standard error: 1.082 on 9997 degrees of freedom\n## Multiple R-squared:  0.6644,\tAdjusted R-squared:  0.6643 \n## F-statistic:  9895 on 2 and 9997 DF,  p-value: &lt; 2.2e-16\n<\/pre>\n\n\n\n<h4 id=\"golem-2-assuming-non-linear-relationships\">Golem 2: Assuming Non-linear Relationships\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#golem-2-assuming-non-linear-relationships\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<pre>set.seed(1)\nn &lt;- 10000\nx &lt;- rnorm(n) #confounder\nt &lt;- rbinom(n, 1, plogis(0.2*x)) #exposure binary\nz &lt;- rnorm(n) \ny &lt;- x^2 + 3*t + 0.4*z + rnorm(n) #outcome\n\ndf_sim &lt;- tibble(x=x,y=y,t=t,z=z)\n\ndf_sim |&gt;\n  ggplot(aes(x=x,y=y,color=as.factor(t))) +\n  geom_point()\n<\/pre><img src=\"https:\/\/i0.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-3-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<p>Alright, do you think linear regression and GAM would produce a different CATE with this simulated dataset?<\/p>\n<pre>model_lr &lt;- lm(y ~ x + t, df_sim)\nmodel_gam &lt;- gam(y ~ s(x, k = 10) + x + t, data = df_sim)\n\ncate_x_lr &lt;-  predict(model_lr,newdata=tibble(x=x,t=1)) - predict(model_lr,newdata=tibble(x=x,t=0))\ncate_x_gam &lt;- predict(model_gam,newdata=tibble(x=x,t=1)) - predict(model_gam,newdata=tibble(x=x,t=0))\n\ntibble(x=x, cate_x_lr=cate_x_lr,cate_x_gam=cate_x_gam) |&gt;\n  pivot_longer(cols = starts_with(&quot;cate&quot;), names_to = &quot;model&quot;, values_to = &quot;cate&quot;) |&gt;\n  ggplot(aes(x=x,y=cate,color=model)) +\n  geom_point()\n<\/pre><img src=\"https:\/\/i0.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-4-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<pre>print(paste0(&quot;ate_lr: &quot;,cate_x_lr |&gt; mean()))\n\r\n## [1] &quot;ate_lr: 2.98384150376772&quot;\n\r\nprint(paste0(&quot;ate_gam: &quot;,cate_x_gam |&gt; mean()))\n\r\n## [1] &quot;ate_gam: 3.01499684018495&quot;\n<\/pre><p>Very very small difference when there is no interaction. What if there is interaction? Let\u2019s simulate<\/p>\n\n\n\n\n<h4 id=\"golem-3-assuming-non-linear-relationships-with-interactions\">Golem 3: Assuming Non-linear Relationships with Interactions\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#golem-3-assuming-non-linear-relationships-with-interactions\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<pre>set.seed(1)\nn &lt;- 10000\nx &lt;- rnorm(n) #confounder\nt &lt;- rbinom(n, 1, plogis(0.2*x)) #exposure binary\nz &lt;- rnorm(n) \ny &lt;- x^2 + 4*x*t + 3*t + 0.4*z + rnorm(n) #outcome\n\ndf_sim &lt;- tibble(x=x,y=y,t=t,z=z)\n\ndf_sim |&gt;\n  ggplot(aes(x=x,y=y,color=as.factor(t))) +\n  geom_point()\n<\/pre><img src=\"https:\/\/i0.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-5-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<p>Wow, OK that looks a bit more complicated. Let\u2019s take a look at the CATE<\/p>\n<pre>model_lr &lt;- lm(y ~ x*t, df_sim)\nmodel_gam &lt;- gam(y ~ s(x, k = 10) + x + t + x:t, data = df_sim)\n\ncate_x_lr &lt;-  predict(model_lr,newdata=tibble(x=x,t=1)) - predict(model_lr,newdata=tibble(x=x,t=0))\ncate_x_gam &lt;- predict(model_gam,newdata=tibble(x=x,t=1)) - predict(model_gam,newdata=tibble(x=x,t=0))\n\ntibble(x=x, cate_x_lr=cate_x_lr,cate_x_gam=cate_x_gam) |&gt;\n  pivot_longer(cols = starts_with(&quot;cate&quot;), names_to = &quot;model&quot;, values_to = &quot;cate&quot;) |&gt;\n  ggplot(aes(x=x,y=cate,color=model)) +\n  geom_point()\n<\/pre><img src=\"https:\/\/i2.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-6-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<pre>print(paste0(&quot;ate_lr: &quot;,cate_x_lr |&gt; mean()))\n\r\n## [1] &quot;ate_lr: 2.95641106287701&quot;\n\r\nprint(paste0(&quot;ate_gam: &quot;,cate_x_gam |&gt; mean()))\n\r\n## [1] &quot;ate_gam: 2.98877579715275&quot;\n<\/pre><p>Alright! As you can see there is a difference with CATE but not so much with ATE.<\/p>\n<p>Now that we have entertained the idea of linear, non-linear, non-linear with interaction relationships, let\u2019s go ahead and take a look at <code>df<\/code> which is going to be our real data. Note that in real life, we won\u2019t know that actual formula <code>y &lt;- x^2 + 2*x*t + 5*t + 0.5*z + rnorm(n)<\/code>, we will only know the measurements (<code>df<\/code>) but don\u2019t know the relationships between the nodes until we use <code>DAG<\/code>, <code>golem<\/code> and <code>owl<\/code> to estimate the function.<\/p>\n\n\n\n\n<h4 id=\"owl-step-5-analyze-real-data\">Owl Step 5: Analyze real data\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#owl-step-5-analyze-real-data\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<p>Remember our real data resides in <code>df<\/code>. Let\u2019s take a look at the inter-nodal relationships by exploratory data analysis.<\/p>\n<pre>library(GGally)\n\ndf |&gt;\n  mutate(t = as.factor(t)) |&gt;\n  ggpairs()\n<\/pre><img src=\"https:\/\/i2.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-7-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<p><strong>Fuctional Relationships<\/strong>: <br>\nIt appears that:<\/p>\n<ul>\n<li><code>y<\/code> and <code>x<\/code>: non-linear, ?is there interaction<\/li>\n<li><code>y<\/code> and <code>t<\/code>: linear<\/li>\n<li><code>y<\/code> and <code>z<\/code>: non-linear, not really sure what this looks like \ud83e\udd23<\/li>\n<li><code>x<\/code> and <code>t<\/code>: ?linear vs no relationship, hard to see the difference<\/li>\n<li><code>x<\/code> and <code>z<\/code>: ?no relationship<\/li>\n<li><code>t<\/code> and <code>z<\/code>: no relationship, looks random<\/li>\n<\/ul>\n\n\n\n\n<h4 id=\"lets-inspect-x-and-t\">Let\u2019s inspect <code>x<\/code> and <code>t<\/code>\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#lets-inspect-x-and-t\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<pre>t.test(x ~ t, df)\n\r\n## \n## \tWelch Two Sample t-test\n## \n## data:  x by t\n## t = -25.616, df = 9996.9, p-value &lt; 2.2e-16\n## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0\n## 95 percent confidence interval:\n##  -0.5408035 -0.4639183\n## sample estimates:\n## mean in group 0 mean in group 1 \n##      -0.2557583       0.2466026\n<\/pre><p>OK, there is a relationship there, given the distributions, we\u2019ll consider them linear.<\/p>\n\n\n\n\n<h4 id=\"lets-inspect-for-interaction-y-x-and-t\">Let\u2019s inspect for interaction <code>y<\/code>, <code>x<\/code>, and <code>t<\/code>\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#lets-inspect-for-interaction-y-x-and-t\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<pre>df |&gt;\n  ggplot(aes(x=x,y=y,color=as.factor(t))) +\n  geom_point()\n<\/pre><img src=\"https:\/\/i2.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-9-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<p>Alright, some interaction there towards the tail ends below <code>0<\/code>. We shall use <code>Golem 3<\/code> and compare <code>linear regression<\/code> and <code>gam<\/code> models.<\/p>\n\n\n\n\n<h2 id=\"viz\">Visualization\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#viz\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h2>\n<pre># linear regression w interaction\nmodel &lt;- lm(y ~ x*t, df)\n\nplot_linear &lt;- df |&gt;\n  add_column(pred=predict(model, newdata=tibble(x=x,t=t))) |&gt;\n  ggplot(aes(x=x,y=y,color=as.factor(t))) +\n  geom_point() +\n  geom_point(aes(x=x,y=pred), color = &quot;red&quot;) +\n  ggtitle(&quot;Linear Regression With Interaction&quot;) +\n  theme(legend.position = &quot;none&quot;)\n\n# calculate cate for lr\ncate &lt;- predict(model,newdata=tibble(x=x,z=z,t=1)) - predict(model,newdata=tibble(x=x,z=z,t=0))\n\n# gam w interaction\nmodel2 &lt;- gam(y ~ s(x, k = 10) + x + t + x:t, data = df)\n\nplot_nonlinear &lt;- df |&gt;\n  add_column(pred=predict(model2, newdata=tibble(x=x,t=t))) |&gt;\n  ggplot(aes(x=x,y=y,color=as.factor(t))) +\n  geom_point() +\n  geom_point(aes(x=x,y=pred), color = &quot;red&quot;) +\n  ggtitle(&quot;GAM With Interaction&quot;) +\n  theme(legend.position = &quot;none&quot;)\n\n# calculate cate for gam\ncate2 &lt;- predict(model2,newdata=tibble(x=x,t=1)) - predict(model2,newdata=tibble(x=x,t=0))\n\n# visualize all model cates to assess differences\ncate_all &lt;- tibble(x=x, cate=cate,cate2=cate2) |&gt;\n  mutate(cate3 = 2*x+5) |&gt;\n  pivot_longer(cols = starts_with(&quot;cate&quot;), names_to = &quot;model&quot;, values_to = &quot;cate&quot;) |&gt;\n  ggplot(aes(x=x,y=cate,color=model)) +\n  geom_point() +\n  ggtitle(&quot;Visualizing all models&#39; CATE&quot;)\n\nggarrange(plot_linear, plot_nonlinear)\n<\/pre><img src=\"https:\/\/i2.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-10-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<p><code>t==1<\/code> is depicted as <code>turqoise<\/code> color, whereas <code>t==0<\/code> is <code>red<\/code> in color.<\/p>\n<p>Wow, this comparison really helped me to visualize why we need to find the right estimator depending on the functional relationship of outcome, exposure and confounder(s). On the left, we have built a linear regression model, as you can see it basically fit one straight line on <code>t==1<\/code> and another on <code>t==0<\/code>. The difference of that, given value <code>x<\/code>, would be CATE.<\/p>\n<p>Same goes with the graph on the right. Now this time, we fit <code>GAM<\/code> model with splines to fit those points for <code>t==1<\/code> and <code>t==0<\/code>. I reckon the CATE would be different from linear model. Let\u2019s visualize it!<\/p>\n\n\n\n\n<h4 id=\"visualizing-cate-of-all-models\">Visualizing CATE of All Models\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#visualizing-cate-of-all-models\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<pre>cate_all \n<\/pre><img src=\"https:\/\/i0.wp.com\/www.kenkoonwong.com\/blog\/cate-gam\/index_files\/figure-html\/unnamed-chunk-11-1.png?w=450&#038;ssl=1\" data-recalc-dims=\"1\" \/>\n<p>Wow, the only time when CATE is the same between linear regression and GAM model is when <code>x==0<\/code>. The other CATEs are different. CATE is linear regression, CATE2 is GAM.<\/p>\n<p>Did you notice that the CATE2 color is a bit off? We actually sneaked in the true CATE (<code>cate3<\/code>) to see how well GAM is able to calculate it. It\u2019s almost a perfect fit!<\/p>\n\n\n\n\n<h4 id=\"how-does-one-estimate-cate-if-we-know-the-true-formula\">How Does One Estimate CATE If We Know The True Formula?\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#how-does-one-estimate-cate-if-we-know-the-true-formula\" rel=\"nofollow\" target=\"_blank\"><\/a>\n<\/h4>\n<p>Given this formula: <br>\n<code>\\(y =  x^2 + 2xt + 5t + 0.5z + \\epsilon\\)<\/code><\/p>\n<p>We take the partial derivative of <code>y<\/code> with respect to <code>t<\/code> to get the ATE\/CATE: <br>\n<code>\\(\\frac{\\partial\\text{y}}{\\partial\\text{t}} = 2x + 5\\)<\/code><\/p>\n<p>Here, we see that CATE changes as <code>x<\/code> changes, except when <code>x<\/code> is <code>0<\/code>. This matches really well with our <code>GAM<\/code> model CATE! \ud83d\ude4c<\/p>\n<p>There is still one question that I don\u2019t quite know the answer to, perhaps someone might be able to educate me on this. Some say the partial derivative is marginal effect and not ATE. \ud83e\udd37\u200d\u2642\ufe0f<\/p>\n\n\n\n\n<h2 id=\"lessons\">Lessons learnt\n  <a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/#lessons\" rel=\"nofollow\" target=\"_blank\"><svg class=\"anchor-symbol\" aria-hidden=\"true\" height=\"26\" width=\"26\" viewBox=\"0 0 22 22\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\n      <path d=\"M0 0h24v24H0z\" fill=\"currentColor\"><\/path>\n      <path d=\"M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z\"><\/path>\n    <\/svg><\/a>\n<\/h2>\n<ul>\n<li><code>GAM<\/code> model is flexible due to its smoothing function, even Richard McElreath recommended using GAM over polynomial regression<\/li>\n<li>If one knows the underlying functional relationship through an equation, CATE is essential derivative of outcome with respect to the exposure<\/li>\n<li>Derivative in latex is <code>\\partial<\/code><\/li>\n<li>It\u2019s nice to use the owl framework as a procedure from DAG -> golem -> simulation -> analysis.<\/li>\n<\/ul>\n<br>\n<br>\n<p>If you like this article:<\/p>\n<ul>\n<li>please feel free to send me a \n<a href=\"https:\/\/www.kenkoonwong.com\/blog\/\" rel=\"nofollow\" target=\"_blank\">comment or visit my other blogs<\/a><\/li>\n<li>please feel free to follow me on \n<a href=\"https:\/\/twitter.com\/kenkoonwong\/\" rel=\"nofollow\" target=\"_blank\">twitter<\/a>, \n<a href=\"https:\/\/github.com\/kenkoonwong\/\" rel=\"nofollow\" target=\"_blank\">GitHub<\/a> or \n<a href=\"https:\/\/med-mastodon.com\/@kenkoonwong\" rel=\"nofollow\" target=\"_blank\">Mastodon<\/a><\/li>\n<li>if you would like collaborate please feel free to \n<a href=\"https:\/\/www.kenkoonwong.com\/contact\/\" rel=\"nofollow\" target=\"_blank\">contact me<\/a><\/li>\n<\/ul>\n\n<div style=\"border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;\">\r\n<div style=\"text-align: center;\">To <strong>leave a comment<\/strong> for the author, please follow the link and comment on their blog: <strong><a href=\"https:\/\/www.kenkoonwong.com\/blog\/cate-gam\/\"> r on Everyday Is A School Day<\/a><\/strong>.<\/div>\r\n<hr \/>\r\n<a href=\"https:\/\/www.r-bloggers.com\/\" rel=\"nofollow\">R-bloggers.com<\/a> offers <strong><a href=\"https:\/\/feedburner.google.com\/fb\/a\/mailverify?uri=RBloggers\" rel=\"nofollow\">daily e-mail updates<\/a><\/strong> about <a title=\"The R Project for Statistical Computing\" href=\"https:\/\/www.r-project.org\/\" rel=\"nofollow\">R<\/a> news and tutorials about <a title=\"R tutorials\" href=\"https:\/\/www.r-bloggers.com\/how-to-learn-r-2\/\" rel=\"nofollow\">learning R<\/a> and many other topics. <a title=\"Data science jobs\" href=\"https:\/\/www.r-users.com\/\" rel=\"nofollow\">Click here if you're looking to post or find an R\/data-science job<\/a>.\r\n\r\n<hr>Want to share your content on R-bloggers?<a href=\"https:\/\/www.r-bloggers.com\/add-your-blog\/\" rel=\"nofollow\"> click here<\/a> if you have a blog, or <a href=\"http:\/\/r-posts.com\/\" rel=\"nofollow\"> here<\/a> if you don't.\r\n<\/div>","protected":false},"excerpt":{"rendered":"<div style = \"width:60%; display: inline-block; float:left; \">\nIt was enjoyable to visualize the non-linear relationship with interaction and observe the corresponding changes in CATE. If one understands the underlying equation, it\u2019s possible to easily obtain the ATE using calculus. Lastly, adopting Richard&#8230;<\/div>\n<div style = \"width: 40%; display: inline-block; float:right;\"><\/div>\n<div style=\"clear: both;\"><\/div>\n","protected":false},"author":2816,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"aioseo_notices":[],"jetpack-related-posts":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/381742"}],"collection":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/users\/2816"}],"replies":[{"embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/comments?post=381742"}],"version-history":[{"count":1,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/381742\/revisions"}],"predecessor-version":[{"id":381743,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/381742\/revisions\/381743"}],"wp:attachment":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/media?parent=381742"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/categories?post=381742"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/tags?post=381742"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}