{"id":391142,"date":"2025-03-10T15:00:00","date_gmt":"2025-03-10T21:00:00","guid":{"rendered":"https:\/\/mfatihtuzen.netlify.app\/posts\/2025-03-11_underrated_functions\/"},"modified":"2025-03-10T15:00:00","modified_gmt":"2025-03-10T21:00:00","slug":"underrated-gems-in-r-must-know-functions-youre-probably-missing-out-on","status":"publish","type":"post","link":"https:\/\/www.r-bloggers.com\/2025\/03\/underrated-gems-in-r-must-know-functions-youre-probably-missing-out-on\/","title":{"rendered":"Underrated Gems in R: Must-Know Functions You\u2019re Probably Missing Out On"},"content":{"rendered":"<!-- \r\n<div style=\"min-height: 30px;\">\r\n[social4i size=\"small\" align=\"align-left\"]\r\n<\/div>\r\n-->\r\n\r\n<div style=\"border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;\">\r\n[This article was first published on  <strong><a href=\"https:\/\/mfatihtuzen.netlify.app\/posts\/2025-03-11_underrated_functions\/\"> A Statistician&#039;s R Notebook<\/a><\/strong>, and kindly contributed to <a href=\"https:\/\/www.r-bloggers.com\/\" rel=\"nofollow\">R-bloggers<\/a>].  (You can report issue about the content on this page <a href=\"https:\/\/www.r-bloggers.com\/contact-us\/\">here<\/a>)\r\n<hr>Want to share your content on R-bloggers?<a href=\"https:\/\/www.r-bloggers.com\/add-your-blog\/\" rel=\"nofollow\"> click here<\/a> if you have a blog, or <a href=\"http:\/\/r-posts.com\/\" rel=\"nofollow\"> here<\/a> if you don't.\r\n<\/div>\n \n\n\n\n\n\n<p>R is packed with powerhouse tools\u2014think dplyr for data wrangling, ggplot2 for stunning visuals, or tidyr for tidying up messes. But beyond the headliners, there\u2019s a lineup of lesser-known functions that deserve a spot in your toolkit. These hidden gems can streamline your code, solve tricky problems, and even make you wonder how you managed without them. In this post, we\u2019ll uncover four underrated R functions: <strong><code>Reduce, vapply, do.call<\/code><\/strong> and <strong><code>janitor::clean_names<\/code><\/strong>. With practical examples ranging from beginner-friendly to advanced, plus outputs to show you what\u2019s possible, this guide will have you itching to try them out in your next project. Let\u2019s dive in and see what these under-the-radar stars can do!<\/p>\n<section id=\"reduce-collapse-with-control\" class=\"level2\">\n<h2 class=\"anchored\" data-anchor-id=\"reduce-collapse-with-control\">1. Reduce: Collapse with Control<\/h2>\n<section id=\"what-it-does-and-its-arguments\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"what-it-does-and-its-arguments\">What It Does and Its Arguments<\/h3>\n<p>Reduce is a base R function that iteratively applies a two-argument function to a list or vector, shrinking it down to a single result. It\u2019s like a secret weapon for avoiding loops while keeping things elegant.<\/p>\n<p><strong>Key Arguments:<\/strong><\/p>\n<ul>\n<li><p><code>f:<\/code> The function to apply (e.g., +, *, or a custom one).<\/p><\/li>\n<li><p><code>x:<\/code> The list or vector to reduce.<\/p><\/li>\n<li><p><code>init<\/code> (optional): A starting value (defaults to the first element of x if omitted).<\/p><\/li>\n<li><p><code>accumulate<\/code> (optional): If TRUE, returns all intermediate results (defaults to FALSE).<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"use-cases\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"use-cases\">Use Cases<\/h3>\n<ul>\n<li><p>Summing or multiplying without explicit iteration.<\/p><\/li>\n<li><p>Combining data structures step-by-step.<\/p><\/li>\n<li><p>Simplifying recursive tasks.<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"examples\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"examples\">Examples<\/h3>\n<section id=\"simple-quick-sum\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"simple-quick-sum\">Simple: Quick Sum<\/h4>\n<div class=\"cell\">\n<pre>numbers &lt;- 1:5\ntotal &lt;- Reduce(`+`, numbers)\nprint(total)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] 15<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> Reduce adds 1 + 2 = 3, then 3 + 3 = 6, 6 + 4 = 10, and 10 + 5 = 15. It\u2019s a sleek alternative to sum().<\/p>\n<\/section>\n<section id=\"intermediate-string-building\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"intermediate-string-building\">Intermediate: String Building<\/h4>\n<div class=\"cell\">\n<pre>words &lt;- c(&quot;R&quot;, &quot;is&quot;, &quot;awesome&quot;)\nsentence &lt;- Reduce(paste, words, init = &quot;&quot;)\nprint(sentence)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] &quot; R is awesome&quot;<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> Starting with an empty string (init = \u201c\u201c), Reduce glues the words together with spaces. Skip init, and it starts with\u201dR\u201d, which might not be what you want.<\/p>\n<\/section>\n<section id=\"advanced-merging-data-frames\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"advanced-merging-data-frames\">Advanced: Merging Data Frames<\/h4>\n<div class=\"cell\">\n<pre>df1 &lt;- data.frame(a = 1:2, b = c(&quot;x&quot;, &quot;y&quot;))\ndf2 &lt;- data.frame(a = 3:4, b = c(&quot;z&quot;, &quot;w&quot;))\ndf3 &lt;- data.frame(a = 5:6, b = c(&quot;p&quot;, &quot;q&quot;))\ncombined &lt;- Reduce(rbind, list(df1, df2, df3))\nprint(combined)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>  a b\n1 1 x\n2 2 y\n3 3 z\n4 4 w\n5 5 p\n6 6 q<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> Reduce stacks three data frames row-wise, pairing them up one by one. It\u2019s a loop-free way to handle multiple merges.<\/p>\n<div class=\"callout callout-style-default callout-note callout-titled\">\n<div class=\"callout-header d-flex align-content-center\">\n<div class=\"callout-icon-container\">\n<i class=\"callout-icon\"><\/i>\n<\/div>\n<div class=\"callout-title-container flex-fill\">\nA Quick Note on purrr::reduce()\n<\/div>\n<\/div>\n<div class=\"callout-body-container callout-body\">\n<p>If you\u2019re a fan of the tidyverse, check out purrr::reduce(). It\u2019s a modern take on base R\u2019s Reduce, offering a consistent syntax with other purrr functions (like .x and .y for arguments) and handy shortcuts like ~ .x + .y for inline functions. It also defaults to left-to-right reduction but can go right-to-left with reduce_right(). Worth a look if you want a more polished, tidyverse-friendly alternative!<\/p>\n<p>Here\u2019s an intermediate-level example of using the <code>reduce()<\/code> function from the <code>purrr<\/code> package for joining multiple dataframes:<\/p>\n<div class=\"cell\">\n<pre>library(purrr)\nlibrary(dplyr)\n\n# Create three sample dataframes representing different aspects of customer data\ncustomers &lt;- data.frame(\n  customer_id = 1:5,\n  name = c(&quot;Alice&quot;, &quot;Bob&quot;, &quot;Charlie&quot;, &quot;Diana&quot;, &quot;Edward&quot;),\n  age = c(32, 45, 28, 36, 52)\n)\n\norders &lt;- data.frame(\n  order_id = 101:108,\n  customer_id = c(1, 2, 2, 3, 3, 3, 4, 5),\n  order_date = as.Date(c(&quot;2023-01-15&quot;, &quot;2023-01-20&quot;, &quot;2023-02-10&quot;, \n                        &quot;2023-01-05&quot;, &quot;2023-02-15&quot;, &quot;2023-03-20&quot;,\n                        &quot;2023-02-25&quot;, &quot;2023-03-10&quot;)),\n  amount = c(120.50, 85.75, 200.00, 45.99, 75.25, 150.00, 95.50, 210.25)\n)\n\nfeedback &lt;- data.frame(\n  feedback_id = 201:206,\n  customer_id = c(1, 2, 3, 3, 4, 5),\n  rating = c(4, 5, 3, 4, 5, 4),\n  feedback_date = as.Date(c(&quot;2023-01-20&quot;, &quot;2023-01-25&quot;, &quot;2023-01-10&quot;,\n                          &quot;2023-02-20&quot;, &quot;2023-03-01&quot;, &quot;2023-03-15&quot;))\n)\n\n# List of dataframes to join with the joining column\ndataframes_to_join &lt;- list(\n  list(df = customers, by = &quot;customer_id&quot;),\n  list(df = orders, by = &quot;customer_id&quot;),\n  list(df = feedback, by = &quot;customer_id&quot;)\n)\n\n# Using reduce to join all dataframes\n# Start with customers dataframe and progressively join the others\njoined_data &lt;- reduce(\n  dataframes_to_join[-1],  # Exclude first dataframe as it's our starting point\n  function(acc, x) {\n    left_join(acc, x$df, by = x$by)\n  },\n  .init = dataframes_to_join[[1]]$df  # Start with customers dataframe\n)\n\n# View the result\nprint(joined_data)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>   customer_id    name age order_id order_date amount feedback_id rating\n1            1   Alice  32      101 2023-01-15 120.50         201      4\n2            2     Bob  45      102 2023-01-20  85.75         202      5\n3            2     Bob  45      103 2023-02-10 200.00         202      5\n4            3 Charlie  28      104 2023-01-05  45.99         203      3\n5            3 Charlie  28      104 2023-01-05  45.99         204      4\n6            3 Charlie  28      105 2023-02-15  75.25         203      3\n7            3 Charlie  28      105 2023-02-15  75.25         204      4\n8            3 Charlie  28      106 2023-03-20 150.00         203      3\n9            3 Charlie  28      106 2023-03-20 150.00         204      4\n10           4   Diana  36      107 2023-02-25  95.50         205      5\n11           5  Edward  52      108 2023-03-10 210.25         206      4\n   feedback_date\n1     2023-01-20\n2     2023-01-25\n3     2023-01-25\n4     2023-01-10\n5     2023-02-20\n6     2023-01-10\n7     2023-02-20\n8     2023-01-10\n9     2023-02-20\n10    2023-03-01\n11    2023-03-15<\/pre>\n<\/div>\n<\/div>\n<p>This example demonstrates how to use <code>reduce()<\/code> to join multiple dataframes in a sequential, elegant way. This pattern is particularly useful when dealing with complex data integration tasks where you need to combine multiple data sources with a common identifier.<\/p>\n<\/div>\n<\/div>\n<\/section>\n<\/section>\n<\/section>\n<section id=\"vapply-iteration-with-assurance\" class=\"level2\">\n<h2 class=\"anchored\" data-anchor-id=\"vapply-iteration-with-assurance\">2. vapply: Iteration with Assurance<\/h2>\n<section id=\"what-it-does-and-its-arguments-1\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"what-it-does-and-its-arguments-1\">What It Does and Its Arguments<\/h3>\n<p>vapply is another base R gem, similar to lapply but with a twist: it forces you to specify the output type and length upfront. This makes it safer and more predictable, especially for critical tasks.<\/p>\n<p><strong>Key Arguments:<\/strong><\/p>\n<ul>\n<li><p><code>X<\/code>: The list or vector to process.<\/p><\/li>\n<li><p><code>FUN<\/code>: The function to apply to each element.<\/p><\/li>\n<li><p><code>FUN.VALUE<\/code>: A template for the output (e.g., numeric(1) for a single number).<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"use-cases-1\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"use-cases-1\">Use Cases<\/h3>\n<ul>\n<li><p>Guaranteeing consistent output types.<\/p><\/li>\n<li><p>Extracting specific stats from lists.<\/p><\/li>\n<li><p>Writing reliable code for packages or production.<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"examples-1\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"examples-1\">Examples<\/h3>\n<section id=\"simple-doubling-up\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"simple-doubling-up\">Simple: Doubling Up<\/h4>\n<div class=\"cell\">\n<pre>values &lt;- 1:3\ndoubled &lt;- vapply(values, function(x) x * 2, numeric(1))\nprint(doubled)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] 2 4 6<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> Each value doubles, and numeric(1) ensures a numeric vector\u2014simple and rock-solid.<\/p>\n<\/section>\n<section id=\"intermediate-word-lengths\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"intermediate-word-lengths\">Intermediate: Word Lengths<\/h4>\n<div class=\"cell\">\n<pre>terms &lt;- c(&quot;data&quot;, &quot;science&quot;, &quot;R&quot;)\nlengths &lt;- vapply(terms, nchar, numeric(1))\nprint(lengths)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>   data science       R \n      4       7       1 <\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> vapply counts characters per word, delivering a numeric vector every time\u2014no surprises like sapply might throw.<\/p>\n<\/section>\n<section id=\"advanced-stats-snapshot\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"advanced-stats-snapshot\">Advanced: Stats Snapshot<\/h4>\n<div class=\"cell\">\n<pre>samples &lt;- list(c(1, 2, 3), c(4, 5), c(6, 7, 8))\nstats &lt;- vapply(samples, function(x) c(mean = mean(x), sd = sd(x)), numeric(2))\nprint(stats)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>     [,1]      [,2] [,3]\nmean    2 4.5000000    7\nsd      1 0.7071068    1<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> For each sample, vapply computes mean and standard deviation, returning a matrix (2 rows, 3 columns). It\u2019s a tidy, type-safe summary.<\/p>\n<\/section>\n<\/section>\n<\/section>\n<section id=\"do.call-dynamic-function-magic\" class=\"level2\">\n<h2 class=\"anchored\" data-anchor-id=\"do.call-dynamic-function-magic\">3. do.call: Dynamic Function Magic<\/h2>\n<section id=\"what-it-does-and-its-arguments-2\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"what-it-does-and-its-arguments-2\">What It Does and Its Arguments<\/h3>\n<p>do.call in base R lets you call a function with a list of arguments, making it a go-to for flexible, on-the-fly operations. It\u2019s like having a universal remote for your functions.<\/p>\n<p><strong>Key Arguments:<\/strong><\/p>\n<ul>\n<li><p><code>what<\/code>: The function to call (e.g., rbind, paste).<\/p><\/li>\n<li><p><code>args<\/code>: A list of arguments to pass.<\/p><\/li>\n<li><p><code>quote<\/code> (optional): Rarely used, defaults to FALSE.<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"use-cases-2\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"use-cases-2\">Use Cases<\/h3>\n<ul>\n<li><p>Combining variable inputs.<\/p><\/li>\n<li><p>Running functions dynamically.<\/p><\/li>\n<li><p>Simplifying calls with list-based data.<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"examples-2\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"examples-2\">Examples<\/h3>\n<section id=\"simple-vector-mashup\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"simple-vector-mashup\">Simple: Vector Mashup<\/h4>\n<div class=\"cell\">\n<pre>chunks &lt;- list(1:3, 4:6)\nall &lt;- do.call(c, chunks)\nprint(all)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] 1 2 3 4 5 6<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> do.call feeds the list to c(), stitching the vectors together effortlessly.<\/p>\n<\/section>\n<section id=\"intermediate-custom-join\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"intermediate-custom-join\">Intermediate: Custom Join<\/h4>\n<div class=\"cell\">\n<pre>bits &lt;- list(&quot;Code&quot;, &quot;Runs&quot;, &quot;Fast&quot;)\njoined &lt;- do.call(paste, c(bits, list(sep = &quot;|&quot;)))\nprint(joined)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] &quot;Code|Runs|Fast&quot;<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> do.call combines the list with a sep argument, creating a piped string in one smooth move.<\/p>\n<\/section>\n<section id=\"advanced-flexible-binding\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"advanced-flexible-binding\">Advanced: Flexible Binding<\/h4>\n<div class=\"cell\">\n<pre>df_list &lt;- list(data.frame(x = 1:2), data.frame(x = 3:4))\ndirection &lt;- &quot;vertical&quot;\nbound &lt;- do.call(if (direction == &quot;vertical&quot;) rbind else cbind, df_list)\nprint(bound)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>  x\n1 1\n2 2\n3 3\n4 4<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> With direction = \u201cvertical\u201d, do.call uses rbind to stack rows. Change it to \u201chorizontal\u201d, and cbind takes over\u2014dynamic and smart.<\/p>\n<\/section>\n<\/section>\n<\/section>\n<section id=\"janitorclean_names-tame-your-column-chaos\" class=\"level2\">\n<h2 class=\"anchored\" data-anchor-id=\"janitorclean_names-tame-your-column-chaos\">4. janitor::clean_names: Tame Your Column Chaos<\/h2>\n<section id=\"what-it-does-and-its-arguments-3\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"what-it-does-and-its-arguments-3\">What It Does and Its Arguments<\/h3>\n<p>From the janitor package, clean_names() transforms messy column names into consistent, code-friendly formats (e.g., lowercase with underscores). It\u2019s a time-saver you\u2019ll wish you\u2019d known sooner.<\/p>\n<p><strong>Key Arguments:<\/strong><\/p>\n<ul>\n<li><p><code>dat<\/code>: The data frame to clean.<\/p><\/li>\n<li><p><code>case<\/code>: The style for names (e.g., \u201csnake\u201d, \u201csmall_camel\u201d, defaults to \u201csnake\u201d).<\/p><\/li>\n<li><p><code>replace<\/code>: A named vector for custom replacements (optional).<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"use-cases-3\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"use-cases-3\">Use Cases<\/h3>\n<ul>\n<li><p>Standardizing imported data with ugly headers.<\/p><\/li>\n<li><p>Prepping data frames for analysis or plotting.<\/p><\/li>\n<li><p>Avoiding frustration with inconsistent naming.<\/p><\/li>\n<\/ul>\n<\/section>\n<section id=\"examples-3\" class=\"level3\">\n<h3 class=\"anchored\" data-anchor-id=\"examples-3\">Examples<\/h3>\n<section id=\"simple-basic-cleanup\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"simple-basic-cleanup\">Simple: Basic Cleanup<\/h4>\n<div class=\"cell\">\n<pre>library(janitor)\n\n# Create a dataframe with messy column names\ndf &lt;- data.frame(\n  `First Name` = c(&quot;John&quot;, &quot;Mary&quot;, &quot;David&quot;),\n  `Last.Name` = c(&quot;Smith&quot;, &quot;Johnson&quot;, &quot;Williams&quot;),\n  `Email-Address` = c(&quot;john@example.com&quot;, &quot;mary@example.com&quot;, &quot;david@example.com&quot;),\n  `Annual Income ($)` = c(65000, 78000, 52000),\n  check.names = FALSE\n)\n\n# View original column names\nnames(df)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] &quot;First Name&quot;        &quot;Last.Name&quot;         &quot;Email-Address&quot;    \n[4] &quot;Annual Income ($)&quot;<\/pre>\n<\/div>\n<pre># Clean the names\nclean_df &lt;- clean_names(df)\n\n# View cleaned column names\nnames(clean_df)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] &quot;first_name&quot;    &quot;last_name&quot;     &quot;email_address&quot; &quot;annual_income&quot;<\/pre>\n<\/div>\n<\/div>\n<p>What <code>clean_names()<\/code> specifically does:<\/p>\n<ul>\n<li><p>Converts all names to lowercase<\/p><\/li>\n<li><p>Replaces spaces with underscores<\/p><\/li>\n<li><p>Removes special characters like periods and hyphens<\/p><\/li>\n<li><p>Creates names that are valid R variable names and follow standard naming conventions<\/p><\/li>\n<\/ul>\n<p>This standardization makes your data more consistent, easier to work with, and helps prevent errors when manipulating or joining datasets.<\/p>\n<\/section>\n<section id=\"intermediate-custom-style\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"intermediate-custom-style\">Intermediate: Custom Style<\/h4>\n<div class=\"cell\">\n<pre>library(dplyr)\nlibrary(purrr)\n\n# Create multiple dataframes with inconsistent naming\ndf1 &lt;- data.frame(\n  `Customer ID` = 1:3,\n  `First Name` = c(&quot;John&quot;, &quot;Mary&quot;, &quot;David&quot;),\n  `LAST NAME` = c(&quot;Smith&quot;, &quot;Johnson&quot;, &quot;Williams&quot;),\n  check.names = FALSE\n)\n\ndf2 &lt;- data.frame(\n  `customer.id` = 4:6,\n  `firstName` = c(&quot;Michael&quot;, &quot;Linda&quot;, &quot;James&quot;),\n  `lastName` = c(&quot;Brown&quot;, &quot;Davis&quot;, &quot;Miller&quot;),\n  check.names = FALSE\n)\n\ndf3 &lt;- data.frame(\n  `cust_id` = 7:9,\n  `first-name` = c(&quot;Robert&quot;, &quot;Jennifer&quot;, &quot;Thomas&quot;),\n  `last-name` = c(&quot;Wilson&quot;, &quot;Martinez&quot;, &quot;Anderson&quot;),\n  check.names = FALSE\n)\n\n# List of dataframes\ndfs &lt;- list(df1, df2, df3)\n\n# Clean names of all dataframes\nclean_dfs &lt;- map(dfs, clean_names)\n\n# Print column names for each cleaned dataframe\nmap(clean_dfs, names)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[[1]]\n[1] &quot;customer_id&quot; &quot;first_name&quot;  &quot;last_name&quot;  \n\n[[2]]\n[1] &quot;customer_id&quot; &quot;first_name&quot;  &quot;last_name&quot;  \n\n[[3]]\n[1] &quot;cust_id&quot;    &quot;first_name&quot; &quot;last_name&quot; <\/pre>\n<\/div>\n<pre># Bind the dataframes (now possible because of standardized column names)\ncombined_df &lt;- bind_rows(clean_dfs)\nprint(combined_df)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>  customer_id first_name last_name cust_id\n1           1       John     Smith      NA\n2           2       Mary   Johnson      NA\n3           3      David  Williams      NA\n4           4    Michael     Brown      NA\n5           5      Linda     Davis      NA\n6           6      James    Miller      NA\n7          NA     Robert    Wilson       7\n8          NA   Jennifer  Martinez       8\n9          NA     Thomas  Anderson       9<\/pre>\n<\/div>\n<\/div>\n<p>This code demonstrates a more advanced use case of the <code>clean_names()<\/code> function when working with multiple data frames that have inconsistent naming conventions. Note that because of the different column names for customer ID, we have missing values in the combined dataframe. This example demonstrates why standardized naming is important.<\/p>\n<\/section>\n<section id=\"advanced-targeted-fixes\" class=\"level4\">\n<h4 class=\"anchored\" data-anchor-id=\"advanced-targeted-fixes\">Advanced: Targeted Fixes<\/h4>\n<div class=\"cell\">\n<pre>df &lt;- data.frame(&quot;ID#&quot; = 1:2, &quot;Sales_%&quot; = c(10, 20), &quot;Q1 Revenue&quot; = c(100, 200))\ncleaned &lt;- clean_names(df, replace = c(&quot;#&quot; = &quot;_num&quot;, &quot;%&quot; = &quot;_pct&quot;))\nprint(names(cleaned))<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] &quot;id&quot;         &quot;sales&quot;      &quot;q1_revenue&quot;<\/pre>\n<\/div>\n<\/div>\n<p><em><strong>Explanation<\/strong>:<\/em> Custom replace swaps # for _num and % for _pct, while clean_names handles the rest\u2014precision meets polish.<\/p>\n<div class=\"cell\">\n<pre>library(readxl)\n\n\n# Create a temporary Excel file with problematic column names\ntemp_file &lt;- tempfile(fileext = &quot;.xlsx&quot;)\ndf &lt;- data.frame(\n  `ID#` = 1:5,\n  `%_Completed` = c(85, 92, 78, 100, 65),\n  `Result (Pass\/Fail)` = c(&quot;Pass&quot;, &quot;Pass&quot;, &quot;Fail&quot;, &quot;Pass&quot;, &quot;Fail&quot;),\n  `\u03bcg\/mL` = c(0.5, 0.8, 0.3, 1.2, 0.4),\n  `p-value` = c(0.03, 0.01, 0.08, 0.002, 0.06),\n  check.names = FALSE\n)\n\n# Save as Excel (simulating real-world data source)\nif (require(writexl)) {\n  write_xlsx(df, temp_file)\n} else {\n  # Fall back to CSV if writexl not available\n  write.csv(df, sub(&quot;\\\\.xlsx$&quot;, &quot;.csv&quot;, temp_file), row.names = FALSE)\n  temp_file &lt;- sub(&quot;\\\\.xlsx$&quot;, &quot;.csv&quot;, temp_file)\n}\n\n# Read the file back\nif (temp_file == sub(&quot;\\\\.xlsx$&quot;, &quot;.csv&quot;, temp_file)) {\n  imported_df &lt;- read.csv(temp_file, check.names = FALSE)\n} else {\n  imported_df &lt;- read_excel(temp_file)\n}\n\n# View original column names\nprint(names(imported_df))<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] &quot;ID#&quot;                &quot;%_Completed&quot;        &quot;Result (Pass\/Fail)&quot;\n[4] &quot;\u03bcg\/mL&quot;              &quot;p-value&quot;           <\/pre>\n<\/div>\n<pre># Create custom replacements\ncustom_replacements &lt;- c(\n  &quot;\u03bcg&quot; = &quot;ug&quot;,  # Replace Greek letter\n  &quot;%&quot; = &quot;percent&quot;,  # Replace percent symbol\n  &quot;#&quot; = &quot;num&quot;   # Replace hash\n)\n\n# Clean with custom replacements\nclean_df &lt;- imported_df %&gt;%\n  clean_names() %&gt;%\n  rename_with(~ stringr::str_replace_all(., &quot;p_value&quot;, &quot;probability&quot;))\n\n# View cleaned column names\nprint(names(clean_df))<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre>[1] &quot;id_number&quot;         &quot;percent_completed&quot; &quot;result_pass_fail&quot; \n[4] &quot;mg_m_l&quot;            &quot;probability&quot;      <\/pre>\n<\/div>\n<pre># Print the cleaned dataframe\nprint(clean_df)<\/pre>\n<div class=\"cell-output cell-output-stdout\">\n<pre># A tibble: 5 \u00d7 5\n  id_number percent_completed result_pass_fail mg_m_l probability\n      &lt;dbl&gt;             &lt;dbl&gt; &lt;chr&gt;             &lt;dbl&gt;       &lt;dbl&gt;\n1         1                85 Pass                0.5       0.03 \n2         2                92 Pass                0.8       0.01 \n3         3                78 Fail                0.3       0.08 \n4         4               100 Pass                1.2       0.002\n5         5                65 Fail                0.4       0.06 <\/pre>\n<\/div>\n<\/div>\n<p>The final output shows the transformation from problematic column names to standardized ones:<\/p>\n<p>From:<\/p>\n<ul>\n<li><p><code>ID#<\/code><\/p><\/li>\n<li><p><code>%_Completed<\/code><\/p><\/li>\n<li><p><code>Result (Pass\/Fail)<\/code><\/p><\/li>\n<li><p><code>\u03bcg\/mL<\/code><\/p><\/li>\n<li><p><code>p-value<\/code><\/p><\/li>\n<\/ul>\n<p>To:<\/p>\n<ul>\n<li><p><code>id_num<\/code><\/p><\/li>\n<li><p><code>percent_completed<\/code><\/p><\/li>\n<li><p><code>result_pass_fail<\/code><\/p><\/li>\n<li><p><code>ug_m_l<\/code><\/p><\/li>\n<li><p><code>probability<\/code><\/p><\/li>\n<\/ul>\n<p>This example demonstrates how <code>clean_names()<\/code> can be part of a more sophisticated data preparation workflow, especially when working with real-world data sources that contain problematic characters and naming conventions.<\/p>\n<\/section>\n<\/section>\n<\/section>\n<section id=\"conclusion-why-these-functions-deserve-your-attention\" class=\"level2\">\n<h2 class=\"anchored\" data-anchor-id=\"conclusion-why-these-functions-deserve-your-attention\">Conclusion: Why These Functions Deserve Your Attention<\/h2>\n<p>R\u2019s ecosystem is vast, but it\u2019s easy to stick to the familiar and miss out on tools like Reduce, vapply, do.call and clean_names. These functions might not top the popularity charts, yet they pack a punch\u2014whether it\u2019s collapsing data without loops, ensuring type safety, adapting on the fly, fixing messy names, or mining text for gold. The examples here show just a taste of what they can do, from quick fixes to complex tasks. Curious to see how they fit into your workflow? Fire up R, play with them, and discover how these underdogs can become your new go-tos. What other hidden R treasures have you found? Drop them in the comments\u2014I\u2019d love to hear!<\/p>\n<\/section>\n<section id=\"references\" class=\"level2\">\n<h2 class=\"anchored\" data-anchor-id=\"references\">References<\/h2>\n<ul>\n<li><p>R Core Team (2025). <em>R: A Language and Environment for Statistical Computing<\/em>. R Foundation for Statistical Computing, Vienna, Austria. Available at: <a href=\"https:\/\/www.r-project.org\/\" class=\"uri\" rel=\"nofollow\" target=\"_blank\">https:\/\/www.R-project.org\/<\/a><\/p><\/li>\n<li><p>Firke, Sam (2023). <em>janitor: Simple Tools for Examining and Cleaning Dirty Data<\/em>. CRAN. Available at: <a href=\"https:\/\/cran.r-project.org\/package=janitor\" class=\"uri\" rel=\"nofollow\" target=\"_blank\">https:\/\/CRAN.R-project.org\/package=janitor<\/a><\/p><\/li>\n<li><p>R Documentation for Reduce, vapply, do.call, clean_names.<\/p><\/li>\n<\/ul>\n\n\n<\/section>\n\n \n<div style=\"border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;\">\r\n<div style=\"text-align: center;\">To <strong>leave a comment<\/strong> for the author, please follow the link and comment on their blog: <strong><a href=\"https:\/\/mfatihtuzen.netlify.app\/posts\/2025-03-11_underrated_functions\/\"> A Statistician&#039;s R Notebook<\/a><\/strong>.<\/div>\r\n<hr \/>\r\n<a href=\"https:\/\/www.r-bloggers.com\/\" rel=\"nofollow\">R-bloggers.com<\/a> offers <strong><a href=\"https:\/\/feedburner.google.com\/fb\/a\/mailverify?uri=RBloggers\" rel=\"nofollow\">daily e-mail updates<\/a><\/strong> about <a title=\"The R Project for Statistical Computing\" href=\"https:\/\/www.r-project.org\/\" rel=\"nofollow\">R<\/a> news and tutorials about <a title=\"R tutorials\" href=\"https:\/\/www.r-bloggers.com\/how-to-learn-r-2\/\" rel=\"nofollow\">learning R<\/a> and many other topics. <a title=\"Data science jobs\" href=\"https:\/\/www.r-users.com\/\" rel=\"nofollow\">Click here if you're looking to post or find an R\/data-science job<\/a>.\r\n\r\n<hr>Want to share your content on R-bloggers?<a href=\"https:\/\/www.r-bloggers.com\/add-your-blog\/\" rel=\"nofollow\"> click here<\/a> if you have a blog, or <a href=\"http:\/\/r-posts.com\/\" rel=\"nofollow\"> here<\/a> if you don't.\r\n<\/div>","protected":false},"excerpt":{"rendered":"\n<p>R is packed with powerhouse tools\u2014think dplyr for data wrangling, ggplot2 for stunning visuals, or tidyr for tidying up messes. But beyond the headliners, there\u2019s a lineup of lesser-known functions that deserve a spot in your toolkit. These hidd&#8230;<\/p>\n","protected":false},"author":2941,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"aioseo_notices":[],"jetpack-related-posts":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/391142"}],"collection":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/users\/2941"}],"replies":[{"embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/comments?post=391142"}],"version-history":[{"count":1,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/391142\/revisions"}],"predecessor-version":[{"id":391143,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/391142\/revisions\/391143"}],"wp:attachment":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/media?parent=391142"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/categories?post=391142"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/tags?post=391142"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}