{"id":387755,"date":"2024-10-01T16:24:49","date_gmt":"2024-10-01T22:24:49","guid":{"rendered":"http:\/\/xianblog.wordpress.com\/?p=57441"},"modified":"2024-10-01T16:24:49","modified_gmt":"2024-10-01T22:24:49","slug":"hands-on-differential-privacy-book-review","status":"publish","type":"post","link":"https:\/\/www.r-bloggers.com\/2024\/10\/hands-on-differential-privacy-book-review\/","title":{"rendered":"Hands-On Differential Privacy [book review]"},"content":{"rendered":"<!-- \r\n<div style=\"min-height: 30px;\">\r\n[social4i size=\"small\" align=\"align-left\"]\r\n<\/div>\r\n-->\r\n\r\n<div style=\"border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;\">\r\n[This article was first published on  <strong><a href=\"https:\/\/xianblog.wordpress.com\/2024\/10\/02\/hands-on-differential-privacy-book-review\/\"> R \u2013 Xi&#039;an&#039;s Og<\/a><\/strong>, and kindly contributed to <a href=\"https:\/\/www.r-bloggers.com\/\" rel=\"nofollow\">R-bloggers<\/a>].  (You can report issue about the content on this page <a href=\"https:\/\/www.r-bloggers.com\/contact-us\/\">here<\/a>)\r\n<hr>Want to share your content on R-bloggers?<a href=\"https:\/\/www.r-bloggers.com\/add-your-blog\/\" rel=\"nofollow\"> click here<\/a> if you have a blog, or <a href=\"http:\/\/r-posts.com\/\" rel=\"nofollow\"> here<\/a> if you don't.\r\n<\/div>\n<p style=\"text-align: justify\"><a href=\"https:\/\/amzn.to\/3ThqsXu\" rel=\"nofollow\" target=\"_blank\"><img loading=\"lazy\" class=\"alignleft \" src=\"https:\/\/learning.oreilly.com\/library\/cover\/9781492097730\/250w\/\" width=\"210\" height=\"276\" \/><\/a><a href=\"https:\/\/amzn.to\/3ThqsXu\" rel=\"nofollow\" target=\"_blank\">Hands-On Differential Privacy<\/a> was published just a few months ago (from September\u00a0 2024!) by (the US publisher) <a href=\"https:\/\/amzn.to\/3TvHp0d\" rel=\"nofollow\" target=\"_blank\">O\u2019Reilly<\/a>, famous for its programming and technical books with animal covers! A <em>slate pencil sea urchin<\/em> in the present <a href=\"https:\/\/amzn.to\/3zagccR\" rel=\"nofollow\" target=\"_blank\">case<\/a>. The <a href=\"https:\/\/amzn.to\/3zagccR\" rel=\"nofollow\" target=\"_blank\">book<\/a> is indeed classical <a href=\"https:\/\/amzn.to\/3TvHp0d\" rel=\"nofollow\" target=\"_blank\">O\u2019Reilly\u2019s<\/a>, with lots of notes, little theory (or maths!) and symbols, a loose structuring of the chapters (no section numbers) and highly detailed examples, and of course plenty of OpenDP code inserts. For instance, in the present <a href=\"https:\/\/amzn.to\/3zagccR\" rel=\"nofollow\" target=\"_blank\">case<\/a>, a case study about the privatization of a sample average x\u0304 that takes about ten pages. Terrible equation rendering btw (what\u2019s wrong with L<sup>A<\/sup>T<sub>E<\/sub>X?!).\u00a0 Overall, I am quickly lost in most of the chapters due to a lack of a driving narrative, facing instead a catalogue of possible scenari and procedures, appearing one after the other as in a fashion show.<\/p>\n<p style=\"text-align: justify\"><a href=\"https:\/\/amzn.to\/3ThqsXu\" rel=\"nofollow\" target=\"_blank\">Hands-On Differential Privacy<\/a> is written by Ethan Cowan, Michael Shoemate, and Mayana Pereira. I came across the book during the <a href=\"https:\/\/opendp.org\/event\/2024-opendp-community-meeting\" rel=\"nofollow\" target=\"_blank\">OpenDP workshop<\/a> at Harvard [that took place right after my return from the <a href=\"https:\/\/xianblog.wordpress.com\/2024\/09\/07\/a-journal-of-the-pacific-and-northwest-t%C6%BFenight\/\" rel=\"nofollow\" target=\"_blank\">Pacific Northwest<\/a>] and it is definitely linked with <a href=\"https:\/\/docs.opendp.org\/en\/stable\/index.html\" rel=\"nofollow\" target=\"_blank\">OpenDP<\/a>, all authors being\u00a0 actually involved at one stage or another in the <a href=\"https:\/\/sites.harvard.edu\/opendp\/\" rel=\"nofollow\" target=\"_blank\">OpenDP Team<\/a>. The style of the <a href=\"https:\/\/amzn.to\/3zagccR\" rel=\"nofollow\" target=\"_blank\">book<\/a> is once again in tune with the O\u2019Reilly manuals, which sort of clashes with my preferences. For instance, the introduction of differential privacy (Chapter 2) is quite extensive. Chapter 3 proceeds to teach about private data transform(ation)s, stability (a rewording of Lipschitz-ianity), with code illustrations, often repeating the earlier derivation (see eg p203), while Chapter 4 is its equivalent for private mechanisms. (With the diagrams Figures 3-1 and 4-1 differing only in highlighting\/bolding different functions in a privatized data processing pipeline.) Returning to differential privacy with a privacy loss parameter and to Laplace and exponential mechanisms, Chapter 5 proposes several notions of privacy, all closed under post-processing. This includes Wasserman and Zhou (2010) interpretation of privacy as hypothesis testing, except it is not exploited further than connecting type I and type II with (\u03b5,\u03b4) parameters. Chapter 6 concludes Part I about concepts with a series of (fearless) combinators, keeping stability and privacy. With an increasing proportion of coding excerpts which I [imho] did not find particularly helpful.<\/p>\n<p style=\"text-align: justify\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Heterocentrotus_mamillatus\" rel=\"nofollow\" target=\"_blank\"><img loading=\"lazy\" class=\"aligncenter size-full\" src=\"https:\/\/i0.wp.com\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/1\/12\/Red_pencil_urchin_-_Papah%C4%81naumoku%C4%81kea.jpg\/270px-Red_pencil_urchin_-_Papah%C4%81naumoku%C4%81kea.jpg?resize=270%2C180&#038;ssl=1\" width=\"270\" height=\"180\" data-recalc-dims=\"1\" \/><\/a>Nothing about statistical loss of information or efficiency, bias, &#038;tc. until Chapter 8 (p199) and even then so little. Part II is about practice, with a first Chapter\u00a0 7 on setting a privacy unit (e.g., a person-month) before ensuring their privacy is protected. And discussing unbounded contributions (not unbounded data!). While Chapter 8 very thinly covers statistical modelling, while remaining agnostic about the choice of statistical procedures (Bayes being solely and na\u00efvely mentioned for classification, furthermore with data-based evaluation of the class \u201cprior\u201d probabilities, p211). At this stage, procedures are often only defined through spinets of code, like the private Theil-Sen estimator (pp204-205). The continuous case boils to a Normality assumption, with its pmf being defined (p212) as<\/p>\n<p style=\"text-align: center\"><img src=\"https:\/\/s0.wp.com\/latex.php?latex=%5Ctext%7BPr%7D%28x%3D%5Cmu%29%3D%5Cfrac%7B1%7D%7B%5Csqrt%7B2%5Cpi%5Csigma%7D%7De%5E%7B-%28x-%5Cmu%29%2F2%5Csigma%5E2%7D&#038;bg=000000&#038;%23038;fg=B0B0B0&#038;%23038;s=0&#038;%23038;c=20201002\" srcset_temp=\"https:\/\/s0.wp.com\/latex.php?latex=%5Ctext%7BPr%7D%28x%3D%5Cmu%29%3D%5Cfrac%7B1%7D%7B%5Csqrt%7B2%5Cpi%5Csigma%7D%7De%5E%7B-%28x-%5Cmu%29%2F2%5Csigma%5E2%7D&#038;bg=000000&#038;fg=B0B0B0&#038;s=0&#038;c=20201002 1x, https:\/\/s0.wp.com\/latex.php?latex=%5Ctext%7BPr%7D%28x%3D%5Cmu%29%3D%5Cfrac%7B1%7D%7B%5Csqrt%7B2%5Cpi%5Csigma%7D%7De%5E%7B-%28x-%5Cmu%29%2F2%5Csigma%5E2%7D&#038;bg=000000&#038;fg=B0B0B0&#038;s=0&#038;c=20201002&#038;zoom=4.5 4x\" alt=\"\\text{Pr}(x=\\mu)=\\frac{1}{\\sqrt{2\\pi\\sigma}}e^{-(x-\\mu)\/2\\sigma^2}\" class=\"latex\" \/><\/p>\n<p style=\"text-align: justify\">which contains at least three errors! Chapter 9 is the equivalent of Chapter 8 for machine learning, mostly centred on private gradient descent. And a Pytorch section (pp232-235). Completed by a light Chapter 10 on synthetic data, which does not seem to broach upon the issue of large dimension covariates, providing instead a list of GAN synthetizers.<\/p>\n<p style=\"text-align: justify\">Part III (Deploying differential privacy) is even more about practice, with Chapter 11 on privacy attacks, Chapter 12 on calibrating a privacy mechanism (co-written with Jayshree Sarathy), and good practice (like codebooks and data annotations), with the appearance of contextual integrity I discovered if not perfectly understood last year at the <a href=\"https:\/\/xianblog.wordpress.com\/2023\/08\/05\/contextual-integrity-for-differential-privacy-4-23w5106\/\" rel=\"nofollow\" target=\"_blank\">BIRS workshop<\/a> in <a href=\"https:\/\/xianblog.wordpress.com\/2023\/09\/01\/a-journal-of-the-year-of-fires\/\" rel=\"nofollow\" target=\"_blank\">Kelowna<\/a>. And Chapter 13 on planning a privacy project, with an 11 step checklist, most of which are quite vague [imho] and do include strategies to make the data owners confident their privacy is safe.<\/p>\n<p style=\"text-align: justify\"><em>[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my <a href=\"http:\/\/chance.amstat.org\/category\/columns\/book-reviews\/\" rel=\"nofollow\" target=\"_blank\">Books Review section<\/a> in <a href=\"http:\/\/chance.amstat.org\/\" rel=\"nofollow\" target=\"_blank\">CHANCE<\/a>]<\/em><\/p>\n\n<div style=\"border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;\">\r\n<div style=\"text-align: center;\">To <strong>leave a comment<\/strong> for the author, please follow the link and comment on their blog: <strong><a href=\"https:\/\/xianblog.wordpress.com\/2024\/10\/02\/hands-on-differential-privacy-book-review\/\"> R \u2013 Xi&#039;an&#039;s Og<\/a><\/strong>.<\/div>\r\n<hr \/>\r\n<a href=\"https:\/\/www.r-bloggers.com\/\" rel=\"nofollow\">R-bloggers.com<\/a> offers <strong><a href=\"https:\/\/feedburner.google.com\/fb\/a\/mailverify?uri=RBloggers\" rel=\"nofollow\">daily e-mail updates<\/a><\/strong> about <a title=\"The R Project for Statistical Computing\" href=\"https:\/\/www.r-project.org\/\" rel=\"nofollow\">R<\/a> news and tutorials about <a title=\"R tutorials\" href=\"https:\/\/www.r-bloggers.com\/how-to-learn-r-2\/\" rel=\"nofollow\">learning R<\/a> and many other topics. <a title=\"Data science jobs\" href=\"https:\/\/www.r-users.com\/\" rel=\"nofollow\">Click here if you're looking to post or find an R\/data-science job<\/a>.\r\n\r\n<hr>Want to share your content on R-bloggers?<a href=\"https:\/\/www.r-bloggers.com\/add-your-blog\/\" rel=\"nofollow\"> click here<\/a> if you have a blog, or <a href=\"http:\/\/r-posts.com\/\" rel=\"nofollow\"> here<\/a> if you don't.\r\n<\/div>","protected":false},"excerpt":{"rendered":"<div style = \"width:60%; display: inline-block; float:left; \"> Hands-On Differential Privacy was published just a few months ago (from September\u00a0 2024!) by (the US publisher) O\u2019Reilly, famous for its programming and technical books with animal covers! A slate pencil sea urchin in the present case. The book is indeed classical O\u2019Reilly\u2019s, with lots of notes, little &#8230;<\/div>\n<div style = \"width: 40%; display: inline-block; float:right;\"><\/div>\n<div style=\"clear: both;\"><\/div>\n","protected":false},"author":56,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"aioseo_notices":[],"jetpack-related-posts":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/387755"}],"collection":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/users\/56"}],"replies":[{"embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/comments?post=387755"}],"version-history":[{"count":1,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/387755\/revisions"}],"predecessor-version":[{"id":387756,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/posts\/387755\/revisions\/387756"}],"wp:attachment":[{"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/media?parent=387755"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/categories?post=387755"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.r-bloggers.com\/wp-json\/wp\/v2\/tags?post=387755"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}