{"id":69904,"date":"2023-02-06T09:00:46","date_gmt":"2023-02-06T17:00:46","guid":{"rendered":"https:\/\/github.blog\/?p=69904"},"modified":"2023-02-06T10:26:02","modified_gmt":"2023-02-06T18:26:02","slug":"the-technology-behind-githubs-new-code-search","status":"publish","type":"post","link":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/","title":{"rendered":"The technology behind GitHub\u2019s new code search"},"content":{"rendered":"<p>From launching our <a href=\"https:\/\/github.blog\/2021-12-08-improving-github-code-search\/\">technology preview<\/a> of the new and improved code search experience a year ago, to the <a href=\"https:\/\/github.blog\/2022-11-15-a-better-way-to-search-navigate-and-understand-code-on-github\/\">public beta<\/a> we released at GitHub Universe last November, there\u2019s been a flurry of innovation and dramatic changes to some of the core GitHub product experiences around how we, as developers, find, read, and navigate code.<\/p>\n<p>One question we hear about the new code search experience is, \u201cHow does it work?\u201d And to complement <a href=\"https:\/\/www.youtube.com\/watch?v=QCs76SC1ZZ0\">a talk I gave at GitHub Universe<\/a>, this post gives a high-level answer to that question and provides a small window into the system architecture and technical underpinnings of the product.<\/p>\n<p>So, how <em>does<\/em> it work? The short answer is that we built our own search engine from scratch, in Rust, specifically for the domain of code search. We call this search engine Blackbird, but before I explain how it works, I think it helps to understand our motivation a little bit. At first glance, building a search engine from scratch seems like a questionable decision. Why would you do that? Aren\u2019t there plenty of existing, open source solutions out there already? Why build something new?<\/p>\n<p>To be fair, we\u2019ve tried and have been trying, for almost the entire history of GitHub, to use existing solutions for this problem. You can read a bit more about our journey in Pavel Avgustinov\u2019s post, <a href=\"https:\/\/github.blog\/2021-12-15-a-brief-history-of-code-search-at-github\/\">A brief history of code search at GitHub<\/a>, but one thing sticks out: we haven\u2019t had a lot of luck using general text search products to power <strong><em>code<\/em><\/strong> search. The user experience is poor, indexing is slow, and it\u2019s expensive to host. There are some newer, code-specific open source projects out there, but they definitely don\u2019t work at GitHub\u2019s scale. So, knowing all that, we were motivated to create our own solution by three things:<\/p>\n<ol>\n<li>We\u2019ve got a vision for an entirely new user experience that\u2019s about being able to ask questions of code and get answers through iteratively searching, browsing, navigating, and reading code.<\/li>\n<li>We understand that <strong><em>code<\/em><\/strong> search is uniquely different from general text search. Code is already designed to be understood by machines and we should be able to take advantage of that structure and relevance. Searching code also has unique requirements: we want to search for punctuation (for example, a period or open parenthesis); we don\u2019t want stemming; we don\u2019t want stop words to be stripped from queries; and, we want to search with regular expressions.<\/li>\n<li>GitHub\u2019s scale is truly a unique challenge. When we first deployed Elasticsearch, it took months to index all of the code on GitHub (about 8 million repositories at the time). Today, that number is north of 200 million, and that code isn\u2019t static: it\u2019s constantly changing and that\u2019s quite challenging for search engines to handle. For the beta, you can currently search almost 45 million repositories, representing 115 TB of code and 15.5 billion documents. <\/li>\n<\/ol>\n<p>At the end of the day, nothing off the shelf met our needs, so we built something from scratch.<\/p>\n<h2 id=\"just-use-grep\"><a class=\"heading-link\" href=\"#just-use-grep\">Just use grep?<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h2>\n<p>First though, let\u2019s explore the brute force approach to the problem. We get this question a lot: \u201cWhy don\u2019t you just use grep?\u201d To answer that, let\u2019s do a little napkin math using <a href=\"https:\/\/github.com\/BurntSushi\/ripgrep\">ripgrep<\/a> on that 115 TB of content. On a machine with an eight core Intel CPU, ripgrep <a href=\"https:\/\/github.com\/BurntSushi\/ripgrep#quick-examples-comparing-tools\">can run an exhaustive regular expression query<\/a> on a 13 GB file cached in memory in 2.769 seconds, or about 0.6 GB\/sec\/core.<\/p>\n<p>We can see pretty quickly that this really isn\u2019t going to work for the larger amount of data we have. Code search runs on 64 core, 32 machine clusters. Even if we managed to put 115 TB of code in memory and assume we can perfectly parallelize the work, we\u2019re going to saturate 2,048 CPU cores for 96 seconds to serve a single query! Only that one query can run. Everybody else has to get in line. This comes out to a whopping 0.01 queries per second (QPS) and good luck doubling your QPS\u2014that\u2019s going to be a fun conversation with leadership about your infrastructure bill.<\/p>\n<p>There\u2019s just no cost-effective way to scale this approach to all of GitHub\u2019s code and all of GitHub\u2019s users. Even if we threw a ton of money at the problem, it still wouldn\u2019t meet our user experience goals.<\/p>\n<p>You can see where this is going: we need to build an index.<\/p>\n<h2 id=\"a-search-index-primer\"><a class=\"heading-link\" href=\"#a-search-index-primer\">A search index primer<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h2>\n<p>We can only make queries fast if we pre-compute a bunch of information in the form of indices, which you can think of as maps from keys to sorted lists of document IDs (called \u201cposting lists\u201d) where that key appears. As an example, here\u2019s a small index for programming languages. We scan each document to detect what programming language it\u2019s written in, assign a document ID, and then create an inverted index where language is the key and the value is a posting list of document IDs.<\/p>\n<h3 id=\"forward-index\"><a class=\"heading-link\" href=\"#forward-index\">Forward index<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h3>\n<div data-target=\"content-table-wrap.container\" class=\"content-table-wrap\"><content-table-wrap><table>\n<tr>\n<td><strong>Doc ID<\/strong>\n   <\/td>\n<td><strong>Content<\/strong>\n   <\/td>\n<\/tr>\n<tr>\n<td>1\n   <\/td>\n<td>def lim<br \/>\n    puts &#8220;mit&#8221;<br \/>\n    end\n   <\/td>\n<\/tr>\n<tr>\n<td>2\n   <\/td>\n<td>fn limits() {\n   <\/td>\n<\/tr>\n<tr>\n<td>3\n   <\/td>\n<td>function mits() {\n   <\/td>\n<\/tr>\n<\/table><\/content-table-wrap><\/div>\n<h3 id=\"inverted-index\"><a class=\"heading-link\" href=\"#inverted-index\">Inverted index<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h3>\n<div data-target=\"content-table-wrap.container\" class=\"content-table-wrap\"><content-table-wrap><table>\n<tr>\n<td><strong>Language<\/strong>\n   <\/td>\n<td><strong>Doc IDs (postings)<\/strong>\n   <\/td>\n<\/tr>\n<tr>\n<td>JavaScript\n   <\/td>\n<td>3, 8, 12, &#8230;\n   <\/td>\n<\/tr>\n<tr>\n<td>Ruby\n   <\/td>\n<td>1, 10, 13, &#8230;\n   <\/td>\n<\/tr>\n<tr>\n<td>Rust\n   <\/td>\n<td>2, 5, 11, &#8230;\n   <\/td>\n<\/tr>\n<\/table><\/content-table-wrap><\/div>\n<p>For code search, we need a special type of inverted index called an ngram index, which is useful for looking up substrings of content. An <a href=\"https:\/\/en.wikipedia.org\/wiki\/N-gram\">ngram<\/a> is a sequence of characters of length <em>n<\/em>. For example, if we choose n=3 (trigrams), the ngrams that make up the content &#8220;limits&#8221; are <code>lim<\/code>, <code>imi<\/code>, <code>mit<\/code>, <code>its<\/code>. With our documents above, the index for those trigrams would look like this:<\/p>\n<div data-target=\"content-table-wrap.container\" class=\"content-table-wrap\"><content-table-wrap><table>\n<tr>\n<td><strong>ngram<\/strong>\n   <\/td>\n<td><strong>Doc IDs (postings)<\/strong>\n   <\/td>\n<\/tr>\n<tr>\n<td>lim\n   <\/td>\n<td>1, 2, &#8230;\n   <\/td>\n<\/tr>\n<tr>\n<td>imi\n   <\/td>\n<td>2, &#8230;\n   <\/td>\n<\/tr>\n<tr>\n<td>mit\n   <\/td>\n<td>1, 2, 3, &#8230;\n   <\/td>\n<\/tr>\n<tr>\n<td>its\n   <\/td>\n<td>2, 3, &#8230;\n   <\/td>\n<\/tr>\n<\/table><\/content-table-wrap><\/div>\n<p>To perform a search, we intersect the results of multiple lookups to give us the list of documents where the string appears. With a trigram index you need four lookups: <code>lim<\/code>, <code>imi<\/code>, <code>mit<\/code>, and <code>its<\/code> in order to fulfill the query for <code>limits<\/code>.<\/p>\n<p>Unlike a hashmap though, these indices are too big to fit in memory, so instead, we build iterators for each index we need to access. These lazily return sorted document IDs (the IDs are assigned based on the ranking of each document) and we intersect and union the iterators (as demanded by the specific query) and only read far enough to fetch the requested number of results. That way we never have to keep entire posting lists in memory.<\/p>\n<h2 id=\"indexing-45-million-repositories\"><a class=\"heading-link\" href=\"#indexing-45-million-repositories\">Indexing 45 million repositories<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h2>\n<p>The next problem we have is how to build this index in a reasonable amount of time (remember, this took months in our first iteration). As is often the case, the trick here is to identify some insight into the specific data we\u2019re working with to guide our approach. In our case it\u2019s two things: Git\u2019s use of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Content-addressable_storage\">content addressable hashing<\/a> and the fact that there\u2019s actually quite a lot of duplicate content on GitHub. Those two insights lead us the the following decisions:<\/p>\n<ol>\n<li><strong>Shard by Git blob object ID<\/strong> which gives us a nice way of evenly distributing documents between the shards while avoiding any duplication. There won\u2019t be any hot servers due to special repositories and we can easily scale the number of shards as necessary.<\/li>\n<li><strong>Model the index as a tree<\/strong> and use delta encoding to reduce the amount of crawling and to optimize the metadata in our index. For us, metadata are things like the list of locations where a document appears (which path, branch, and repository) and information about those objects (repository name, owner, visibility, etc.). This data can be quite large for popular content.<\/li>\n<\/ol>\n<p>We also designed the system so that query results are consistent on a commit-level basis. If you search a repository while your teammate is pushing code, your results shouldn\u2019t include documents from the new commit until it has been fully processed by the system. In fact, while you\u2019re getting back results from a repository-scoped query, someone else could be paging through global results and looking at a different, prior, but still <strong><em>consistent<\/em><\/strong> state of the index. This is tricky to do with other search engines. Blackbird provides this level of query consistency as a core part of its design.<\/p>\n<h2 id=\"lets-build-an-index\"><a class=\"heading-link\" href=\"#lets-build-an-index\">Let\u2019s build an index<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h2>\n<p>Armed with those insights, let\u2019s turn our attention to building an index with Blackbird. This diagram represents a high level overview of the ingest and indexing side of the system.<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" loading=\"lazy\" src=\"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/Canvas_1-1.png?w=1024&#038;resize=1024%2C692\" alt=\"a high level overview of the ingest and indexing side of the system\" width=\"1024\" height=\"692\" class=\"aligncenter size-large wp-image-69988 width-fit\" srcset=\"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/Canvas_1-1.png?w=1376 1376w, https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/Canvas_1-1.png?w=300 300w, https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/Canvas_1-1.png?w=768 768w, https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/Canvas_1-1.png?w=1024 1024w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/p>\n<p>Kafka provides events that tell us to go index something. There are a bunch of crawlers that interact with Git and a service for extracting symbols from code, and then we use Kafka, again, to allow each shard to consume documents for indexing at its own pace.<\/p>\n<p>Though the system generally just responds to events like <code>git push<\/code> to crawl changed content, we have some work to do to ingest all the repositories for the first time. One key property of the system is that we optimize the order in which we do this initial ingest to make the most of our delta encoding. We do this with a novel probabilistic data structure representing repository similarity and by driving ingest order from a level order traversal of a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Minimum_spanning_tree\">minimum spanning tree<\/a> of a graph of repository similarity<sup id=\"fnref-69904-1\"><a href=\"#fn-69904-1\" class=\"jetpack-footnote\" title=\"Read footnote.\">1<\/a><\/sup>.<\/p>\n<p>Using our optimized ingest order, each repository is then crawled by diffing it against its parent in the delta tree we\u2019ve constructed. This means we only need to crawl the blobs unique to that repository (not the entire repository). Crawling involves fetching blob content from Git, analyzing it to extract symbols, and creating documents that will be the input to indexing.<\/p>\n<p>These documents are then published to another Kafka topic. This is where we partition<sup id=\"fnref-69904-2\"><a href=\"#fn-69904-2\" class=\"jetpack-footnote\" title=\"Read footnote.\">2<\/a><\/sup> the data between shards. Each shard consumes a single Kafka partition in the topic. Indexing is decoupled from crawling through the use of Kafka and the ordering of the messages in Kafka is how we gain query consistency.<\/p>\n<p>The indexer shards then take these documents and build their indices: tokenizing to construct ngram indices<sup id=\"fnref-69904-bignote\"><a href=\"#fn-69904-bignote\" class=\"jetpack-footnote\" title=\"Read footnote.\">3<\/a><\/sup> (for content, symbols, and paths) and other useful indices (languages, owners, repositories, etc) before serializing and flushing to disk when enough work has accumulated.<\/p>\n<p>Finally, the shards run compaction to collapse up smaller indices into larger ones that are more efficient to query and easier to move around (for example, to a read replica or for backups). Compaction also <a href=\"https:\/\/en.wikipedia.org\/wiki\/K-way_merge_algorithm\">k-merges<\/a> the posting lists by score so relevant documents have lower IDs and will be returned first by the lazy iterators. During the initial ingest, we delay compaction and do one big run at the end, but then as the index keeps up with incremental changes, we run compaction on a shorter interval as this is where we handle things like document deletions.<\/p>\n<h2 id=\"life-of-a-query\"><a class=\"heading-link\" href=\"#life-of-a-query\">Life of a query<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h2>\n<p>Now that we have an index, it\u2019s interesting to trace a query through the system. The query we\u2019re going to follow is a regular expression qualified to the <a href=\"https:\/\/github.com\/rails\">Rails organization<\/a> looking for code written in the Ruby programming language: <code>\/arguments?\/ org:rails lang:Ruby<\/code>. The high level architecture of the query path looks a little bit like this:<\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" loading=\"lazy\" src=\"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search.png?w=1024&#038;resize=1024%2C508\" alt=\"Architecture diagram of a query path.\" width=\"1024\" height=\"508\" class=\"aligncenter size-large wp-image-69989 width-fit\" srcset=\"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search.png?w=1376 1376w, https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search.png?w=300 300w, https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search.png?w=768 768w, https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search.png?w=1024 1024w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/p>\n<p>In between GitHub.com and the shards is a service that coordinates taking user queries and fanning out requests to each host in the search cluster. We use Redis to manage quotas and cache some access control data.<\/p>\n<p>The front end accepts the user query and passes it along to the Blackbird query service where we parse the query into an abstract syntax tree and then rewrite it, resolving things like languages to their canonical <a href=\"https:\/\/github.com\/github\/linguist\">Linguist<\/a> language ID and tagging on extra clauses for permissions and scopes. In this case, you can see how rewriting ensures that I\u2019ll get results from public repositories or any private repositories that I have access to.<\/p>\n<pre><code>And(\n    Owner(\"rails\"),\n    LanguageID(326),\n    Regex(\"arguments?\"),\n    Or(\n        RepoIDs(...),\n        PublicRepo(),\n    ),\n)\n<\/code><\/pre>\n<p>Next, we fan out and send <em>n<\/em> concurrent requests: one to each shard in the search cluster. Due to our sharding strategy, a query request must be sent to each shard in the cluster.<\/p>\n<p>On each individual shard, we then do some further conversion of the query in order to lookup information in the indices. Here, you can see that the regex gets translated into a series of substring queries on the ngram indices.<\/p>\n<pre><code>and(\n  owners_iter(\"rails\"),\n  languages_iter(326),\n  or(\n    and(\n      content_grams_iter(\"arg\"),\n      content_grams_iter(\"rgu\"),\n      content_grams_iter(\"gum\"),\n      or(\n        and(\n         content_grams_iter(\"ume\"),\n         content_grams_iter(\"ment\")\n        )\n        content_grams_iter(\"uments\"),\n      )\n    ),\n    or(paths_grams_iter\u2026)\n    or(symbols_grams_iter\u2026)\n  ), \n  \u2026\n)\n<\/code><\/pre>\n<p>If you want to learn more about a method to turn regular expressions into substring queries, see Russ Cox\u2019s article on <a href=\"https:\/\/swtch.com\/~rsc\/regexp\/regexp4.html\">Regular Expression Matching with a Trigram Index<\/a>. We use a different algorithm and dynamic gram sizes instead of trigrams (see below<sup id=\"fnref2:69904-bignote\"><a href=\"#fn-69904-bignote\" class=\"jetpack-footnote\" title=\"Read footnote.\">3<\/a><\/sup>). In this case the engine uses the following grams: <code>arg<\/code>,<code>rgu<\/code>, <code>gum<\/code>, and then either <code>ume<\/code> and <code>ment<\/code>, or the 6 gram <code>uments<\/code>.<\/p>\n<p>The iterators from each clause are run: <em>and<\/em> means intersect, <em>or<\/em> means union. The result is a list of documents. We still have to double check each document (to validate matches and detect ranges for them) before scoring, sorting, and returning the requested number of results.<\/p>\n<p>Back in the query service, we aggregate the results from all shards, re-sort by score, filter (to double-check permissions), and return the top 100. The GitHub.com front end then still has to do syntax highlighting, term highlighting, pagination, and finally we can render the results to the page.<\/p>\n<p>Our p99 response times from individual shards are on the order of 100 ms, but total response times are a bit longer due to aggregating responses, checking permissions, and things like syntax highlighting. A query ties up a single CPU core on the index server for that 100 ms, so our 64 core hosts have an upper bound of something like 640 queries per second. Compared to the grep approach (0.01 QPS), that\u2019s screaming fast with plenty of room for simultaneous user queries and future growth.<\/p>\n<h2 id=\"in-summary\"><a class=\"heading-link\" href=\"#in-summary\">In summary<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h2>\n<p>Now that we\u2019ve seen the full system, let\u2019s revisit the scale of the problem. Our ingest pipeline can publish around 120,000 documents per second, so working through those 15.5 billion documents should take about 36 hours. But delta indexing reduces the number of documents we have to crawl by over 50%, which allows us to re-index the entire corpus in about 18 hours.<\/p>\n<p>There are some big wins on the size of the index as well. Remember that we started with 115 TB of content that we want to search. Content deduplication and delta indexing brings that down to around 28 TB of <strong>unique<\/strong> content. And the index itself clocks in at just 25 TB, which includes not only all the indices (including the ngrams), but also a compressed copy of all unique content. This means our total index size including the content is roughly a quarter the size of the original data!<\/p>\n<p>If you haven\u2019t signed up already, we\u2019d love for you to <a href=\"https:\/\/github.com\/features\/code-search\">join our beta<\/a> and try out the new code search experience. Let us know what you think! We\u2019re actively adding more repositories and fixing up the rough edges based on feedback from people just like you.<\/p>\n\n<h3 id=\"notes\"><a class=\"heading-link\" href=\"#notes\">Notes<span class=\"heading-hash pl-2 text-italic text-bold\" aria-hidden=\"true\"><\/span><\/a><\/h3>\n<div class=\"footnotes\">\n<hr \/>\n<ol>\n<li id=\"fn-69904-1\">\nTo determine the optimal ingest order, we need a way to tell how similar one repository is to another (similar in terms of their content), so we invented a new probabilistic data structure to do this in the same class of data structures as <a href=\"https:\/\/en.wikipedia.org\/wiki\/MinHash\">MinHash<\/a> and <a href=\"https:\/\/en.wikipedia.org\/wiki\/HyperLogLog\">HyperLogLog<\/a>. This data structure, which we call a geometric filter, allows computing set similarity and the symmetric difference between sets with logarithmic space. In this case, the sets we\u2019re comparing are the contents of each repository as represented by (path, blob_sha) tuples. Armed with that knowledge, we can construct a graph where the vertices are repositories and edges are weighted with this similarity metric. Calculating a minimum spanning tree of this graph (with similarity as cost) and then doing a level order traversal of the tree gives us an ingest order where we can make best use of delta encoding. Really though, this graph is enormous (millions of nodes, trillions of edges), so our MST algorithm computes an approximation that only takes a few minutes to calculate and provides 90% of the delta compression benefits we\u2019re going for.&#160;<a href=\"#fnref-69904-1\" title=\"Return to main content.\">&#8617;<\/a>\n<\/li>\n<li id=\"fn-69904-2\">\nThe index is sharded by Git blob SHA. Sharding means spreading the indexed data out across multiple servers, which we need to do in order to easily scale horizontally for reads (where we are concerned about QPS), for storage (where disk space is the primary concern), and for indexing time (which is constrained by CPU and memory on the individual hosts).&#160;<a href=\"#fnref-69904-2\" title=\"Return to main content.\">&#8617;<\/a>\n<\/li>\n<li id=\"fn-69904-bignote\">\nThe ngram indices we use are especially interesting. While trigrams are a known sweet spot in the design space (as <a href=\"https:\/\/swtch.com\/~rsc\/regexp\/regexp4.html\">Russ Cox and others<\/a> have noted: bigrams aren\u2019t selective enough and quadgrams take up too much space), they cause some problems at our scale.<\/p>\n<p>For common grams like <code>for<\/code> trigrams aren\u2019t selective enough. We get way too many false positives and that means slow queries. An example of a false positive is something like finding a document that has each individual trigram, but not next to each other. You can\u2019t tell until you fetch the content for that document and double check at which point you\u2019ve done a lot of work that has to be discarded. We tried a number of strategies to fix this like adding follow masks, which use bitmasks for the character following the trigram (basically halfway to quad grams), but they saturate too quickly to be useful.<\/p>\n<p>We call the solution &#8220;sparse grams,\u201d and it works like this. Assume you have some function that given a bigram gives a weight.  As an example, consider the string <code>chester<\/code>. We give each bigram a weight: 9 for &#8220;ch&#8221;, 6 for &#8220;he&#8221;, 3 for &#8220;es&#8221;, and so on.<\/p>\n<p class=\"jetpack-slideshow-noscript robots-nocontent\">This slideshow requires JavaScript.<\/p><div id=\"gallery-69904-1-slideshow\" class=\"jetpack-slideshow-window jetpack-slideshow jetpack-slideshow-black\" data-trans=\"fade\" data-autostart=\"1\" data-gallery=\"[{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel1.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69913&quot;,&quot;title&quot;:&quot;carousel1&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel2.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69914&quot;,&quot;title&quot;:&quot;carousel2&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel3.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69915&quot;,&quot;title&quot;:&quot;carousel3&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel4.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69916&quot;,&quot;title&quot;:&quot;carousel4&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel5.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69917&quot;,&quot;title&quot;:&quot;carousel5&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel6.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69918&quot;,&quot;title&quot;:&quot;carousel6&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel7.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69919&quot;,&quot;title&quot;:&quot;carousel7&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel8.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69920&quot;,&quot;title&quot;:&quot;carousel8&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel9.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69921&quot;,&quot;title&quot;:&quot;carousel9&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel10.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69922&quot;,&quot;title&quot;:&quot;carousel10&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;},{&quot;src&quot;:&quot;https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/carousel11.png?fit=600%2C350&quot;,&quot;id&quot;:&quot;69923&quot;,&quot;title&quot;:&quot;Sparse grams example using the string \\u0026#8220;chester\\u0026#8221;&quot;,&quot;alt&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;itemprop&quot;:&quot;image&quot;}]\" itemscope itemtype=\"https:\/\/schema.org\/ImageGallery\"><\/div>\n<p>Using those weights, we tokenize by selecting intervals where the inner weights are strictly smaller than the weights at the borders. The inclusive characters of that interval make up the ngram and we apply this algorithm recursively until its natural end at trigrams. At query time, we use the exact same algorithm, but keep only the covering ngrams, as the others are redundant.&#160;<a href=\"#fnref-69904-bignote\" title=\"Return to main content.\">&#8617;<\/a> <a href=\"#fnref2:69904-bignote\" title=\"Return to main content.\">&#8617;<\/a>\n<\/li>\n<\/ol>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A look at what went into building the world&#8217;s largest public code search index.<\/p>\n","protected":false},"author":1406,"featured_media":69974,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_gh_post_show_toc":"no","_gh_post_is_no_robots":"no","_gh_post_is_featured":"no","_gh_post_is_excluded":"no","_gh_post_is_unlisted":"no","_gh_post_related_link_1":"","_gh_post_related_link_2":"","_gh_post_related_link_3":"","_gh_post_sq_img":"","_gh_post_sq_img_id":"","_gh_post_cta_title":"","_gh_post_cta_text":"","_gh_post_cta_link":"","_gh_post_cta_button":"Click Here to Learn More","_gh_post_recirc_hide":"no","_gh_post_recirc_col_1":"gh-auto-select","_gh_post_recirc_col_2":"65301","_gh_post_recirc_col_3":"65308","_gh_post_recirc_col_4":"65316","_featured_video":"","_gh_post_additional_query_params":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"{title}\n\n{excerpt}\n\n{url}","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"_wpas_customize_per_network":false,"jetpack_post_was_ever_published":false,"_links_to":"","_links_to_target":""},"categories":[3307,72],"tags":[2311,2913],"coauthors":[2308],"class_list":["post-69904","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-architecture-optimization","category-engineering","tag-code-search","tag-core-productivity"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.7 (Yoast SEO v27.7) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>The technology behind GitHub\u2019s new code search - The GitHub Blog<\/title>\n<meta name=\"description\" content=\"A look at what went into building the world&#039;s largest public code search index.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The technology behind GitHub\u2019s new code search\" \/>\n<meta property=\"og:description\" content=\"A look at what went into building the world&#039;s largest public code search index.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/\" \/>\n<meta property=\"og:site_name\" content=\"The GitHub Blog\" \/>\n<meta property=\"article:published_time\" content=\"2023-02-06T17:00:46+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-02-06T18:26:02+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736\" \/>\n\t<meta property=\"og:image:width\" content=\"1600\" \/>\n\t<meta property=\"og:image:height\" content=\"736\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Timothy Clem\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Timothy Clem\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/\"},\"author\":{\"name\":\"Timothy Clem\",\"@id\":\"https:\\\/\\\/github.blog\\\/#\\\/schema\\\/person\\\/bd567b024e367d2487333a1379510f64\"},\"headline\":\"The technology behind GitHub\u2019s new code search\",\"datePublished\":\"2023-02-06T17:00:46+00:00\",\"dateModified\":\"2023-02-06T18:26:02+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/\"},\"wordCount\":2973,\"image\":{\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/code-search-header.png?fit=1600%2C736\",\"keywords\":[\"code search\",\"Core productivity\"],\"articleSection\":[\"Architecture &amp; optimization\",\"Engineering\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/\",\"url\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/\",\"name\":\"The technology behind GitHub\u2019s new code search - The GitHub Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/github.blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/code-search-header.png?fit=1600%2C736\",\"datePublished\":\"2023-02-06T17:00:46+00:00\",\"dateModified\":\"2023-02-06T18:26:02+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/github.blog\\\/#\\\/schema\\\/person\\\/bd567b024e367d2487333a1379510f64\"},\"description\":\"A look at what went into building the world's largest public code search index.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/#primaryimage\",\"url\":\"https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/code-search-header.png?fit=1600%2C736\",\"contentUrl\":\"https:\\\/\\\/github.blog\\\/wp-content\\\/uploads\\\/2023\\\/02\\\/code-search-header.png?fit=1600%2C736\",\"width\":1600,\"height\":736},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/the-technology-behind-githubs-new-code-search\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/github.blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Engineering\",\"item\":\"https:\\\/\\\/github.blog\\\/engineering\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Architecture &amp; optimization\",\"item\":\"https:\\\/\\\/github.blog\\\/engineering\\\/architecture-optimization\\\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"The technology behind GitHub\u2019s new code search\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/github.blog\\\/#website\",\"url\":\"https:\\\/\\\/github.blog\\\/\",\"name\":\"The GitHub Blog\",\"description\":\"Updates, ideas, and inspiration from GitHub to help developers build and design software.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/github.blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/github.blog\\\/#\\\/schema\\\/person\\\/bd567b024e367d2487333a1379510f64\",\"name\":\"Timothy Clem\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/1d162393947c12eedb63fa12c3f66c8fb190f4fe8e2d5893f7faf1df9ffd6598?s=96&d=mm&r=g6270a9c7f33b7be5c3657ed7a8f0df65\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/1d162393947c12eedb63fa12c3f66c8fb190f4fe8e2d5893f7faf1df9ffd6598?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/1d162393947c12eedb63fa12c3f66c8fb190f4fe8e2d5893f7faf1df9ffd6598?s=96&d=mm&r=g\",\"caption\":\"Timothy Clem\"},\"sameAs\":[\"http:\\\/\\\/adaptivepatchwork.com\"],\"url\":\"https:\\\/\\\/github.blog\\\/author\\\/tclem\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"The technology behind GitHub\u2019s new code search - The GitHub Blog","description":"A look at what went into building the world's largest public code search index.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/","og_locale":"en_US","og_type":"article","og_title":"The technology behind GitHub\u2019s new code search","og_description":"A look at what went into building the world's largest public code search index.","og_url":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/","og_site_name":"The GitHub Blog","article_published_time":"2023-02-06T17:00:46+00:00","article_modified_time":"2023-02-06T18:26:02+00:00","og_image":[{"width":1600,"height":736,"url":"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736","type":"image\/png"}],"author":"Timothy Clem","twitter_card":"summary_large_image","twitter_image":"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736","twitter_misc":{"Written by":"Timothy Clem","Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/#article","isPartOf":{"@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/"},"author":{"name":"Timothy Clem","@id":"https:\/\/github.blog\/#\/schema\/person\/bd567b024e367d2487333a1379510f64"},"headline":"The technology behind GitHub\u2019s new code search","datePublished":"2023-02-06T17:00:46+00:00","dateModified":"2023-02-06T18:26:02+00:00","mainEntityOfPage":{"@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/"},"wordCount":2973,"image":{"@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/#primaryimage"},"thumbnailUrl":"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736","keywords":["code search","Core productivity"],"articleSection":["Architecture &amp; optimization","Engineering"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/","url":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/","name":"The technology behind GitHub\u2019s new code search - The GitHub Blog","isPartOf":{"@id":"https:\/\/github.blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/#primaryimage"},"image":{"@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/#primaryimage"},"thumbnailUrl":"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736","datePublished":"2023-02-06T17:00:46+00:00","dateModified":"2023-02-06T18:26:02+00:00","author":{"@id":"https:\/\/github.blog\/#\/schema\/person\/bd567b024e367d2487333a1379510f64"},"description":"A look at what went into building the world's largest public code search index.","breadcrumb":{"@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/#primaryimage","url":"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736","contentUrl":"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736","width":1600,"height":736},{"@type":"BreadcrumbList","@id":"https:\/\/github.blog\/engineering\/architecture-optimization\/the-technology-behind-githubs-new-code-search\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/github.blog\/"},{"@type":"ListItem","position":2,"name":"Engineering","item":"https:\/\/github.blog\/engineering\/"},{"@type":"ListItem","position":3,"name":"Architecture &amp; optimization","item":"https:\/\/github.blog\/engineering\/architecture-optimization\/"},{"@type":"ListItem","position":4,"name":"The technology behind GitHub\u2019s new code search"}]},{"@type":"WebSite","@id":"https:\/\/github.blog\/#website","url":"https:\/\/github.blog\/","name":"The GitHub Blog","description":"Updates, ideas, and inspiration from GitHub to help developers build and design software.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/github.blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/github.blog\/#\/schema\/person\/bd567b024e367d2487333a1379510f64","name":"Timothy Clem","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1d162393947c12eedb63fa12c3f66c8fb190f4fe8e2d5893f7faf1df9ffd6598?s=96&d=mm&r=g6270a9c7f33b7be5c3657ed7a8f0df65","url":"https:\/\/secure.gravatar.com\/avatar\/1d162393947c12eedb63fa12c3f66c8fb190f4fe8e2d5893f7faf1df9ffd6598?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1d162393947c12eedb63fa12c3f66c8fb190f4fe8e2d5893f7faf1df9ffd6598?s=96&d=mm&r=g","caption":"Timothy Clem"},"sameAs":["http:\/\/adaptivepatchwork.com"],"url":"https:\/\/github.blog\/author\/tclem\/"}]}},"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/github.blog\/wp-content\/uploads\/2023\/02\/code-search-header.png?fit=1600%2C736","jetpack_shortlink":"https:\/\/wp.me\/pamS32-ibu","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/posts\/69904","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/users\/1406"}],"replies":[{"embeddable":true,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/comments?post=69904"}],"version-history":[{"count":38,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/posts\/69904\/revisions"}],"predecessor-version":[{"id":70017,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/posts\/69904\/revisions\/70017"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/media\/69974"}],"wp:attachment":[{"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/media?parent=69904"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/categories?post=69904"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/tags?post=69904"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/github.blog\/wp-json\/wp\/v2\/coauthors?post=69904"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}