Modernizing LinkedIn’s Static Application Security Testing Capabilities to protect our members
Static Application Security Testing (SAST) is a technique that analyzes an application's source code to find security vulnerabilities. LinkedIn uses it in our internal software development life cycle where we identify security issues early in our development process. It allows us to scale our security efforts to scan across millions of lines of code over tens of thousands of repositories.
Striking a balance between developer velocity and security is a significant undertaking. Doing this at LinkedIn’s scale comes with its own unique set of challenges, so we recently embarked on a mission to modernize our SAST capabilities. In this blog, we’ll share the design decisions, and challenges along the journey of developing and deploying our updated SAST pipeline. We’ll also share our strategy behind guard rails enforcement, which prevents developers from committing insecure code and keeps LinkedIn’s websites and infrastructure secure for our members and customers.
Establishing design principles
At the beginning of our journey, we agreed upon some design principles to guide decisions and design choices. These were:
- Developer-first security: We believe in balancing security and developer experience/velocity. We strive to make security as invisible as possible so that it works behind the scene without interrupting developers' workflow.
- Extensibility & self-service: We design our system to allow other teams to write security rules or integrate with the pipeline in a simplified manner.
- Resilience & redundancy: When dealing with tens of thousands of repositories, issues in the scanning pipeline are bound to come up from time to time. We need to make sure our solution is resilient with graceful fallbacks where no single point of failure would cause disruptions for our developers.
- Observability & metrics: When dealing with the large scale of repositories at LinkedIn, it is important that we have visibility and collect the relevant data at all points of our scanning pipeline. This allows us to take a data driven approach to detect and optimize for issues such as where scans are failing, taking too long or simply creating too much noise.
Challenges with our legacy approach
Prior to this initiative, our old SAST pipeline was a collection of different disjointed systems, each running their own scans. Many were bespoke scanners that evolved over the years, resulting in a fragmented landscape that lacked consistency. This made maintenance, troubleshooting and introducing new detection rules difficult. Coverage isn’t always consistent as different tools tap into different parts of the pipeline.
With the migration of LinkedIn’s codebase to GitHub, our team decided to build everything natively on GitHub Actions platform. In particular, we have decided on using CodeQL and Semgrep scanners since our internal evaluation found that both complement each other well. Furthermore, between both scanners we were able to largely migrate existing custom detection tools by rewriting them as CodeQL or Semgrep rules.
Deviation from paved path approach
While CodeQL and Semgrep provide a paved path for setting up integration via pre-defined GitHub Actions or even via a one-click UI, we found that these default setups did not meet our requirements because:
- LinkedIn uses a complex and custom build process. This means the default build process in CodeQL doesn’t work out of the box and requires custom setup during the database creation process.
- Paved path does not provide sufficient data to meet our observability & metrics goals.
- It was hard to add customization like dynamic rules filtering, enrich SARIF files and provide custom remediation messaging specific to LinkedIn’s environment.
As such, we deviated from the paved path and wrote our own GitHub Actions Workflow. Figure 1 below shows a high level glimpse at what our GitHub Action Workflows looks like for CodeQL.
As shown above, the multiple stages and components involved include:
- Rule processing: We dynamically agglomerate CodeQL’s default ruleset with LinkedIn's custom written rules. This allows us to selectively enable/disable certain rules on a repo-by-repo basis based on certain criterials (e.g. filter out rules that are not applicable in LinkedIn's environment, or if they are problematic for a particular repo).
- CodeQL database creation: In this stage, we customize the database creation process so that it would work for LinkedIn’s internal build process.
- SARIF enrichment: We enrich SARIF alerts with additional metadata and remediation information specific to LinkedIn’s environment. Inline with our “Extensibility” design principle above, we implemented a templating system that lets other teams write their own rules while customizing how their alerts look like.
- Metrics instrumentation: We instrument each stage of our pipeline and collect extensive metrics, including breakdown of run duration, queue time, resource allocation, successes and errors.
- Fail safe mechanism: In the event of unrecoverable error at any stage of the pipeline, we fail gracefully to prevent any disruption to the developers (this will be covered in more detail below).
Our implementation for Semgrep follows a similar structure and many of the stages were written either as GitHub Actions Reusable Workflows or Composite Actions to optimize for code reuse.
Now we’ll discuss how we got our GitHub Actions Workflow files to run across tens of thousands of repos whenever a Pull Request is created and on a weekly schedule.
This is where we unexpectedly stumbled upon a hurdle. While GitHub has a feature called Required Workflow to centrally enable a workflow across an organization's repositories, they do not support a scheduling mechanism. We evaluated quite a few different options and none of them met our requirements. We came to the conclusion that we had to push our workflow file into each and every single repository. However, trying to manage our workflow at scale is a non-trivial task. Here’s our strategic approach:
Using a stub workflow
If we committed the core of our SAST workflow file into each repo, everytime we made a change, we would need to push the change across our repos. This would make updates cumbersome and inefficient. Instead of pushing the main SAST workflow file, we decided to push a stub file into each repo. This stub is a very tiny workflow whose main objective is mainly to call our main SAST workflow which is hosted centrally. This allows the stub file to be relatively static and any changes can be made directly in the central workflow. The changes are propagated immediately to all repositories without having to update the workflows in each and every repository. A sample of the stub file is as follows:
name: Security Scan
on:
pull_request:
types: [opened, synchronize]
schedule:
- cron: '34 3 * * 4' # randomized to spread out the load
env:
VERSION: '1.0.0'
LAST_UPDATED: '2025-10-22'
permissions:
actions: read
contents: read
security-events: write
pull-requests: read
jobs:
sast-scan:
uses: static-analysis-actions/.github/workflows/sast-scan.yaml@production
Drift Management System (DMS)
We built our DMS that runs everyday and checks every repository for the following:
- Is our SAST stub workflow file present?
- Is the content of the stub workflow file different from the latest copy?
If any of the above is true, DMS pushes and commits a copy of the workflow file into the repo. This ensures that every repository has the latest copy of our SAST stub workflow and its content has not drifted, driving adoption and ensuring consistency across the organization.
Tap into repository creation flow
While DMS helps to ensure that existing repositories contain our latest workflow file, newly created repositories would still be missing our workflow file until the next DMS scheduled run. To tackle that, we also tap into our repository creation flow so that our workflow file is added at the point of creation. This makes sure any newly created repository is immediately protected by our SAST scanning.
Enforcement: Blocking mode
With our SAST pipeline deployed, our design principle around observability and metrics means that we collect an extensive amount of data and carry out data driven optimizations to tackle hotspots in our pipeline. One area we identified was the potential for vulnerable code to make its way into the system in certain scenarios:
- Pull requests were sometimes merged without waiting for the SAST scan to complete. This means potentially vulnerable code could have been merged.
- In some cases, SAST alerts raised within pull requests were not fully addressed before merge, creating a similar risk.
It’s important that our scanning detects potential issues and fixes them as early as possible in the software development life cycle. Thus, to handle such cases, we implemented what we term “blocking mode.” This is where we set a restriction such that prior to merging, all pull requests must have the SAST scan completed and must be free of security alerts that exceed our risk threshold.
This was implemented via the Repository Ruleset feature in GitHub. The way it works is that for each Pull Request, it only allows merging if:
- A SARIF has been submitted for the Pull Request.
- Within the SARIF there are no vulnerability alerts that exceed our security threshold.
The combination of the two checks above provides us the assurance that all newly committed code has been scanned and is free of dangerous vulnerabilities.
Kill switches and fail safes
The implementation of “blocking mode” comes with a lot of considerations, including that our scanning is now in the hot path of developers. If there were to be any errors in completing the scanning (and there will be errors when dealing with tens of thousands of repositories), developers would essentially be blocked and thus causing disruption.
To mitigate this we implemented the following:
Multi-level kill switch
As part of our solution, we implemented kill switches across different points of our pipeline. This allows us the flexibility to quickly toggle scanning on/off based on different criterials:
- Organization by organization
- Repository-by-Repository
- Language-by-Language etc. (Eg: Disable python scanning)
- Tool specific. (Eg: Disable only CodeQL)
- Rule-by-Rule (Ability to quickly disable problematic rules)
This has proven to be useful in incidents where there were outages with GitHub or when we encounter problematic repositories whose configuration deviates from the norm.
Fail safe mechanism
Should there be an error during a Pull Request (PR) scan, the developer would be unable to merge the PR. This is because “blocking mode” is essentially stuck waiting for a SARIF file. We implemented a self-recovery mechanism that:
- Uploads a Blank SARIF file to unblock and allow developer to continue merging the PR
- Captures the failure events for post analysis
- Triggers an alert to our oncall if the number of failures exceeds a threshold within a time frame.
Automated end-2-end testing for workflow
Due to the complexity and number of moving parts in our workflow, we implemented an automated end-2-end (E2E) testing suite. All workflow changes, as well as ruleset repos, go through our E2E where the new changes are used to scan across a set of pre-configured repos. It includes checking if:
- Pre-staged vulnerabilities are being caught as expected
- Scanning for various languages and configurations are completed successfully
- SARIFS are enriched properly
Conclusion
Modernizing our SAST capabilities at LinkedIn’s scale was a journey that went beyond simply adopting new tools. It required a fundamental rethinking of how we could provide security at scale without hindering the velocity of our developers. By adhering to our core principles of being developer-first, resilient, and observable, we moved from a fragmented ecosystem to a unified, robust pipeline natively integrated into the developer workflow on GitHub.
This initiative has been a significant step in our "shift left" strategy, empowering our engineers with fast, reliable, and actionable security feedback directly in their pull requests. This maintains the security of LinkedIn's code and infrastructure, thereby protecting our members and customers.
It is worth noting that SAST is only one of the mechanisms to keep our code base secure. We also leveraged GitHub’s Dependabot and Secret Scanner to ensure our dependencies are secure and prevent credentials exposure.
We hope that by sharing our challenges and design decisions, we can provide a blueprint for other organizations facing similar challenges in scaling their application security programs.