In this article we explain our journey towards Continuous Security Audits to detect and remediate potential Security Issues within our OpenSource offerings at Dgraph Labs Inc. As part of this initiative, we have integrated a selection of toolsets that facilitate Continuous Security Audits, leveraging our new CI/CD Setup built upon the robust foundation of GitHub Actions.
Our new setup provides Improved Visibility
and Faster Security Issue Resolution
for both our organization and our
esteemed customers. Notably, within a concise timeframe (~3-months), we have successfully addressed over 2k+ security
issues, significantly bolstering our SOC2 compliance endeavors.
Background Link to heading
Before we begin - we would like to give you an overview of our Security Landscape and explain our Goals for handling these Security Issues at Dgraph Labs Inc.
Our Security Landscape Link to heading
In this blog, our primary focus revolves around the security aspects pertaining to our OpenSource codebase. Our
OpenSource ecosystem can be categorized into three distinct areas, as visually depicted in the below image. Broadly
speaking, our security audit endeavors encompass the following layers - Our Code
, Our Binary Artifacts
&
Our Docker Images
.
Each layer in the above pyramid can bring in different kinds of security concerns for us (& our customers). Due to the OpenSource nature of our product, it is imperative that we exercise additional vigilance in bolstering security measures within our layers. The presence of any vulnerabilities within our product can potentially lead to exploitation. It is crucial to note that such exploits extend beyond impacting solely our OpenSource user base; they also pose a risk to our Cloud DBaaS users.
Our standard release process entails a meticulous sequence, commencing with a Tag Checkpoint on the underlying Code
,
followed by building our Binary Artifacts
and lastly constructing our Docker Images
. The aforementioned pyramid
serves not only as a visual representation of our security layers but also as a direct reflection of our release
methodology. Safeguarding these layers is of paramount importance to us at Dgraph Labs Inc.
Our Security Audit Focus Link to heading
In the first layer, i.e. Our Code
, our primary objective centers around conducting audits of
our dependency package pinnings to proactively identify any known vulnerabilities (a.k.a CVEs). Additionally, we place
significant emphasis on performing static analysis (Linters) on our source code to effectively detect potential security
issues encompassing areas such as buffer overflows, memory leaks, and other undefined behaviors.
In the second layer, i.e. Our Binary Artifacts
, our focus pivots towards constructing these artifacts within a secure
environment to mitigate the risk of unauthorized modifications. This entails meticulous generation and validation of
appropriate SHA to ensure the integrity and authenticity of the
released artifacts.
In the third layer, i.e. Our Docker Images
, our attention is primarily directed towards the underlying Linux
environment. We diligently ensure that the Linux packages within this environment are consistently updated to thwart
any known vulnerabilities (a.k.a CVEs). Naturally, the packaging of these images also necessitates a secure environment
to prevent any unwarranted manipulation or compromise.
What are CVEs? Link to heading
CVE (Common Vulnerabilities and Exposures) serves as a universal identifier for pinpointing and discussing specific vulnerabilities and security concerns in software and hardware systems. This standardized framework offers a centralized reference for tracking vulnerabilities across diverse platforms and organizations. Maintained by the MITRE Corporation in collaboration with the global cybersecurity community, the CVE system assigns a unique identifier(e.g., CVE-2021-12345) to each entry. These entries comprise comprehensive details about the vulnerability, including its description, impact assessment, and pertinent references or resources. CVEs foster streamlined communication and coordination among security researchers, software vendors, and users - by establishing a common language to identify and refer to specific vulnerabilities. This empowers organizations to effectively monitor and address vulnerabilities within their systems, enabling the implementation of appropriate remedial actions.
To put it simply, envision CVE as an extensive database housing recognized security issues associated with software products. Most security toolsets rely on this database to examine potential issues within the foundational components of your software.
What are Linters? Link to heading
Linters play a crucial role in the toolkit of developers striving for exceptional code quality, enhanced productivity, and reduced occurrences of bugs and errors. These automated software tools diligently scrutinize code, unearthing potential issues, errors, and bugs that might otherwise elude manual code reviews. Linters excel at detecting subtle nuances and intricacies that are often challenging to identify through human inspection alone. By leveraging linters, developers can fortify their development process, ensuring overall efficiency and a higher standard of code. To delve deeper into the world of linters and their benefits, you can explore further information here, here, here and here on this subject.
Continuous Security Audits Link to heading
In order to establish a seamless workflow for Continuous Security Audits, we evaluated several toolsets that seamlessly integrated with our GitHub ecosystem. Our objective was to achieve Continuous Audits that encompassed not only functional, integration, and performance aspects but also ensured the utmost security for each code change, including pull requests. Real-time insights played a pivotal role in our pursuit of this goal.
The image below illustrates our well-defined setup, comprising two distinct components: the Code & Build Phase
and
the Post Release Phase
. Each phase serves a specific purpose in our comprehensive security auditing process.
During the Code & Build Phase
, we leverage Aqua Trivy Scans to conduct
scans of both our Code and Docker layers. These scans are executed with every code change (pull request), and the
results are logged into the Github Security Tab. This enables easy visualization, triage, and prompt resolution of
any identified issues. To further enhance security at the Code layer, we utilize
Linters to thoroughly analyze and detect potential security vulnerabilities.
In the subsequent Post Release Phase
, we rely on Snyk Scans to examine our released Docker
Artifacts for any security issues. Although this process follows a similar approach to fix any identified issues, it
is performed outside of the Github ecosystem.
Fixing Security Issues Link to heading
Gaining a clear understanding of which security issues to address, determining the most effective remediation approaches, and meticulously tracking the code changes that resolve them are all crucial aspects of our security operations. These components hold significant importance in our overall security framework, and our well-designed architecture plays a pivotal role in enabling us to handle them with precision and efficiency.
Identifying WHAT to fix Link to heading
Upon the completion of these scans, the comprehensive results are relayed to the GitHub Security tab within our repositories. This allows our Security team to readily access and evaluate the scope and urgency of each identified issue. To provide a tangible illustration of our security posture derived from these scans, we present the image below. Remarkably, we have successfully resolved over 2k+ security issues, further attesting to our commitment to robust security practices.
The visualization provided is derived primarily from our dedicated Security Pipeline. This pipeline plays a crucial role in enabling Continuous Security Audits by executing scans against any code changes made, including pull requests, as well as on our main branch according to a predefined schedule. Its primary objective is to meticulously examine and cross-reference potential known vulnerabilities (CVEs) with the comprehensive CVE database, subsequently generating informative reports for our evaluation. These results play a vital role in assisting us in assessing the severity of each security issue, categorizing them based on criticality levels such as Critical, High, Medium, and Low. Armed with this valuable information, we are able to strategically plan and prioritize the necessary fixes, ensuring the continual enhancement of our security posture.
Understanding HOW to fix Link to heading
The remarkable aspect of the CVE results displayed in our visualization tab is that it not only provides insight into the underlying issues but also offers guidance on the specific package pinning adjustments required to effectively resolve these issues. To illustrate our systematic approach to addressing vulnerabilities, the image below serves as an exemplary demonstration of how we strategically plan our fixes.
We have also enabled DependaBot
, to
help us with auto-remediation (we believe this is the future).
Tracking what Code fixed the Security Issue Link to heading
As avid proponents of the GitHub ecosystem, we have wholeheartedly embraced its capabilities. By consistently running these scans, we anticipate a notable decrease in the reported issues over time, as we actively address and resolve them. Equally important to us is the ability to accurately track when each issue was successfully resolved, and GitHub seamlessly provides us with that essential functionality. The visual representation below illustrates our comprehensive tracking system.
Results Link to heading
Through the implementation of this Continuous Security Audit setup, we have achieved remarkable outcomes, successfully rectifying over 2k+ security issues and diligently addressing more than 1k+ CVEs across a diverse range of OpenSource and ClosedSource projects at Dgraph Labs Inc within a few months’ timeframe. To provide a glimpse of our accomplishments, the image below showcases a snapshot from our flagship Dgraph project.
With steadfast dedication, we have systematically addressed and resolved these issues across every release within the past year since the integration of Continuous Security Audits. Our commitment to ongoing improvement is evident in the image below, which vividly portrays the comprehensive array of fixes implemented across our releases.
We have derived considerable advantages from certain notable features, such as the ability to maintain these Security
Issues in a private mode until they are effectively addressed. The robust GitHub Security ecosystem has played a pivotal
role in facilitating this process, enabling us to not only accomplish Improved Visibility
but also helped us
accomplish Faster Security Issue Resolution
with remarkable efficiency.
Conclusion Link to heading
In conclusion, our current setup has proven to be highly effective for our needs. However, it is worth acknowledging that the realm of defensive security toolsets is rapidly evolving. From our perspective, while this approach remains predominantly reactive, the advent of the AI era has brought about significant transformations in the landscape of security practices and protocols. As a result, we are continuously exploring innovative security co-pilot toolsets and engaging in projects such as OpenRewrite, as we recognize the need to adapt to emerging trends. Ultimately, our overarching objective centers around the pursuit of auto-remediation toolsets, which represents a pivotal milestone in our security endeavors.
Acknowledgements Link to heading
Like any project, this was a team effort. We would like to thank all our internal contributors who have helped make this a reality. Thanks Dilip (co-author), Joshua, Kevin & Aman.