A Brief History of Open Source from the Netflix Cloud Security Team

Netflix Technology Blog
Netflix TechBlog
Published in
5 min readAug 21, 2017

--

by Jason Chan

This summer marks three years of releasing open source software for the Netflix Cloud Security team. It’s been a busy three years — our most recent release marks 15 open source projects — so we figured a roundup and recap would be useful.

Penetration testing tools, vulnerabilities, and offensive security techniques have dominated security conferences and security-related open source for some time. However, in recent years, more individuals and organizations have been publishing “blue team” and defensive security tools and talks. We’re thrilled that the security industry has become more supportive of sharing these tools and techniques, and we’re more than happy to participate through the release of open source.

Our security-related OSS tends to be reflective of the unique Netflix culture. Many of the tools we’ve released are aimed at facilitating security in high-velocity and distributed software development organizations. Automation is a big part of our approach, and we seek to keep our members, employees, data, and systems safe and secure while enabling innovation. For our team, scale, speed, and integration with the culture are the keys to enabling the business to move fast.

Without further ado, here’s a look back at the OSS we’ve released.

  • Security Monkey was our first OSS release, way back in June 2014. Security Monkey is a tool for monitoring the security of cloud environments (originally and most significantly, AWS — Amazon Web Services), including analyzing and responding to misconfigurations, vulnerabilities, and other security issues. We’ve gotten lots of great contributions over the years, and we’ve talked about it at a few conferences, including AppSecUSA. In March of 2017, engineers from Google added Google Cloud Platform support to Security Monkey.
  • Scumblr, Sketchy, and Workflowable were announced and released together in August 2014. Together, they serve as an intelligence gathering and workflow platform — initially for various Internet resources (e.g. credential dumps on paste sites, relevant posts from social media), though the system has evolved to become the primary automation platform for our AppSec team. Scumblr is a web application that allows you to configure various web searches and collect and act upon the results. Sketchy is a task-based API for taking screenshots and scraping text from websites, and Workflowable is a Ruby Gem that adds flexible workflow functionality to Ruby on Rails applications.
  • FIDO (May 2015), or Fully Integrated Defense Operation (not a part of or service of the FIDO Alliance) is a tool for automated security incident response. It started as an experiment many years ago to see how tying into the API for our help desk system might speed response to malware incidents, and it eventually evolved into our system for orchestrating security response within our corporate environment. Slides for a talk at the Open Source Digital Forensics conference provide more context and details. At this point, FIDO is deprecated at Netflix and the OSS code is no longer maintained, though it remains available.
  • Sleepy Puppy (August 2015) is a tool to manage cross-site scripting (XSS) payloads and propagation over time and helps application security teams and testers track and evaluate the impact of XSS issues (historically one of the most widespread types of web application vulnerability). Our original blog post outlines the design and use of Sleepy Puppy, and we also released an extension for Burp Proxy, a popular tool for web application security testing.
  • Lemur, a system to streamline and automate the management and monitoring of SSL/TLS certificates, was released in September 2015. Managing PKI and SSL certificates has been a historically difficult problem, and we envisioned and built Lemur after scrambling to manage certificate revocation and reissuance after Heartbleed. We covered Lemur at AppSecUSA in the context of enterprise-wide TLS management and at AWS re:Invent as an example of how we approach security automation.
  • BLESS (May 2016), or Bastion’s Lambda Ephemeral SSH Service, is an SSH Certificate Authority (CA) that runs as an AWS Lambda function and is used to sign SSH public keys. Using an SSH CA provides a flexible array of authorization options, especially in large-scale and fast-moving environments like Netflix. We’ve covered BLESS at OSCON and QConNY, and our friends at Lyft have made some additional contributions, and spoke about their use of BLESS at one of our OSS meetups last year.
  • HubCommander (February 2017) is a Slack bot framework that we use for ChatOps-based management of GitHub organizations. It lets us provide simple, Slack-based self-service for various admin-level GitHub actions while maintaining access control and an audit log. And, while GitHub maintenance was its original intent, with the most recent release, it’s now a more general-purpose bot framework.
  • Stethoscope (February 2017) is a system that collects information about various end user-related security topics (e.g. device security), and provides those end-users clear and actionable advice for improving security. We use Stethoscope at Netflix to align with our unique culture and give our employees the freedom and context to securely manage their own devices. Our initial blog post provides more background and rationale, and we presented Stethoscope at ShmooCon earlier this year.
  • BetterTLS (April 2017) is a test suite for HTTPS clients implementing verification of the Name Constraints certificate extension. We’ve used it to identity and help correct implementation issues with TLS offerings from various vendors, and we have the bettertls.com companion site to assist with testing. Our initial blog post provides more background on Name Constraints and the test suite.
  • Repokid and Aardvark (June 2017) are tools that simplify and streamline the process of implementing least privilege for AWS IAM (Identity and Access Management) roles. These tools operate by actively watching the AWS services that a given IAM role uses and cutting back permissions by removing access to unused services. We spoke about related earlier work at an OSS meetup and AWS re:Invent last year, and we’ll be doing a deeper dive at this year’s re:Invent as well.
  • Repulsive Grizzly and Cloudy Kraken are tools that we released this July in our Skunkworks project, signifying that we are making the code public but are not planning regular updates or long term maintenance. These tools help us simulate application DDoS attacks in our environment, with Repulsive Grizzly simplifying test coordination and execution and Cloudy Kraken acting as an AWS orchestration framework for scaling up testing. We did a talk at this year’s DEFCON on the tools and application DDoS in general, and Wired provided a nice article covering the talk, approach, and tools.

We’ve enjoyed contributing to the OSS security community and have learned a lot from the feedback and collaboration. It’s always instructive to see how software evolves over its lifecycle and to see how others extend it in novel and creative ways. And going forward, we’ll look to make more use of our Skunkworks project to share projects that are experimental or that we don’t necessarily envision supporting long term. We have a few projects we’re considering open sourcing in the near future — if you’re interested, keep an eye on this space, our GitHub site, and @NetflixOSS on Twitter, and check out our YouTube channel for more talks from our team.

--

--

Learn more about how Netflix designs, builds, and operates our systems and engineering organizations