Skip to content
Zeeshan Javeed edited this page Jan 17, 2024 · 1 revision

Welcome to the ZeeCodeQL wiki! Welcome to the juice-shop wiki! GitHub offers a range of features designed to enhance and uphold the quality of your codebase. Some of these features, such as the dependency graph and Dependabot alerts, are accessible across all plans. On the other hand, certain advanced security functionalities necessitate a GitHub Advanced Security (GHAS) subscription, coupled with CodeQL. This tutorial will provide a systematic walkthrough on 'How to Setup/Enable Advanced Security using CodeQL in GitHub.'

GitHub presents two modes for Advanced Security implementation. For public repositories, a straightforward CodeQL setup can be activated with a single click. However, for an advanced CodeQL configuration, a deeper understanding of 'GitHub Actions' is required. Furthermore, access to 'Runners' is imperative to execute a customized CodeQL setup.

This guide will comprehensively cover both configurations, offering a step-by-step approach to facilitate the setup process. I have used famous juice-shop public repo (MIT License) and modified it to include Java and Visual Studio code.

Task 1: Create a Fork

image
  • Make sure to choose your account and provide a name if juice-shop not available.
image

Task 2: Enable Advance Security

  • Once you have forked the repository, now you have the complete control of this repository. Click on Settings and then Code security and analysis. Here you can see all settings regarding security.
  • Make sure that you have enabled all the Settings. I would also recommend to enable Dependabot, an automated dependency updates built into GitHub.
image

Task 3: CodeQL Default Setup

  • To enable CodeQL Default Setup, simply under Code Scanning, click on setup and select Default Setup. It will automatically detect the your repository language and will prompt you for default setup.
image
  • In our case it has detected JavaScript / TypeScript, Python and Java / Kotlin. Please confirm Enable CodeQL to enable default setup. As Java is a compiled language, it will try automatically to find the build file from the repository.
image
  • Wait for few minutes and then Navigate to Security Tab and then click on Code scanning. Here we can see warnings along with their severity level.
image
  • Also click on Tool status to observe if tool configuration is throwing any warning or error. If we have to disable default setup, then we have to do it in Settings and in Tool Status remove tools configuration files.
image

Task 4: CodeQL Advanced Setup

  • Code Scannings using Advanced Setup requires that repository has Actions enabled.
image
  • If your repository doesn't have access to GitHub-hosted runners (paid plan for private repositories) then you have to create self-hosted Runners. For this demo we recommend to create three 'self-hosted' Runners.
image

Initial Setup

  • In Code security and analysis under Code scanning, select Set up and click on Advanced (If you successfully removed Default Setup already). GitHub advanced AI system will prepare a best suited Codeql.yml file content. Please replace the content of the file with the following. Alternatively you can also select, switch to advanced.

name: "CodeQL"

on:
  push:
    branches: [ "main" ]
  pull_request:
    branches: [ "main" ]
  
jobs:
  analyze:
    name: Analyze
    runs-on: self-hosted

    strategy:
      fail-fast: false
  
    steps:
    - name: Checkout repository
      uses: actions/checkout@v4

    # setup nodejs
    - name: Setup Node for Python or JS Only
      uses: actions/setup-node@v4
      with:
        node-version: 18
    
    # Initializes the CodeQL tools for scanning.
    - name: Initialize CodeQL
      uses: github/codeql-action/init@v3
      with:
        languages: javascript-typescript
       

    # Perform analysis and upload result back to GitHub
    - name: Perform CodeQL Analysis
      uses: github/codeql-action/analyze@v3
      with:
        category: "javascript-typescript"

Ignore Directories or Certain Files

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
  uses: github/codeql-action/init@v3
  with:
    languages: javascript-typescript
    config: |
        paths-ignore:
          - /data/static/codefixes/*.ts

Matrix Language

A Matrix Strategy lets you use variables in a single job definition to automatically create multiple job runs that are based on the combinations of the variables. For example, you can use a matrix strategy to test your code in multiple versions of a language or on multiple operating systems / languages. As our code has Javascript, Python and Java, so we can use the same action with help of matrix strategy.

  name: "Code Scanning using CodeQL"
  
  on:
    push:
      branches: [ "main" ]
    pull_request:
      branches: [ "main" ]
    
  jobs:
    analyze:
      name: Analyze
      runs-on: self-hosted
  
      strategy:
        fail-fast: false
        matrix:
          language: [ 'javascript', 'python' ]
    
      steps:
      - name: Code Language 
        run: echo ${{ matrix.language }}
      
      - name: Checkout repository
        uses: actions/checkout@v4
  
      # setup nodejs
      - name: Setup Node for Python or JS Only
        uses: actions/setup-node@v4
        with:
          node-version: 18
      
      # Initializes the CodeQL tools for scanning.
      # ignore some files
      - name: Initialize CodeQL
        uses: github/codeql-action/init@v3
        with:
          languages: ${{ matrix.language }}
          config: |
              paths-ignore:
                - /data/static/codefixes/*.ts
         
  
      # Perform analysis and upload result back to GitHub
      - name: Perform CodeQL Analysis
        uses: github/codeql-action/analyze@v3
        with:
          category: "/language:${{matrix.language}}"
  • You will observe that now we have two jobs that ran against one 'Action'.
image

Compiled Language and Conditional Flow

Nowadays, Enterprise solutions are programmed with compiled as well as with languages having an interpreter. GitHub Actions provides native support for compiled languages and we can customise our workflow to call build system for compilation as a tool chain process. In the following example, we will use Java as a compiled language and maven to build java project.

  name: "Code Scanning using CodeQL"
  on:
    push:
      branches: [ "main" ]
    pull_request:
      branches: [ "main" ]
    
  jobs:
    analyze:
      name: CodeQL Code Scanning
      runs-on: self-hosted
  
      strategy:
        fail-fast: false
        matrix:
          language: [ 'javascript', 'python', 'java' ]
            # CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby', 'swift' ]
            # Use only 'java' to analyze code written in Java, Kotlin or both
            # Use only 'javascript' to analyze code written in JavaScript, TypeScript or both
            # Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support
      
      
      steps:
        - name: Code Language 
          run: echo ${{ matrix.language }}
          
        - name: Checkout Source
          uses: actions/checkout@v4
      
        - if: ${{ matrix.language == 'java' }}
          name: Set up JDK for Java Code
          uses: actions/setup-java@v3
          with:
            java-version: 11
            java-package: jdk
            distribution: microsoft
            # Using the integrated cache functionality to speed up builds
            cache: maven
        
        - if: ${{ matrix.language == 'python'  || matrix.language == 'javascript' }}
          name: Initialize CodeQL Scanning for Interpreter Languages
          uses: github/codeql-action/init@v2
          with:
            languages: ${{ matrix.language }}
            config: |
              paths-ignore:
                - /data/static/codefixes/*.ts
      
         
            
        - if: ${{ matrix.language == 'python'  || matrix.language == 'javascript' }}     
          name: Setup Node for Python or JS Only
          uses: actions/setup-node@v4
          with:
            node-version: 18
  
        - if: ${{ matrix.language == 'python'  || matrix.language == 'javascript' }} 
          name: Autobuild for Python or JS Only
          uses: github/codeql-action/autobuild@v2
          
        
        # We are using a custom build step to have greater control over the build process over using autobuild
        #- name: Custom Build Step
        - if: ${{ matrix.language == 'java' }}
          name: Initialize CodeQL Scanning for Java
          uses: github/codeql-action/init@v2
          with:
            languages: ${{ matrix.language }}
            #config-file: ./.github/codeql/codeql-config.yml
            queries: security-extended
            packs: +codeql/java-queries:experimental/Security/CWE/CWE-020/Log4jJndiInjection.ql
            
        - if: ${{ matrix.language == 'java' }}
          name: Custom Build Step for Compiled Langugage
          run: mvn compile -B
          shell: bash
  
        - name: Perform CodeQL Analysis
          uses: github/codeql-action/analyze@v2
          with:
            category: "/language:${{matrix.language}}"
  • Please commit the file and see the results in Actions.
image

Scanning on Demand and on Schedule

  • As a best practice, We can exclude some directories that will not be monitored for changes like dependabot. Also I recommend a routine build process as well as on-demand scan to avoid any vulnerability in the code. Please modify CodeQl.yml

    name: "Code Scanning using CodeQL"
    
    on:
      push:
       branches:
          - '**'
          - '!dependabot/**'
      workflow_dispatch:
      pull_request:
        branches:
          - main
      schedule:
        - cron: "*/30 * * * *"
    

Task 4: Ruleset

  • A GitHub Rulesets is a named list of rules that apply to a repository. You can create rulesets to control how people can interact with selected branches and tags in a repository. You can control things like who can push commits to a certain branch, or who can delete or rename a tag.
  • For this task, lets create a ruleset, give it a meaningful name, enable it, apply on default branch and you can add admin as to 'by-pass' rule as well.
image

Branch Protection Ruleset

  • Enable, require a pull request before merging and Required approvals to 1. Also enable Require status checks to pass. Click on Add checks. If your workflow has successfully completed once, you can start typing 'Java' and a list will automatically fetch possible code-scanning checks from configuration file. Add Code Scanning for Java, Javascript and for Python. click create and save.
image

Required Workflow and Enforce Security Checks

  • Please checkout code on your local machine or use GitHub Codespaces to instantly start modifying the code in a branch. We have already created a ready to use Task that will introduce a java vulnerability into the codebase (see screenshot). Commit and Push the code.
  • Alternatively you can apply patch, patches.tgz file under patches/log4j-vulnerability

tar -xzvf ./patches/log4j-vulnerability/patches.tgz

image
  • Navigate back to your repository and create a pull request. As we have enabled required workflow in previous section so GitHub will automatically start building the workflow on pull request.
image

As you can see we intentionally injected log4j vulnerability and codeQL scanning has intelligently identified it. You can further click on error to look further details of the vulnerability and how to fix it. If you believe its false-positive or you still want to merge the code with current vulnerability you can also dismiss the warning.

During this exercise, we have learned

  • How to enable GitHub Advance Security
  • How to enable default setup
  • How to remove CodeQL configurations
  • How to setup CodeQL advance set up
  • How to enforce Required Workflow using GitHub Rulesets
  • Some best practices around Actions, Pull request etc.