Posted by:

David Greenwood

David Greenwood, Chief of Signal

If you are reading this blog post via a 3rd party source it is very likely that many parts of it will not render correctly. Please view the post on signalscorps.com for the full interactive viewing experience.

In this post I will take a look at creating basic YARA-L for Google Chronicle (and show a manual conversion of a Sigma rule to YARA-L format).

Note: this post is written for YARA-L 2.0. The concepts discussed may not be correct for other versions.

In the first three parts of this tutorial I described the general structure of Sigma rules.

For the next three posts I will take you through some other detection languages and how they can be converted to Sigma manually so that the topic of automatic conversion (later in this series) will make much more sense.

First up is Google’s threat detection language, YARA-L, built for Chronicle.

YARA-L was named after, and inspired by, YARA (invented by VirusTotal, now a Google company) for malware analysis. The L stands for “logs”. Google describe it as a threat detection language, and not a data query language.

In my own opinion, YARA-L is the most well suited to detection language found natively in a SIEM product. Though with power comes complexity.

Rule construction

YARA-L 2, you must specify parts of your rule in the following order:

  1. meta: Stores arbitrary key-value pairs of rule details.
  2. events: Conditions to filter events and the relationship between events.
  3. match (optional): Values to return when matches are found.
  4. outcome (optional): Additional information extracted from each detection.
  5. condition: Condition to check events and the variables used to find matches.
  6. options (optional): Options to turn on or off while executing this rule.

Breaking each of these down…

1. meta

The meta part of the rule stores arbitrary field-value pairs of rule details, such as who wrote it, what it detects on, version control, etc.

In the documentation, there is no restriction on the fields that can be used to define a field-value.

The docs simply state each a key part must be an unquoted string, and a value part must be a quoted string, like so;

<key> = "<value>"

Here is an example of a YARA-L rule, with common values I see being used by authors in the meta section;

meta:
  author = "Signals Corps"
  date = "08/09/2020"
  description = "A demo description"
  category = "proxy"
  mitre = "T1056, Collection"
  yara_version = "YL2.0"
  rule_version = "1.0"
  reference = "https://www.signalscorps.comblog"
  created = "2021-03-09"

The fields and values set in the meta section will exposed in the Chronicle UI when triggered.

2. events

The events section of the YARA-L rule sets the conditions to filter, modify and define the events you want to detect on.

Here is a simple example,

events:
  $workspace.metadata.event_type = "USER_LOGIN"

To understand this statement, you need to know a little about Google’s Unified Data Model (udm) schema first, especially the difference between Events and Entities.

Google's Unified Data Model fields

In my example I am using the Event Data Model and declaring a top level event_type as metadata and then selecting an event sub-type value (USER_LOGIN) for the field event_type from the available options – you will see the USER_LOGIN event type listed in that table.

Therefore this events statement only considers user login events correctly mapped and stored in Chronicle.

You will also notice I use the top level namespace workspace. This is just a string to represent the product Google Workspace. This could be anything.

The following statement would return the same events;

events:
  $e.metadata.event_type = "USER_LOGIN"

This is particularly useful in the conditions section of a Rule, as I will show you later.

If I wanted to filter on only Workspace events in Chronicle I would need to use metadata sub-type product_name property like so;

events:
  $workspace.metadata.product_name = "Google Workspace"
  $workspace.metadata.event_type = "USER_LOGIN"

It is also possible to use a range of operators in the events section. For example;

events:
  $workspace.metadata.event_type != "USER_LOGIN"

Now I am considering all events except those where workspace.metadata.event_type=USER_LOGIN.

Logical Operators are also available to use like so;

events:
  $workspace.metadata.event_type = "USER_LOGIN" 
  and $workspace.role.type != "SERVICE_ACCOUNT"

In the above example I use the and operator to capture login events, but not from service accounts.

Making this slightly more complex, I turn fields returned by events into variables for later sections of the rule by declaring Event Variables taken from the matching events like so;

events:
  $workspace.metadata.event_type = "USER_LOGIN"
  $workspace.principal.user.userid = $userid

In this statement I am taking the value returned from the workspace.principal.user.userid field in matching events and storing it as the variable userid (which I can now reuse in later parts of the rule).

If you are not sure why I am declaring variables in statements, all will become clear in the conditions section of the Rule.

Taking it to one more level of complexity, it is also possible to use a the functions; string, regular expression, date, math, and net, in the events section to create functional expressions.

I will let you dive into functions yourself, however, I will say this; I see a lot of people using regular expressions in rules as a default, but beware, this is not always the most efficient method of searching and filtering over other available functions and operators.

3. match (optional)

The match section is useful because it allows you to group events by field and time.

For example;

events:
  $workspace.metadata.event_type = "USER_LOGIN"
  $workspace.principal.user.userid = $userid
match:
  $userid over 10m

This example statement returns $userid (defined in the events section) when the rule finds a match. The time window specified is 10 minutes. Events that are more than 10 minutes apart are not correlated. Events outside the time range are ignored – the rule does not consider them to be a detection.

For example; lets say I have userid=david in certain log lines. $userid over 10m only considers login events for david over a 10 minute period. If I had another user showing in the logs, e.g. userid=john, the rule would also group events over a 10 minute period for login attempts by john too.

This is particularly useful when dealing with events that can generate lots of events, like logins. Instead of triggering on each login, the detection is triggered based on logins over the span of 10 minutes (which might include 1 or more logins).

Here is another example using multiple variables in the match section;

events:
  $gcp.metadata.vendor_name = "Google Cloud Platform"
  $gcp.metadata.product_event_type = "google.api.apikeys.v1.ApiKeys.CreateApiKey"
  $gcp.security_result.action = "ALLOW"
  $gcp.target.resource.name = $resourceName
  $gcp.target.cloud.project.name = $projectName

match:
  $resourceName, $projectName over 5m

4. outcome

In the outcome section, you can define up to 10 outcome variables, with arbitrary names.

These outcomes will be stored in the detections generated by the rule. Each detection may have different values for the outcomes.

Essentially outcomes can help analysts better deal with triggered alerts when they are working them.

In the most simplistic outcome implementation, you can use placeholder directly as an outcome value;

outcome:
  $email_size_b = $email_sent_bytes

Here the email_sent_bytes value returned by the matching event will become the outcome value for the field email_size_b.

When multiple events are returned (with a match statement), I can use other functions to populate the outcome field.

match:
  $hostname over 5m

outcome:
  $max_email_size_b = max($email_sent_bytes)

In this statement the log line with the largest email size (max) in the series to populate the outcome variable

5. condition

In the condition section, you can specify the detection condition using events and variables defined in the events section.

This section ultimately defines the logic of the detection.

events:
  $workspace.metadata.event_type = "USER_LOGIN"
  $workspace.metadata.event_type = "PROCESS_PRIVILEGE_ESCALATION"
  $workspace.metadata.event_type = "FILE_DELETION"
  $workspace.principal.user.userid = $userid
match:
  $userid over 10m
condition:
  $workspace

In this example statement, a match will only be found if all workspace events are true. That is, a distinct userid performs USER_LOGIN and PROCESS_PRIVILEGE_ESCALATION and FILE_DELETION events within 10 minutes.

This Rule could also be written as;

events:
  $e1.metadata.event_type = "USER_LOGIN"
  $e2.metadata.event_type = "PROCESS_PRIVILEGE_ESCALATION"
  $e3.metadata.event_type = "FILE_DELETION"
  $e4.principal.user.userid = $userid
match:
  $userid over 10m
condition:
  $e1 and $e2 and $e3 and $e4

The # character is a special character in the condition section. If it is used before any event or placeholder variable name, it represents the number of distinct events or values that satisfy all the events section conditions (put another way, the count of events). To illustrate this;

events:
  $e1.metadata.event_type = "USER_LOGIN"
  $e2.metadata.event_type = "PROCESS_PRIVILEGE_ESCALATION"
  $e3.metadata.event_type = "FILE_DELETION"
  $e4.principal.user.userid = $userid
match:
  $userid over 10m
condition:
  e1 and $e2 and #e3 > 10 and $e4

The condition statement here is looking for the same collection of events as in the previous example, but in this instance the event e3 (FILE_DELETION) must be seen at least 10 times (count > 10).

6. options

In the options section, you can specify the options for the rule. Syntax for the options section is similar to that of the meta section. But a key must be one of predefined option names, and the value is not restricted to string type.

Currently, the only available option is allow_zero_values. If set to true, matches generated by the rule can have zero values as match variable values.

Sigma Rule to YARA-L Rule

Now you know a little more about YARA-L Rules, lets try and manually create one from an existing Sigma Rule.

Sigma Rule

Here is a public Sigma Rule detecting MFA being disabled in Google Workspace;

title: Google Workspace MFA Disabled
id: 780601d1-6376-4f2a-884e-b8d45599f78c
description: Detects when multi-factor authentication (MFA) is disabled.
author: Austin Songer
status: experimental
date: 2021/08/26
modified: 2021/12/02
references:
    - https://cloud.google.com/logging/docs/audit/gsuite-audit-logging#3
    - https://developers.google.com/admin-sdk/reports/v1/appendix/activity/admin-security-settings#ENFORCE_STRONG_AUTHENTICATION
    - https://developers.google.com/admin-sdk/reports/v1/appendix/activity/admin-security-settings?hl=en#ALLOW_STRONG_AUTHENTICATION
logsource:
  product: google_workspace
  service: google_workspace.admin
detection:
    selection_base:
        eventService: admin.googleapis.com
        eventName: 
            - ENFORCE_STRONG_AUTHENTICATION
            - ALLOW_STRONG_AUTHENTICATION
    selection_eventValue:
        new_value: 'false'
    condition: all of selection*
level: medium
tags:
    - attack.impact
falsepositives:
 - MFA may be disabled and performed by a system administrator.

YARA-L Rule

First I can map all of the metadata like so;

meta:
  // source https://github.com/SigmaHQ/sigma/blob/master/rules/cloud/gworkspace/gworkspace_mfa_disabled.yml
  sigma_title = "Google Workspace MFA Disabled"
  sigma_id = "780601d1-6376-4f2a-884e-b8d45599f78c"
  sigma_description = "Detects when multi-factor authentication (MFA) is disabled."
  sigma_author = "Austin Songer"
  sigma_status = "experimental"
  sigma_date = "2021/08/26"
  sigma_modified = "2021/12/02"
  sigma_references = "['https://cloud.google.com/logging/docs/audit/gsuite-audit-logging#3','//developers.google.com/admin-sdk/reports/v1/appendix/activity/admin-security-settings#ENFORCE_STRONG_AUTHENTICATION','https://developers.google.com/admin-sdk/reports/v1/appendix/activity/admin-security-settings?hl=en#ALLOW_STRONG_AUTHENTICATION']"
  sigma_level = "medium"
  sigma_tags = "['attack.impact']"
  sigma_falsepositives = "MFA may be disabled and performed by a system administrator."

I have mapped almost all the fields in the Sigma rule with the prefix sigma_ in the meta section, as there is no corresponding fields to capture them in YARA-L.

You will also notice I have included a comment (declared using //) to tie the rule back to the source repository.

Now I can write the events section.

events:
  $workspace.metadata.vendor_name = "Google Workspace"
  $workspace.metadata.product_event_type = /enforce_strong_authentication/ nocase
  $workspace.metadata.product_event_type = /allow_strong_authentication/ nocase
  $workspace.about.labels.key = "new_value"
  $workspace.about.labels.value = "false"

condition:
  $workspace

You will notice there is not a one-to-one mapping between Sigma and YARA-L.

From the first Search Identifier I have converted product: google_workspace to workspace.metadata.vendor_name.

The eventName values are covered by product_event_type. I use the nocase modifier to ignore the case found in the log when performing the search.

In the second Search Identifier the Sigma Rule states;

eventValue:
        new_value: 'false'

I have converted this to a key / value pair in the YARA-L rule like so;

  $workspace.about.labels.key = "new_value"
  $workspace.about.labels.value = "false"

In the Sigma Rule both Search Identifiers must be true so the condition matches on all workspace events.

Which gives us a final rule;

rule GoogleWorkspaceMFADisabled {
  meta:
    // source https://github.com/SigmaHQ/sigma/blob/master/rules/cloud/gworkspace/gworkspace_mfa_disabled.yml
    sigma_title = "Google Workspace MFA Disabled"
    sigma_id = "780601d1-6376-4f2a-884e-b8d45599f78c"
    sigma_description = "Detects when multi-factor authentication (MFA) is disabled."
    sigma_author = "Austin Songer"
    sigma_status = "experimental"
    sigma_date = "2021/08/26"
    sigma_modified = "2021/12/02"
    sigma_references = "['https://cloud.google.com/logging/docs/audit/gsuite-audit-logging#3','//developers.google.com/admin-sdk/reports/v1/appendix/activity/admin-security-settings#ENFORCE_STRONG_AUTHENTICATION','https://developers.google.com/admin-sdk/reports/v1/appendix/activity/admin-security-settings?hl=en#ALLOW_STRONG_AUTHENTICATION']"
    sigma_level = "medium"
    sigma_tags = "['attack.impact']"
    sigma_falsepositives = "MFA may be disabled and performed by a system administrator."

  events:
    $workspace.metadata.vendor_name = "Google Workspace"
    $workspace.metadata.product_event_type = /enforce_strong_authentication/ nocase
    $workspace.metadata.product_event_type = /allow_strong_authentication/ nocase
    $workspace.about.labels.key = "new_value"
    $workspace.about.labels.value = "false"

  condition:
    $workspace
}

One final note, you will see in the Chronicle Detection Rules YARA-L Repo rules are saved with the extension .yaral. So I would save the above file as GoogleWorkspaceMFADisabled.yaral.

…that helped me put this post together.

Next up: Microsoft Kusto

In the next part of this tutorial I will look at the Kusto Query Language from Microsoft.


Sigma Certification (Virtual and In Person)

The content used in this post is a small subset of our full training material used in our Sigma training.

If you want to join a select group of certified Sigma professionals, subscribe to our newsletter below to be notified of new course dates.




Our brand new Discord!

Like this blog?

Sign up to receive new posts in your inbox.


Stixify

Stixify. Extract machine readable intelligence from unstructured data.

Extract machine readable intelligence from unstructured data.

Obstracts

Obstracts

Turn any blog into structured threat intelligence.


Vulmatch

Vulmatch

Know when software you use is vulnerable, how it is being exploited, and how to detect an attack.

SIEM Rules

SIEM Rules. Your detection engineering database.

View, modify, and deploy SIEM rules for threat hunting.