Oracle alertlog Monitoring with Elasticsearch and Kibana

Right now I’m working on an Elasticsearch based monitor for Oracle Databases. There are many guides on how to set up ELK (Elasticsearch, Logstash and Kibana) so I’ll skip that.

In my opinion log analytics is the key to provide a solid IT infrastructure. Especially in terms of security and availability! Oracle has its own log analytics products in the cloud but most customers I support don’t want/need such a big environment. “want” in terms of pricing. “big” in terms of data volume. The smallest package you can get is 200GB upload per hour which is simply to big for most (or even all) midsize companies in Germany (@ORACLE please provide smaller packets!).

OK, back to ELK & Oracle.

Requirements:

  • Elasticsearch running on machine (let’s say) A
  • Logstash running on any machine (in my case machine A)
  • Filebeat running on the Oracle Database machine

Here is the configuration you need to properly send alertlog data to Elasticsearch. There are a few guides out there but they do not work properly: Sorry to say that! Because they work with the multiline plugin which will mix up incoming messages from multiple sources if the arrive at the same time. I think that is key because I want to use one logstash for many databases (see https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html).

Here is my configuration of filebeat. Filebeat simply reads the alertlog files and sends all new information to logstash. When you run it for the first time, all data in the logfile is considered as “new”. Maybe you want to clean up the log if you have a 100GB logfile.

filebeat.prospectors:
## 12cR2 and newer
- input_type: log
paths:
- /path/to/alert.log
multiline.pattern: '^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}'
multiline.negate: true
multiline.match: after

## 12cR1 and older
- input_type: log
paths:
- /path/to/alert.log
multiline.pattern: '^[[:alpha:]]{3} [[:alpha:]]{3} [[:digit:]]{2} [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2} [[:digit:]]{4}'
multiline.negate: true
multiline.match: after

output.logstash:
hosts: ["elk01:5045"]

This configuration simply sends all alertlog records to logstash. The “multiline.pattern” describes the date pattern in the alertlog (one example for the old format and one for the new format starting with 12.2.0.1). The pattern is the separator between log records.

In the logstash configuration all the magic happens which tells elasticsearch how to store the information in the index.

input {
  beats {
    port => "5045"
  }
}
filter {
  ## Extract ORACLE_SID from source path
  grok {
    match => ["source","%{GREEDYDATA}/%{GREEDYDATA:db_unique_name}/trace/alert_%{GREEDYDATA:oracle_sid}.log"]
  }

  # Search for ORA- and create field if it matches
  if [message] =~ /ORA-/ {
    grok {
      match => [ "message","(?ORA-[0-9]*)" ]
    }
  }

  ## Oracle 12gR1 (and lower) date format (+Multitenant support)
  if [message] =~ /^[[:alpha:]]{3} [[:alpha:]]{3} [[:digit:]]{2} [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2} [[:digit:]]{4}/ {
    grok {
      match => ["message","^%{DAY:day} %{MONTH:month} %{MONTHDAY:monthday} %{TIME:time} %{YEAR:year}\n((?[[:alnum:]]*\([[:digit:]]*\)):)?%{GREEDYDATA:message}$"]
        overwrite => ["message"]
    }
    mutate {
      add_field => {
        "TIMESTAMP_ISO" => "%{year} %{month} %{monthday} %{time}"
      }
    }
    ## Replace @timestamp with timestamp and remove timestamp
    date {
      timezone => "MET"
      match => ["TIMESTAMP_ISO" , "yyyy MMM dd HH:mm:ss"]
      target => "@timestamp"
      remove_field => [ "TIMESTAMP", "TIMESTAMP_ISO", "day", "monthday", "month", "year", "time" ]
    }
  }

  ## Oracle 12cR2 (and newer) date format (+Multitenand support)
  if [message] =~ /^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}/ {
    grok {
      patterns_dir => ["/etc/logstash/logstash-patterns-core/patterns"]
      match => ["message","^%{TIMESTAMP_ISO8601:TIMESTAMP}\n((?[[:alnum:]]*\([[:digit:]]*\)):)?%{GREEDYDATA:message}$"]
      overwrite => ["message"]
    }

    ## Replace @timestamp with timestamp and remove timestamp
    date {
      timezone => "UTC"
      match => ["TIMESTAMP" , "ISO8601"]
      target => "@timestamp"
      remove_field => [ "TIMESTAMP" ]
    }
  }
}
output {
  #stdout { codec => rubydebug }
  elasticsearch {
    hosts => [ "localhost:9200" ]
  }
}

OK, this is a lot stuff! Let’s split up the filter section a bit.

Since Oracle has this “diag” directory and a fixed directory structure we can extract the ORACLE_SID from the path:

  ## Extract ORACLE_SID from source path
  grok {
    match => ["source","%{GREEDYDATA}/%{GREEDYDATA:db_unique_name}/trace/alert_%{GREEDYDATA:oracle_sid}.log"]
  }

Afterwards we search for the pattern “ORA-” to extract possible ORA error messages. The filed in Elasticsearch is called “ORA-“:

  # Search for ORA- and create field if it matches
  if [message] =~ /ORA-/ {
    grok {
      match => [ "message","(?ORA-[0-9]*)" ]
    }
  }

Next step is to extract the date/time of the message and overwrite the @timestamp filed of the message. We could just ignore this step but I want the @timestamp field (default field in a filebeat index) to represent the time when Oracle created the log message rather than when logstash received it.

This block does the magic for the old date format:

  ## Oracle 12gR1 (and lower) date format (+Multitenant support)
  if [message] =~ /^[[:alpha:]]{3} [[:alpha:]]{3} [[:digit:]]{2} [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2} [[:digit:]]{4}/ {
    grok {
      match => ["message","^%{DAY:day} %{MONTH:month} %{MONTHDAY:monthday} %{TIME:time} %{YEAR:year}\n((?[[:alnum:]]*\([[:digit:]]*\)):)?%{GREEDYDATA:message}$"]
        overwrite => ["message"]
    }
    mutate {
      add_field => {
        "TIMESTAMP_ISO" => "%{year} %{month} %{monthday} %{time}"
      }
    }
    ## Replace @timestamp with timestamp and remove timestamp
    date {
      timezone => "MET"
      match => ["TIMESTAMP_ISO" , "yyyy MMM dd HH:mm:ss"]
      target => "@timestamp"
      remove_field => [ "TIMESTAMP", "TIMESTAMP_ISO", "day", "monthday", "month", "year", "time" ]
    }
  }

This block does the magic for the new date format:

  ## Oracle 12cR2 (and newer) date format (+Multitenand support)
  if [message] =~ /^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2}/ {
    grok {
      patterns_dir => ["/etc/logstash/logstash-patterns-core/patterns"]
      match => ["message","^%{TIMESTAMP_ISO8601:TIMESTAMP}\n((?[[:alnum:]]*\([[:digit:]]*\)):)?%{GREEDYDATA:message}$"]
      overwrite => ["message"]
    }
    ## Replace @timestamp with timestamp and remove timestamp
    date {
      timezone => "UTC"
      match => ["TIMESTAMP" , "ISO8601"]
      target => "@timestamp"
      remove_field => [ "TIMESTAMP" ]
    }
  }

The new format looks much easier! The reason is that there are already patterns out there which describe the new date format. You can download these patterns here: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns

Now you can open kibana and do all the fancy stuff with the data! You can find any problems, even those which do not exist at all 🙂

Cheers!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: