High Level Architecture

ClairCore is designed to be embedded into another application or service wrapper.
Two separate packages implement ClairCore's functionality: libindex and libvuln.
The main goal of these libraries to to receive a Manifest and compute a VulnerabilityReport.
The intermediate structure IndexReport is used to express all discovered artifacts within an Manifest.

The follow diagram depcits the high level architecture.
alt text

libindex is primarly implemented by the Indexer
libvuln is primarly implemented by the Matcher

Vulnerability Matching

This is a high level overview of the interfaces that must be implemented for end-to-end vulnerability matching.
A description of the end-to-end process is described as well.

Scanners

Several Scanner interface types exist for extracting contents from container layers.

PackageScanner

claircore/internal/indexer/PackageScanner

type PackageScanner interface {
	VersionedScanner
	// Scan performs a package scan on the given layer and returns all
	// the found packages
	Scan(*claircore.Layer) ([]*claircore.Package, error)
}

type VersionedScanner interface {
	// unique name of the distribution scanner.
	Name() string
	// version of this scanner. this information will be persisted with the scan.
	Version() string
	// the kind of scanner. currently only package is implemented
	Kind() string
}

A PackageScanner implementation should parse a discovered package database within the provided layer and return an array of claircore.Package structures reflecting the parsed packages.

DistributionScanner

claircore/internal/indexer/DistributionScanner

type DistributionScanner interface {
	VersionedScanner
	Scan(context.Context, *claircore.Layer) ([]*claircore.Distribution, error)
}

A DistributionScanner implementation should discover the Distribution information of layer.
Distribution is typically the base operating system the layer demonstrates features of.

RepositoryScanner (currently not in use)

claircore/internal/indexer/RepositoryScanner

type RepositoryScanner interface {
	VersionedScanner
	Scan(context.Context, *claircore.Layer) ([]*claircore.Repository, error)
}

A RepositoryScanner implementation should discover any package repositories present in the layer.
This is currently not implemented however future plans are to match packages with their owning repository.

Updater

claircore/libvuln/driver/Updater

type Updater interface {
	Name() string
	Fetcher
	Parser
}

An Updater implementation is responsible for fetching a security advisory database and parsing the contents.
An Updater is an aggregate interface consisting of.
claircore/libvuln/driver/Fetcher
claircore/libvuln/driver/Parser

type Fingerprint string

type Fetcher interface {
	Fetch(context.Context, Fingerprint) (io.ReadCloser, Fingerprint, error)
}

A Fetcher implementation is responsible for returning an io.ReadCloser where the contents of a security database can be read from.
A Fingerprint is provided so the implementation can determine if the security database needs to be fetched.
For example the Fingerprint maybe a sha-256 hash of the contents.
See source for mode details.

type Parser interface {
	Parse(ctx context.Context, contents io.ReadCloser) ([]*claircore.Vulnerability, error)
}

The reason we split fetching and parsing is to easily support offline modes of operation.
A parser can be provided any io.ReadCloser allowing for simple scripts to be implemented for on demand parsing and indexing of CVE data.
In order to run your updater on an interval and as part of the claircore runtime you must implement both methods.

Matcher

claircore/libvuln/driver/Matcher

type Matcher interface {
	Name() string
	Filter(record *claircore.IndexRecord) bool
	Query() []MatchConstraint
	Vulnerable(record *claircore.IndexRecord, vuln *claircore.Vulnerability) bool
}

A Matcher implementation is responsible for telling ClairCore which packages to query via the Filter method, how to query the security advisory database via the Query method and whether the discovered Vulnerability from the security advisory database affects the provided package via the Vulnerable method.
See implementations for further details.

A Matcher implementation should exist next to a Updater implementation to share this information between the two.
A Matcher informs ClairCore how to query the security advisory database by returning a list of MatchConstraint.
A MatchContraint constrains a query to the security advisory database by the provided values.
Multiple MatchConstraint will be 'AND'd together.

type MatchConstraint int

const (
	_ MatchConstraint = iota
	DistributionDID
	DistributionName
	DistributionVersion
	DistributionVersionCodeName
	DistributionVersionID
	DistributionArch
	DistributionCPE
	DistributionPrettyName
)

As an example the Ubuntu Updater parses and indexes vulnerabilities and populates the Vulnerability.Distribution.DID, Vulnerability.Distribution.Name, and Vulnerability.Distribution.Version fields.
The Ubuntu Matcher is aware of this and constrains it's queries by returning DistributionDID,DistributionName, DistributionVersion, constraints when it's Query method is called.
ClairCore will query the security advisory database with these constraints returning only applicable vulnerabilities.

An end to end success

A successful scan looks like this:

  1. Updaters have ran either in the background on an interval or have had their Parse methods called and an offline-load placed CVE data into the security advisory database.
  2. A Manifest is provided to libindex. libindex fetches all the layers, runs all scanner types on each layer, persists all artifacts found in each layer, and computes an IndexReport.
  3. A IndexReport is provided to libvuln.
  4. libvuln creates a stream of IndexRecord structs from the IndexReport and concurrently streams these structs to each configured Matcher.
  5. libvuln computes a VulnerabilityReport aggregating all vulnerabilities discovered by all configured Matcher implementations.
  6. Sometime later the security advisory database is updated and a new request to libvuln will present updated vulnerability data.

Indexer

claircore/internal/indexer

The Indexer package is responsible for retreiving Manifest layers, parsing the contents of each layer, and computing an IndexReport.
The Indexer is implemented as an FSM to correctly report it's current state to the client.

States

The following diagram expresses the possible states of the Indexer
indexer controller state diagram

Data Model

The Indexer data model focuses on content addressable hashes as primary keys, the deduplication of package/distribution/repostitory information, and the recording of scan artifacts.
Scan artifacts are unique artifacts found within a layer which point to a deduplicated general package/distribution/repository record.

The following diagram outlines the current Indexer data model.
indexer data model diagram

Matcher Architecture

claircore/internal/matcher
claircore/libvuln/driver

The Matcher architecture is based on a data flow application.
The Matcher functionality is implemented via a Controller in the matcher package and interface definitions within the libvuln.driver package.
When libvuln is provided a IndexReport the Matcher package will transform the IndexReport into a stream of IndexRecord structs.
The Matcher package then instantiates each implemented driver.Matcher interface and utilizes these implementations to determine if particular IndexRecord structs are vulnerable.

Matcher Architecture

Content-Addressability

ClairCore treats both image hashes and layer hashes as content addressable.
Manifests MUST provide a content addressable hash uniquely identifying the image as a whole.
Layers MUST provide a content addressable hash unique identifying the layer's contents.

Reducing work

ClairCore will use content addressable hashes to understand what work it needs to perform.
If ClairCore comes across a image or layer hash which has been scanned by all configured scanners it will retrieve the existing results and not perform work.
If ClairCore is started with a new set of package scanners and encounters a previously seen image or layer hash it will rescan the image or layer.
ClairCore will only perform a scan with the missing scanner on the incoming image or layers.

Severity Mapping

ClairCore will normalize a security databases's severity string to a set of defined values.
Clients may use the NormalizedSeverity field on a claircore.Vulnerability to react to vulnerability severities without needing to know each security database's severity strings.
All strings used in the mapping tables are identical to the strings found within the relevant security database.

ClairCore Severity Strings

The following are severity strings ClairCore will normalize others to.
Clients can guarantee one of these strings will be associated with a claircore.Vulnerability.

Unknown
Negligible
Low
Medium
High
Critical
Defcon1

Alpine Mapping

Alpine SecDB database does not provide severity information.
All vulnerability severities will be Unknown.

Alpine SeverityClair Severity
*Unknown

AWS Mapping

AWS UpdateInfo database provides severity information.

AWS SeverityClair Severity
lowLow
mediumMedium
importantHigh
criticalCritical

Debian Mapping

Debian Oval database does not provide severity information.
All vulnerability severities will be Unknown.

Debian SeverityClair Severity
*Unknown

Oracle Mapping

Oracle Oval database provides severity information.

Oracle SeverityClair Severity
N/AUnknown
LOWLow
MODERATEMedium
IMPORTANTHigh
CRITICALCritical

RHEL Mapping

RHEL Oval database provides severity information.

RHEL SeverityClair Severity
NoneUnknown
LowLow
ModerateMedium
ImportantHigh
CriticalCritical

SUSE Mapping

SUSE Oval database provides severity information.

SUSE SeverityClair Severity
NoneUnknown
LowLow
ModerateMedium
ImportantHigh
CriticalCritical

Ubuntu Mapping

Ubuntu Oval database provides severity information.

Ubuntu SeverityClair Severity
UntriagedUnknown
NegligibleNegligible
LowLow
MediumMedium
HighHigh
CriticalCritical

Contributors

This is some documentation helpful for the actual nuts-n-bolts of adding code to and testing code in the project.

Logging

All the logging in claircore is done with zerolog via context.Context values. Loggers are extracted from the Contexts via the zerolog.Ctx function, then a child logger is created via the With method and associated with a new Context via the (*zerolog.Logger).WithContext method.

This allows for claircore's logging to be used consistently throughout all the packages without having unintended prints to stderr.

How to Log

Getting a logger

In a function, first obtain a logger:

	log := zerolog.Ctx(ctx).With().

then add key-value pairs of any relevant context:

		Str("component", "Example.Logger").

then create the new logger:

		Logger()

then add the logger back to the Context so that child functions will have the annotations on the logger they extract from the Context.

	ctx = log.WithContext(ctx)

The log object shouldn't be stored in a struct and should stay function local.

Logging style

Constant Messages

Zerolog emits lines when the Msg or Msgf methods are called. Project style is to not use Msgf. Any variable data should be set as key-value pairs on the Event object.

For example, don't do this:

	log.Info().Msgf("done at: %v", time.Now())

Do this instead:

	log.Info().
		Time("time", time.Now()).
		Msgf("done")

Grammar

When noting when noting the change during a chunk of work, make sure that the log messages scan as visually similar. Usually, this means formatting messages into "${process} ${event}". For example:

frob start
frob initialized
frob ready
frob success
frob done

Is much easier to scan than:

starting to frob
initialized frobber
ready for frobbing
did frob
done with frobing

Don't log and return

When handling an error, code should only log it if it does not propagate it. The code that ultimately handles the error is responsible for deciding what to do with it. Logging and returning ends up with the same message repeated multiple times in the logs.

Levels

Claircore attempts to have consistent leveled logging. The rules for figuring out what level to use is:

  • Panic

    There's some occurrence that means the process won't work correctly.

  • Fatal

    Unused, because it prevents defers from running.

  • Error

    Something unexpected occurred and the process can continue, but a human needs to be notified. An error will be returned for this request.

  • Warn

    Something unexpected occurred and the process can continue. An error will be returned for this request.

  • Info

    Some information that may be useful to an operator. Examples include a timer-based process starting and ending, a user request starting and ending, or a summary of work done.

  • Debug

    Some information that may be useful to a developer. Examples include entering and exiting functions, stepping through a function, or specific file paths used while work is being done.

Local Development

A local development environment is implemented via docker-compose.

Usage

Several make targets are defined for working with the local development environment.

local-dev-up - runs a db, libvulnhttp and libindexhttp
local-dev-logs - tails all aggregated container logs
local-dev-down - tears down the local development environment
claircore-db-up - creates just the claircore database useful for running integration tests without test servers
claircore-db-restart - destroys and recreates a fresh database. localhost:5434
libindexhttp-restart - builds and runs libindexhttp with any new changes. localhost:8080
libvulnhttp-restart - builds and runs libvulnhttp with any new changes. localhost8081

Tests

Several make targets are defined for working with tests.

integration - run the integration test suite. requires the claircore-db to be up. run `make clair-db-up` before this target
unit - run the unit test suite.
bench -  runs the benchmarks
integration-v - runs the integration test suite with verbose
unit-v - runs the unit test suite with verbose

Cctool

Cctool is a small utility to make poking at claircore components easier.

Build

Build via the standard incantation:

go build ./cmd/cctool

Usage

Cctool is driven by subcommands. Use the "h" flag before a subcommand to see common flags and a list of subcommands. Use the "h" flag after a subcommand to see flags specific to that subcommand.

Report

The report subcommand reads in docker-like image references on the command line or stdin and outputs a column-oriented summary, suitable for passing to a tool like awk.

Report expects to talk to the development HTTP servers.

Manifest

The manifest subcommand reads in docker-like image references on the command line or stdin and outputs newline-separated json manifests, suitable for passing to libindex.