High Level Architecture

ClairCore is designed to be embedded into another application or service wrapper.
Two separate packages implement ClairCore's functionality: libindex and libvuln.
The main goal of these libraries to to receive a Manifest and compute a VulnerabilityReport.
The intermediate structure IndexReport is used to express all discovered artifacts within an Manifest.

The follow diagram depcits the high level architecture.
alt text

libindex is primarly implemented by the Indexer
libvuln is primarly implemented by the Matcher

Vulnerability Matching

This is a high level overview of the interfaces that must be implemented for end-to-end vulnerability matching.
A description of the end-to-end process is described as well.

Scanners

Several Scanner interface types exist for extracting contents from container layers.

PackageScanner

claircore/internal/indexer/PackageScanner

type PackageScanner interface {
	VersionedScanner
	// Scan performs a package scan on the given layer and returns all
	// the found packages
	Scan(*claircore.Layer) ([]*claircore.Package, error)
}

type VersionedScanner interface {
	// unique name of the distribution scanner.
	Name() string
	// version of this scanner. this information will be persisted with the scan.
	Version() string
	// the kind of scanner. currently only package is implemented
	Kind() string
}

A PackageScanner implementation should parse a discovered package database within the provided layer and return an array of claircore.Package structures reflecting the parsed packages.

DistributionScanner

claircore/internal/indexer/DistributionScanner

type DistributionScanner interface {
	VersionedScanner
	Scan(context.Context, *claircore.Layer) ([]*claircore.Distribution, error)
}

A DistributionScanner implementation should discover the Distribution information of layer.
Distribution is typically the base operating system the layer demonstrates features of.

RepositoryScanner (currently not in use)

claircore/internal/indexer/RepositoryScanner

type RepositoryScanner interface {
	VersionedScanner
	Scan(context.Context, *claircore.Layer) ([]*claircore.Repository, error)
}

A RepositoryScanner implementation should discover any package repositories present in the layer.
This is currently not implemented however future plans are to match packages with their owning repository.

Updater

claircore/libvuln/driver/Updater

type Updater interface {
	Name() string
	Fetcher
	Parser
}

An Updater implementation is responsible for fetching a security advisory database and parsing the contents.
An Updater is an aggregate interface consisting of.
claircore/libvuln/driver/Fetcher
claircore/libvuln/driver/Parser

type Fingerprint string

type Fetcher interface {
	Fetch(context.Context, Fingerprint) (io.ReadCloser, Fingerprint, error)
}

A Fetcher implementation is responsible for returning an io.ReadCloser where the contents of a security database can be read from.
A Fingerprint is provided so the implementation can determine if the security database needs to be fetched.
For example the Fingerprint maybe a sha-256 hash of the contents.
See source for mode details.

type Parser interface {
	Parse(ctx context.Context, contents io.ReadCloser) ([]*claircore.Vulnerability, error)
}

The reason we split fetching and parsing is to easily support offline modes of operation.
A parser can be provided any io.ReadCloser allowing for simple scripts to be implemented for on demand parsing and indexing of CVE data.
In order to run your updater on an interval and as part of the claircore runtime you must implement both methods.

Matcher

claircore/libvuln/driver/Matcher

type Matcher interface {
	Name() string
	Filter(record *claircore.IndexRecord) bool
	Query() []MatchConstraint
	Vulnerable(record *claircore.IndexRecord, vuln *claircore.Vulnerability) bool
}

A Matcher implementation is responsible for telling ClairCore which packages to query via the Filter method, how to query the security advisory database via the Query method and whether the discovered Vulnerability from the security advisory database affects the provided package via the Vulnerable method.
See implementations for further details.

A Matcher implementation should exist next to a Updater implementation to share this information between the two.
A Matcher informs ClairCore how to query the security advisory database by returning a list of MatchConstraint.
A MatchContraint constrains a query to the security advisory database by the provided values.
Multiple MatchConstraint will be 'AND'd together.

type MatchConstraint int

const (
	_ MatchConstraint = iota
	DistributionDID
	DistributionName
	DistributionVersion
	DistributionVersionCodeName
	DistributionVersionID
	DistributionArch
	DistributionCPE
	DistributionPrettyName
)

As an example the Ubuntu Updater parses and indexes vulnerabilities and populates the Vulnerability.Distribution.DID, Vulnerability.Distribution.Name, and Vulnerability.Distribution.Version fields.
The Ubuntu Matcher is aware of this and constrains it's queries by returning DistributionDID,DistributionName, DistributionVersion, constraints when it's Query method is called.
ClairCore will query the security advisory database with these constraints returning only applicable vulnerabilities.

An end to end success

A successful scan looks like this:

Updaters have ran either in the background on an interval or have had their Parse methods called and an offline-load placed CVE data into the security advisory database.
A Manifest is provided to libindex. libindex fetches all the layers, runs all scanner types on each layer, persists all artifacts found in each layer, and computes an IndexReport.
A IndexReport is provided to libvuln.
libvuln creates a stream of IndexRecord structs from the IndexReport and concurrently streams these structs to each configured Matcher.
libvuln computes a VulnerabilityReport aggregating all vulnerabilities discovered by all configured Matcher implementations.
Sometime later the security advisory database is updated and a new request to libvuln will present updated vulnerability data.

Indexer

claircore/internal/indexer

The Indexer package is responsible for retreiving Manifest layers, parsing the contents of each layer, and computing an IndexReport.
The Indexer is implemented as an FSM to correctly report it's current state to the client.

States

The following diagram expresses the possible states of the Indexer
indexer controller state diagram

Data Model

The Indexer data model focuses on content addressable hashes as primary keys, the deduplication of package/distribution/repostitory information, and the recording of scan artifacts.
Scan artifacts are unique artifacts found within a layer which point to a deduplicated general package/distribution/repository record.

The following diagram outlines the current Indexer data model.
indexer data model diagram

Matcher Architecture

claircore/internal/matcher
claircore/libvuln/driver

The Matcher architecture is based on a data flow application.
The Matcher functionality is implemented via a Controller in the matcher package and interface definitions within the libvuln.driver package.
When libvuln is provided a IndexReport the Matcher package will transform the IndexReport into a stream of IndexRecord structs.
The Matcher package then instantiates each implemented driver.Matcher interface and utilizes these implementations to determine if particular IndexRecord structs are vulnerable.

Matcher Architecture

Content-Addressability

ClairCore treats both image hashes and layer hashes as content addressable.
Manifests MUST provide a content addressable hash uniquely identifying the image as a whole.
Layers MUST provide a content addressable hash unique identifying the layer's contents.

Reducing work

ClairCore will use content addressable hashes to understand what work it needs to perform.
If ClairCore comes across a image or layer hash which has been scanned by all configured scanners it will retrieve the existing results and not perform work.
If ClairCore is started with a new set of package scanners and encounters a previously seen image or layer hash it will rescan the image or layer.
ClairCore will only perform a scan with the missing scanner on the incoming image or layers.

Local Development

A local development environment is implemented via docker-compose.

Usage

Several make targets are defined for working with the local development environment.

local-dev-up - runs a db, libvulnhttp and libindexhttp
local-dev-logs - tails all aggregated container logs
local-dev-down - tears down the local development environment
claircore-db-up - creates just the claircore database useful for running integration tests without test servers
claircore-db-restart - destroys and recreates a fresh database. localhost:5434
libindexhttp-restart - builds and runs libindexhttp with any new changes. localhost:8080
libvulnhttp-restart - builds and runs libvulnhttp with any new changes. localhost8081

Tests

Several make targets are defined for working with tests.

integration - run the integration test suite. requires the claircore-db to be up. run `make clair-db-up` before this target
unit - run the unit test suite.
bench -  runs the benchmarks
integration-v - runs the integration test suite with verbose
unit-v - runs the unit test suite with verbose