High Level Architecture
ClairCore is designed to be embedded into another application or service wrapper.
Two separate packages implement ClairCore's functionality: libindex
and libvuln
.
The main goal of these libraries to to receive a Manifest
and compute a VulnerabilityReport
.
The intermediate structure IndexReport
is used to express all discovered artifacts within an Manifest
.
The follow diagram depcits the high level architecture.
libindex
is primarly implemented by the Indexer
libvuln
is primarly implemented by the Matcher
Vulnerability Matching
This is a high level overview of the interfaces that must be implemented for end-to-end vulnerability matching.
A description of the end-to-end process is described as well.
Scanners
Several Scanner interface types exist for extracting contents from container layers.
PackageScanner
claircore/internal/indexer/PackageScanner
type PackageScanner interface {
VersionedScanner
// Scan performs a package scan on the given layer and returns all
// the found packages
Scan(*claircore.Layer) ([]*claircore.Package, error)
}
type VersionedScanner interface {
// unique name of the distribution scanner.
Name() string
// version of this scanner. this information will be persisted with the scan.
Version() string
// the kind of scanner. currently only package is implemented
Kind() string
}
A PackageScanner implementation should parse a discovered package database within the provided layer and return an array of claircore.Package
structures reflecting the parsed packages.
DistributionScanner
claircore/internal/indexer/DistributionScanner
type DistributionScanner interface {
VersionedScanner
Scan(context.Context, *claircore.Layer) ([]*claircore.Distribution, error)
}
A DistributionScanner implementation should discover the Distribution information of layer.
Distribution is typically the base operating system the layer demonstrates features of.
RepositoryScanner (currently not in use)
claircore/internal/indexer/RepositoryScanner
type RepositoryScanner interface {
VersionedScanner
Scan(context.Context, *claircore.Layer) ([]*claircore.Repository, error)
}
A RepositoryScanner implementation should discover any package repositories present in the layer.
This is currently not implemented however future plans are to match packages with their owning repository.
Updater
claircore/libvuln/driver/Updater
type Updater interface {
Name() string
Fetcher
Parser
}
An Updater implementation is responsible for fetching a security advisory database and parsing the contents.
An Updater is an aggregate interface consisting of.
claircore/libvuln/driver/Fetcher
claircore/libvuln/driver/Parser
type Fingerprint string
type Fetcher interface {
Fetch(context.Context, Fingerprint) (io.ReadCloser, Fingerprint, error)
}
A Fetcher
implementation is responsible for returning an io.ReadCloser where the contents of a security database can be read from.
A Fingerprint is provided so the implementation can determine if the security database needs to be fetched.
For example the Fingerprint maybe a sha-256 hash of the contents.
See source for mode details.
type Parser interface {
Parse(ctx context.Context, contents io.ReadCloser) ([]*claircore.Vulnerability, error)
}
The reason we split fetching and parsing is to easily support offline modes of operation.
A parser can be provided any io.ReadCloser allowing for simple scripts to be implemented for on demand parsing and indexing of CVE data.
In order to run your updater on an interval and as part of the claircore runtime you must implement both methods.
Matcher
claircore/libvuln/driver/Matcher
type Matcher interface {
Name() string
Filter(record *claircore.IndexRecord) bool
Query() []MatchConstraint
Vulnerable(record *claircore.IndexRecord, vuln *claircore.Vulnerability) bool
}
A Matcher
implementation is responsible for telling ClairCore which packages to query via the Filter
method, how to query the security advisory database via the Query
method and whether the discovered Vulnerability
from the security advisory database affects the provided package via the Vulnerable
method.
See implementations for further details.
A Matcher
implementation should exist next to a Updater
implementation to share this information between the two.
A Matcher
informs ClairCore how to query the security advisory database by returning a list of MatchConstraint
.
A MatchContraint
constrains a query to the security advisory database by the provided values.
Multiple MatchConstraint
will be 'AND'd together.
type MatchConstraint int
const (
_ MatchConstraint = iota
DistributionDID
DistributionName
DistributionVersion
DistributionVersionCodeName
DistributionVersionID
DistributionArch
DistributionCPE
DistributionPrettyName
)
As an example the Ubuntu Updater
parses and indexes vulnerabilities and populates the Vulnerability.Distribution.DID
, Vulnerability.Distribution.Name
, and Vulnerability.Distribution.Version
fields.
The Ubuntu Matcher
is aware of this and constrains it's queries by returning DistributionDID
,DistributionName
, DistributionVersion
, constraints when it's Query
method is called.
ClairCore will query the security advisory database with these constraints returning only applicable vulnerabilities.
An end to end success
A successful scan looks like this:
Updaters
have ran either in the background on an interval or have had theirParse
methods called and an offline-load placed CVE data into the security advisory database.- A
Manifest
is provided tolibindex
.libindex
fetches all the layers, runs all scanner types on each layer, persists all artifacts found in each layer, and computes an IndexReport. - A
IndexReport
is provided tolibvuln
. libvuln
creates a stream ofIndexRecord
structs from the IndexReport and concurrently streams these structs to each configuredMatcher
.libvuln
computes aVulnerabilityReport
aggregating all vulnerabilities discovered by all configuredMatcher
implementations.- Sometime later the security advisory database is updated and a new request to
libvuln
will present updated vulnerability data.
Indexer
claircore/internal/indexer
The Indexer
package is responsible for retreiving Manifest
layers, parsing the contents of each layer, and computing an IndexReport
.
The Indexer
is implemented as an FSM to correctly report it's current state to the client.
States
The following diagram expresses the possible states of the Indexer
Data Model
The Indexer
data model focuses on content addressable hashes as primary keys, the deduplication of package/distribution/repostitory information, and the recording of scan artifacts.
Scan artifacts are unique artifacts found within a layer which point to a deduplicated general package/distribution/repository record.
The following diagram outlines the current Indexer
data model.
Matcher Architecture
claircore/internal/matcher
claircore/libvuln/driver
The Matcher
architecture is based on a data flow application.
The Matcher
functionality is implemented via a Controller in the matcher
package and interface definitions within the libvuln.driver
package.
When libvuln
is provided a IndexReport
the Matcher
package will transform the IndexReport
into a stream of IndexRecord
structs.
The Matcher
package then instantiates each implemented driver.Matcher
interface and utilizes these implementations to determine if particular IndexRecord
structs are vulnerable.
Content-Addressability
ClairCore treats both image hashes and layer hashes as content addressable.
Manifests MUST provide a content addressable hash uniquely identifying the image as a whole.
Layers MUST provide a content addressable hash unique identifying the layer's contents.
Reducing work
ClairCore will use content addressable hashes to understand what work it needs to perform.
If ClairCore comes across a image or layer hash which has been scanned by all configured scanners it will retrieve the existing results and not perform work.
If ClairCore is started with a new set of package scanners and encounters a previously seen image or layer hash it will rescan the image or layer.
ClairCore will only perform a scan with the missing scanner on the incoming image or layers.
Severity Mapping
ClairCore will normalize a security databases's severity string to a set of defined values.
Clients may use the NormalizedSeverity
field on a claircore.Vulnerability
to react to vulnerability severities without needing to know each security database's severity strings.
All strings used in the mapping tables are identical to the strings found within the relevant security database.
ClairCore Severity Strings
The following are severity strings ClairCore will normalize others to.
Clients can guarantee one of these strings will be associated with a claircore.Vulnerability.
Unknown
Negligible
Low
Medium
High
Critical
Defcon1
Alpine Mapping
Alpine SecDB database does not provide severity information.
All vulnerability severities will be Unknown.
Alpine Severity | Clair Severity |
---|---|
* | Unknown |
AWS Mapping
AWS UpdateInfo database provides severity information.
AWS Severity | Clair Severity |
---|---|
low | Low |
medium | Medium |
important | High |
critical | Critical |
Debian Mapping
Debian Oval database does not provide severity information.
All vulnerability severities will be Unknown.
Debian Severity | Clair Severity |
---|---|
* | Unknown |
Oracle Mapping
Oracle Oval database provides severity information.
Oracle Severity | Clair Severity |
---|---|
N/A | Unknown |
LOW | Low |
MODERATE | Medium |
IMPORTANT | High |
CRITICAL | Critical |
RHEL Mapping
RHEL Oval database provides severity information.
RHEL Severity | Clair Severity |
---|---|
None | Unknown |
Low | Low |
Moderate | Medium |
Important | High |
Critical | Critical |
SUSE Mapping
SUSE Oval database provides severity information.
SUSE Severity | Clair Severity |
---|---|
None | Unknown |
Low | Low |
Moderate | Medium |
Important | High |
Critical | Critical |
Ubuntu Mapping
Ubuntu Oval database provides severity information.
Ubuntu Severity | Clair Severity |
---|---|
Untriaged | Unknown |
Negligible | Negligible |
Low | Low |
Medium | Medium |
High | High |
Critical | Critical |
Contributors
This is some documentation helpful for the actual nuts-n-bolts of adding code to and testing code in the project.
Logging
All the logging in claircore is done with zerolog via context.Context
values. Loggers are extracted from the Contexts via the zerolog.Ctx
function,
then a child logger is created via the With
method and associated with a new
Context via the (*zerolog.Logger).WithContext
method.
This allows for claircore's logging to be used consistently throughout all the packages without having unintended prints to stderr.
How to Log
Getting a logger
In a function, first obtain a logger:
log := zerolog.Ctx(ctx).With().
then add key-value pairs of any relevant context:
Str("component", "Example.Logger").
then create the new logger:
Logger()
then add the logger back to the Context so that child functions will have the annotations on the logger they extract from the Context.
ctx = log.WithContext(ctx)
The log object shouldn't be stored in a struct and should stay function local.
Logging style
Constant Messages
Zerolog emits lines when the Msg
or Msgf
methods are called. Project style
is to not use Msgf
. Any variable data should be set as key-value pairs on
the Event object.
For example, don't do this:
log.Info().Msgf("done at: %v", time.Now())
Do this instead:
log.Info().
Time("time", time.Now()).
Msgf("done")
Grammar
When noting when noting the change during a chunk of work, make sure that the log messages scan as visually similar. Usually, this means formatting messages into "${process} ${event}". For example:
frob start
frob initialized
frob ready
frob success
frob done
Is much easier to scan than:
starting to frob
initialized frobber
ready for frobbing
did frob
done with frobing
Don't log and return
When handling an error, code should only log it if it does not propagate it. The code that ultimately handles the error is responsible for deciding what to do with it. Logging and returning ends up with the same message repeated multiple times in the logs.
Levels
Claircore attempts to have consistent leveled logging. The rules for figuring out what level to use is:
-
Panic
There's some occurrence that means the process won't work correctly.
-
Fatal
Unused, because it prevents defers from running.
-
Error
Something unexpected occurred and the process can continue, but a human needs to be notified. An error will be returned for this request.
-
Warn
Something unexpected occurred and the process can continue. An error will be returned for this request.
-
Info
Some information that may be useful to an operator. Examples include a timer-based process starting and ending, a user request starting and ending, or a summary of work done.
-
Debug
Some information that may be useful to a developer. Examples include entering and exiting functions, stepping through a function, or specific file paths used while work is being done.
Local Development
A local development environment is implemented via docker-compose.
Usage
Several make targets are defined for working with the local development environment.
local-dev-up - runs a db, libvulnhttp and libindexhttp
local-dev-logs - tails all aggregated container logs
local-dev-down - tears down the local development environment
claircore-db-up - creates just the claircore database useful for running integration tests without test servers
claircore-db-restart - destroys and recreates a fresh database. localhost:5434
libindexhttp-restart - builds and runs libindexhttp with any new changes. localhost:8080
libvulnhttp-restart - builds and runs libvulnhttp with any new changes. localhost8081
Tests
Several make targets are defined for working with tests.
integration - run the integration test suite. requires the claircore-db to be up. run `make clair-db-up` before this target
unit - run the unit test suite.
bench - runs the benchmarks
integration-v - runs the integration test suite with verbose
unit-v - runs the unit test suite with verbose
Cctool
Cctool
is a small utility to make poking at claircore components easier.
Build
Build via the standard incantation:
go build ./cmd/cctool
Usage
Cctool
is driven by subcommands. Use the "h" flag before a subcommand to see
common flags and a list of subcommands. Use the "h" flag after a subcommand to
see flags specific to that subcommand.
Report
The report
subcommand reads in docker-like image references on the command
line or stdin and outputs a column-oriented summary, suitable for passing to a
tool like awk
.
Report
expects to talk to the development HTTP servers.
Manifest
The manifest
subcommand reads in docker-like image references on the command
line or stdin and outputs newline-separated json manifests, suitable for passing
to libindex.