Integrating PVS Studio and Coverity with make

Today I will talk about integration of PVS Studio and Coverity into a make-based build system.


As a part of my work on Open Harmony, I'm looking into static analysis for C and
C++ projects.

I had prior experience with PVS Studio and Coverity. I've used them in my
personal projects as well as in my work on snap-confine, a privileged part of
snapd responsible for creation of the execution environment for snap application
processes, where security is extremely important.

Let us briefly look at how static analysis tools integrate with the build
system. In general static analysis tools do not run the code but instead
read the code and deduce useful properties this way. Since C includes a
pre-processor, which handles #include and various #if statements, the ideal
input to a static analyzer is pre-processed code. This way the static analysis
tool can be fed standalone information that no longer relies on system headers
or third party libraries that are necessary to understand definitions in the
code. This also makes such pre-processed input convenient for SAAS-like service,
where the analysis tool is not running locally with access to the local
compiler.

In general all the static analysis tools I've tried behave this way. The main
difference from an integrator point of view if the pre-process step is done
implicitly or explicitly. The cheapest way to integrate with a tool if this can
be done implicitly, namely by following and observing an existing build process.
The analyzer support tool can trigger an otherwise standard build process,
observe how the compiler is executed, extract the relevant -I, -iquote and -D
flags, find the translation units and eventually pre-process the code internally.

Some tools offer the "build system observer", others do not, or in special case,
gcc doesn't need to as the analyzer is an integral part of the compiler itself.

Let's consider two examples, explicit pre-processing with PVS Studio and
implicit pre-processing with Coverity. Examples below use make(1) syntax.

First we need a way to pre-process arbitrary file. Interestingly this works for
both C and C++ as the pre-processor does not care about the actual source
language. Well, maybe except __cplusplus defines.

%.i: %:
    $(CPP) $(CPPFLAGS) $< -E -o $@

For those who don't read make, this will invoke the pre-processor, which by
convention is named by the $(CPP) variable, pass all the options defined by
another variable, $(CPPFLAGS), then the input file $<, then the -E option,
which asks the compiler to stop at the pre-processing, followed by -o $@ to
write the result to the output file, which is named on the left hand side of the
rule header %.i. The rule header shows how to make a file with the .i
extension out of any file % behaves like * in globs.

Now we can ask the static analysis tool to do its job. Let's write another rule:

%.PVS-Studio.log: %.i
    pvs-studio --cfg .pvs-studio.cfg --i-file $< --source-file $* --output-file $@

Some bits are omitted for clarity. The real rule has additional dependencies on
the PVS-Studio license file and some directories. Interestingly we need to both
provide the pre-processed file $< and the original source file $*. Here $<
is the first dependency and $* is the text that was matched against
%`.

The resulting *.PVS-Studio.log files are opaque. They represent arbitrary
knowledge extracted by the tool from an individual translation unit. This mode
of operation is beneficial for parallelism, as those tasks can be executed
concurrently. As with compilation, in the end we need to link the results
together to get the result of our analysis.

pvs-report: $(wildcard *.PVS-Studio.log)
    plog-converter --setings .pvs-studio.cfg \
        $(PLOG_CONVERTER_FLAGS) \
        --srcRoot . \
        --projectName foo \
        --projectVersion 1.0 \
        --renderTypes fullhtml \
        --output $@ \
        $^

Here we use another tool, plog-converter to merge the result of the analysis.
There are some additional options that influence the format and contents of the
generated report but those are self-explanatory. One interesting observation is
that this command does not fail in the make check style, by existing with an
error code. If you intend to block on static analysis results there are some
additional steps you need to take. For PVS Studio I've created the appropriate
rules inside zmk, so that the logs can be processed and displayed in the same
style that a compiler would otherwise produce, so that the output is useful to
editors like vim.

That's it for PVS Studio, now let's look at Coverity. Coverity offers a sizable
(715MB) archive with all kinds of tooling. From the point of view we need just
one tool, the cov-build wrapper. The wrapper invokes arbitrary build command,
in our case make and stores the analysis inside a directory. Coverity requires
that directory to be called cov-int so we will follow along for simplicity.

Here is our make rule:

cov-int: $(MAKEFILE_LIST) $(wildcard *.c *.h) # everything!
    cov-build --dir $@ $(MAKE) -B

The rule is simple and could be improved. Ideally, to avoid the -B (aka
--always-make argument) we would perform an out-of-tree build in a temporary
directory. What we are after is a condition where make invokes the compiler,
even for the files we may have built in our tree before, so that cov-build
gets to observe the relevant arguments, as was described before. The more
problematic part is the build-dependency, which technically depends on
everything that the input may need. Here I simplified the real dependency set.
For practical CI systems that's sufficient, for purists it requires some care to
properly describe the recursive dependencies if the implicit target (commonly
called all).

The result is a directory we need to tar and send to Coverity for analysis. That
part is not interesting and details are available inside the zmk library.

Coverity processing is both asynchronous and capped behind a quota. I would not
recommend using it to block builds in CI, except if you have a commercial local
instance that never rejects your uploads. Coverity has a REST API for accessing
analysis results so with some extra integration that API can be queried and
appropriate blocking rules can be used to "break the build". Personally I did
not attempt this, mainly due to quota.

As a last note, Coverity puts additional requirements on valid submissions. At least 80%
of compilation units must be "ready for analysis". This can be checked with ad-hoc rule
that uses some shell to look at the analysis log file. The following shell snippet is not
quoted for correct use inside Make, see zmk for the quoted original:

test "$(tail cov-int/build-log.txt -n 3 | \
        awk -e '/[[:digit:]] C\/C\+\+ compilation units \([[:digit:]]+%) are ready for analysis/ { gsub(/[()%]/, "", $6); print $6; }')" \
     -gt 80

Next time we will look at the perceived value of the various static analysis
tools I've tried.

Footnote: ZMK can be found at https://github.com/zyga/zmk/


You'll only receive email when they publish something new.

More from Zygmunt Krynicki