A lot has improved over the last few years in terms of availability of C/C++ source code security scanners. Many scanners are now available for free for open-source projects, not only improving the security of commons code, but also allowing developers to get some hands-on experience and learn how they operate. In this part I’m discussing Synopsys Coverity, clang-analyzer and AddressSanitizer.
Since late 90’s I’ve been maintaining pam_tacplus project, which I gradually also started using as a test subject for various source code security scanners available for C language. The project is an ideal candidate, a very old, multi-platform code that writes and reads network packets with an enormous potential for remote code execution. Since then a lot has changed in terms of coding standards and available tooling, but 90’s were also the time when the theory of first buffer overflow attacks was published and gained popularity that lasts even today. Since I knew the research, the code I wrote was written with a paranoid approach to buffer management, format strings and all the typical C problems, and it all these years with no serious security issues.
The first scanners I’ve tried were early pattern-based
grep-like tools that could tell you that you’re using
and nothing much beyond it. These were rather disappointing. I’ve returned to the idea probably a decade later
when Coverity (later acquired by Synopsys)
was made available for free for open-source projects. The
pam_tacplus project was onboarded into Coverity,
and a few minor issues (not security related) were fixed, and the tool still stands out in terms of
how useful it is.
Over the last couple of years more SAST vendors started to offer new tools to the FOSS community, and CI/CD integration became much easier, up to the moment when GitHub acquired Semmle and integrated it in their Actions build pipelines under the name of CodeQL. Coverity is also easily integrated, and SonarCloud is another raher sophisticated C/C++ scanner that I have tried.
Coverity for C is a wrapper-based scanner that works in two stages:
- first, you build your project through a Coverity-provided wrapper around the C compiler that compiles the program saving tons of AST information
- secondk, the data is uploaded for analysis to scan.coverity.com, where you can browse the reports.
My project is integrated with Coverity using GitHub Actions in .github/workflows/coverity.yml file, so that every push or merge request will trigger a rescan. Coverity does absolutely fantastic job as it comes to both SAST but also presentation of results. The scanner tracks data throgh all conditional branches, and indeed finds edge cases which can result in unexpected behaviour.
The following example shows a bug in gnulib integrated with my project that is triggered when an empty list is passed to an iterator function. I first saw that bug triggered in my app and detected by ASAN (more on that below), and it was also found by Coverity. Note clear indication of the execution branches (in green) that leads to the bug.
Another bug found by Coverity was a memory leak (also found by ASAN):
Summary: Coverity in my opinion offers the best C/C++ security scanning solution right now, both in terms of quality of the data and execution flow analysis and in terms of presentation to developer.
clang-analyzer is a SAST scanner bundled with Clang compiler. It’s run as a
scan-build (on some systems it comes with Clang, on some it’s a separate package). In case of
my project, it’s run in the following manner:
scan-build --use-cc=clang ./configure scan-build --use-cc=clang make clean all
Don’t set your expectations too high though, as it terms of security alone the list of checks
is not too impressive as of today (
security.FloatLoopCounter Warn on using a floating point value as a loop counter (CERT: FLP30-C, FLP30-CPP) security.insecureAPI.DeprecatedOrUnsafeBufferHandling Warn on uses of unsecure or deprecated buffer manipulating functions + security.insecureAPI.UncheckedReturn Warn on uses of functions whose return values must be always checked security.insecureAPI.bcmp Warn on uses of the 'bcmp' function security.insecureAPI.bcopy Warn on uses of the 'bcopy' function security.insecureAPI.bzero Warn on uses of the 'bzero' function security.insecureAPI.decodeValueOfObjCType Warn on uses of the '-decodeValueOfObjCType:at:' method + security.insecureAPI.getpw Warn on uses of the 'getpw' function + security.insecureAPI.gets Warn on uses of the 'gets' function + security.insecureAPI.mkstemp Warn when 'mkstemp' is passed fewer than 6 X's in the format string + security.insecureAPI.mktemp Warn on uses of the 'mktemp' function security.insecureAPI.rand Warn on uses of the 'rand', 'random', and related functions security.insecureAPI.strcpy Warn on uses of the 'strcpy' and 'strcat' functions + security.insecureAPI.vfork Warn on uses of the 'vfork' function
Summary: clang-analyzer in terms of security is very basic and limited to
of some bad coding practices. But it’s there, it’s free and it’s work in progress so why not.
scan-build in some distributions comes along with
clang, and on some it’s
a separate package — check the files in .builds/
to find out which is the case where.
One of the most powerful inventions for C is the AddressSanitizer
available in both clang and
gcc. ASAN is a
code instrumentation tool that detects suspicious memory and code control flows in run-time, and probably
most similar to the IAST family of scanners in its operations. ASAN is unsuitable for production use due
to performance penalty, but it should be used by every C/C++ programmer during their functional
testing runs. The Clang version of ASAN is more feature-rich than the GCC version.
In my project ASAN comes comes in the .builds/ manifest files for SourceHut build engine that allows me to test builds on flavours Linux and FreeBSD. An example from .builds/oldlts.yml for Ubuntu 20.04:
export LDFLAGS=-shared-libasan env CC=clang ./configure --enable-asan make clean all
--enable-asan flag triggers a number compiler sanitizers defined in
configure.ac, each of which
is tested for support in the local compiler. The reason for such a cautious approach is that their
availability varies between compiler versions. The key sanitizer is
but there’s many more
The build produces regular binaries, which are then run as part of the functional testing suite, as if they do as part of regular testing. All memory operations will be however carefully watched by ASAN sanitizers and if anything suspicious happens, the program will be terminated with a detailed stack dump.
In my build manifests you can see the following environment
variables, which are only required because I’m testing a
export ASAN_OPTIONS=abort_on_error=1:fast_unwind_on_malloc=0:detect_leaks=1 export LD_PRELOAD=$(clang -print-file-name=libclang_rt.asan-x86_64.so) ...the whole functional testing suite runs here...
Now, if the tested version has any bugs, and if they are triggered by your test suite ASAN will abort the execution and dump something like:
[build@build ~/pam-tacplus]$ tacc ================================================================= ==7329==ERROR: AddressSanitizer: odr-violation (0x0000010e65c0):  size=4 'tac_encryption' tacc.c:77:5  size=4 'tac_encryption' libtac/lib/header.c:38:5 These globals were registered at these points: : #0 0x106ea0d in __asan_register_globals /usr/src/contrib/llvm-project/compiler-rt/lib/asan/asan_globals.cpp:360:3 #1 0x10dfdab in asan.module_ctor (/usr/local/bin/tacc+0xbedab) #2 0x8010ecc5a (/libexec/ld-elf.so.1+0x8c5a) #3 0x8010ea488 (/libexec/ld-elf.so.1+0x6488) : #0 0x106ea0d in __asan_register_globals /usr/src/contrib/llvm-project/compiler-rt/lib/asan/asan_globals.cpp:360:3 #1 0x80114119b in asan.module_ctor (/usr/local/lib/libtac.so.3+0x1b19b) #2 0x8010ecc5a (/libexec/ld-elf.so.1+0x8c5a) #3 0x8010ea488 (/libexec/ld-elf.so.1+0x6488) ==7329==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0 SUMMARY: AddressSanitizer: odr-violation: global 'tac_encryption' at tacc.c:77:5 ==7329==ABORTING
This particular alert means I’ve declared two global variables of the same name
in two distinct places in the code, an actual bug that I introduced during rewrite.
Another interesting case was a bug in
gnulib where a list iterator crashed when
an empty list was passed to it.
This was nicely captured by ASAN undefined behaviour sanitizer (
-fsanitize=undefined, aka UBSAN),
as well as Coverity (see above):
libtac/lib/acct_s.c:152:9: runtime error: null pointer passed as argument 2, which is declared to never be null /usr/include/string.h:43:28: note: nonnull attribute specified here SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior libtac/lib/acct_s.c:152:9 in gl_array_list.c:452:29: runtime error: applying zero offset to null pointer SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior gl_array_list.c:452:29 in
And then ASAN memory leak detector (
-fsanitize=leak) does pretty good job at detecting
==14185==ERROR: LeakSanitizer: detected memory leaks Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f766db2dd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28) #1 0x7f766d8382b0 in xcalloc libtac/lib/xalloc.c:31 #2 0x7f766d83004e in _tac_add_attrib_pair libtac/lib/attrib.c:74 #3 0x7f766d8335d3 in tac_author_read_timeout libtac/lib/author_r.c:263 #4 0x565049269f66 in main /home/build/pam-tacplus/tacc.c:327 #5 0x7f766c72bbf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6) #6 0x56504926a5b9 in _start (/home/build/pam-tacplus/.libs/tacc+0x55b9)
Summary: AddressSanitizer is very powerful tool for C/C++ programs that is comparable with IAST solutions for Java in terms of precision and efficiency in finding bugs. Just as with IAST, the efficiency of AddressSanitizer is a function of coverage of functional testing. In other words, if you don’t run your program compiled with ASAN, the latter will never kick in. If your functional test suite doesn’t cover particular modules or execution paths, ASAN will never have a chance to test it. Fortunately, ASAN can be used with fuzzers such as afl which ensures a very broad test coverage, especially across edge cases.
This cycle will be continued to cover more tools, specifically Semmle (aka CodeQL, aka LGTM) and SonarCloud.
If you’re impatient, just head to pam_tacplus
repo and see it in the code (mostly
.github/workflows). Note this is living project,
so I’m trying out different approaches and code does change frequently.