A GCC-based Compliance Checker For Single-Translation-Unit, Identifier-Related MISRA-C Rules
A GCC-based Compliance Checker For Single-Translation-Unit, Identifier-Related MISRA-C Rules
• Experiment. The modified compiler is evaluated in bench- Table 1: GCC (7.3.0) options for checking MISRA-C rules.
ICPP • Experiment.
Workshops ’20, August The2020,
17–20, modified compiler
Edmonton, AB, is evaluated in bench- Table 1: GCC (7.3.0)Guan-Ren options for checking MISRA-C rules.
ICPPmark tests.’20,
Workshops The results
August show
17–20, 2020,that AB, Canada
the modified
Edmonton, GCC can cor-
Canada Guan-Ren Wang
Wang and
and Peng-Sheng
Peng-Sheng Chen
Chen
mark tests. The results show that the modified GCC can cor-
rectly detect compliance with MISRA-C’s identifier-related,
rectly detect compliance with MISRA-C’s identifier-related, Table 1: Rule Rule GCC
GCCoption
option
•• Experiment. The
The modified
single-translation-unit-labeled
Experiment. modified compiler rules.is
compiler evaluated
evaluated in
isrules. in bench-
bench- Table 1: GCC GCC (7.3.0)
(7.3.0) options
options for for checking
checking MISRA-C
MISRA-C rules. rules.
mark
single-translation-unit-labeled
tests. The results show that the modified GCC can cor- 2.6 2.6 -Wunused-label
-Wunused-label
mark
The remainder tests. The results
of this paper show that
is organized the modified
as follows. GCC can cor-
Section Section
2 de- 2
rectly The remainder
rectly detect
detect compliance
of this
compliance with
paper is organized
with MISRA-C’s
MISRA-C’s identifier-related,
as follows.
identifier-related, Rule 2.72.7 GCC -Wunused-parameter
GCC option
-Wunused-parameter
scribes thesingle-translation-unit-labeled
MISRA-Cthe
describes rules that arerules
MISRA-C therules.
subject
that are the of this
subjectstudy. study.3 Sec- Rule 3.13.1
Section
of this option
-Wcomment,
-Wcomment, -Wcomments
-Wcomments
single-translation-unit-labeled rules. 2.6 -Wunused-label
describes
The in
tiondetail the
3 describes implementation
in detail the implementation of the GCC ofmodification
the GCC modifi- 2.6
4.2 4.2 -Wunused-label
-Wtrigraphs
-Wtrigraphs
The remainder
remainder of this
this paper
andofevaluation
is
is organized
paperresults. organized as follows.
follows. Section
asconcluding Section 22 are pre- 2.7 2.7 -Wunused-parameter
-Wunused-parameter
anddescribes
evaluationcationresults. Finally, concludingFinally,remarks areremarks
presented 5.3
3.1 5.38.1
the -Wshadow
describessented the MISRA-C
MISRA-C
in Section
rules
rules
4.
that
that are
are the
the subject
subject of of this
this study.
study. Sec-Sec- 3.1 -Wshadow
-Wcomment,
-Wcomment, -Wcomments
-Wcomments
in Section
tion 4.
tion 33 describes
describes in
in detail
detail thethe implementation
implementation of of the
the GCCGCC modifi-
modifi- 4.2 8.1 -Wimplicit
-Wimplicit
-Wtrigraphs
4.2 8.4 -Wtrigraphs-Wmissing-declarations
cation
cation and 2 evaluation
and evaluation
MISRA-C results.
results.RULES Finally,
Finally, concluding
concluding remarks remarks are are pre-
pre- 5.3 8.416.4
5.3 -Wshadow
-Wshadow -Wmissing-declarations
2 sented
MISRA-C in SectionRULES
in Section 4. -Wswitch-default
This section describes the MISRA-C rules relevant to this paper. 8.1 16.4
sented 4. 8.1 -Wimplicit
-Wswitch-default
-Wimplicit
This section Thedescribes the MISRA-C
identifier-related, rules relevant to thisrules
single-translation-unit-labeled paper.are the 8.4 8.4 -Wmissing-declarations
-Wmissing-declarations
2 MISRA-C RULES
The2identifier-related,
MISRA-C RULES
rule 5.2, 5.3, 5.4, single-translation-unit-labeled
and 5.5. Rule 5.2 says “ identifiers rules are thein the 16.4
declared 16.4be in prototype-Wswitch-default
form with named parameters”. The function func
-Wswitch-default
ruleThis
5.2,section
This 5.3,
section
same 5.4,describes
and 5.5.
describes
scope
the MISRA-C
andtheRule
name 5.2
MISRA-Cspace
rules
saysshall relevant to
“ identifiers
rules berelevant
distinct”.
this
this paper.
declared
to The in standard
paper.
C90 the no parameter,
in Figure 3 does void should
not have abe used. The
complete function
prototype form. in Figure
If there
func1 is no 3
The identifier-related, single-translation-unit-labeled rules are the
sameThescope and name
identifier-related,
considers space
significant shall
the firstbe31distinct”.
charactersThe
single-translation-unit-labeled of an C90
rules standard
are
identifer; thestandard
be in has a missing
parameter,
prototype form parameter
void
with
should name.
named
be used.Providing
The function
parameters”. The thefunc1
parameter
function func name
in Figure 3 in
rule 5.2, 5.3,
rule 5.2,significant 5.4,
5.3, extends and
5.4, and 5.5.
5.5. Rule 5.2 says “ identifiers declared in the be in prototype form with named parameters”. The the
function funcname in
considers C99 the
this toRule
first 31
63. 5.2
Rule says
5.4“isidentifiers
characters of
similaran todeclared
identifer; inbut
rule 5.2,standardtheit applies
the has a missing
declaration parameter
can improve name. Providing
the readability andparameter
aid further analysis.
same
same scope and
and name
name spacespaceAshallshall be distinct”. The C90 standard in
in Figure
Figurethe 33 does
does not
not have
have a complete prototype form. If there is
is no
C99considers
extends
scope
to this
macro to identifiers.
63. the
Rule 5.431islong
be
similar
distinct”.
toofrule
identifier mightThebe C90
5.2, but
standard
itstandard
harmful applies to parameter,
to readabil- declaration cana improve
completethe prototype form.
readability andIfaid
there
furthernoanalysis.
considers significant
significant the first
first 31 characters
characters of an
an identifer;
identifer; standard Mismatches
parameter, void
void between
should
should be
be aused.
declaration
used. The
The and the
function
function func1
func1corresponding
in
in Figure
Figure 33 definition
ity and maintainability. Rulemight5.5 states “identifiers shall be distinct Mismatches between a declaration and the corresponding definition
macro
C99identifiers.
C99 extends
extends this
this Ato
to long
63.
63. identifier
Rule
Rule 5.4
5.4 is
is similar
similar berule
to
to harmful
rule 5.2,
5.2, toititreadability
but
but applies
applies has
has aaindicate
missing
missing possible
parameter
parameter programmer
name.
name. errors.
Providing
Providing the
the parameter
parameter name
name in
in
from macro names”. Figure 1 shows an example in which the decla- indicate possible programmer errors.
andto
tomaintainability.
macro
macroration identifiers.
identifiers. Rule
A 5.5 states
A long
long identifier
identifier “identifiers
might
might be shall beto
beasharmful
harmful distinct
to readabil-
readabil- from the the declaration
declaration can can improve
improve the the readability
readability and
and aid
aid further
further analysis.
analysis.
of a variable mean is the same the macro name mean, in
macro
ity names”.
ity and
and Figure
maintainability.
maintainability. 1 shows
Rule
Rule an
5.5
5.5 example
states
states in
“identifiers
“identifiers which
violation of rule 5.5. Rule 5.3 is “An identifier declared in an inner
the
shall
shall be
be declaration
distinct
distinct Mismatches
Mismatches between
between aa declaration
declaration and
and the
the corresponding
corresponding definition
definition
of afrom
variable
from macro
macro
scopemean is the
names”.
names”.
shall same
Figure
Figure
not
1 as
hide1an
shows
showsthe an macro
an
identifier
example
example name
declared
in
in which
mean,
which in violation
the
the
in an outer
decla-
decla-
scope”. The
indicate
indicate possible
possibleint programmer
func();
programmer
char func1(int);
errors.
errors. /* Non-compliant */
/* Non-compliant */
ration
of rule
ration5.5.of aa variable
Rule
ofexample 5.3 in
variable ismean
“An
mean is
is the
identifier
the same
same as
declared
as the
the macro
in
macro an name
inner
name scope
mean,
mean, in
shall
in void func2(int a, int b); /* Compliant */
Figure 2 has the declarations of variables a and b in
violation of rule 5.5. Rule 5.3 is “An identifier declared in an inner
notviolation
hide an theofidentifier
rule 5.5.
inner scope declared
Rule 5.3 isthe
hiding in
“An anidentifier
outerdeclared
variables scope”.
declared inThe
in example
theanouter,
innerbreaking int func();
int func();char func1(int c) /* Non-compliant */
/* Non-compliant */
scope
scope shall not hide an identifier declared in an
anaouter scope”. The
in Figure 2this
has the declarations of variables and bscope”.in theThe inner
shall not hide an 5.3
identifier declared inpotentially
outer char
char func1(int); /*
/* Non-compliant
Non-compliant */
rule. Rules and 5.5 describe confusing cases, so void
{
func1(int);
func2(int a, int b); /* Compliant */
*/
example
example in
in Figure
Figure 2
2 has
has the
the declarations
declarations of
of variables
variables a
a and
and b
b in
in void func2(int... a, int b); /* Compliant */
scope
the hidingscope the variables
following them can declared in
improve a the outer,
program’s breaking
readability this rule.
and reduce the }
the inner
innerrisk scope hiding
hiding the
the variables
variables declared
declared in in the
the outer,
outer, breaking
breaking char
char func1(int
func1(int c)
Rules
this5.3 and
rule. 5.5
Rules
of adescribe
5.3 and 5.5potentially
misunderstanding.
describe confusing
potentially cases, socases,
confusing following
so {
c)
this rule. Rules 5.3 and 5.5 describe potentially confusing cases, so {
them can improve
following
following them
them can a program’s
can improve
improve aa program’sreadability
program’s and reduce
readability
readability and the risk
and reduce
reduce theof a
the
...
...
}
} Figure3:3:Example
Figure Example of ofrule
rule8.2.
8.2.
misunderstanding.
risk of a misunderstanding.
#define mean(a,b) ((a+b)/2) /* compliant */
risk of a misunderstanding.
int result; /* compliant */
target code
Parser and optimizations code generation
Figure 6: Message regarding redefined macro.
C++ Parser JavaGENERIC
Parser Front-end GIMPLE Middle-end
Parser RTL Back-end
target Figure Tested program regarding redefined
6: Message Output message
macro.
C++ JavaGENERIC Front-endGIMPLE Middle-end RTL code
Back-end Tested program 10.c: In function Output message
‘test’:
Parser code 10.c:1:1: error: parameter name omitted
void test(int) { } 10.c: In function
Java Parser Front-endFigure 4: GCC compiler infrastructure.
Middle-end Back-end Tested program void Output message
test(int) { ‘test’:
}
10.c:1:1:
^~~~ Outputerror: parameter name omitted
Java Front-end Middle-end Back-end Tested program
void test(int) { } message
Figure 4: GCC compiler infrastructure. void test(int)
10.c: In function ‘test’:{ }
10.c:1:1:
10.c: ^~~~
error: parameter
In function ‘test’: name omitted
void test(int) { }
Figure 4: GCC compiler infrastructure. void test(int) {
}
void test(int)
10.c:1:1: error: {parameter
} name omitted
Figure
Figure 4:4:GCC
GCCcompiler
compiler infrastructure.
infrastructure. Figure 7: Tested program and output
^~~~ test(int)
void { } message for breaking rule
^~~~
Figure
8.2.
Figure 7: 7: Testedprogram
Tested program and andoutput message
output for breaking
message rule
for breaking
CODE=VAR_DECL
Figure 8.2.
rule
7: 8.2.
Tested program and output message for breaking rule
CODE=VAR_DECL
Decl.minimal.name Figure
8.2. 7: Tested program and output message for breaking rule
Decl.minimal.name
Common.type 8.2.
CODE=VAR_DECL
CODE=VAR_DECL
3.3 Implementation
Common.type
Decl.minimal.name
Decl.minimal.name ...
...
3.3 Implementation
Consider rules 5.2, 5.3, 5.4, and 5.5. First, for rules 5.2 and 5.4,
CODE=IDENTIFIER_NODECommon.type
Common.type
CODE=INTERGER_TYPE 3.3 3.3GCC Implementation
Implementation
Consider
can rules 5.2, detect
not only 5.3, 5.4, and 5.5.
whether First, foridentify
a duplicate rules 5.2 and 5.4,
declaration
CODE=IDENTIFIER_NODE
Identifier.str ... CODE=INTERGER_TYPE
Type.name
3.3 Consider
Implementation
GCC can
exists, rules
but not5.2,
also 5.3,
only 5.4,re-declaration.
detect
a macro and 5.5. aFirst,
whether for rules
duplicate
However, 5.2 declaration
identify and 5.4,
compliance GCC
with
... Consider rules 5.2, 5.3, 5.4, and 5.5. First, for rules 5.2 and 5.4,
CODE=IDENTIFIER_NODEIdentifier.str CODE=INTERGER_TYPEType.name
Type.value Consider
GCCcan can therules
not
exists,
ISO
not only5.2,
but
only 5.3,a5.4,
detect
also
C standard
detect macro andre-declaration.
whether 5.5.
specification
whether First,
a aduplicate
duplicate foridentify
is also rules
identify
However,
required. 5.2 and 5.4,standard
declaration
compliance
The C90
declaration exists,
with
A
CODE=IDENTIFIER_NODE CODE=INTERGER_TYPE GCC can not only detect whether a duplicate identify declaration
Identifier.str A Type.name
CODE=IDENTIFIER_NODE
Type.value
exists,
CODE=INTERGER_TYPE but also a macro re-declaration. However, compliance with the ISO
the
states
but ISO
also aC
that standard
the
macro first specification
31 characters
re-declaration. is also
of
However, an required.
identifier
compliance The
name C90
with standard
are signif-
Identifier.str Type.name exists, but alsothat
states
icant; a macro
this the
is re-declaration.
first
extended 31 characters
tois
is63 However, compliance
of an identifier
characters in the C99name with are signif-
standard. The
A
Type.value
CODE=IDENTIFIER_NODE
Identifier.str
the ISO
CODE=INTERGER_TYPE
Type.minval C standard
C standard specification
specification also
also required.
required. TheThe C90 C90 standard
standard states
Type.value the ISOthat C standard
icant;
GCC this specification
is
preprocessor extended and to Cisof
also
63 required.
characters
compiler have inThe
no theC90
limitC99standard
standard.
the lengthThe
onsignif- of
A
Identifier.str states
Type.minval
Type.maxval
states that
that the
the
the first
first 31
31
firstnames. characters
characters
31 characters ofan an identifier
identifier name nameare are significant;
CODE=IDENTIFIER_NODE
int
CODE=INTERGER_TYPE
icant;this GCC
identifier
this is preprocessor
extended to and
Even
63 C of
using
characters an
theidentifier
compiler haveC99
options
in the name
no
-std= limitareonsignif-
or
standard. the
-ansi,
The lengthGCC of
CODE=IDENTIFIER_NODE CODE=INTERGER_TYPE
Type.maxval
icant; this isisextended
extended toEven
to 63 characters in re-declaration
the C99 standard. The GCC
C 63 characters theinoptions
nothe C99 onstandard. The
Identifier.str Type.minval
int 0 99 identifier
still
GCC preprocessor cannot names.
detect
and whether
compilerusingidentifier
have limit -std= or complies
the length -ansi, of GCC with
Identifier.str Type.minval
Type.maxval
0 GCC preprocessor
99preprocessor
still
the cannot
length and
and
detect
limit C
Cof compiler
compiler
whether
the ISO haveno
have
identifier
standard. nolimit
limit onthe
on thelength
re-declaration length
complies of identifier
of with
int identifier names. Even using the options -std= or -ansi, GCC
still names. Even using of the options oror-ansi, GCC withstill withcannot
Type.maxval
Figure int 5: Exampe GCC tree structure.
0 99
identifier
cannot names.
the length
For Even
rule
detect limit
5.5,
whetherusing
GCC the the
ISO
cannot options
identifier standard.
check-std=
-std=
variable
re-declaration -ansi,
names
complies GCC
conflicting
Figure 5: Exampe 0GCC tree structure. 99still detect
cannotmacro whether
detect
For rule
names. 5.5,
the length limit of the ISO standard.identifier
whether GCC
Figure cannot
8 re-declaration
identifier
shows re-declaration
check
the variable
calling complies
complies
names
sequence with
with
conflicting
of checking the length
with
rede-
Figure
Figure 5: Exampe
5:5:Exampe GCC
GCC tree structure.
treestructure.
structure.
theFor limit
lengthrule of
limit
macro
fined the of
macros
5.5, GCCISO
thein
names. standard.
ISO
Figure
GCC.
cannot standard.
8 shows
The
check the
function
variable calling sequence
_cpp_create_definition()
names conflicting of checking
with rede-
There
Figureare two important
Exampe GCCbuilt-in
tree functions that are commonly For rule in5.5, GCC cannot check variable names
macro names.For
fined
is rule
macros
charge 5.5,ofin
Figure 8GCC GCC.
parsing
shows cannot
The
macros.
the check
function
calling Checkingvariable ofconflicting
re-definition
_cpp_create_definition()
sequence checking with
names conflicting
is handled by
rede- with
usedThere
duringaretracing
two important
and verifying built-in GCC functions
front-end. that are commonly macro names. Figure 8The
shows the calling sequencere-definition
of checking rede-
used during tracing and verifying GCC front-end. fined macro is innames.
the
macros charge
function
in GCC. Figure
of parsing 8 shows
macros.
warn_of_redefinition().
function the calling
Checking sequence
_cpp_create_definition() Figureof is9checking
handled
shows the byrede-
There areare
There two
•two important
important built-in
debug_tree() built-inMuchfunctions
functions
information thatare
that is are commonly
commonly
obtainable from tree fined macros
is in fined calling
charge ofinparsing
themacros GCC.
function
sequence The function
for checking
inwarn_of_redefinition().
GCC.
macros. The
_cpp_create_definition()
re-declared
function
Checking re-definition isFigure
identifiers
handled
_cpp_create_definition()in9 byshowsThe
GCC. the
There aretracing
two important built-in functions that are commonly
used during
used duringtracing and
and verifying
• structures.
debug_tree() verifying
However, GCC
GCC
used during tracing and verifying GCC front-end. front-end.
it is front-end.
Much information
not is obtainable
easy to fully understand from is
the tree
theinfunction
charge ofwarn_of_redefinition().
calling
function
is in charge
parsing
sequence macros. Checkingre-declared
for checking
duplicate_decls()
of parsing macros. Checking
re-definition
called is9handled
byidentifiers
pushdecl()
re-definition
Figure shows intheby
GCC. The
handles
is handled by
structures.
structure. However,
This functionit iscan
notdump
easy totree
fully understand
information. the
the tree function
function
the warn_of_redefinition().
issue of identifier re-declared called
checkingre-declaration.
duplicate_decls() Figure
During in9 GCC.
shows The
by pushdecl()
compilation, theGCChandles
first
• debug_tree()
• debug_tree()
• debug_tree()
Much Much
Much
information
information
information
is
is obtainable
is
obtainable
obtainable
from
from
from
treetree
tree
calling
callingthe sequence
function
sequence
for
warn_of_redefinition().
for checking re-declared
identifiers
identifiersFigurein 9
GCC.shows The the first
calling
• inform()
structure. This
This function
functioncan dump
can tree
output information.
the program location the
checksissue of identifier re-declaration.
macro re-definition called and then During
checks compilation,
identifierhandles GCC
re-declaration.
structures.
structures.
structures. However,
However,
However,
ititis
itThisisnot
iscan
not not
easy
easy easyto to
fully fully understand
understand the tree the function duplicate_decls()
sequence for checking re-declared
by pushdecl()
identifiers in GCC. The func-
• and
structure. a suitable
inform()
This function message. dump It to
function fully
can
needs
tree twounderstand
output parameters:
information. the location_t,
the programtree locationfunction duplicate_decls()
the issuechecks macro
ofInidentifier
order not to interferecalled
re-definition
re-declaration. andwith
During bycompilation
then pushdecl()
checks identifier
compilation, runGCChandles
byre-declaration.
the original
first
tree structure.
structure.andThis
a This
the function
function
suitable
required can
message.
messagecan
dump It dump
tree
needs
to be twotree
output. information.
information.
parameters:
This function outputs
location_t, the tion
issue duplicate_decls()
of
GCC, identifier
In order
we use re-declaration.
not
an to interfere
additional called
During
with
data by pushdecl()
compilation,
compilation
structure, a hash GCC
run handles
by
table, first
the
to the
original
record issue
in-
• inform() This function can output the program location checks macro re-definition and then checks identifier re-declaration.
• inform() Thisthefunction canwith
output the program location
locationand
• inform() and
warning
and a suitable This function
required
messages
message. can
Itmessage
needs output
location
two bethe
toparameters: program
output. This
information. function
location_t, outputs Inoforder
checks identifier
macro
GCC,
formation
not re-declaration.
re-definition
wetouse
for an and
theadditional
interfere macros
withthenand During
checks
data compilation,
identifier
structure,
identifiers.
compilation Aahash
run hash
by GCC
re-declaration.
table,
table
the has first
to an
original record checks
almostin-
a suitable
and a message.
suitable
warning message.It
messagesneeds
It needstwo
with two parameters:
parameters:
location
and the required message to be output. This function outputs location_t,
location_t,
information. macro re-definition
formation to interfere
for the andwith
macros then
GCC, we use an additional data structure, a hash table, the
In order not checks
compilation
and identifiers. identifier
run
A by
hash re-declaration.
original
totable
recordhasin- an almost
andand
thethe
warningrequired
required
messages message
message to
with location be output.
to be output.
information.Thisfunction
This function outputs GCC,
outputs formationwe Inuseorder
for an
theadditional
not to and
macros data
interfere structure,
withAacompilation
identifiers. hashtable
hash table,has torun
record
an byin-
almost the original
warning
warningmessages
messages with locationinformation.
with location information. formation for the macros and identifiers. A hash table has an almost
GCC, we use an additional data structure, a hash table, to record in-
formation for the macros and identifiers. A hash table has an almost
3.2 Strategy constant lookup time, which makes checking duplication efficient.
There are few documents [9, 14] describing GCC’s internal design, Following ISO C, only the significant characters are recorded. For
especially the front-end part [8, 11]. To identify the proper program example, using C90, the first 31 characters of an identifier name
point in GCC to check the rules, we create a test program that are used as a lookup key, and only these characters are stored. We
makes GCC issue the diagnostic messages of interest here. The also record a variable’s scope information for checking rule 5.3,
ICPPICPP
Workshops ’20, August
Workshops 17–20,
’20, August 2020,2020,
17–20, Edmonton, AB, Canada
Edmonton, AB, Canada Guan-Ren
Guan-Ren Wang Wang and Chen
and Peng-Sheng Peng-Sheng Chen
ICPP Workshops ’20, August 17–20, 2020, Edmonton, AB, Canada Guan-Ren Wang and Peng-Sheng Chen
ICPP Workshops ’20, August 17–20, 2020, Edmonton, AB, Canada Guan-Ren Wang and Peng-Sheng Chen