0% found this document useful (0 votes)

106 views51 pages

StepUsersGuide 09

This document provides guidance on the Standardized Technology Evaluation Process (STEP) for evaluating one or more commercial off-the-shelf (COTS) products. STEP consists of four phases: 1) Scoping and Test Strategy, 2) Test Preparation, 3) Testing, Results, and Final Report, and an optional phase 4) Integration and Deployment. The purpose of STEP is to provide a standardized, repeatable, and objective process for technology evaluations. Following STEP helps ensure consistency, efficiency, traceability, and defensible results across evaluations. The document outlines the specific actions, deliverables, and checkpoints for each phase of STEP to guide evaluation teams.

Uploaded by

Yixuan Zhong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

106 views51 pages

StepUsersGuide 09

Uploaded by

Yixuan Zhong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Standardized Technology Evaluation Process (STEP)

User’s Guide and Methodology for Evaluation Teams

Sarah Brown
May 2007

1
Table of Contents
1 Introduction 4
1.1 Purpose 4
1.2 Background 5
1.3 Intended Audience 5
1.4 How to Use this Document 5
2 STEP Methodology 7
2.1 Evaluation Phases 7
2.2 STEP Workflow 9
2.3 Tailoring STEP 10
2.3.1 STEP Workflow for Small Evaluation Teams 10
2.3.2 STEP Workflow for Single Product Evaluations 11
3 Guidance for Successful Evaluations 12
3.1 Methods Used to Evaluate and Score Products 12
3.1.1 Establishing Evaluation Criteria 13
3.1.2 Scoring the Products 14
3.1.3 Computing Weights 15
3.1.4 Computing the Overall Score for Each Product 19
3.2 Communication throughout the Evaluation Process 20
3.3 Ensuring Evaluation Integrity 20
3.4 Creating an Evaluation Timeline 21
4 Phase 1: Scoping and Test Strategy 22
4.1 Action: Conduct Preliminary Scoping 22
4.2 Action: Scoping with Government Sponsor 23
4.3 Action: Perform Market Survey/Tool Selection 24
4.4 Action: Determine Test Architecture 25

2
4.5 Action: Draft High-Level Test Plan 25
4.6 Check Point – Phase 1 26
5 Phase 2: Test Preparation 28
5.1 Action: Establish Evaluation Criteria, Priorities, and Test Procedures 28
5.2 Action: Perform Government Requirements’ Mapping 29
5.3 Action: Enhance and Finalize Test Plan 30
5.4 Action: Acquire Necessary Hardware and Software 31
5.5 Action: Hold Technical Exchange Meeting (TEM) (optional) 32
5.6 Check Point – Phase 2 33
6 Phase 3: Testing, Results, and Final Report 35
6.1 Action: Conduct Testing and Compile Results 35
6.2 Action: Perform Crosswalk 36
6.3 Action: Share Results with Vendors 36
6.4 Action: Deliver Final Report 37
6.5 Check Point – Phase 3 39
7 Acknowledgments 41
8 References 41
Appendix A Acronym and Definition List 43
Appendix B STEP Templates 44

3
1 Introduction
1.1 Purpose
MITRE conducts numerous technology evaluations for its sponsors each year, spanning a wide
range of products and technologies. In order to keep pace with rapidly changing technology and
sponsor needs, MITRE evaluation teams require a well-defined evaluation process that is
efficient, repeatable, and as objective as possible.
The benefits of following a standardized, effective process include:
• Consistency and improved traceability through fixed steps and deliverables
• Improved efficiency leading to less effort required per evaluation
• Defensible, repeatable results
• Better communication within and among evaluation teams
• Evaluations that can be compared and shared more easily across the sponsor base
• An opportunity to develop guidance and document lessons-learned for future evaluations
The Standard Technical Evaluation Process (STEP) developed in G024 outlines a rigorous
process for technology evaluations of one or more COTS products1 . It applies to a variety of areas
of technology and provides substantial benefits for evaluation teams and their government
sponsors.
STEP aims to provide:
• A process that can be used in a broad range of technology evaluations
• Standard deliverables to achieve consistency, traceability, and defensibility of the
evaluation results
• Guidelines to assist teams in developing goals, documenting findings, and addressing
challenges
• A process that is recognized as comprehensive and fair
This document presents STEP and offers a guide to evaluation teams who wish to use it. From
preliminary scoping to eventual integration and deployment, STEP guides teams in producing
high quality reports, thorough evaluations, and defensible results.

1
Technology evaluation is used in this document to refer to evaluations of multiple products providing the same
capability. Product evaluation is used to refer to an evaluation of a single product.

4
1.2 Background
In 2004, the MITRE Intelligence Community Test and Integration Center in G024 began
developing STEP in an effort to track past evaluation work and ensure quality, objectivity, and
consistency in future evaluations. Since that time, STEP has been used successfully in G024 as
well as in G025, G027, and G151 evaluation tasks.
The four phases of STEP follow a common framework for conducting a technology evaluation. In
developing and refining STEP, a variety of resources and subject matter experts were sought (see
references in Section 8) within and outside of MITRE to gain a broader understanding of
evaluation theory and practice. The STEP workflow and methodology incorporate many of these
practices and recommendations.
1.3 Intended Audience
This document is intended for MITRE project leads and engineers conducting technology
evaluations of one or more products and is suitable for experienced as well as first-time
evaluators. Although STEP was designed originally for several G024 security tool evaluations, the
process and methodology is applicable to any software or information technology evaluation.
Because evaluations may vary significantly in size and scope, STEP presents options for
evaluation teams that would like to work in parallel for improved efficiency, as well as for smaller
teams that wish to work together through each stage. Together, the STEP workflow and
methodology provide a comprehensive resource for teams wishing to standardize their evaluations
and structure their daily activities.
1.4 How to Use this Document
Section 2 of this document provides guidance on four major challenges in technology evaluations:
using an established scoring method, communicating with the sponsor, ensuring integrity and
defensibility, and forming a realistic evaluation timeline.
The remainder of the document provides specific information for executing each STEP action.
The presentation in this document is based on the CEM Project Leader Handbook [8]. There is a
chapter for the three main STEP phases and the chapters are designed so that the reader can
quickly locate information about a specific action. Each chapter contains:
• An overview of the phase
• A section for each action within the phase
• For each action:
o Description: A description of the action and specific work to complete
o Lessons-learned: Guidance for successfully completing the action
o Templates and Sample Deliverables: A list of templates and deliverables from
past evaluations to assist teams in documenting their work

5
The final STEP phase, Phase 4: Integration and Deployment, is outside the scope of this document
and is not addressed in detail. Phase 4 applies if an evaluation results in a purchase decision by the
sponsor. In this case, the sponsor determines the specific actions required.

6
2 STEP Methodology
2.1 Evaluation Phases
The STEP process defines evaluations according to three main phases: (1) Scoping and Test
Strategy, (2) Test Preparation, (3) Testing, Results, and Final Report, and a fourth, optional phase
(4) Integration and Deployment that is determined by the sponsor on a case-by-case basis (Figure
1). Each STEP phase has different objectives, actions and associated document deliverables.
Checkpoints, or control gates, separate the phases, and each phase must be completed before the
next one is begun. These control gates help to ensure evaluation integrity. For instance, teams
must establish their evaluation criteria and test strategy (Phase 2) before installing or testing the
evaluation products (Phase 3). It is critical that the team solidify their evaluation criteria before
starting hands-on product testing. This avoids the potential for introducing bias into the evaluation
criteria based on prior knowledge of a given product’s features or design.

Figure 1: Four Phases of STEP

Below are short descriptions of each phase:
1. Scoping and Test Strategy. During this phase, the evaluation team gains an
understanding of the mission objectives and technology space, and settles on key
requirements through scoping with the government sponsor. The team produces a
project summary to help clarify the objectives and scope, and performs a market
survey to identify potential products in the technology area. The evaluation team

7
works with the sponsor to select a list of products for further evaluation based on the
market survey results, evaluation timeline, and resources available. To prepare for
testing, the team produces a project summary and high-level test plan.

2. Test Preparation. After selecting the products to evaluate and obtaining concurrence
from the sponsor, the evaluation team works to acquire the evaluation products from
the vendors, and any additional infrastructure that is required for testing. This
includes signing non-disclosure agreements (NDAs), establishing vendor points of
contact, and meeting with the vendor to discuss the test plan. At the same time, the
team develops a full set of evaluation criteria that the products will be tested against
and any scenario tests 2 that will be performed. The evaluation team then installs the
products in the test environment, and engages the vendor as technical questions arise.
The team may wish to hold a technical exchange meeting (TEM) to gain further
insight and background from subject matter experts.

3. Testing, Results, and Final Report. In this phase, the evaluation team tests and
scores the products against all of the test criteria. The team must ensure that testing
for each product is performed under identical conditions, and must complete a full
crosswalk of the scores for each product requirement after testing to ensure scoring
consistency. Following the crosswalk, evaluation team members conduct individual
meetings with each vendor to review their findings, correct any misunderstandings
about their product’s functionality, and retest if necessary. The team produces a final
report that incorporates the evaluation results and any supporting information.

4. Integration and Deployment 3 . The final evaluation report submitted to the

government provides a data source to assist in decision-making, but is not a proposal
to purchase specific products. If the government decides to purchase a product, the
evaluation team works with the government and other commercial contractors to
assist in deploying and integrating the solution into the operational environment.
Actions in this phase may include developing configuration guidance and supporting
documentation. .

2
In a scenario test, product performance is determined in situation that models a real-world application. The
evaluation team must ensure that each product tested receives the same data and is in the same environment.
Test results will be repeatable only to the extent that the modeled scenario and data can be reproduced.
3
Phase 4 is outside the scope of this document. It is not addressed in later chapters.

8
2.2 STEP Workflow
Figure 2 presents the full STEP workflow. STEP is comprised of four phases separated by
checkpoints. Within each phase, most actions can be completed in parallel so that teams can
maximize their efficiency. The highlighted actions result in major document deliverables for the
sponsor. Appendix A of this guide contains templates for completing each STEP action.

9
Figure 2: Full STEP Workflow
2.3 Tailoring STEP
2.3.1 STEP Workflow for Small Evaluation Teams
For small evaluation teams that wish to perform the STEP actions in a linear order, Table 1
presents a recommended workflow.
Table 1: Recommended Linear STEP Workflow
STEP Phase Section Action
Phase 1 - Scoping and Test Strategy § 4.1 Conduct Preliminary Scoping
§ 4.2 Scoping with Government Sponsor
§ 4.3 Perform Market Survey/Tool Selection
§ 4.4 Determine Test Architecture
§ 4.5 Draft High-Level Test Plan
§ 4.6 Check Point – Phase 1
Phase 2 - Test Preparation § 5.1 Establish Evaluation Criteria, Priorities & Test
Procedures
§ 5.2 Perform Government Requirements’ Mapping
§ 5.3 Enhance and Finalize Test Plan
§ 5.4 Acquire Necessary Hardware and Software
§ 5.5 Hold Technical Exchange Meeting (TEM)
(optional)
§ 5.6 Check Point – Phase 2
Phase 3 - Testing, Results, and § 6.1 Conduct Testing and Compile Results
Final Report
§ 6.2 Perform Crosswalk
§ 6.3 Share Results with Vendors
§ 6.4 Deliver Final Report
§ 6.5 Check Point – Phase 3
Phase 4 - Integration and none Determined by sponsor
Deployment

10
2.3.2 STEP Workflow for Single Product Evaluations
While the full STEP workflow is designed for technology evaluations (evaluations involving
multiple products), it can be modified for teams performing a single product evaluation. In this
situation, Figure 3 provides a tailored workflow.

Figure 3: STEP Workflow for Single Product Evaluations

11
3 Guidance for Successful Evaluations
In developing STEP, project leads identified several key challenges in conducting technology
evaluations. The following subsections address the four challenges identified by MITRE
evaluation teams that are critical to ensuring an evaluation’s success:
• Methods used to evaluate and score products,
• Communication during the evaluation process,
• Ensuring evaluation integrity, and
• Creating an evaluation timeline.
These challenges were echoed and addressed in several literature searches on decision making. As
stated in an article [6] on methods and best practices in evaluating alternatives:
“There are many potential mistakes that can lead one awry in a task…Some concern
understanding the task. Others concern structuring the decision problem to be addressed. Still
others occur in determining the judgments necessary to specify the [scores]… These mistakes
frequently cascade… ‘When this occurs, the [scores] provide little or no insight, contribute to a
poor decision, and result in frustration with the decision process.”

3.1 Methods Used to Evaluate and Score Products

In a technology evaluation, teams must evaluate and score products against a set of evaluation
criteria in order to determine the best choice to meet their sponsor’s needs. Teams must produce a
clear assessment of the products and provide a rationale that can be used to make and justify
decisions. The process involves
1. establishing a set of evaluation criteria and, as appropriate, dividing the criteria among a
set of categories,
2. determining a scheme for scoring products against the evaluation criteria
3. providing a set of numerical weights to determine the relative importance of the criteria
and evaluation categories
4. computing the overall score for each product
Teams often use a spreadsheet such as the one in Table 2 to track the evaluation criteria, scores,
and weights, and calculate the total weighted scores for each product (see Appendix B for this
Evaluation Criteria Template).

12
Table 2: Spreadsheet for capturing evaluation criteria, weights, and scores

<product

scores>

scores>
scores>
P1

P5
Description of How to Test
# Evaluation Criteria the Criteria Weight
1.0 Category 1 Title
1.1 Criteria A -description-
1.2 Criteria B -description-
1.3 Criteria C -description-
1.4 Criteria D -description-

The following subsections provide guidance for accomplishing steps 1- 4 above. This guidance
comes from the multi-attribute utility (MAU) analysis, within the mathematical field of decision
analysis. Decision analysis is concerned with providing a mathematical framework for decision
making, so that decision makers can rigorously and consistently express their preferences, in such
a way that their results can be readily and logically explained.
Multi-attribute utility (MAU) analysis [1, 2, 3, 4, 5, 6, 7, 10, and 14] is a well-established decision
analysis method that specifically addresses how to select one alternative from a set of alternatives,
which is akin to selecting a particular product from a set of products in a given technology area.
MAU analysis follows steps 1- 4 above to compute the overall score, or utility, of each alternative
under consideration. By following the rules and principles of MAU analysis, evaluation teams can
perform straightforward, rigorous, and consistent decision making. Furthermore, teams can back
up the integrity of their results through an established scoring method that is recognized as
comprehensive and fair.
3.1.1 Establishing Evaluation Criteria
In preparing for the evaluation testing, the first step is to establish the evaluation criteria. This is a
key step, because at the end of the evaluation, the results will be a reflection of how well the team
created their evaluation criteria. In order to generate these criteria, the team should conduct
independent research and request guidance on all aspects and objectives of the problem from the
government sponsor and subject matter experts. Through this research, the team will ensure that
the sponsor’s primary needs/wants are addressed, as well as critical functional (e.g. security)
capabilities or nonfunctional (e.g., policy, vendor support) issues.
Evaluation criteria should be specific, Boolean (two-valued) types of questions that are clearly
stated and can be clearly tested. The following tips are provided for writing individual criteria
statements. First, use the “who shall what” standard form to prevent misunderstanding. In other
words,

13
Figure 4: Standard form for writing the evaluation criteria
In writing these statements, avoid the following pitfalls listed in [13]:
• Ambiguity – write as clearly as possible so as to provide a single meaning
• Multiple criteria – criteria that contain conjunctions (and, or, with, also) can often be split
into independent criteria
• Mixing evaluation areas – do not mix design, system, user, and vendor support criteria in
the same evaluation category.
• Wishful thinking – “Totally safe”, “Runs on all platforms”.
• Vague terms – “User friendly”, speculative words such as “generally”, “usually”
In addition to the evaluation criteria statements, provide a description of how each criterion will be
tested. Following these tips will help ensure that each evaluation criterion is carefully written,
independent, and clearly states what is tested, how it is tested, and the desired outcome.
3.1.2 Scoring the Products
The next step is to determine how products will be scored against the evaluation criteria. For
example, teams could use the following function ui:
• ui(ai) = 0 if a product does not meet evaluation criteria ai
• ui(ai) = 1 if a product partially meets evaluation criteria ai
• ui(ai) = 2 if a product fully meets evaluation criteria ai
This function is a constructed scale because each point is explicitly defined. Constructed scales are
often useful because they allow both quantitative and qualitative criteria to be measured. Teams
may prefer to assign scores based on a standard unit of measure (e.g., time, dollars), a complex
function, or another function type.
By convention, in MAU analysis, any scoring function should be normalized so that the scores
fall in the range from 0 to 1. Normalizing the above constructed scale gives:
• ui(ai) = 0 if a product does not meet evaluation criteria ai
• ui(ai) = .5 if a product partially meets evaluation criteria ai
• ui(ai) = 1 if a product fully meets evaluation criteria ai

14
Therefore, in the above example, a product that fully meets a criterion during testing will receive a
score of 1, a product that partially meets a criterion will receive a score of .5, and a product that
does not meet a criterion will receive a 0 for that item. These are not the only possible scale
values. In this case we have a discrete set of three values. We could have a larger discrete set or a
continuous set between 0 and 1.
3.1.3 Computing Weights
The final step is to assign weights wi to each criterion. These weights serve as scaling factors to
specify the relative importance of each criterion. Because they are scaling factors that specify
relative importance in the overall set of criteria, they should be nonnegative numbers that sum to
1.
There is no “best” method for choosing weights. The choice depends on the principles and axioms
that the decision maker wishes to follow, level of detail desired for the weights, and the computing
resources available for calculating the weights.
A variety of methods have been proposed for eliciting weights [1, 2, 3, 4, 10, and 14]. These
methods include:
• Weighted Ranking
• Analytic Hierarchy Process (AHP)
• Trade-off method (also called Pricing Out)
• Paired Comparison (also called Balance Beam method)
• Reference Comparison
These methods are compared in Figure 5 below and the Paired Comparison and Reference
Comparison methods are recommended for use by MITRE evaluation teams.
The first three methods, weighted ranking, AHP, and the trade-off method, are not recommended
in this guide for the following reasons. Both weighted ranking [2, 9] and AHP [5, 10] are popular
methods, but they can be manipulated in ways that result in certain basic logical flaws, and as a
result, are often rejected by decision analysts as acceptable methods for computing weights [2, 4,
11, 14]. The Trade-Off method [2, 3, 6] is also a well-accepted method, but is not recommended
because of the computational resources required to derive weights for more than 10 alternatives.
Several commercial decision software packages are available that implement this method.
The Paired Comparison and Reference Comparison [3, 9, and 14] are recommended in this guide
for use by evaluation teams because they are widely accepted and practical to perform by hand.
The Paired Comparison is a good choice when deriving weights for 10-100 alternatives.
Alternatively, the Reference Comparison method is a good choice when deriving weights for
100+ evaluation criteria. It requires fewer computations than Paired Comparison; however it
provides less granular weights.

15
Can exhibit Can exhibit
logical flaws logical flaws

Weighted Reference Paired Comparison Analytic Hierarchy Trade-off

ranking Comparison (Balance Beam Process (AHP) method
Method) (Pricing Out)

Easy to implement; Time intensive, may require software;

provides relatively coarse weights provides more granular weights

Figure 5: Comparison of Weight Assessment Methods. Reference Comparison and

Paired Comparison are recommended in this Guide for evaluation teams

Paired Comparison:
This method is a good choice for deriving weights for 10-100 alternatives and is best explained
with an example. Given a set of evaluation categories or a small set of evaluation criteria,
determine a basic ordering from highest importance to least importance. Throughout these weight
assessment methods, basic orderings and relative importance is decided by the team and will be
subjective.
Example:
Most important = A
B
C
D
E
F
Least important = G

16
For example, in an evaluation of a security product, security is the most important category,
followed by auditing, administration/management, and then vendor resources.
Starting with the alternative of highest importance, express its importance with the alternatives of
lower importance in terms of a <, =, or > relationship. There is no rule about coming up with this
expression, it is determined by the evaluation team. Obtain an equality (=) relationship whenever
possible to make it easier to solve the equations at the end. Repeat this with the alternative of next
highest importance, until each alternative is expressed in terms of lower-order alternatives, as
shown:

Paired Comparisons (Balance Beam Comparisons) Relationship

A<B+C A=B+D
A=B+D
B>C+D B=C+D+G
B<C+D+E
B<C+D+F
B=C+D+G
C<D+E C>D+G
C<D+F
C<D+F
C>D+G
D=E D=E
E>F+G E = 1.5 (F + G)
E = 1.5 (F + G)
F = 2G F = 2G

Next, assign the lowest-order alternative (in this case, G) a value of 1. Then back solve the system
of equations to determine values for the set of alternatives. The result in this example is:
A = 17.5
B = 11.5
C > 5.5 and C < 6.5
D = 4.5
E = 4.5
F=2
G=1

17
Since the value for C is not exact, it can be approximated and assigned a weight of 6.
The sum of these weights is 47, so to normalize the values, divide each one by 47. The resulting
numbers sum to 1 and give the weights. From A to G they are: 0.372, 0.245, 0.128, 0.096, 0.096,
0.043, and 0.020.
The paired comparison method can be used to find weights for the individual evaluation criteria
and/or for the evaluation categories themselves. The table below shows the weights corresponding
to individual evaluation criteria.
Table 3: Paired Comparison Weights shown on Evaluation Criteria Template

<product

name>

name>
P1

P5
Description of How to Test
# Evaluation Criteria the Criteria Weight
1.0 Category 1
1.1 Criteria A -description- 0.372 0 0 0 0 0
1.2 Criteria B -description- 0.245 0 0 0 0 0
1.3 Criteria C -description- 0.128 0 0 0 0 0
1.4 Criteria D -description- 0.096 0 0 0 0 0

Reference Comparison:
The Reference Comparison method is an alternative to the Paired Comparison and is a good
alternative when calculating weights for 100+ criteria. Given a set of evaluation criteria, choose
the evaluation criterion that is most important or significant in the set. Assign this criterion a value
of 3. Using this as a reference, rank the remaining criteria as follows4 :
• 3 = the criterion is as important as the “reference criterion”
• 2 = the criterion is slightly less important as the “reference criterion”
• 1 = the criterion is much less important than the “reference criterion”
Then, normalize these values so that they sum to 1.
For example, suppose values are assigned as follows:
A=3
B=3

4
It is not necessary to use the range from 1 to 3. The range can be less constrained or more
constrained as needed.

18
C=2
D=2
E=3
F=1
G=2
The sum of these weights is 16, so to normalize the values, divide each one by 16. The resulting
numbers sum to 1 and give the weights. From A to G they are: 0.1875, 0.1875, 0.125, 0.125,
0.1875, 0.0625, and 0.125.
The reference comparison method can be used to elicit weights for the individual evaluation
criteria and/or for the evaluation categories themselves. The table below shows the weights
corresponding to individual evaluation criteria.
Table 4: Reference Comparison Weights on Evaluation Criteria Template

<product

name>

name>
name>
P1

P5
Description of How to Test
# Evaluation Criteria the Criteria Weight
1.0 Category 1
1.1 Criteria A -description- 0.1875 0 0 0 0 0
1.2 Criteria B -description- 0.1875 0 0 0 0 0
1.3 Criteria C -description- 0.125 0 0 0 0 0
1.4 Criteria D -description- 0.125 0 0 0 0 0

3.1.4 Computing the Overall Score for Each Product

Once the evaluation criteria, product scores, and evaluation weights have been determined, the nth
additive utility function is used to compute the overall score of each product, where n is the
number of evaluation criteria.
As an example, the additive utility function with two evaluation criteria, a1 and a2, is:
u(a1, a2,) = w1 u1(a1)+ w2 u2(a2)
The variables in the function are:
• u, represents the overall score of a product over two evaluation criteria, a1 and a2
• u1 and u2, scoring function(s) for criteria a1 and a2, respectively. For simplicity, teams can
use the same scoring function for each criterion. The scoring function example from
Section 3.1.2 demonstrated a constructed scale.

19
• w1 and w2, individual weights assigned to each criterion by a weight assessment method.
The process of eliciting weights was described in Section 3.1.3.
Therefore in summary, MAU analysis provides evaluation teams with a consistent, fairly rigorous
approach for scoring products in a technology evaluation. Teams must establish the evaluation
criteria; determine a scheme for scoring products; and weight the relative importance of each
evaluation criterion and category. The results are the collective efforts of evaluation teams, and are
therefore likely to have some inter-subjective consistency. After each product has been evaluated
and scored, the nth additive utility function gives the overall score (or utility) for each product and
an overall product ranking.
3.2 Communication throughout the Evaluation Process
A successful evaluation requires effective communication between the evaluation team and the
sponsor, stakeholders, subject matter experts, and vendors throughout the evaluation process. The
team must understand what the problem is and what the solution is intended to accomplish.
During each phase, evaluation teams should conduct status updates with the sponsor and
stakeholders and/or subject matter experts, either in writing or as a briefing, to discuss and solicit
feedback on the following items:
• Evaluation goals and objectives
• Initial product assessments
• Additional products or currently deployed solutions within the sponsor’s environment
worth considering
• Considerations/requirements for the sponsor’s environment
• Evaluation criteria and the test plan
In order to facilitate consistent, well-presented work during an evaluation that is recorded for later
reference, Appendix B provides STEP briefing and document deliverable templates for each
phase of the evaluation. In addition to ensuring good communication throughout the evaluation,
the STEP templates also assist the team in drafting their final report.
3.3 Ensuring Evaluation Integrity
It is critical that MITRE teams perform evaluations that are recognized as comprehensive and fair.
A fundamental requirement to achieving evaluation integrity is consistent documentation of test
data and methodology for review by the sponsor, stakeholders, and vendors if questions arise. The
STEP actions and tips (Chapters 4-6) provide guidance for ensuring evaluation integrity. These
guidelines include:
• Verifying all product information for a Market Survey/Tool Selection with product
vendors, and requesting written explanations (by email) as needed

20
• Following the rules and principles for establishing evaluation criteria, scoring products,
and weighting criteria, as explained in Section 3.1
• Finalizing evaluation criteria, including associated weights, test procedures, and expected
outcomes/guidelines for scoring before testing is begun.
• Highlighting product strengths and weaknesses as they are indicated in the overall
evaluation scores. That is, the evaluation team must be careful not to call out product
strengths and weaknesses arbitrarily in the final report without quantitative results and/or
justification to back up the claims.
• Documenting the evaluation using STEP templates for consistency
3.4 Creating an Evaluation Timeline
Scheduling is an important part of the evaluation process in order to establish realistic timelines
and expectations. The STEP workflow allows teams to identify the individual actions and estimate
the time required to complete each one. Teams may wish to break larger actions into smaller
segments to ensure that all of the evaluation work is well defined [13]. Teams must also work
with their sponsor to determine the appropriate number of products to be tested with the time and
resources available. Successful planning and timelines throughout the project will result in
managing the work required for the evaluation.

21
4 Phase 1: Scoping and Test Strategy
During this phase, the evaluation team gains an understanding of the mission objectives and
technology space, and settles on key requirements through scoping with the government sponsor.
The team produces a project summary to help clarify the objectives and scope, and performs a
market survey to identify potential products in the technology area. The evaluation team works
with the sponsor to select a list of products for further evaluation based on the market survey
results, evaluation timeline, and resources available. To prepare for testing, the team produces a
project summary and high-level test plan.

4.1 Action: Conduct Preliminary Scoping

Description and Gather resources and information on the technology area to learn
Activities about the different vendors, products, and solutions currently
available.
Collect information from multiple sources, including:
• Online research
• Trade journals (Gartner, Burton, etc.)
• Business reports (e.g., IDC and Gartner)
• Sponsor meetings
• Recommendations from subject matter experts (e.g.,
MITRE expertise search, MITRE Infosec list)
Distribute the initial list of products across the team for
individual team members to research. Complete Preliminary
Scoping sheets for each product to obtain company background
and high-level information for each product.
Lessons Learned Save all web pages, correspondence, and articles collected
during the preliminary scoping action (along with dates and any
other bibliographic information) as evidence of how each stage
of the evaluation is performed. This information may also be
included as references in the final report.
Allow sufficient time for team members to become familiar with
each product at a high level.
It is not necessary to contact vendors at this point, unless the
team cannot find basic information about the product from
market literature.

22
Templates Preliminary Scoping template
and
Sample Deliverables
(Samples available
upon request)

4.2 Action: Scoping with Government Sponsor

Description and Create a list of important scoping questions and/or base
Activities assumptions for the sponsor. These are key requirements or
needs that the team feels should be addressed in the evaluation.
Use disqualifiers and clear thresholds (i.e. go/no go
requirements) to eliminate unsuitable options.
Review these items with the sponsor and obtain any guidance
and/or corrections. Build this list from information collected
during the Preliminary Scoping.
Discuss the priorities for the key criteria (e.g., “required”,
“highly desired”). Document this information in the Scoping
Questions with Sponsor template.
Lessons Learned (Team lead) Present preliminary team findings and information
to the sponsor and clarify your understanding of what evaluation
is to be performed (scope and objectives), when the work is to be
accomplished (project timeline), and who will be involved in the
work (evaluation team members, team lead, sponsor, any
additional stakeholders or standing boards).
The purpose of this action is to review the teams’ preliminary
findings and make sure that the sponsor and team are on the
same page. The list of scoping questions may be greater or fewer
than 10 items; this is an approximate size.
This discussion should confirm the sponsor’s concerns and
requirements to the team.
Write statements of the key scoping criteria that can translate
into evaluation criteria (completed in Phase 2).

23
Templates Scoping with Sponsor template
and Sample: Network IDS/IPS System Scoping Questions document
Sample Deliverables
(Samples available
upon request)

4.3 Action: Perform Market Survey/Tool Selection

Description and After establishing the key scoping criteria with the sponsor
Activities (Action 2), use the Market Survey/Tool Selection template to
create a matrix of initial products under consideration vs. key
scoping criteria developed with the sponsor.
Scoping criteria that the sponsor considers to be firm
requirements should be assigned the highest weight/priority.
Then, score each product according to the established scoring
methodology presented in Section 3. The Paired Comparison
method is recommended for determining weights.
Contact product vendors to verify the information used to
calculate the scores and save this correspondence for future
reference should questions arise.
Use the results of the Market Survey/Tool Selection to
recommend a list of products for full evaluation.
Lessons Learned The Market Survey/Tool Selection provides a quantitative
product ranking so that the evaluation team can determine which
products to bring into the lab for evaluation.
The number of tools selected for evaluation should be based on
both the results of the Market Survey/Tool Selection as well as
and the resources/timeframe available for the evaluation.
Appoint evaluation team member(s) to serve as the point(s) of
contact for each product.
Before contacting vendors, obtain clear instructions from the
project lead and/or sponsor about the type of information that
can be communicated to them about the sponsor’s identity and
mission.

24
Request to speak with a technical engineer as well as a sales
representative to discuss products’ features or capabilities.

Templates Market Survey/Tool Selection template

and Sample: Network Intrusion Detection/Prevention Systems
Sample Deliverables (Network IDS/IPS) Market Survey/Tool Selection
(Samples available
upon request)

4.4 Action: Determine Test Architecture

Description and Discuss set-up of lab equipment, software, and hardware
Activities required for the evaluation.
Identify any needs for specialized hardware, software, or other
equipment.
Lessons Learned Create a network diagram illustrating how the lab equipment will
be connected.
Templates Sample: Forensics Evaluation Test Architecture
and
Sample Deliverables
(Samples available
upon request)

4.5 Action: Draft High-Level Test Plan

Description and Prepare a 1-2 page write-up with the following information:
Activities
• Project name, charge number
• Points of contact
• Project description and objectives
• Deliverables
• Resources
• Test Methodology and Requirements
Specify any infrastructure or logistical requirements. For

25
example,
• Will the evaluation team need to connect to or isolate
data from the MITRE Information Infrastructure (MII)?
• Will the team perform any product tests in parallel?
Complete this write-up using the High-Level Test Plan
template.
Provide a list of purchase requests to the evaluation team lead if
supporting equipment (hardware/software) is required for
testing.
Lessons Learned Write the High-Level Test Plan with as much detail as possible,
and as if it is a deliverable for the sponsor or an outside reader.
Provide background, introduce the products to be evaluated, and
present the testing and scoring methodology. This document will
continue to grow and can hopefully become Chapter 2: Test
Strategy in the Final Report.
Templates High-Level Test Plan template
and Sample: Forensics High-Level Test Plan
Sample Deliverables
(Samples available
upon request)

4.6 Check Point – Phase 1

Description and Internal Component: Complete all actions from Phase 1. At a
Activities team meeting, discuss the next actions and assign tasks for
proceeding with Phase 2.
External Component: Engage stakeholders and subject matter
experts from the sponsor’s environment with a status brief on the
preliminary findings and objectives.
Solicit feedback and suggestions on the evaluation approach the
products under consideration.
Lessons Learned Work with the sponsor to select a list of products for full
evaluation based on the market survey results, evaluation
timeline, and resources available.
Below is a suggested briefing format (from Phase 1 Brief

26
template):
• Purpose of Task
• Base Assumptions/Key Requirements
• Tool Selection
• Product Highlights
• Product Drawbacks
• Background on the evaluation test environment
• High-level test architecture/plan
• Tool Selection Ranking
• Next Steps
Templates Phase 1 Brief template
and
Sample Deliverables
(Samples available
upon request)

27
5 Phase 2: Test Preparation
After selecting the products to evaluate and obtaining concurrence from the sponsor, the
evaluation team works to acquire the evaluation products from the vendors, and any additional
infrastructure that is required for testing. This includes signing non-disclosure agreements
(NDAs), establishing vendor points of contact, and meeting with the vendor to discuss the test
plan. At the same time, the team develops a full set of evaluation criteria that the products will be
tested against and any scenario tests that will be performed. The evaluation team then installs the
products in the test environment, and engages the vendor as technical questions arise. The team
may wish to hold a technical exchange meeting (TEM) to gain further insight and background
from subject matter experts.

5.1 Action: Establish Evaluation Criteria, Priorities, and Test Procedures

Description and Develop technical evaluation criteria and separate the criteria
Activities into categories, such as:
• General functionality
• User Interface
• Vendor Support
• Administration and Management
Using the Evaluation Criteria Spreadsheet template, track the
following information:
• Evaluation Criteria Statements
• Procedures for How to Test each criterion and the
expected outcomes (Test Methodology)
• Evaluation Criteria Weights and Category Weights
• Score
• Comments
If the products are tested against any scenarios, provide
descriptions of each scenario, its test environment, and how
each product will be scored.
Lessons Learned Number the evaluation criteria so that they can be referenced
easily throughout the evaluation and in the final report.
Ensure that all aspects and objectives of the technology
evaluation are addressed. Clearly state what is being tested, how

28
it is being tested, and the expected outcome. The wording of the
evaluation criteria must be precise so that:
• Each member of the evaluation team understands and
has the same interpretation of the criteria and test
procedures.
• If an evaluation criterion is vague, an outsider (vendor,
stakeholder) may misinterpret it in the final report and
challenge the associated product scores.
It is critical that the evaluation team be able to defend their tests
and results with documented statements and procedures from
Phase 2.
Establish weights for the individual evaluation criteria and the
evaluation categories. For a thorough explanation of
established/approved scoring techniques, see Section 3. The
Reference Comparison method is recommended for eliciting
evaluation criteria weights. The Paired Comparison method is
recommended for eliciting evaluation category weights.
Consider dividing the test results into two test phases: Evaluation
Criteria Testing (Phase 1) and Scenario Testing (Phase 2) to
distinguish between evaluation criteria, which are usually single
steps, and scenarios, which cover a number of steps.
Templates Evaluation Criteria Spreadsheet template
and Sample: Forensics Evaluation Criteria
Sample Deliverables
Sample: Forensics Evaluation Scenario and Scenario Scoresheet
(Samples available
upon request)

5.2 Action: Perform Government Requirements’ Mapping

Description and Identify key government policy documents or guidance (e.g.,
Activities DCID 6/3 for Information Systems within the Intelligence
Community) that may necessitate additional evaluation criteria.
Cross-reference the Evaluation Criteria against these
documents to verify that the criteria cover all government
requirements and to document the mapping.

29
Lessons Learned Completing this mapping may not be straightforward, as certain
criteria may not map directly to government requirements.
Therefore, assign more than one team member to perform this
action to ensure a consensus has been reached in the final results.
Templates None
and
Sample Deliverables
(Samples available
upon request)

5.3 Action: Enhance and Finalize Test Plan

Description and Using the High-Level Test Plan and Project Summary, write a
Activities more detailed, 2-3 page Test Plan to include:
• Short descriptions of the tools selected for evaluation
(including version numbers and all components)
• Any data sources (sample files, network data, etc.) to be
used in testing
• A network diagram of the test architecture
• Additional hardware and software to be used for testing
• Infrastructure requirements (e.g., access to the MII or a
separate, closed network)
• Categories covered by the evaluation criteria
• Description of any scenarios to be tested and how testing
will be performed
• The scoring methodology that will be used (see Section
2.2.2)
Lessons Learned Write the Test Plan as if it is a deliverable for the sponsor or an
outside reader. Provide background, introduce the products to be
evaluated, and present the testing and scoring methodology.
With minor modifications, it should be possible to insert the Test
Plan directly as Chapter 2: Test Strategy in the Final Report.
This will alleviate a lot of pain later!

30
Templates High-Level Test Plan template
and Sample: Forensics Finalized Test Plan
Sample Deliverables
(Samples available
upon request)

5.4 Action: Acquire Necessary Hardware and Software

Description and For each product selected for evaluation, contact the product
Activities vendors and discuss the following:
• Explain that their product has been selected to be part of
an evaluation at MIRE, for the government
• Review the purpose and scope of the evaluation (list
some of the evaluation categories, scenarios that may be
tested, clarify any details that are outside the scope of the
evaluation)
• Verify contact information for the sales representative
and technical engineer points of contact who will work
with the evaluators during the evaluation
• Explain that prior to delivering a final report to the
sponsor, the evaluation findings and results will be
reviewed with the vendor
• Obtain any non-disclosure agreements (NDAs) and
provide these to MITRE Contracts for signature. MITRE
technical staff members are not authorized to sign
NDAs.
Lessons Learned As stated in Action 4: Market Survey/Tool Selection, obtain
clear instructions from the project lead and/or sponsor about the
type of information that can be communicated to vendors
verbally and electronically about the sponsor’s identity and
mission.
Evaluations provide valuable feedback to vendors about their
products and may ultimately result in a government purchase for
a significant customer base. As a result, vendors are asked to
provide evaluation products free of charge. MITRE/sponsors
typically do not pay for this equipment.

31
Request all equipment/license keys for the duration of the
evaluation, so that the evaluation team can tests repeat or verify
tests until the final report is delivered.
If time affords, resist vendor offers to set up the evaluation
equipment for testing. Installation and configuration should be
should be included as part of the evaluation, therefore, it is
important that the evaluation team set up the equipment in the
lab on their own. Products should also be configured in
accordance with their sponsor’s environment. During a short-
term evaluation, however, it may be better for the vendor to set
up the equipment quickly.
Templates None
and
Sample Deliverables
(Samples available
upon request)

5.5 Action: Hold Technical Exchange Meeting (TEM) (optional)

Description and It may be useful to have a Technical Exchange Meeting (TEM)
Activities with subject matter experts, stakeholders, and the sponsor to
share information about the problem or technology space. A
TEM may consist of a series of presentations and discussions
that can influence how the problem is viewed, the weights
assigned to the evaluation criteria, and to provide guidance for
MITRE and the sponsor on their future involvement in the
subject area.
Lessons Learned A TEM provides an opportunity for informal communication
among project stakeholders, the evaluation team, and the sponsor
during the project.
Key success factors for a TEM include
• Identifying and inviting the right people
• Inviting the sponsor
• A solicitation for presentations that are relevant
• Participation among subject matter experts who may not
be familiar with the project and can offer a new, alternate

32
perspective

Templates None
and
Sample Deliverables
(Samples available
upon request)

5.6 Check Point – Phase 2

Description and Internal Component: Incorporate feedback received from the
Activities sponsor and stakeholders on the evaluation criteria and weights.
Complete all evaluation equipment delivery, set-up, and
configuration. Prepare a status briefing for the sponsor and
stakeholders that covers the Test Plan and Evaluation Criteria.
External Component: Provide a status brief for the sponsor,
and stakeholders/subject matter experts from the sponsor’s
environment. Review the evaluation criteria categories, the
criteria themselves, and the assigned weights. Request input and
feedback stakeholders on any criteria and/or weight adjustments
that should be considered.
Below is a suggested briefing format (from Phase 2 Brief
template):
• Agenda
• Purpose
• Current Status – Equipment Acquisition and Set-Up
• Overview of Finalized Evaluation Criteria
• Overview of Finalized Evaluation Weights
• Overview of Test Plan
• Timeline and Next Steps
Lessons Learned Provide soft copy and hard copy versions of the evaluation
criteria to the sponsor and stakeholders involved.
Send a follow-up email to stakeholders requesting input on the

33
evaluation criteria 1-2 weeks before they are finalized.

Templates Phase 2 Brief template

and
Sample Deliverables
(Samples available
upon request)

34
6 Phase 3: Testing, Results, and Final Report
In this phase, the evaluation team tests and scores the products against all of the test criteria. The
team must ensure that testing for each product is performed under identical conditions, and must
complete a full crosswalk of the scores for each product requirement after testing to ensure scoring
consistency. Following the crosswalk, evaluation team members conduct individual meetings with
each vendor to review their findings, correct any misunderstandings about their product’s
functionality, and retest if necessary. The team produces a final report that incorporates the
evaluation results and any supporting information.

6.1 Action: Conduct Testing and Compile Results

Description and Perform all functional testing and proceed to any scenario
Activities testing. Score the product against every criterion and fully
document all comments in the comment field during testing so
that no information is lost or forgotten. The evaluation criteria
results will be included in the final report.
The evaluation team must ensure that each product tested
receives the same data and is tested in the same environment. In
addition, data collection must be performed uniformly for all
products. NOTE: Test results will be repeatable only to the
extent that the modeled scenario and data can be reproduced.
Lessons Learned To maintain objectivity, consider testing each product according
to the 2-person rule—evaluate each product in pairs, and iterate
through different pair combinations across the products. (i.e.,
two people who evaluate one product work with new partners to
evaluate subsequent products).
Score products according to one of the established scoring
methods presented in Section 2.2.2.
Contact the vendor if problems are encountered during testing.
Discuss the issues, request troubleshooting support, and correct
any misunderstandings.
As a team, agree on a policy for applying any patches or updates
that are issued after products are installed in the lab. Teams may
wish to apply updates that help to improve the product if testing
can still be completed on time

35
Templates None
and
Sample Deliverables
(Samples available
upon request)

6.2 Action: Perform Crosswalk

Description and Meet as a team, and review the evaluation criteria in order and
Activities the assigned scores for each product. The purpose of the
crosswalk is to ensure that the products are scored consistently.
Clarify any scores of “1” (criteria that are partially met) with an
explanation in the Evaluation Criteria comment field. These
comments may or may not be included in the final report, but the
team must document somewhere their rationale for awarding
partial credit.
Lessons Learned Set aside several hours over multiple days to perform the
crosswalk. While it is a lengthy process, the crosswalk is critical
to ensure that the team discusses and normalizes the product
scores for each criterion.
Templates None
and
Sample Deliverables
(Samples available
upon request)

6.3 Action: Share Results with Vendors

Description and Following the crosswalk, schedule individual vendor meetings to
Activities review the evaluation findings.
For each product, discuss the overall strengths and weaknesses
and obtain concurrence from the vendor that the team’s findings
accurately capture the product’s capabilities.
Provide vendors with an opportunity to correct any
misunderstandings about their product’s functionality, and retest
if necessary.

36
Discuss potential product improvements and/or changes from the
perspective of the sponsor’s environment.
Lessons Learned The sponsor is the owner of the evaluation itself, and as a result,
MITRE is obligated to protect specific objectives, requirements,
and intentions throughout the evaluation. In addition, NDAs that
were signed with each of the vendors prevents the evaluation
team from sharing results with their competitors.
For these reasons, when reviewing evaluation results with
vendors, do not:
• Discuss any other product’s performance in the
evaluation.
• Reveal the weights of individual evaluation criteria
and/or category weights
• Provide vendors with copies of their product’s results
(unless otherwise directed by the sponsor)
During the vendor briefing, the evaluation team should:
• Ensure both members of the 2-person evaluation team
that evaluated the product are present
• Review major strengths and weaknesses found in the
product
• Discuss overall impressions
• Discuss any lingering problems encountered during
testing
• Allow the vendor to correct any misunderstandings
Templates None
and
Sample Deliverables
(Samples available
upon request)

6.4 Action: Deliver Final Report

Description Structure the final report as follows for both technology and product
and evaluations:

37
Activities • Executive Summary
• Table of Contents
• Introduction
o Background
o Purpose
o Organization of Document
• Test Preparation
• Findings -- the following sections for each product:
o Strengths
o Weaknesses
• Recommendations and Conclusions
• References
• Appendices
o Test Results
o Evaluation Criteria
o Glossary and Acronyms
o Test Data (if applicable)
Lessons Create charts and graphs to capture the overall evaluation results. Capture
Learned scores and performance for each product so the reader can visualize the
results of the evaluation. Below is a sample chart from an Intrusion
Detection Evaluation:

38
Dedicate a section to each product evaluated. Call out its strengths and
weaknesses and ensure that these are the same strengths and weaknesses
reflected in the numerical evaluation results. That is, identify the categories
in which the product scored highest (or stood out above other products).
Identifying a product’s strengths/weaknesses based on the team’s
recollection of the evaluation is not a reliable tool.
Templates Samples available upon request
and
Sample
Deliverables
(Samples
available
upon
request)

6.5 Check Point – Phase 3

Description and Internal Components: Complete all testing, including the initial
Activities evaluation and scoring, as well as any further testing required as
a result of the crosswalk and/or vendor meetings.
Complete all vendor meetings to discuss the evaluation findings
and overall results for the final report.
Draft final report, complete MITRE internal peer review, and
deliver report to the sponsor.

39
External Component (optional): Prepare final status brief for
sponsor and key stakeholders to review the evaluation objectives
and goals, and present preliminary findings/recommendations.
Lessons Learned Ensure that all vendor concerns are discussed and an agreement
is reached before any controversial statements are written in the
final report.
Templates None
and
Sample Deliverables
(Samples available
upon request)

40
7 Acknowledgments
This work was completed with funding from the MITRE Systems Engineering Process Office
(SEPO), the Office of the Director of National Intelligence (ODNI) Chief Information Officer
(CIO), and the G020 - Information Security division. The following people provided much
guidance and assistance during this project: Chris Do, Dale Johnson, Robin Medlock, Michael
O’Connor, Bill Neugent, Greg Stephens, Jake Ulvila, John Vasak, Lora Voas, and Brian White.

8 References
URLs are valid as of the date of this document.
[1] J. Butler, D. J. Morrice, and P.W. Mullarkey. ‘A Multiple Attribute Utility Theory
Approach to Ranking and Selection’, Management Science, 47/6:800-816. (2001).
[2] T. Edmunds. ‘Multiattribute Utility Analysis (MAU) to Support Decisions’, (presentation),
Systems and Decision Sciences Technology, Lawrence Livermore National Laboratory.
(2001).
[3] W. Edwards. ‘SMART and SMARTER: Improved Simple Methods for Multiattribute
Utility Measurement’, Organizational Behavior and Human Decision Processes, 60:306-
325. (1994)
[4] E. H. Forman and S. I. Gass. ‘The Analytic Hierarchy Process—An Exposition’,
Operations Research, INFORMS, 49/4:469-486. (2001).
[5] R. Haas and O. Meixner. ‘An Illustrated Guide to the Analytic Hierarchy Process’, Institute
of Marketing and Innovation, University of Natural Resources and Applied Life Sciences,
Vienna. <http://www.boku.ac.at/mi/ahp/ahptutorial.pdf> (2007).
[6] R. L. Keeney. ‘Common Mistakes in Making Value Trade-Offs’, Operations Research,
INFORMS, 50/6:935-945. (2002).
[7] Z. F. Lansdowne and B. W. Lamar. ‘An On-Line Survey of Portfolio Selection
Methodologies’, Center for Enterprise Modernization, The MITRE Corporation. (2003).
[8] B. Miller, K. See, and N. Tronick. ‘MITRE Center for Enterprise Modernization Project
Leader Handbook.’ Draft, version 2.0. The MITRE Corporation. (2006).
[9] ‘Modelling and Decision Support Tools’, Institute for Manufacturing, University of
Cambridge. <http://www.ifm.eng.cam.ac.uk/dstools/#3>. (2007).
[10] T. L. Saaty. “Priority Setting in Complex Problems”, IEEE Transactions on Engineering
Management, 30/3140-155. (1983).

41
[11] T. L. Saaty. ‘The analytic hierarchy process: Some observations on the paper by Apostolou
and Hassell’, Journal of Accounting Literature. (1994).
<http://www.findarticles.com/p/articles/mi_qa3706/is_199401/ai_n8722119>
[12] ‘A Systems Approach to Project Management’, Cambridge Consulting. MITRE Institute
Course Project Management Boot Camp. (2007)
[13] ‘Test & Evaluation Handbook for C2 Systems’, Draft, ESC Test and Evaluation
Directorate, The MITRE Corporation. (1998).
[14] J. W. Ulvila, et al. ‘A Framework for Information Assurance Attributes and Metrics’,
Technical Report 01-1. Decision Sciences Associates, Inc. (2001)

42
Appendix A Acronym and Definition List
Technology evaluation An evaluation of multiple products from the same technology area
Product evaluation An evaluation of a single product for use in a sponsor’s environment
Evaluation scenario Procedures designed to test a product’s performance in a particular
application or situation
Evaluation criteria Functional requirements and features that products are tested against
in an evaluation
TEM Technical Exchange Meeting
IC TIC Intelligence Community Test and Integration Center
STEP Standardized Technical Evaluation Process

43
Appendix B STEP Templates
Section Action Template
§ 3.1 Conduct Preliminary Scoping Preliminary Scoping
§ 3.2 Scoping with Government Sponsor Scoping with Sponsor
§ 3.3 Perform Market Survey/Tool Selection Market Survey/Tool Selection
§ 3.4 Determine Test Architecture
§ 3.5 Draft High-Level Test Plan High-Level Test Plan
§ 3.6 Check Point – Phase 1 Phase 1 Brief
§ 4.1 Establish Evaluation Criteria, Priorities & Test Procedures Evaluation Criteria Spreadsheet
§ 4.2 Perform Government Requirements’ Mapping
§ 4.3 Enhance and Finalize Test Plan High-Level Test Plan
§ 4.4 Acquire Necessary Hardware and Software
§ 4.5 Hold Technical Exchange Meeting (TEM) (optional)
§ 4.6 Check Point – Phase 2 Phase 2 Brief
§ 5.1 Conduct Testing and Compile Results
§ 5.2 Perform Crosswalk
§ 5.3 Share Results with Vendors
§ 5.4 Deliver Final Report
§ 5.5 Check Point – Phase 3
§ 6.1 Purchase Selected Product
§ 6.2 Support Sponsor with Integration and Deployment

44
Insert classification, (e.g., UNCLASSFIED//FOUO)

B.1 Preliminary Scoping Template

Created by:
Date:

Product Name <product name>

Vendor
URL
Company Headquarters Location
Company Size
Foreign Owners or Investors?
Pricing
Contacts
Commercial/Open Source
Does it Integrate with Other Products?
Current Users (Any government agencies?)

Operating Platform/OS
Target OS/File Structure
<specific question>
<specific question>
<specific question>
Other Key Features

Insert classification (e.g., UNCLASSFIED//FOUO) 45

Insert classification (e.g., UNCLASSFIED//FOUO)

B.2 Scoping with Sponsor Template

1. <Question>
<Explanation>
Assumption to be confirmed with sponsor: <Assumption>

Sample Question:
1. Are there any preferences on the type of platform used?
Windows, Linux, Solaris, BSD, or some customized system. What about versions of each platform?
Assumption to be confirmed with sponsor: No preference on platform.

Insert classification (e.g., UNCLASSFIED//FOUO) 46

Insert classification (e.g., UNCLASSFIED//FOUO)

B.3 Market Survey/Tool Selection Template

# Weighted Factors Weight <product> <product> <product> <product> <product>

<Category 1> 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0

<Category 2> 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
7 0 0 0 0 0 0
8 0 0 0 0 0 0
9 0 0 0 0 0 0
10 0 0 0 0 0 0
11 0 0 0 0 0 0
12 0 0 0 0 0 0

General
13 0 0 0 0 0 0
14 0 0 0 0 0 0
15 0 0 0 0 0 0
16 0 0 0 0 0 0

Total Score 0 0 0 0 0

Insert classification (e.g., UNCLASSFIED//FOUO) 47

Insert classification (e.g., UNCLASSFIED//FOUO)

B.4 High-Level Test Plan Template

Project Name name

Project Number number

Contacts
• Sponsor:
• Project Lead:
• Researchers:

Project Description/Initial Objectives

• text

Test Strategy/Methodology
• text

Deliverables
• list

Resources
• Hardware
• Software
• Books
• Training/Committees/TEMs
• Newsgroups
• Project Documents

Insert classification (e.g., UNCLASSFIED//FOUO) 48

Insert classification (e.g., UNCLASSFIED//FOUO)

B.5 Phase 1 Brief Template

Insert classification (e.g., UNCLASSFIED//FOUO) 49

Insert classification (e.g., UNCLASSFIED//FOUO)

B.6 Evaluation Criteria Spreadsheet

<product

name>

name>
Test Description and
# Evaluation Criteria How to Test Weight Comments
1.0 Category 1
1.1 0 0 0 0 0 0
1.2 0 0 0 0 0 0
1.3 0 0 0 0 0 0
1.4 0 0 0 0 0 0
2.0 Category 2
2.1 0 0 0 0 0 0
2.2 0 0 0 0 0 0
2.3 0 0 0 0 0 0
2.4 0 0 0 0 0 0
2.5 0 0 0 0 0 0
2.6 0 0 0 0 0 0
2.7 0 0 0 0 0 0
2.8 0 0 0 0 0 0
3.0 Category 3
3.1 0 0 0 0 0 0
3.2 0 0 0 0 0 0
3.3 0 0 0 0 0 0
3.4 0 0 0 0 0 0

Total Score 0 0 0 0 0

Insert classification (e.g., UNCLASSFIED//FOUO) 50

Insert classification (e.g., UNCLASSFIED//FOUO)

B.7 Phase 2 Brief Template

Insert classification (e.g., UNCLASSFIED//FOUO) 51

Ma Thesis Topics For Tefl
100% (3)
Ma Thesis Topics For Tefl
8 pages
Skripsi Eva Nurul Fadilah NPM 1601070086
No ratings yet
Skripsi Eva Nurul Fadilah NPM 1601070086
120 pages
Alehegn (1)
No ratings yet
Alehegn (1)
40 pages
Atsede Getahun
No ratings yet
Atsede Getahun
85 pages
100469
No ratings yet
100469
87 pages
Topic 2 - READING DEFFICULTY
No ratings yet
Topic 2 - READING DEFFICULTY
7 pages
Gaining Agility Through IT Personnel Capabilities - The Mediating
No ratings yet
Gaining Agility Through IT Personnel Capabilities - The Mediating
24 pages
Edx Higher Ed Access Program Catalog (As of 04may2020)
No ratings yet
Edx Higher Ed Access Program Catalog (As of 04may2020)
114 pages
Web Application Security Dissertation
100% (2)
Web Application Security Dissertation
6 pages
Problems of Education in The 21st Century, Vol. 49, 2012
No ratings yet
Problems of Education in The 21st Century, Vol. 49, 2012
109 pages
PFE Project
No ratings yet
PFE Project
4 pages
45.intergration of Higher Education
No ratings yet
45.intergration of Higher Education
15 pages
Violent Online Gaming Exposure Among Children and Adolescent
No ratings yet
Violent Online Gaming Exposure Among Children and Adolescent
12 pages
Google UX Design Certificate - Case Study - Oart - Prompt 1
No ratings yet
Google UX Design Certificate - Case Study - Oart - Prompt 1
28 pages
Brand Positioning of Joyalukkas
0% (1)
Brand Positioning of Joyalukkas
60 pages
A Study of The Thematic Apperception Test (TAT) With Japanese Subjects
No ratings yet
A Study of The Thematic Apperception Test (TAT) With Japanese Subjects
25 pages
3 Guidance For Successful Evaluations: 3.1 Methods Used To Evaluate and Score Products
No ratings yet
3 Guidance For Successful Evaluations: 3.1 Methods Used To Evaluate and Score Products
10 pages
3 Guidance For Successful Evaluations: 3.1 Methods Used To Evaluate and Score Products
No ratings yet
3 Guidance For Successful Evaluations: 3.1 Methods Used To Evaluate and Score Products
10 pages
3 Guidance For Successful Evaluations: 3.1 Methods Used To Evaluate and Score Products
No ratings yet
3 Guidance For Successful Evaluations: 3.1 Methods Used To Evaluate and Score Products
10 pages
University of Kerala: Syllabus For Vi Semester
No ratings yet
University of Kerala: Syllabus For Vi Semester
32 pages
Evaluating Software Engineering Methods and Tool Part 7: Planning Feature Analysis Evaluation 2
No ratings yet
Evaluating Software Engineering Methods and Tool Part 7: Planning Feature Analysis Evaluation 2
4 pages
Biomimicry Urban Design
100% (1)
Biomimicry Urban Design
194 pages
Introduction To Technology Evaluation
No ratings yet
Introduction To Technology Evaluation
4 pages
Marketing Optional - From Relationship Marketing To Total Relationship Marketing and Beyond
No ratings yet
Marketing Optional - From Relationship Marketing To Total Relationship Marketing and Beyond
4 pages
Good Thesis Topics For Computer Engineering
100% (3)
Good Thesis Topics For Computer Engineering
5 pages
Stake Evaluation Model (Countenance Model) in Learning Process Bahasa Indonesia at Ganesha University of Educational
No ratings yet
Stake Evaluation Model (Countenance Model) in Learning Process Bahasa Indonesia at Ganesha University of Educational
12 pages
Legal Research Methodology and Applicable Procedures To Legal Research in Nigeria
100% (1)
Legal Research Methodology and Applicable Procedures To Legal Research in Nigeria
18 pages
The Effect of Handwriting Without Tears On Montessori Four-Year-O
No ratings yet
The Effect of Handwriting Without Tears On Montessori Four-Year-O
53 pages
TL BAB Lesson 4 Market Research
No ratings yet
TL BAB Lesson 4 Market Research
4 pages
Loss & Profit
No ratings yet
Loss & Profit
26 pages
GEARE Information Guide
No ratings yet
GEARE Information Guide
2 pages
Evidence Based Nursing Practice
No ratings yet
Evidence Based Nursing Practice
21 pages
3D Virtual Draping With Fabric Mechanics and Body Scan Data
No ratings yet
3D Virtual Draping With Fabric Mechanics and Body Scan Data
10 pages
Perceptions of Grade 11 Students To The Implementation of Wearing Student I.D
67% (6)
Perceptions of Grade 11 Students To The Implementation of Wearing Student I.D
16 pages
Biographical Sketch For Claire Farel MD From 2010 NIH Fellowship Application
No ratings yet
Biographical Sketch For Claire Farel MD From 2010 NIH Fellowship Application
3 pages
The TOGAF® Standard, 10th Edition - ADM Practitioners’ Guide – 2025 Update
From Everand
The TOGAF® Standard, 10th Edition - ADM Practitioners’ Guide – 2025 Update
The Open Group
No ratings yet
The Digital Practitioner Foundation Study Guide
From Everand
The Digital Practitioner Foundation Study Guide
Andrew Josey
No ratings yet
TOGAF® 9 Certified Study Guide - 4th Edition
From Everand
TOGAF® 9 Certified Study Guide - 4th Edition
Rachel Harrison
No ratings yet
Mastering Test Automation: A Practical Guide to Scalable & Efficient Testing
From Everand
Mastering Test Automation: A Practical Guide to Scalable & Efficient Testing
Chizitere Sylvia Olebu
No ratings yet
JUnit in Depth: Definitive Reference for Developers and Engineers
From Everand
JUnit in Depth: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
NUnit in Practice: Definitive Reference for Developers and Engineers
From Everand
NUnit in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Guide to Behave for Python Testing: Definitive Reference for Developers and Engineers
From Everand
Practical Guide to Behave for Python Testing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering the Art of Unit Testing: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Unit Testing: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Java Testing for New Developers: A Practical Guide with Examples
From Everand
Java Testing for New Developers: A Practical Guide with Examples
William E. Clark
No ratings yet
The TOGAF® Standard, Version 9.2
From Everand
The TOGAF® Standard, Version 9.2
The Open Group
No ratings yet
Handbook Lean Six Sigma, Proces control, Change management: Methods, tools and inspiration for operational improvement projects
From Everand
Handbook Lean Six Sigma, Proces control, Change management: Methods, tools and inspiration for operational improvement projects
Gunter Wiededmann
No ratings yet
Associate in Project Management (APM) Exam Practice Questions and Dumps GAQM (APM) Exam Guidebook And Updated Questions
From Everand
Associate in Project Management (APM) Exam Practice Questions and Dumps GAQM (APM) Exam Guidebook And Updated Questions
Byte Books
No ratings yet
The TOGAF® Standard, 10th Edition - ADM Practitioners’ Guide
From Everand
The TOGAF® Standard, 10th Edition - ADM Practitioners’ Guide
The Open Group
No ratings yet
Codeception Essentials: Definitive Reference for Developers and Engineers
From Everand
Codeception Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The TOGAF® Standard, 10th Edition – Architecture Development Method
From Everand
The TOGAF® Standard, 10th Edition – Architecture Development Method
The Open Group
No ratings yet
The Expert’s Path to Software Quality Assurance: A Comprehensive Guide for Delivering High Quality Software
From Everand
The Expert’s Path to Software Quality Assurance: A Comprehensive Guide for Delivering High Quality Software
Muhammad Faizan Khan
No ratings yet
The DevOps Journey: Navigating the Path to Seamless Software Delivery
From Everand
The DevOps Journey: Navigating the Path to Seamless Software Delivery
Kameron Hussain
No ratings yet
Lean Six Sigma Nuggets: A Fully Commented Project Documentation
From Everand
Lean Six Sigma Nuggets: A Fully Commented Project Documentation
Uwe H Kaufmann
No ratings yet
Maximizing the Value of Consulting: A Guide for Internal and External Consultants
From Everand
Maximizing the Value of Consulting: A Guide for Internal and External Consultants
Jack J. Phillips
No ratings yet
Change Management Process for Information Technology
From Everand
Change Management Process for Information Technology
Carlo Figliomeni
No ratings yet
Agile for Consultants
From Everand
Agile for Consultants
CertSquad Professional Trainers
No ratings yet
Microsoft Dynamics Sure Step 2010
From Everand
Microsoft Dynamics Sure Step 2010
Chandru Shankar
No ratings yet
Content Audits and Inventories: A Handbook for Content Analysis
From Everand
Content Audits and Inventories: A Handbook for Content Analysis
Paula Ladenburg Land
No ratings yet
Quality Assurance Testing from Beginner to Paid Professional, 1: Everything You Need to Know to Start a Career in Manual and Automated QA Testing
From Everand
Quality Assurance Testing from Beginner to Paid Professional, 1: Everything You Need to Know to Start a Career in Manual and Automated QA Testing
Bolakale Aremu
5/5 (1)
How to Start a Career in QA: Steps and Tips
From Everand
How to Start a Career in QA: Steps and Tips
Idrak Mirzayev
No ratings yet
The Software Test Engineer's Handbook, 2nd Edition: A Study Guide for the ISTQB Test Analyst and Technical Test Analyst Advanced Level Certificates 2012
From Everand
The Software Test Engineer's Handbook, 2nd Edition: A Study Guide for the ISTQB Test Analyst and Technical Test Analyst Advanced Level Certificates 2012
Graham Bath
4/5 (5)
SAFe® Scrum Master Exam Companion : Q & A with Explanations
From Everand
SAFe® Scrum Master Exam Companion : Q & A with Explanations
SUJAN
No ratings yet
Structured Software Testing: The Discipline of Discovering
From Everand
Structured Software Testing: The Discipline of Discovering
Arunkumar Khannur
No ratings yet
Everything that you MUST know as an IT Scrum Master
From Everand
Everything that you MUST know as an IT Scrum Master
Ms. Sweta Suman
No ratings yet
Audit Engagement Strategy (Driving Audit Value, Vol. III): The Best Practice Strategy Guide for Maximising the Added Value of the Internal Audit Engagements
From Everand
Audit Engagement Strategy (Driving Audit Value, Vol. III): The Best Practice Strategy Guide for Maximising the Added Value of the Internal Audit Engagements
Hans Beumer
No ratings yet
Best Industry Outcomes
From Everand
Best Industry Outcomes
Terry Cooke-Davies
No ratings yet
Customer Success with Microsoft Dynamics Sure Step
From Everand
Customer Success with Microsoft Dynamics Sure Step
Chandru Shankar
No ratings yet
ISTQB Certified Tester Foundation Level Practice Exam Questions
From Everand
ISTQB Certified Tester Foundation Level Practice Exam Questions
Gabriel Awoyemi
5/5 (1)
Project Manager's Guide
From Everand
Project Manager's Guide
Professor Martin Flank, PMP
No ratings yet
Testing Practitioner Handbook
From Everand
Testing Practitioner Handbook
Renu Rajani
No ratings yet
A Guide to Project Monitoring & Evaluation
From Everand
A Guide to Project Monitoring & Evaluation
Gudda
2.5/5 (3)
Guidelines to make your own SOP (Standard Operating Procedure)): 1, #1
From Everand
Guidelines to make your own SOP (Standard Operating Procedure)): 1, #1
Reaz
No ratings yet
PMP Project Management Professional Exam Study Guide: 2021 Exam Update
From Everand
PMP Project Management Professional Exam Study Guide: 2021 Exam Update
Kim Heldman
4/5 (5)
VMWARE Certified Spring Professional Certification Cased Based Practice Questions - Latest Edition
From Everand
VMWARE Certified Spring Professional Certification Cased Based Practice Questions - Latest Edition
Exam OG
No ratings yet
CAPM Exam Insights: Q&A with Explanations
From Everand
CAPM Exam Insights: Q&A with Explanations
SUJAN
No ratings yet
Managing the Testing Process: Practical Tools and Techniques for Managing Hardware and Software Testing
From Everand
Managing the Testing Process: Practical Tools and Techniques for Managing Hardware and Software Testing
Rex Black
4/5 (8)
PMI-ACP Exam Companion : Q & A with Explanations
From Everand
PMI-ACP Exam Companion : Q & A with Explanations
SUJAN
No ratings yet
CAPM SURE SUCCESS: Expert Q&A with Detailed Explanations
From Everand
CAPM SURE SUCCESS: Expert Q&A with Detailed Explanations
SUJAN
No ratings yet
Agile Extension to the BABOK® Guide (Agile Extension) version 2
From Everand
Agile Extension to the BABOK® Guide (Agile Extension) version 2
IIBA
No ratings yet
Navigating Complexity: A Practice Guide
From Everand
Navigating Complexity: A Practice Guide
Project Management Institute
No ratings yet
The Operational Auditing Handbook: Auditing Business and IT Processes
From Everand
The Operational Auditing Handbook: Auditing Business and IT Processes
Andrew Chambers
4.5/5 (5)
ISTQB Certified Tester Advanced Level Test Manager (CTAL-TM): Practice Questions Syllabus 2012
From Everand
ISTQB Certified Tester Advanced Level Test Manager (CTAL-TM): Practice Questions Syllabus 2012
Gabriel Awoyemi
No ratings yet
Laboratory Quality/Management: A Workbook with an Eye on Accreditation
From Everand
Laboratory Quality/Management: A Workbook with an Eye on Accreditation
Kenneth N. Parson
5/5 (2)
Laboratory Practice Quality Pocket handbook
From Everand
Laboratory Practice Quality Pocket handbook
Samuel Mabusela
No ratings yet
Group Project Software Management: A Guide for University Students and Instructors
From Everand
Group Project Software Management: A Guide for University Students and Instructors
Tommy Yuan
No ratings yet
Requirements Management: A Practice Guide
From Everand
Requirements Management: A Practice Guide
Project Management Institute
3.5/5 (2)
Automated Software Testing Interview Questions You'll Most Likely Be Asked
From Everand
Automated Software Testing Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Information Systems Auditing: The IS Audit Follow-up Process
From Everand
Information Systems Auditing: The IS Audit Follow-up Process
Robert E. Davis
2/5 (1)
Information Systems Auditing: The IS Audit Testing Process: Information Systems Auditing, #3
From Everand
Information Systems Auditing: The IS Audit Testing Process: Information Systems Auditing, #3
Robert E. Davis
1/5 (1)
Learn PMP in 24 Hours
From Everand
Learn PMP in 24 Hours
Alex Nordeen
No ratings yet

StepUsersGuide 09

Uploaded by

StepUsersGuide 09

Uploaded by

Standardized Technology Evaluation Process (STEP)

User’s Guide and Methodology for Evaluation Teams

Figure 1: Four Phases of STEP

4. Integration and Deployment 3 . The final evaluation report submitted to the

Figure 3: STEP Workflow for Single Product Evaluations

3.1 Methods Used to Evaluate and Score Products

Weighted Reference Paired Comparison Analytic Hierarchy Trade-off

Easy to implement; Time intensive, may require software;

Figure 5: Comparison of Weight Assessment Methods. Reference Comparison and

Paired Comparisons (Balance Beam Comparisons) Relationship

3.1.4 Computing the Overall Score for Each Product

4.1 Action: Conduct Preliminary Scoping

4.2 Action: Scoping with Government Sponsor

4.3 Action: Perform Market Survey/Tool Selection

Templates Market Survey/Tool Selection template

4.4 Action: Determine Test Architecture

4.5 Action: Draft High-Level Test Plan

4.6 Check Point – Phase 1

5.1 Action: Establish Evaluation Criteria, Priorities, and Test Procedures

5.2 Action: Perform Government Requirements’ Mapping

5.3 Action: Enhance and Finalize Test Plan

5.4 Action: Acquire Necessary Hardware and Software

5.5 Action: Hold Technical Exchange Meeting (TEM) (optional)

5.6 Check Point – Phase 2

Templates Phase 2 Brief template

6.1 Action: Conduct Testing and Compile Results

6.2 Action: Perform Crosswalk

6.3 Action: Share Results with Vendors

6.4 Action: Deliver Final Report

6.5 Check Point – Phase 3

B.1 Preliminary Scoping Template

Product Name <product name>

Insert classification (e.g., UNCLASSFIED//FOUO) 45

B.2 Scoping with Sponsor Template

Other Possible Scoping Questions:

Insert classification (e.g., UNCLASSFIED//FOUO) 46

B.3 Market Survey/Tool Selection Template

# Weighted Factors Weight <product> <product> <product> <product> <product>

Insert classification (e.g., UNCLASSFIED//FOUO) 47

B.4 High-Level Test Plan Template

Project Name name

Project Number number

Project Description/Initial Objectives

Insert classification (e.g., UNCLASSFIED//FOUO) 48

B.5 Phase 1 Brief Template

Insert classification (e.g., UNCLASSFIED//FOUO) 49

B.6 Evaluation Criteria Spreadsheet

Insert classification (e.g., UNCLASSFIED//FOUO) 50

B.7 Phase 2 Brief Template

Insert classification (e.g., UNCLASSFIED//FOUO) 51

You might also like