[go: up one dir, main page]

0% found this document useful (0 votes)
67 views6 pages

A Data Fusion Based Digital Investigation Model As

This document summarizes a research paper about developing a data fusion model for digital investigations and cybersecurity risk assessment. The proposed model aims to address limitations in existing computer forensic tools by analyzing patterns in large datasets. It applies data fusion and mining techniques to correlate security issues and crimes. The model is based on the Joint Director Laboratories data fusion model and uses algorithms to autonomously process information at different fusion levels. The goal is to improve data quality over quantity for analysis to make investigations more effective and reduce costs. The proposed model could help investigate cyberattacks, interpret adversary actions, and predict their impacts for security restoration and risk management.

Uploaded by

Philipp A Isla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views6 pages

A Data Fusion Based Digital Investigation Model As

This document summarizes a research paper about developing a data fusion model for digital investigations and cybersecurity risk assessment. The proposed model aims to address limitations in existing computer forensic tools by analyzing patterns in large datasets. It applies data fusion and mining techniques to correlate security issues and crimes. The model is based on the Joint Director Laboratories data fusion model and uses algorithms to autonomously process information at different fusion levels. The goal is to improve data quality over quantity for analysis to make investigations more effective and reduce costs. The proposed model could help investigate cyberattacks, interpret adversary actions, and predict their impacts for security restoration and risk management.

Uploaded by

Philipp A Isla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/299032486

A Data Fusion based Digital Investigation Model as an Effective Forensic Tool in


the Risk Assessment and Management of Cyber Security Systems

Article · January 2009

CITATIONS READS

4 349

1 author:

Suneeta Satpathy
Sri Sri University
45 PUBLICATIONS   153 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Fusion and Mining in Education View project

Mortality rate prediction with AI for COVID 19 View project

All content following this page was uploaded by Suneeta Satpathy on 22 March 2016.

The user has requested enhancement of the downloaded file.


A Data Fusion based Digital Investigation Model as an Effective Forensic Tool
in the Risk Assessment and Management of Cyber Security Systems

Ms. Suneeta Satpathy (PhD Scholar, Utkal University)


Assistant Professor, College of Engineering
Plot-1, Sector-B, CNI Complex, Patia, Bhubaneswar-24, Orissa, India
and
Mr. Asish Mohapatra, MSc, MPhil (pre-doctoral), EMC, Risk Cert (Harvard)
Regional Health Risk Assessment and Toxicology Specialist, Health Canada (Alberta region)
Suite 282, 220-4th Ave SE, Calgary, Alberta, Canada, T2G 4X3

ABSTRACT information resources and databases. As a result, there is


a tendency among the users, business enterprises,
The cyber-infrastructure have become increasingly private and public sectors to get connected to the WWW
complex and inextricably intertwined with the in order to take advantage of both the pull and push
infrastructures of the public, and private organizations. technology. There is a tremendous growth in computer
The inter-connectedness of computers globally has and web related crimes; while, a similar growth is
enhanced our capability to analyze various databases; missing in the development of security solutions. It has
however, it has also raised the issue of information and created a new type of warfare where information
databases security on the web. The law enforcement and systems, web based databases are the targets by which it
the computational forensic analysis process, in its is being exploited by the unscrupulous elements in the
relative infancy, is the unwilling victim of the rapid society for disrupting peace and causing mayhem [5].
advancement of information technology. An epistemic The criminal justice delivery system has not kept pace
uncertainty is an unavoidable attribute which can be with the technological advancements, which have taken
present in digital investigations and could affect the place with the advent of Information technology. To
investigation process. So there has to be a well-designed effectively combat cyber-infrastructure related crimes, it
system to analyze information from various sources for is not only sufficient to successfully investigate the
possible security threat and act appropriately in the crime but more important is to prosecute and administer
event of any suspicion. Forensic digital analysis is justice, according to the law of the land.
unique among all the forensic applications. It is
inherently mathematical and generally comprises of Based on current approaches, security systems and
more data from an investigation than other types of databases generate enormous amounts of data and
applications. In this paper, we have presented a data therefore, higher priority must be given to systems that
fusion based digital investigation model by which can analyze rather than merely collect such data, while
conflicting information due to the unavoidable still retaining collections of essential forensic data. The
uncertainty can be identified and processed. Data fusion forensic aspect to the overall model of security is
along with data mining techniques applied in the context equally important as the area of Computer Forensics
of database and intelligence analysis can be correlated (CF) lends itself heavily to the response of a criminal
with various security issues and crimes. Thus it holds violation that has already occurred. Data fusion
the promise of alleviating such problems. Application of promises to play a proactive and central role in the
our proposed model in the broader Health Care and Life future prevention, detection, attribution, and remediation
Sciences (HCLS) and public health risk analysis of such types of cyber crimes.
(PHRA) and toxicological database integration areas are
briefly discussed and future projects are proposed. Our research is aimed at providing a fusion based digital
investigation model which can address various types of
threats in network and facilitate forensic analysis. The
model called "A fusion based investigation model for
Keywords: Information Technology, Data Mining, Data
Computer Forensics”[3], is derived from the Joint
Fusion, Cyber Crime, Digital Evidence, Computer
Director Laboratories (JDL) data fusion model [7] and
Forensics (CF).
is built around a set of algorithms in various levels of
fusion. The algorithms at various levels can be executed
1. INTRODUCTION continuously and autonomously in its environment, able
to carry out activities in a flexible and intelligent
Undoubtedly, the World Wide Web (WWW)
manner while being responsive to changes in its
connectivity through the ever-growing cyber-
environment.
infrastructure has facilitated rapid availability of
This paper concerns CF as it is a problem of great transmitted in digital form. It is fragile in
significance to information infrastructure protection nature and can easily be altered or destroyed. It
because computer networks are at the core of the is unique when compared to other forms of
operational control of much of the day to day documentary evidence.
operations. • Computer forensic tools available are unable to
analyze all the data found on computer system
2. COMPUTER FORENSICS (CF) [1,2] to reveal the overall pattern of the data set,
which can help digital investigators decide
CF is the science of busting cyber criminals. It can be what steps to take next in their search. Also the
defined more pedantically as the "investigation of digital data offered by computer forensic tools can
evidence for use in criminal or civil courts of law." CF often be misleading due to the dimensionality,
is most commonly used after a suspected hack attempt, complexity and amount of the data presented.
in order to analyze a computer or network for evidence
of intrusion. It is the use of scientifically derived and Our proposed data fusion based investigative tool will
proven methods toward the preservation, collection, concentrate on improving the quality of data rather than
validation, identification, analysis, interpretation, quantity for analysis. Further, it will lessen the
documentation, and for the purpose of presentation of processing time required and ultimately reduce the
digital evidence derived from digital sources in the court monetary costs of digital investigations.
of Law to punish the criminal [1]. The major goals are
to: 3. DATA FUSION SYSTEMS AND
• Provide a conclusive description of all cyber- RELATED WORK
attack activities for the purpose of complete
post-attack enterprise and critical infrastructure Multi-sensor data fusion is an evolving technology,
information restoration; concerning the problem of how to fuse data from
• Correlate, interpret, and predict adversarial multiple sensors in order to make a more accurate
actions and their impact; estimation of the environment and to generate
• Make digital data suitable and persuasive for information of a superior quality [7, 11, 12]. It is a
introduction into a criminal investigative formal framework in which the means and tools for the
process; and alliance of data originating from different sources are
• Provide sufficient evidence to allow the expressed [14]. The first data fusion methods were
criminal perpetrator to be successfully primarily applied in the military domain, in recent years
prosecuted. these methods have also been applied to problems in the
civilian domain and various non-military applications
A major issue to achieve these goals is how to rapidly (e.g., air traffic controls, robotics, image processing,
collect and normalize digital evidence from a variety of remote sensing, hazardous wastes tracking,
sources including firewalls, hosts, network management environmental data fusion, etc.). A more recent idea is
systems, and routers. The information that is collected the application of Multisensor data fusion techniques to
could then be used to predict or anticipate adversarial the area of information security [13, 16].
actions, understand the current state of affairs, and help
in determining appropriate courses-of-action. Multi-sensor data fusion provides an important
functional framework for building next generation
2.1 Problem Statement security systems. Tim Bass presented a Data Fusion
CF faces several problems. Some of them are model, based on the Joint Directors of Laboratories
highlighted below. (JDL) Functional Data Fusion Process Model [16].

• Digital investigations are becoming more time There are a number of research projects that have started
consuming and complex as the volumes of to implement Multisensor data fusion techniques. One
data requiring analysis continue to grow. of these projects is EMERALD [4], an acronym for
• Digital investigators are finding it increasingly `Event Monitoring Enabling Responses to Anomalous
difficult to use current tools to locate vital Live Disturbances'. It couples sensors, so the state of
evidence within the massive volumes of data. one sensor can adjust another. This suppresses false
• Log files are often large in size and multi- positives and increases sensitivity. The idea in all types
dimensional, which makes the digital of fusion seems to use an approach, which mainly
investigation and search for supporting focuses on the implementation. There is no general
evidence more complex. architecture of Forensic data fusion systems. Based on
these observations, it seems important to start a
• Digital evidence [6, 8] by definition is systematic analysis and to develop a generic architecture
for forensic fusion-based investigation model, which
information of probative value stored or
can facilitate digital forensic analysis and to ultimately well as prioritizing these events. The data fusion process
restrain the cyber criminals. is further explained in four different progressions.

3.1 Requirements of Forensic Data Fusion system • Collection of events from different sources
Developing a system for agencies conducting CF • Processing of events in various levels of fusion
investigations that will utilize the data fusion technology • Decision making
requires an appropriate methodology for selecting • Evidence accumulation
architecture and adopting alternative techniques for
cost-effective system requirements [7]. Generally
accepted engineering guidelines for data fusion systems
recommend a paradigm in which the design and
development flow from an overall system requirements
and constraints to a specification of the role for data
fusion within the system. There are several fundamental
issues, which should be taken into consideration when
building an investigation model [9, 10]:

• What architecture should be used?


• What algorithms and techniques are
appropriate and optimal for a particular
application?
• How should the individual source data be
processed to extract the maximum amount of
information?
• How does the data collection environment
affect the processing? Figure 1 – A fusion based digital investigation model
• How can the fusion process be optimized?
• What accuracy can be achieved by a data 4.1 Data Collection & Pre-Processing
fusion process? The first step is the data collection & preprocessing
phase where data collected from various sources are
Contemporary view on the problem of security is fused and processed to produce data specifying
concerned with an idea that particular protective semantically understandable and interpretable attributes
mechanisms and corresponding software must be of objects [7]. The collected data are aligned in time,
integrated along with the forensic capabilities into a space or measurement units and the extracted
fusion system interacting via exchange of information information during processing phase is saved to the
and making decisions in a cooperative and coordinated knowledge database or knowledgebase.
manner. These systems should be adaptive to traffic
variations, reconfiguration of the software and hardware 4.2 Low level fusion
components. This level of fusion processes the data to achieve a
refined representation and reduces the quantity by
4. A FUSION BASED DIGITAL INVESTIGATION concurrently retaining useful information and improves
MODEL its quality, with minimal loss of detail. It is mainly
concerned with data cleaning, data transformation and
As mentioned above, the proposed model (figure 1), “A data reduction [7].
fusion based Investigation Model for Computer
Forensics”[3] is motivated by Data fusion model • Data cleaning (removes irrelevant information)
proposed by the JDL [7] that fuses data from various • Data transformation (converts the raw data into
heterogeneous sources in order to attain low false alarm structured information)
rates and high threat detection rates. In addition to the • Data reduction (reduces the representation of
functionalities of Data Fusion Model, our model also the dataset into a smaller volume to make
supports post mortem forensic analysis by preserving analysis more practical and feasible)
the necessary potential legal digital evidence. Therefore,
the main goal is to provide proactively a more The above procedures enable the data fusion process to
intelligent fusion based model for CF which partly focus on data that applies most to the current situation
automates the detection and prevention of intrusions and and reduces the data fusion system load. It can help
handles the true positives there by reducing the number reduce a search space into smaller, more easily managed
of events the operator of the system has to inspect as parts which can save valuable time during digital
investigation.
4.3 Data estimation phase interface. Evidence Report can be generated after
It is based on a model of the system behavior stored in analysis.
the feature database and the knowledge acquired by the
knowledgebase. The fusion algorithm estimates the state During fusion at every level, in order to increase
of the system. After extracting features from the accuracy, external knowledge is always utilized as
structured datasets, data fusion system will save them to auxiliary information, and extracted information during
an information product database. processing procedure are saved to the knowledgebase
making the process more dynamic.
4.4 High-level fusion
This level of fusion develops a background description 4.8 Feasibility
of relations between entities. It consists of event and The proposed model can be useful as an evidence
activity interpretation and eventually contextual acquisition tool for supplying the Offline Admissible
interpretation [7]. Furthermore, it involves the use of Legal Digital Evidence for the Forensic Investigating
data mining functionalities such as classification and agencies including preservation and continuity of
clustering to extract useful patterns among the data. The evidence, and transparency of the CF methods. As the
results obtained would be indicative of destructive admissibility and weight are the two determinants in the
behavior patterns. These features form a feature space legal acceptability of digital evidence [6], the courts
fused to identify and classify them to serve for attack deal with issues related to the difference between the
detection and recognition. It effectively extends and novel scientific evidence and the legal evidence.
enhances the completeness, consistency, and level of
abstraction of the situation description. There are three requirements for the evidence to be
admissible in the court [2]:
4.5 Decision level fusion
The patterns discovered from the high level fusion still • Authentication (showing a true copy of the
needs to be analyzed to determine the relevancy of those original)
patterns. The goal of this step is to identify pertinent • The best evidence rule (presenting the original)
patterns. Further, it analyzes the current situation and • Exceptions to the hearsay rule. ( allowable
projects it into the future to draw inferences about exceptions are when confession, business or
possible outcomes. It identifies intent, lethality, and other official records are involved)
opportunity [7]. Finally, decision of the fusion result
along with necessary information is stored in the From an evidence perspective, the law enforcement
forensic logbook from which the forensic evidence agencies will seek something that they can demonstrate
report can be generated to be used in the expert to others long after the event is over (i.e. the evidence
testimony in the Court of Law. log file). The main aim is to identify the features that
will be responsive to the needs of the law enforcement
4.6 Forensic Logbook agencies in collecting the information and protecting the
The forensic log book is a record keeping system. The chain of evidence of computer intrusions, so that it will
term forensics refers to the post-mortem analysis of stand up in the court of law to prove the crime.
evidence; however, in the context of computer we refer
to the analysis of evidence as CF [1, 2]. The digital 5. CONCLUSIONS AND FUTURE WORK
information captured are recorded in the log book with a
pre-defined format like date and time of the event, To collect the digital evidence is not an easy task. In this
intruder’s IP address, and target IP address, users, type paper we have proposed a proprietary fusion based
of event, and success or failure of the event, origin of investigation model which can effectively process
request for identification/authentication data and name different types data both syntactically and semantically
of object for object introduction and deletion data. A to retrieve the legal digital evidence. For tracking such
time stamp is added to all data logged. The time line can types of coordinated multifaceted cyberspace attacks
be seen as a recording of the attack. So documentation require cluster analysis techniques, adaptive neural
purposes a report containing the data pre-processing networks, and rule-based knowledgebase systems.
process. The above information can be generated and
used by the CF expert as potential legal digital evidence Profiling, identifying, tracing, and apprehending cyber
in the court of law. suspects are the important issues of research today.
Within a computer system the anonymity afforded by
4.7 User Interface the criminal encourages destructive behavior while
This proposed model separates the user interface from making it extremely difficult to prove the identity of the
the data collection and processing elements. The criminal. Computer Forensics has emerged in response
administrator and computer forensic experts can to the escalation of crimes committed by the use of
communicate with the forensic logbook through user computer systems either as an object of crime, an
instrument used to commit a crime or a repository of
evidence related to a crime. The evidence gathering [8] D. Brezinski and T. Killalea, Guidelines for
process in a computing environment, by their nature is Evidence Collection and Archiving, RFC3227,
technical and different from other forms of evidence February 2002.
gathering. Data fusion along with data mining [9] C. King, E. Osmanoglu, and C. Dalton, Security
techniques promises to play a central role in the future Model Design, Deployment and Operations, chapter
prevention, detection, attribution, and remediation of 4, McGraw-Hill Osborne Media, 2001.
such types of cyber crimes. [10] P. Stephenson: Intrusion Management: A Top
Level Model for Securing Information Assets in an
Our future work includes extending the investigation Enterprise Environment, Proceedings of EICAR 2000,
model to detect and prevent the various types of cyber Brussels, Belgium, March 2000.
threats. Furthermore, the co-author of this paper has also [11] http://www.data-fusion.org , accessed May 30,
proposed a data fusion based dynamic risk analysis 2009.
framework at the 2008 Society for Risk Analysis (SRA) [12] D. L. Hall, and J. Linas. An Introduction to
annual conference proceeding symposium (Boston, MA, Multisensor Data Fusion. In Proceedings of the IEEE,
USA). In addition to the various military and non- vol. 85, n° 1, pp. 6-23, 1997.
military applications described in this paper, the [13] P. Varshney, Distributed Detection and Data
application of these emerging data fusion methodologies Fusion. Springer-Verlag, New York, NY., 1995
can be effectively extended to related areas such as [14] E. Waltz and J. Linas, Multisensor Data Fusion.
environmental and public health risk analysis, Artech House, Boston, MA, 1990.
toxicology, and Health Care and Life Sciences (HCLS) [15] A. Mohapatra, Semantic Web Informatics
data fusion. The methodology proposed here can find Facilitated Tool (SWIFT) – Dynamic Analysis of
application in these emerging areas in terms of Risk Tools (DART): A Knowledgebase Framework,
protecting environmental and human health ecosystems. Symposium on, “A Palette of Scientific Data - Online
Application of data fusion and data mash-up Tools to Support Risk Assessment”, Society for Risk
technologies facilitated by a semantic web informatics Analysis (SRA), Boston, MA, 2008.
framework can increase the efficiency of dynamic data [16] T. Bass, Multi-sensor Data Fusion for Next
integration and risk analysis of various systems [15]. Generation Distributed Intrusion Detection System,
In Proceedings of the IRIS National Symposium on Sensor
Disclaimer: Views expressed in this paper are those of and Data Fusion, 1999.
authors’ and do not necessarily represent: a) affiliating
agency positions, or b) endorsement of specific tools.

6. REFERENCES

[1] E. Casey (ed.), Handbook of Computer Crime


Investigation, Academic Press, 2001.
[2] E. Casey, Digital Evidence and Computer Crime:
Forensic Science, Computer and the Internet, Academic
Press, 2000.
[3] S Satpathy, A Kar, S Pradhan, A Fusion based
model for Computer Forensics-Indian Science
Congress conference-Jan3-7 2005.
[4] P.A. Porras. and P.G. Neumann, EMERALD: Event
monitoring enabling responses to anomalous live
disturbances in proceedings of the 20th National
Information Systems Security Conference. National
Institute of Standards and technology, 1997.
[5] H Lipson, Tracking and Tracing Cyber Attacks:
Technical Challenges and Global Policy Issues
(CMU/SEI-2002-SR-009), CERT Coordination Center,
November 2002.
[6] J. Danielsson, Project Description A system for
collection and analysis of forensic evidence,
Application to NFR, April 2002.
[7] David L. Hall, Sonya A.H. McMullen,
Mathematical Techniques in Multisensor Data
Fusion,2nd edition, Artech House, 2004.

View publication stats

You might also like