0% found this document useful (0 votes)

720 views32 pages

Weka vs Orange: Data Mining Tools Comparison

This document discusses various data mining tools and provides examples of using the Orange and Weka tools. It describes key features and the user interface of tools like Orange, RapidMiner, Teradata, KNIME, H20, and Weka. It then provides steps to analyze a dataset using the Weka tool, including loading the iris dataset, selecting and running the J48 algorithm, and reviewing the results. The goal is to demonstrate the classification rule process on a dataset using the J48 algorithm.

Uploaded by

Darshan Mithapara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Topics covered

Data Import,
H2O,
Iris Dataset,
Data Mining,
Teradata,
Decision Trees,
Data Science Workflows,
Data Mining Techniques,
Orange Data Mining,
Open Source Software

0% found this document useful (0 votes)

720 views32 pages

Weka vs Orange: Data Mining Tools Comparison

Uploaded by

Darshan Mithapara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Topics covered

Data Import,
H2O,
Iris Dataset,
Data Mining,
Teradata,
Decision Trees,
Data Science Workflows,
Data Mining Techniques,
Orange Data Mining,
Open Source Software

DWDM 191290116048

Practical 1
Aim: Case study on different data mining tools.
 What is Data Mining
Data mining is the process of sorting through large data sets to identify patterns
and relationships that can help solve business problems through data analysis.
Data mining techniques and tools enable enterprises to predict future trends and
make more-informed business decisions.
Data mining is a crucial component of successful analytics initiatives in
organizations. The information it generates can be used in business
intelligence (BI) and advanced analytics applications that involve analysis of
historical data, as well as real-time analytics applications that examine streaming
data as it's created or collected.
 Data Mining Tools
Data Melt Data Mining
Orange Data Mining
Oracle Data Mining
SAS Data Mining
RapidMiner
Teradata
KNIME
Rattle
Weka
H20

Gyanmanjari institute of technology 1

DWDM 191290116048

Orange Data Mining

Orange is a perfect machine learning and data mining software suite. It supports
the visualization and is a software-based on components written in Python
computing language and developed at the bioinformatics laboratory at the faculty
of computer and information science, Ljubljana University, Slovenia.
As it is a software-based on components, the components of Orange are called
"widgets." These widgets range from pre-processing and data visualization to the
assessment of algorithms and predictive modelling.
 Features
Data comes to orange is formatted quickly to the desired pattern, and moving the
widgets can be easily transferred where needed.
Orange allows its users to make smarter decisions in a short time by rapidly
comparing and analysing the data.
It is a good open-source data visualization as well as evaluation that concerns
beginners and professionals.
Data mining can be performed via visual programming or Python scripting.
 User Interface

Gyanmanjari institute of technology 2

DWDM 191290116048

RapidMiner
RapidMiner is a free to use Data mining tool. It is used for data prep, machine
learning, and model deployment. This free data mining software offers a range of
products to build new data mining processes and predictive setup analysis.
 Features
Allow multiple data management methods.
GUI or batch processing.
Integrates with in-house databases.
Interactive, shareable dashboards.
Big Data predictive analytics.
Remote analysis processing.
Data filtering, joining, merging, and aggregating.
Build, train and validate predictive models.
Reports and triggered notifications.
 User Interface

Gyanmanjari institute of technology 3

DWDM 191290116048

Teradata
Teradata is a massively parallel open processing system for developing large-
scale data warehousing applications. Teradata can run on Unix/Linux/Windows
server platform.
 Features
Teradata Optimizer can handle up to 64 joins in a query.
Tera data has a low total cost of ownership. It is easy to set up, maintain, and
administrate.
It supports SQL to interact with the data stored in tables. It provides its extension.
It helps you to distribute the data to the disks automatically with no manual
intervention.
Teradata provides load & unload utilities to move data into/from Teradata
System.
 User Interface

Gyanmanjari institute of technology 4

DWDM 191290116048

KNIME
KNIME is opensource software for creating data science applications and
services. It is one of the best tools for data mining that helps you to understand
data and to design data science workflows.
 Features
Helps you to build an end-to-end data science workflow.
Blend data from any source.
Allows you to aggregate, sort, filter, and join data either on your local machine,
in-database or in distributed big data environments.
Build machine learning models for classification, regression, dimension
reduction.
 User Interface

Gyanmanjari institute of technology 5

DWDM 191290116048

H20
H2O is another excellent opensource software Data mining tool. It is used to
perform data analysis on the data held in cloud computing application systems.
 Features
H2O allows you to take advantage of the computing power of distributed systems
and in-memory computing.
It allows fast and easy deployment into production with Java and binary format.
It helps you to use the programming languages like R,
Python and others to build a model in H2O.
Distributed, In-memory Processing.
 User Interface

Signature:

Date:

Gyanmanjari institute of technology 6

DWDM 191290116048

Practical 2
Aim: Analysis of mining techniques using Weka Tool.

 Weka Tool
Weka: Waikato Environment for Knowledge Analysis

Gyanmanjari institute of technology 7

DWDM 191290116048

 Start Weka
Start Weka. This may involve finding it in program launcher or double clicking
on the [Link] file. This will start the Weka GUI Chooser.
The Weka GUI Chooser lets you choose one of the Explorer, Experimenter,
Knowledge Explorer and the Simple CLI (command line interface).
Click the “Explorer” button to launch the Weka Explorer.
This GUI lets you load datasets and run classification algorithms. It also provides
other features, like data filtering, clustering, association rule extraction, and
visualization, but we won’t be using these features right now.
 Open the data/iris. Arff Dataset
Click the “Open file…” button to open a data set and double click on the “data”
directory.
Weka provides a number of small common machine learning datasets that you
can use to practice on.
Select the “iris. arff” file to load the Iris dataset.
The Iris Flower dataset is a famous dataset from statistics and is heavily borrowed
by researchers in machine learning. It contains 150 instances (rows) and 4
attributes (columns) and a class attribute for the species of iris flower.
 Select and Run an Algorithm
Now that you have loaded a dataset, it’s time to choose a machine learning
algorithm to model the problem and make predictions.
Click the “Classify” tab. This is the area for running algorithms against a loaded
dataset in Weka.
You will note that the “Zero” algorithm is selected by default.
Click the “Start” button to run this algorithm.
The Zero algorithm selects the majority class in the dataset (all three species of
iris are equally present in the data, so it picks the first one: setosa) and uses that
to make all predictions. This is the baseline for the dataset and the measure by
which all algorithms can be compared. The result is 33%, as expected (3 classes,
each equally represented, assigning one of the three to each prediction results in
33% classification accuracy).

Gyanmanjari institute of technology 8

DWDM 191290116048

You will also note that the test options select Cross Validation by default with 10
folds. This means that the dataset is split into 10 parts: the first 9 are used to train
the algorithm, and the 10th is used to assess the algorithm. This process is
repeated, allowing each of the 10 parts of the split dataset a chance to be the held-
out test set.
The Zero algorithm is important, but boring.
Click the “Choose” button in the “Classifier” section and click on “trees” and
click on the “J48” algorithm.
This is an implementation of the C4.8 algorithm in Java (“J” for Java, 48 for C4.8,
hence the J48 name) and is a minor extension to the famous C4.5 algorithm.
Click the “Start” button to run the algorithm.
 Review Results
After running the J48 algorithm, you can note the results in the “Classifier output”
section.
The algorithm was run with 10-fold cross-validation: this means it was given an
opportunity to make a prediction for each instance of the dataset (with different
training folds) and the presented result is a summary of those predictions.
Firstly, note the Classification Accuracy. You can see that the model achieved a
result of 144/150 correct or 96%, which seems a lot better than the baseline of
33%.
Secondly, look at the Confusion Matrix. You can see a table of actual classes
compared to predicted classes and you can see that there was 1 error where an
Iris-setosa was classified as an Iris-versicolor, 2 cases where Iris-virginica was
classified as an Iris-versicolor, and 3 cases where an Iris-versicolor was classified
as an Iris-setosa (a total of 6 errors). This table can help to explain the accuracy
achieved by the algorithm.

Signature:

Date:

Gyanmanjari institute of technology 9

DWDM 191290116048

Practical 3
Aim: Demonstration of classification rule process on dataset using J48
algorithm.
 Step 1: Select Database Student.

Gyanmanjari institute of technology 10

DWDM 191290116048

 Step 2: Select ARFF file.

 Step 3: Select J48 Algorithm from Trees Classifier.

Gyanmanjari institute of technology 11

DWDM 191290116048

 Step 4: Show Summary of Dataset Using J48 Algorithm.

 Step 5: Show Tree of Dataset.

Signature:

Date:

Gyanmanjari institute of technology 12

DWDM 191290116048

Practical 4
Aim: Demonstration of classification rule process on dataset using ID3
algorithm.
 Step 1: Select Database Employee.

Gyanmanjari institute of technology 13

DWDM 191290116048

 Step 2: Select ARFF File.

 Step 3: Select ID3 Algorithm from Trees Classifier.

Gyanmanjari institute of technology 14

DWDM 191290116048

 Step 4: Show Summary of Dataset Using ID3 Algorithm.

Signature:

Date:

Gyanmanjari institute of technology 15

DWDM 191290116048

Practical 5
Aim: Demonstration of classification rule process on dataset using Naive
Bayes algorithm.
 Step 1: Select Database Student.

Gyanmanjari institute of technology 16

DWDM 191290116048

 Step 2: Select ARFF File.

 Step 3: Select Naive Bayes Algorithm from Classifier.

Gyanmanjari institute of technology 17

DWDM 191290116048

 Step 4: Show Summary of Dataset Using Naive Bayes Algorithm.

Signature:

Date:

Gyanmanjari institute of technology 18

DWDM 191290116048

Practical 6
Aim: Demonstration of clustering rule process on dataset iris using simple
k-means.
 Step 1: Select Database IRIS.

Gyanmanjari institute of technology 19

DWDM 191290116048

 Step 2: Select ARFF File.

 Step 3: Show Attributes of Current Relation IRIS.

Gyanmanjari institute of technology 20

DWDM 191290116048

 Step 4: Select Simple K Means Cluster from Clusterers.

 Step 5: Show Cluster Output of Dataset IRIS using Simple K Means

Cluster.

Signature:

Date:

Gyanmanjari institute of technology 21

DWDM 191290116048

Practical 7
Aim: Demonstration of clustering rule process on dataset student using
simple k-means.
 Step 1 : Select Database Student.

Gyanmanjari institute of technology 22

DWDM 191290116048

 Step 2: Select ARFF File.

 Step 3: Show Attributes of Current Relation Student.

Gyanmanjari institute of technology 23

DWDM 191290116048

 Step 4: Select Simple K Means Cluster from Clusterers.

 Step 5: Show Cluster Output of Dataset Student using Simple K

Means Cluster.

Signature:

Date:

Gyanmanjari institute of technology 24

DWDM 191290116048

Practical 8
Aim: Demonstration of Association rule process on dataset supermarket
using Apriori.
 Step 1 : Select Database Supermarket.

Gyanmanjari institute of technology 25

DWDM 191290116048

 Step 2: Select ARFF File.

 Step 3: Show Attributes of Current Relation Supermarket.

Gyanmanjari institute of technology 26

DWDM 191290116048

 Step 4: Select Apriori Associator from Associations.

 Step 5: Best rules found From Supermarket Dataset using Apriori

Associator.

Signature:

Date:

Gyanmanjari institute of technology 27

DWDM 191290116048

Practical 9
Aim: Demonstrate how we can insert particular algorithm in Weka by
external package.
 Step 1 : Select Package manager From Tools.

 Step 2: Select Package from Package manager to Insert External

Package.

Gyanmanjari institute of technology 28

DWDM 191290116048

 Step 3: After Select Package from Package manager Install External

Package.

Signature:

Date:

Gyanmanjari institute of technology 29

DWDM 191290116048

Practical 10
Aim: Study and Analyze DTREG Data Mining Tool.
 DTREG Data Mining Tool.
DTREG is a robust application that is installed easily on any Windows system.
DTREG reads Comma Separated Value (CSV) data files that are easily created
from almost any data source.
Once you create your data file, just feed it into DTREG, and let DTREG do all of
the work of creating a decision tree, Support Vector Machine, K-Means
clustering, Linear Discriminant Function, Linear Regression or Logistic
Regression model. Even complex analyses can be set up in minutes.

 Features.
Data Import: DTREG can import data from various sources such as CSV, Excel,
SQL, ODBC, and Oracle. It also supports importing data from SAS datasets.
Data Visualization: DTREG provides a range of visualization tools such as scatter
plots, histograms, box plots, and line charts, to help users understand the
distribution and relationships among variables in their datasets.
Feature Selection: DTREG offers multiple feature selection methods such as
correlation-based feature selection, backward feature elimination, and forward
feature selection, which helps users to select the most relevant variables for
building predictive models.
Model Building: DTREG supports various algorithms for model building,
including decision trees, regression analysis, neural networks, and support vector
machines (SVMs). Users can choose the algorithm that best suits their data and
research question.
Model Evaluation: DTREG provides several evaluation metrics such as root
mean square error (RMSE), mean absolute error (MAE), and coefficient of
determination (R-squared) to help users assess the accuracy of their predictive
models.
Model Deployment: DTREG allows users to export their predictive models as
C++ or Java code, which can be integrated into other software applications.

Gyanmanjari institute of technology 30

DWDM 191290116048

 Step 1: Select Zoo DTREG(.dtr) Dataset .

 Step 2:Show the Zoo Dataset Variables.

 Step 3: Show the Tree of Zoo Dataset.

Gyanmanjari institute of technology 31

DWDM 191290116048

 Step 4: Show the Model Size and Error Rate of Zoo Dataset.

Signature:

Date:

Gyanmanjari institute of technology 32

Common questions

The ID3 and J48 algorithms, both implemented in Weka, differ primarily in their method of creating decision trees. ID3 algorithm uses a top-down approach starting with the entire dataset and uses the statistical property of information gain to guide the creation of the tree . It lacks pruning which can lead to overfitting. J48, on the other hand, extends the ID3 algorithm by incorporating techniques like pruning to handle noise and improve model simplicity, thereby enhancing predictive performance . This pruning reduces the complexity of the decision-making process and ensures that the model generalizes better when applied to unseen data .

The Weka tool's graphical user interface (GUI) supports exploration and visualization by offering a user-friendly environment that facilitates various tasks, including dataset loading, algorithm selection, and result visualization . The GUI provides tabs like "Explorer" that allow users to perform classification, clustering, and association rule extraction, supporting interactive explorations . Additionally, visual representations like decision trees from algorithms such as J48 or other plots enhance the understanding of the data and model outputs, allowing users to intuitively explore complex datasets and gain new insights .

Teradata offers significant advantages in scalability and data handling due to its massively parallel open processing system that supports large-scale data warehousing applications . This system can efficiently process large volumes of data across different hardware configurations, including Unix, Linux, and Windows server platforms . Teradata's optimizer handles complex queries and supports SQL for interaction with structured databases, enabling automatic data distribution to disks, which reduces manual interventions and optimizes storage usage . Moreover, it offers utilities for fast data loading and unloading, adding to its efficiency in data management tasks .

H2O is designed to leverage the power of distributed and in-memory processing in cloud computing environments by allowing users to build models using popular programming languages like R and Python with enhanced speed and reduced latency . It supports easy deployment into production environments through Java and binary formats, facilitating rapid scaling and integration into cloud systems . H2O's capabilities enable the processing of large datasets in-memory across distributed nodes, thus optimizing performance and efficiency of data-driven applications in a cloud infrastructure .

RapidMiner integrates with big data environments by providing options for data management that include data filtering, joining, merging, and aggregating . This integration supports interactive, shareable dashboards and big data predictive analytics, allowing for seamless integration with in-house databases . Such integration is beneficial as it enables organizations to deploy predictive models efficiently across large datasets and supports remote analysis processing . This helps enterprises harness big data for actionable insights, thus improving the decision-making process .

Orange Data Mining software distinguishes itself through its component-based architecture utilizing "widgets" which facilitate tasks from data pre-processing to predictive modeling . It supports rapid data formatting and allows users to make quick decisions by comparing and analyzing data efficiently. Moreover, it offers both visual programming and Python scripting, making it versatile for different user preferences . Unlike some other tools, Orange is known for its open-source nature, which promotes wider accessibility and community-driven enhancements .

The J48 algorithm in Weka is an implementation of the C4.8 algorithm and enhances prediction accuracy by generating a decision tree based on the training dataset, as opposed to the Zero algorithm that simply chooses the majority class . With 10-fold cross-validation, the J48 algorithm achieved a 96% classification accuracy compared to the 33% accuracy of the Zero algorithm, which highlights its effectiveness in leveraging the dataset to make informed predictions . This difference is illustrated by analyzing the confusion matrix which provides insights into classification errors and prediction specifics .

Weka uses a cross-validation method with 10 folds by default for training and testing the algorithms. The dataset is divided into 10 equal parts; nine of these parts are used to train the model, and the tenth is used for evaluation. This process is repeated 10 times, each time with a different part as the test set . This method helps in reducing overfitting and provides a robust evaluation of the model's predictive performance .

DTREG enhances model transparency and interpretability through its robust data visualization tools which include scatter plots, histograms, and box plots, allowing users to better understand variable distributions and relationships . Feature selection methods within DTREG, such as correlation-based feature selection and feature elimination, facilitate the identification of relevant variables, which simplifies model explanations . Additionally, DTREG supports exporting models as C++ or Java code which can improve interpretability for integrating models with existing software, enabling clear documentation and understanding of the model logic .

KNIME facilitates the building of end-to-end data science workflows through its visual workflow editor which supports data blending, aggregation, filtering, and joining, whether the data resides locally, in-database, or in distributed environments . This offers flexibility and ease compared to traditional software that often requires extensive coding and programming expertise. KNIME allows users to create complete data science processes visually, enabling machine learning model building and deployment with minimal coding . It supports integration with various data sources and tools, making it a cohesive solution for executing workflows efficiently .

Android Development Tools Overview
No ratings yet
Android Development Tools Overview
25 pages
Mobile Application Development
No ratings yet
Mobile Application Development
8 pages
Krushi Seva Kendra Software Overview
No ratings yet
Krushi Seva Kendra Software Overview
83 pages
Practical File OF Programming in PHP UGCA (1930)
No ratings yet
Practical File OF Programming in PHP UGCA (1930)
29 pages
Java AWT and Swing Overview
No ratings yet
Java AWT and Swing Overview
9 pages
AWT and Swing Java GUI Concepts
No ratings yet
AWT and Swing Java GUI Concepts
20 pages
Mobile App Development Overview
No ratings yet
Mobile App Development Overview
71 pages
Generations of Operating Systems Overview
No ratings yet
Generations of Operating Systems Overview
6 pages
Advanced Java Practical Programs
No ratings yet
Advanced Java Practical Programs
27 pages
Mobile App Development Exam Model Answers
No ratings yet
Mobile App Development Exam Model Answers
126 pages
Android Development Tools Overview
No ratings yet
Android Development Tools Overview
9 pages
Demand Paging Techniques on Symbian
100% (2)
Demand Paging Techniques on Symbian
177 pages
Android Layout Types Explained
No ratings yet
Android Layout Types Explained
92 pages
Plant Disease Detection System
No ratings yet
Plant Disease Detection System
11 pages
Python Hand Gesture Mouse Control Report
No ratings yet
Python Hand Gesture Mouse Control Report
12 pages
Power BI Desktop Installation Guide
No ratings yet
Power BI Desktop Installation Guide
6 pages
Vector vs. Array Processing Explained
No ratings yet
Vector vs. Array Processing Explained
7 pages
Overview of Operating System Types
No ratings yet
Overview of Operating System Types
28 pages
Servlet Lifecycle and JSP Overview
No ratings yet
Servlet Lifecycle and JSP Overview
25 pages
Arrays: Types, Operations, and Sorting
No ratings yet
Arrays: Types, Operations, and Sorting
27 pages
Mobile App for Item Donations
No ratings yet
Mobile App for Item Donations
6 pages
Overview of Java Swing Components
No ratings yet
Overview of Java Swing Components
60 pages
Creating Multiple Activities in Android
No ratings yet
Creating Multiple Activities in Android
5 pages
Mobile App Development Syllabus 22617
No ratings yet
Mobile App Development Syllabus 22617
3 pages
Advanced Java Programming Practices
No ratings yet
Advanced Java Programming Practices
33 pages
AI Concepts and Applications Overview
No ratings yet
AI Concepts and Applications Overview
12 pages
Android Programming Question Bank
No ratings yet
Android Programming Question Bank
3 pages
PHP Web-Based To-Do List Application
No ratings yet
PHP Web-Based To-Do List Application
16 pages
Gesture-Controlled Virtual Mouse System
No ratings yet
Gesture-Controlled Virtual Mouse System
1 page
Database Principles Overview
No ratings yet
Database Principles Overview
56 pages
ASM Chart for 2-Bit Up-Down Counter
No ratings yet
ASM Chart for 2-Bit Up-Down Counter
13 pages
Overview of Operating System Types
No ratings yet
Overview of Operating System Types
12 pages
A Project Report On: Diploma in Computer Engineering
No ratings yet
A Project Report On: Diploma in Computer Engineering
38 pages
Optimal Page Replacement Program
No ratings yet
Optimal Page Replacement Program
14 pages
First Stage of Software Development
No ratings yet
First Stage of Software Development
25 pages
Operating System Chapter 5 Overall
No ratings yet
Operating System Chapter 5 Overall
9 pages
Features of Operating Systems Report
No ratings yet
Features of Operating Systems Report
16 pages
Android Activity Life Cycle in Hindi
No ratings yet
Android Activity Life Cycle in Hindi
5 pages
ATM System UML Diagrams Overview
No ratings yet
ATM System UML Diagrams Overview
6 pages
CLARA vs. CLARANS in Clustering
No ratings yet
CLARA vs. CLARANS in Clustering
26 pages
Mobile App Development Overview
No ratings yet
Mobile App Development Overview
10 pages
Java Applet Basics and Methods
No ratings yet
Java Applet Basics and Methods
8 pages
Mad Lab Installation Guide
No ratings yet
Mad Lab Installation Guide
44 pages
ARM Microprocessor Fundamentals Syllabus
No ratings yet
ARM Microprocessor Fundamentals Syllabus
20 pages
Overview of Operating Systems Explained
No ratings yet
Overview of Operating Systems Explained
20 pages
Software Testing Concepts and Practices
No ratings yet
Software Testing Concepts and Practices
3 pages
Online Class Test Project Report
No ratings yet
Online Class Test Project Report
20 pages
Mobile App Development Exam Paper 2024
No ratings yet
Mobile App Development Exam Paper 2024
3 pages
Advantages and Features of PHP
No ratings yet
Advantages and Features of PHP
15 pages
Memory Management Techniques Overview
No ratings yet
Memory Management Techniques Overview
87 pages
PHP Database Operations Guide
No ratings yet
PHP Database Operations Guide
10 pages
MSBTE Diploma Engineering MCQs Guide
No ratings yet
MSBTE Diploma Engineering MCQs Guide
378 pages
Java Programming Concepts Explained
No ratings yet
Java Programming Concepts Explained
52 pages
ARMv4 Instruction Set Overview
No ratings yet
ARMv4 Instruction Set Overview
20 pages
Concepts and Techniques: - Chapter 4
No ratings yet
Concepts and Techniques: - Chapter 4
58 pages
Introduction to Iris Dataset in Weka
No ratings yet
Introduction to Iris Dataset in Weka
55 pages
Data Warehousing Laboratory Report
No ratings yet
Data Warehousing Laboratory Report
55 pages
Overview of WEKA for Machine Learning
No ratings yet
Overview of WEKA for Machine Learning
17 pages
Data Mining Tools and Weka Implementation
No ratings yet
Data Mining Tools and Weka Implementation
66 pages
Comparative Analysis of Data Mining Tools
No ratings yet
Comparative Analysis of Data Mining Tools
4 pages
Basic ICT Curriculum Overview
No ratings yet
Basic ICT Curriculum Overview
10 pages
IoT-Based Room Temperature Monitoring
No ratings yet
IoT-Based Room Temperature Monitoring
39 pages
Geovia Whittle Outline Advanced July2024
No ratings yet
Geovia Whittle Outline Advanced July2024
2 pages
I2 Analyst - S Notebook 8 Release Notes
No ratings yet
I2 Analyst - S Notebook 8 Release Notes
17 pages
E-commerce URL Patterns for Shopping
No ratings yet
E-commerce URL Patterns for Shopping
5 pages
Cadence NCSim Setup and Usage Guide
No ratings yet
Cadence NCSim Setup and Usage Guide
6 pages
OpenGL Graphics Lab Course Overview
No ratings yet
OpenGL Graphics Lab Course Overview
30 pages
Cyberbloc FP-S: Advanced C-arm System
No ratings yet
Cyberbloc FP-S: Advanced C-arm System
7 pages
TM Series Brochure
No ratings yet
TM Series Brochure
4 pages
Audacity Manual
No ratings yet
Audacity Manual
3 pages
i-CAT FLX Imaging System Manual
No ratings yet
i-CAT FLX Imaging System Manual
122 pages
Python Installation and Tools Guide
No ratings yet
Python Installation and Tools Guide
6 pages
Blood Bank Management System Overview
No ratings yet
Blood Bank Management System Overview
11 pages
Servicenow Automationpdf Compress
100% (2)
Servicenow Automationpdf Compress
272 pages
Python Expense Tracker Project Guide
No ratings yet
Python Expense Tracker Project Guide
28 pages
Cabide Ui 5th Edition Plug in Installation and Whats New Guide v1 1 en
No ratings yet
Cabide Ui 5th Edition Plug in Installation and Whats New Guide v1 1 en
3 pages
System Design Basics Overview
No ratings yet
System Design Basics Overview
139 pages
Logiq s8 r4 Datasheet v3 Usa Europe Eagm Canada India Anz Asean
100% (1)
Logiq s8 r4 Datasheet v3 Usa Europe Eagm Canada India Anz Asean
20 pages
Second Semester Final Exam Overview
No ratings yet
Second Semester Final Exam Overview
11 pages
School Billing System Project Overview
100% (1)
School Billing System Project Overview
22 pages
HCI: Reasoning, Problem Solving & VR Devices
No ratings yet
HCI: Reasoning, Problem Solving & VR Devices
66 pages
Essential QAC Tools for Web Testing
No ratings yet
Essential QAC Tools for Web Testing
25 pages
Essential Cover Letter Components
No ratings yet
Essential Cover Letter Components
4 pages
MagiCAD 2011.11 User's Guide Overview
No ratings yet
MagiCAD 2011.11 User's Guide Overview
57 pages
Intel Core Ultra Proc PS Datasheet Rev001
No ratings yet
Intel Core Ultra Proc PS Datasheet Rev001
330 pages
h17716 Dell Emc Unity XT Hybrid Family Ds
No ratings yet
h17716 Dell Emc Unity XT Hybrid Family Ds
2 pages
Scadapack E: 530E and 535E Quick Start Guide
No ratings yet
Scadapack E: 530E and 535E Quick Start Guide
78 pages
Image Processing: John C. Russ
100% (1)
Image Processing: John C. Russ
10 pages
SCADA Protocols: IEC 104 vs. MQTT
No ratings yet
SCADA Protocols: IEC 104 vs. MQTT
68 pages
Dual Boot MX Linux with Windows 10/11
No ratings yet
Dual Boot MX Linux with Windows 10/11
27 pages

Weka vs Orange: Data Mining Tools Comparison

Uploaded by

Weka vs Orange: Data Mining Tools Comparison

Uploaded by

DWDM 191290116048

Gyanmanjari institute of technology 1

Orange Data Mining

Gyanmanjari institute of technology 2

Gyanmanjari institute of technology 3

Gyanmanjari institute of technology 4

Gyanmanjari institute of technology 5

Gyanmanjari institute of technology 6

Gyanmanjari institute of technology 7

Gyanmanjari institute of technology 8

Gyanmanjari institute of technology 9

Gyanmanjari institute of technology 10

 Step 2: Select ARFF file.

 Step 3: Select J48 Algorithm from Trees Classifier.

Gyanmanjari institute of technology 11

 Step 4: Show Summary of Dataset Using J48 Algorithm.

 Step 5: Show Tree of Dataset.

Gyanmanjari institute of technology 12

Gyanmanjari institute of technology 13

 Step 2: Select ARFF File.

 Step 3: Select ID3 Algorithm from Trees Classifier.

Gyanmanjari institute of technology 14

 Step 4: Show Summary of Dataset Using ID3 Algorithm.

Gyanmanjari institute of technology 15

Gyanmanjari institute of technology 16

 Step 2: Select ARFF File.

 Step 3: Select Naive Bayes Algorithm from Classifier.

Gyanmanjari institute of technology 17

 Step 4: Show Summary of Dataset Using Naive Bayes Algorithm.

Gyanmanjari institute of technology 18

Gyanmanjari institute of technology 19

 Step 2: Select ARFF File.

 Step 3: Show Attributes of Current Relation IRIS.

Gyanmanjari institute of technology 20

 Step 4: Select Simple K Means Cluster from Clusterers.

 Step 5: Show Cluster Output of Dataset IRIS using Simple K Means

Gyanmanjari institute of technology 21

Gyanmanjari institute of technology 22

 Step 2: Select ARFF File.

 Step 3: Show Attributes of Current Relation Student.

Gyanmanjari institute of technology 23

 Step 4: Select Simple K Means Cluster from Clusterers.

 Step 5: Show Cluster Output of Dataset Student using Simple K

Gyanmanjari institute of technology 24

Gyanmanjari institute of technology 25

 Step 2: Select ARFF File.

 Step 3: Show Attributes of Current Relation Supermarket.

Gyanmanjari institute of technology 26

 Step 4: Select Apriori Associator from Associations.

 Step 5: Best rules found From Supermarket Dataset using Apriori

Gyanmanjari institute of technology 27

 Step 2: Select Package from Package manager to Insert External

Gyanmanjari institute of technology 28

 Step 3: After Select Package from Package manager Install External

Gyanmanjari institute of technology 29

Gyanmanjari institute of technology 30

 Step 1: Select Zoo DTREG(.dtr) Dataset .

 Step 2:Show the Zoo Dataset Variables.

 Step 3: Show the Tree of Zoo Dataset.

Gyanmanjari institute of technology 31

Gyanmanjari institute of technology 32

Common questions

What is the difference in approach between the ID3 and J48 algorithms as implemented in Weka?

What is the difference in approach between the ID3 and J48 algorithms as implemented in Weka?

How do the features of the Weka tool's graphical user interface support the exploration and visualization of data mining processes?

How do the features of the Weka tool's graphical user interface support the exploration and visualization of data mining processes?

In terms of scalability and data handling, what are the main benefits of using Teradata over other data mining tools?

In terms of scalability and data handling, what are the main benefits of using Teradata over other data mining tools?

What are the specific capabilities of H2O for distributed and in-memory processing in cloud computing environments?

What are the specific capabilities of H2O for distributed and in-memory processing in cloud computing environments?

How does RapidMiner integrate with big data environments, and what benefits does this integration bring?

How does RapidMiner integrate with big data environments, and what benefits does this integration bring?

What are the key features and advantages of Orange Data Mining software when compared to other data mining tools?

What are the key features and advantages of Orange Data Mining software when compared to other data mining tools?

How does the J48 algorithm used in Weka enhance the accuracy of predictions compared to a baseline algorithm?

How does the J48 algorithm used in Weka enhance the accuracy of predictions compared to a baseline algorithm?