0% found this document useful (0 votes)

19 views30 pages

Data Science Record

Uploaded by

tharunpravin.06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views30 pages

Data Science Record

Uploaded by

tharunpravin.06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

LIST OF EXPERIMENTS

1. Install, configure and run Hadoop/HDFS/Pig and R

2. Implement word count / frequency programs using MapReduce

3. Implement an MR program that processes a weather dataset R

4. Implement Linear and logistic Regression

5. Implement SVM / Decision tree classification techniques

6. Implement clustering techniques

7. Visualize data using any plotting framework

8. Implement an application that stores big data in Hbase / MongoDB / Pig using Hadoop / R.
Index
Exp. Date Experiment Name Page Mark Signature
No. No.

1. Install, configure and run Hadoop and R

Implement word count / frequency

2.
programs using MapReduce

Implement an MR program that processes

3.
a Weather Dataset

4 Implement Linear and Logistic Regression

Implement SVM / Decision Tree

5
Classification Techniques

6. Implement Clustering Techniques

Visualize data using any plotting

7. framework

Implement an application that stores big

8. data in Hbase / MongoDB / Pig using
Hadoop / R
Exp No: 1
Install, configure and run Hadoop and R
Date:

AIM:

To install, configure and run Hadoop and R.

PROCEDURE:
 Hadoop Installation
Run the following command on ubuntu terminal:
1. Install Hadoop on ubuntu.
a. sudo apt update
b. sudo apt install openjdk-8-jdk -y
2. Check if java is installed
a. java -version
b. javac -version
3. Install SSH server
a. sudo apt install openssh-server openssh-client -y
4. Create a new user in ubuntu
a. sudo adduser hdoop
b. sudo adduser hdoop sudo
c. su – hdoop
5. Create rsa file
a. ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
b. cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
c. chmod 0600 ~/.ssh/authorized_keys
6. Open ssh server
a. ssh localhost
7. Download Hadoop
a. wget https://downloads.apache.org/hadoop/common/hadoop-3.2.3/hadoop-
3.2.3.tar.gz
b. tar xzf hadoop-3.2.3.tar.gz
8. Edit bashrc
a. sudo nano .bashrc
b. Add the following lines at the end of the file
i. export HADOOP_HOME=/home/hdoop/hadoop-3.2.3
ii. export HADOOP_INSTALL=$HADOOP_HOME
iii. export HADOOP_MAPRED_HOME=$HADOOP_HOME
iv. export HADOOP_COMMON_HOME=$HADOOP_HOME
v. export HADOOP_HDFS_HOME=$HADOOP_HOME
vi. export YARN_HOME=$HADOOP_HOME

2
vii. export
HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
viii. export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
ix. export HADOOP_OPTS"-Djava.library.path=$HADOOP_HOME/lib/nativ"
c. source ~/.bashrc
9. Edit JAVA_HOME
a. sudo nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh
b. Add the following line at the end of the file
i. export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
10. Edit core-site
a. sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml
b. Add the following lines
i. <property>
ii. <name>hadoop.tmp.dir</name>
iii. <value>/home/hdoop/tmpdata</value>
iv. <description>A base for other temporary directories.</description>
v. </property>
vi. <property>
vii. <name>fs.default.name</name>
viii. <value>hdfs://localhost:9000</value>
ix. <description>The name of the default file system></description>
x. </property>
11. Edit hdfs-site
a. sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml
b. Add the following lines
i. <property>
ii. <name>dfs.data.dir</name>
iii. <value>/home/hdoop/dfsdata/namenode</value>
iv. </property>
v. <property>
vi. <name>dfs.data.dir</name>
vii. <value>/home/hdoop/dfsdata/datanode</value>
viii. </property>
ix. <property>
x. <name>dfs.replication</name>
xi. <value>1</value>
xii. </property>
12. Edit mapred-site
a. sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml
b. Add the following lines
i. <property>
ii. <name>mapreduce.framework.name</name>
iii. <value>yarn</value>
iv. </property>
13. Edit yarn-site
3
a. sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml
b. Add the following lines
i. <property>
ii. <name>yarn.nodemanager.aux-services</name>
iii. <value>mapreduce_shuffle</value>
iv. </property>
v. <property>
vi. <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
vii. <value>org.apache.hadoop.mapred.ShuffleHandler</value>
viii. </property>
ix. <property>
x. <name>yarn.resourcemanager.hostname</name>
xi. <value>127.0.0.1</value>
xii. </property>
xiii. <property>
xiv. <name>yarn.acl.enable</name>
xv. <value>0</value>
xvi. </property>
xvii. <property>
xviii. <name>yarn.nodemanager.env-whitelist</name>
xix. <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_
HOME,HADOOP_CONF_DIR,CLAS
xx. SPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_
MAPRED_HOME</value>
xxi. </property>
14. Launch Hadoop
a. hdfs namenode -format
b. cd $HADOOP_HOME/sbin
c. start-all.sh
15. Go to browser
a. localhost:8088
b. localhost:9870

 Install R on windows

Download and install R for windows from - https://cran.r-project.org/bin/windows/base/

4
OUTPUT:

RESULT:

Thus, the installation, configuration of Hadoop and R has been executed successfully.

5
Exp No: 2
Implement word count / frequency programs using
Date: MapReduce

AIM:

To implement word count program using MapReduce.

PROCEDURE:
Run the following command on ubuntu terminal.

1. Create a directory on the Desktop named Lab and inside it create two folders; one called
“Input” and the other called “tutorial_classes”.
a. cd Desktop
b. mkdir Lab
c. mkdir Lab/Input
d. mkdir Lab/tutorial_classes
2. Add the file attached with this document “WordCount.java” in the directory Lab
3. Add the file attached with this document “input.txt” in the directory Lab/Input.
4. Type the following command to export the hadoop classpath into bash.
a. export HADOOP_CLASSPATH=$(hadoop classpath)
5. Make sure it is now exported.
a. echo $HADOOP_CLASSPATH
6. It is time to create these directories on HDFS rather than locally. Type the following
commands.
a. hadoop fs -mkdir /WordCountTutorial
b. hadoop fs -mkdir /WordCountTutorial/Input
c. hadoop fs -put Lab/Input/input.txt /WordCountTutorial/Input
7. Go to localhost:9870 from the browser, Open “Utilities → Browse File System” and
you should see the directories and files we placed in the file system.
8. Then, back to local machine where we will compile the WordCount.java file. Assuming
we are currently in the Desktop directory.
a. cd Lab
b. javac -classpath $HADOOP_CLASSPATH -d tutorial_classes
WordCount.javaPut the output files in one jar file (There is a dot at the end)
c. jar -cvf WordCount.jar -C tutorial_classes .
9. Now, we run the jar file on Hadoop.
a. hadoop jar WordCount.jar WordCount /WordCountTutorial/Input
/WordCountTutorial/Output
10. Output the result:
a. hadoop dfs -cat /WordCountTutorial/Output/*

6
OUTPUT:

RESULT:

Thus, the implementation of word count using Map Reduce has been executed successfully.

7
Exp No: 3
Implement an MR program that processes a Weather
Date: Dataset

AIM:

To implement an MR program that processes a weather dataset.

PROCEDURE:
Run the following commands on ubuntu terminal.

1. Download dataset from - ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/daily01

2. Create a java class as MyMaxMin in eclipse IDE

MyMaxMin.java

import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.conf.Configuration;
public class MyMaxMin {
public static class MaxTemperatureMapper extends
Mapper<LongWritable, Text, Text, Text> {
public static final int MISSING = 9999;

@Override
public void map(LongWritable arg0, Text Value, Context context)
throws IOException, InterruptedException {
String line = Value.toString();
if (!(line.length() == 0)) {
String date = line.substring(6, 14);
float temp_Max = Float.parseFloat(line.substring(39, 45).trim());
float temp_Min = Float.parseFloat(line.substring(47, 53).trim());
if (temp_Max > 30.0) {
context.write(new Text("The Day is Hot Day :" + date),

newText(String.valueOf(temp_Max)));
8
}

if (temp_Min < 15) {

context.write(new Text("The Day is Cold Day :" + date),
new Text(String.valueOf(temp_Min)));
}
}
}

}
public static class MaxTemperatureReducer extends
Reducer<Text, Text, Text, Text> {
public void reduce(Text Key, Iterator<Text> Values, Context context)
throws IOException, InterruptedException {
String temperature = Values.next().toString();
context.write(Key, new Text(temperature));
}

public static void main(String[] args) throws Exception {

Configuration conf = new Configuration();
Job job = new Job(conf, "weather example");
job.setJarByClass(MyMaxMin.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path OutputPath = new Path(args[1]);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
OutputPath.getFileSystem(conf).delete(OutputPath);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

3. Now we add these external jars to our MyProject. Right Click on MyProject -> then
select Build Path-> Click on Configure Build Path and select Add External jars…. and
add jars from it’s download location then click -> Apply and Close.
4. Now export the project as jar file. Right-click on MyProject choose Export.. and go to
Java -> JAR file click -> Next and choose your export destination then click -> Next.
5. Choose Main Class as MyMaxMin by clicking -> Browse and then click -> Finish ->
Ok.
9
6. Start Hadoop
a. start-all.sh
7. Move dataset to Hadoop HDFS
a. hdfs dfs -put /file_path /destination
b. hdfs dfs -put /home/hadoop/Downloads/CRND0103-2020-
AK_Fairbanks_11_NE.txt /
c. hdfs dfs -ls /
8. Now Run your Jar File with below command and produce the output in MyOutput File.
a. hadoop jar /home/hadoop/Documents/Project.jar /CRND0103-2020-
AK_Fairbanks_11_NE.txt /MyOutput
9. Go to browser – localhost:9870

OUTPUT:

RESULT:
Thus, the implementation of MR program that processes a weather dataset has been
executed successfully.

10
Exp No: 4
Implement Linear and Logistic Regression
Date:

AIM:

To implement linear and logistic regression.

PROCEDURE:

1. Open R on windows.
2. Create a new workspace.
3. Create a new script file.
4. Type the code in the script file.
5. Run the script file.
6. Close R.

PROGRAM:

> dataset = read.csv("data-marketing-budget-12mo.csv", header=T, colClasses = c("numeric",

"numeric", "numeric"))
> head(dataset,5)
> simple.fit = lm(Sales~Spend,data=dataset)
> summary(simple.fit)
> multi.fit = lm(Sales~Spend+Month, data=dataset)
> summary(multi.fit)
> input<- mtcars [,c("am","cyl","hp","wt")]
> print(head(input))
> am.data =glm(formula = am ~ cyl+hp+wt,data = input,family = binomial)
> print(summary(am.data))

OUTPUT:

11
RESULT:

Thus, the implementation of linear and logistic regression has been executed successfully.

12
Exp No: 5 a
Implement SVM Classification Techniques
Date:

AIM:

To implement SVM Classification technique.

PROCEDURE:

1. Open R on windows.
2. Create a new workspace.
3. Create a new script file.
4. Type the code in the script file.
5. Run the script file.
6. Close R.

PROGRAM:

> library(e1071)
> plot(iris)
> plot(iris$Sepal.Length, iris$Sepal.width, col=iris$Species)
> plot(iris$Petal.Length, iris$Petal.width, col=iris$Species)
> s<-sample(150,100)
> col<- c("Petal.Length", "Petal.Width", "Species")
> iris_train<- iris[s,col]
> iris_test<- iris[-s,col]
> svmfit<- svm(Species ~., data = iris_train, kernel = "linear", cost = .1, scale = FALSE)
> print(svmfit)
> plot(svmfit, iris_train[,col])
> tuned <- tune(svm, Species~., data = iris_train, kernel = "linear", ranges=
list(cost=c(0.001,0.01,.1,.1,10,100)))
> summary(tuned)
> p<-predict(svmfit, iris_test[,col], type="class")
> plot(p)
> table(p,iris_test[,3] )
> mean(p== iris_test[,3])

13
OUTPUT:

14
15
RESULT:

Thus, the implementation of SVM Classification technique has been executed successfully.

16
Exp No: 5 b
Implement Decision Tree Classification Techniques
Date:

AIM:
To implement decision tree classification technique.
PROCEDURE:

1. Open R on windows.
2. Create a new workspace.
3. Create a new script file.
4. Type the code in the script file.
5. Run the script file.
6. Close R.

PROGRAM:

> library(MASS)
> library(rpart)
> head(birthwt)
> hist(birthwt$bwt)
> table(birthwt$low)
> cols <- c('low', 'race', 'smoke', 'ht', 'ui')
> birthwt[cols] <- lapply(birthwt[cols], as.factor)
> set.seed(1)
> train<- sample(1:nrow(birthwt), 0.75 * nrow(birthwt))
> birthwtTree<- rpart(low ~ . - bwt, data = birthwt[train, ], method = 'class')
> plot(birthwtTree)
> text(birthwtTree, pretty = 0)
> summary(birthwtTree)
> birthwtPred<- predict(birthwtTree, birthwt[-train, ], type = 'class')
> table(birthwtPred, birthwt[-train, ]$low)
OUTPUT:

17
RESULT
Thus, the implementation of decision tree classification technique has been executed
successfully.

18
Exp No: 6
Implement Clustering Techniques
Date:

AIM:
To implement clustering techniques.

PROCEDURE:
1. Open R on windows.
2. Create a new workspace.
3. Create a new script file.
4. Type the code in the script file.
5. Run the script file.
6. Close R.
PROGRAM:
> library(datasets)
> head(iris)
> library(ggplot2)
> ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) + geom_point()
> set.seed(20)
> irisCluster <- kmeans(iris[, 3:4], 3, nstart = 20)
> irisCluster
> table(irisCluster$cluster, iris$Species)
OUTPUT:

19
RESULT:
Thus, the implementation of clustering techniques has been executed successfully.

20
Exp No: 7
Visualize data using any plotting framework
Date:

AIM:
To visualize data using any plotting framework in R.
PROCEDURE:
1. Open R on windows.
2. Create a new workspace.
3. Create a new script file.
4. Type the code in the script file.
5. Run the script file.
6. Close R.
PROGRAM:
1. Histogram
> library(RColorBrewer)
> data(VADeaths)
> par(mfrow=c(2,3))
> hist(VADeaths,breaks=10, col=brewer.pal(3,"Set3"),main="Set3 3 colors")
> hist(VADeaths,breaks=3 ,col=brewer.pal(3,"Set2"),main="Set2 3 colors")
> hist(VADeaths,breaks=7, col=brewer.pal(3,"Set1"),main="Set1 3 colors")
> hist(VADeaths,,breaks= 2, col=brewer.pal(8,"Set3"),main="Set3 8 colors")
> hist(VADeaths,col=brewer.pal(8,"Greys"),main="Greys 8 colors")
> hist(VADeaths,col=brewer.pal(8,"Greens"),main="Greens 8 colors")\

2. Line Chart
> data(AirPassengers)
> plot(AirPassengers,type="l")

3. Bar Chart
> data("iris")
> barplot(iris$Petal.Length)
> barplot(iris$Sepal.Length,col = brewer.pal(3,"Set1"))
> barplot(table(iris$Species,iris$Sepal.Length),col = brewer.pal(3,"Set1"))

4. Box Plot
> data(iris)
> par(mfrow=c(2,2))
> boxplot(iris$Sepal.Length,col="red")
> boxplot(iris$Sepal.Length~iris$Species,col="red")
> boxplot(iris$Sepal.Length~iris$Species,col=heat.colors(3))
> boxplot(iris$Sepal.Length~iris$Species,col=topo.colors(3))
> boxplot(iris$Petal.Length~iris$Species)

5. Scatter Plot
> plot(x=iris$Petal.Length)
> plot(x=iris$Petal.Length,y=iris$Species)

21
6. Heat Map
> x <- rnorm(10,mean=rep(1:5,each=2),sd=0.7)
> y <- rnorm(10,mean=rep(c(1,9),each=5),sd=0.1)
> dataFrame<- data.frame(x=x,y=y)
> set.seed(143)
> dataMatrix<-as.matrix(dataFrame)[sample(1:10),]
> heatmap(dataMatrix)

7. Correlogram
> library("corrplot")
> data("mtcars")
> corr_matrix <- cor(mtcars)
> corrplot(corr_matrix)
> corrplot(corr_matrix,method = 'number',type = "lower")
8. Area Chart
> library(dplyr)
> library(ggplot2)
> airquality %>%
o group_by(Day) %>%
o summarise(mean_wind = mean(Wind)) %>%
o ggplot() +
o geom_area(aes(x = Day, y = mean_wind)) +
o labs(title = "Area Chart of Average Wind per Day",
 subtitle = "using airquality data",
 y = "Mean Wind")
OUTPUT:

22
23
24
25
RESULT:
Thus, the visualization of data using plotting framework has been executed successfully.

26
Exp No: 8
Implement an application that stores big data in Hbase /
Date: MongoDB / Pig using Hadoop / R

AIM:
To implement an application that stores big data in mongoDB using R.
PROCEDURE:
1. Open R on windows.
2. Create a new workspace.
3. Create a new script file.
4. Type the code in the script file.
5. Run the script file.
6. Close R.
PROGRAM:
> library(ggplot2)
> library(mongolite)
> library(dplyr)
> crimes=data.table::fread("crimes.csv")
> connection_string="mongodb://localhost:27017/?tls=false&readPreference=primary"
> my_collection = mongo(collection = "crimes", db = "chicago",url=connection_string)
> my_collection$insert(crimes)
> my_collection$count()
> my_collection$iterate()$one()
> df <- as.data.frame(my_collection$find())
> head(df)
> length(my_collection$distinct("Primary Type"))
> my_collection$aggregate('[{"$group":{"_id":"$Location Description", "Count":
{"$sum":1}}}]')%>%na.omit()%>%
> arrange(desc(Count))%>%head(10)%>%
> ggplot(aes(x=reorder(`_id`,Count),y=Count))+
> geom_bar(stat="identity",color='skyblue',fill='#b35900')+geom_text(aes(label = Count),
color = "blue") +coord_flip()+xlab("Location Description")
> crimes=my_collection$find('{}', fields = '{"_id":0, "Primary Type":1,"Year":1}')
> crimes%>%group_by("Primary
Type")%>%summarize(Count=n())%>%arrange(desc(Count))%>%head(4)
OUTPUT:

27
28
RESULT:
Thus, the implementation of application to store big data in mongoDB using R has been
executed successfully.

Data Science
No ratings yet
Data Science
82 pages
Course: Big Data Analytics Lab Scheme: 2017
No ratings yet
Course: Big Data Analytics Lab Scheme: 2017
25 pages
Hadoop Administrator Training - Lab Hand Book
No ratings yet
Hadoop Administrator Training - Lab Hand Book
12 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
34 pages
BDA Manual
No ratings yet
BDA Manual
41 pages
Bda Lab S
No ratings yet
Bda Lab S
92 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Hadoop Installation & MapReduce Guide
No ratings yet
Hadoop Installation & MapReduce Guide
13 pages
Lab Manual
No ratings yet
Lab Manual
34 pages
Ccs334-Bda Lab Manual
No ratings yet
Ccs334-Bda Lab Manual
48 pages
Amrita CC 3.1
No ratings yet
Amrita CC 3.1
7 pages
Lab 4 - Installation of Hadoop and MapReduce WordCount Example
100% (1)
Lab 4 - Installation of Hadoop and MapReduce WordCount Example
16 pages
Group A 1st
No ratings yet
Group A 1st
4 pages
BIG Data File
No ratings yet
BIG Data File
28 pages
BDA Practical Experiment 1
No ratings yet
BDA Practical Experiment 1
5 pages
Big Data File
No ratings yet
Big Data File
16 pages
Big Data Lab Manual Printout
No ratings yet
Big Data Lab Manual Printout
51 pages
DAN Lab ManuaL
No ratings yet
DAN Lab ManuaL
53 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Bi Lab File
No ratings yet
Bi Lab File
19 pages
Ad8704 BDM Manual
No ratings yet
Ad8704 BDM Manual
46 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
Ccs334 Bda Lab Ex
No ratings yet
Ccs334 Bda Lab Ex
45 pages
Hadoop Setup Guide for Developers
No ratings yet
Hadoop Setup Guide for Developers
7 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
34 pages
Data Analytics Lab
No ratings yet
Data Analytics Lab
42 pages
EX. NO Date Program NO Sign
No ratings yet
EX. NO Date Program NO Sign
80 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
32 pages
Amc Engineering College: Dept. of Computer Science and Engineering
No ratings yet
Amc Engineering College: Dept. of Computer Science and Engineering
6 pages
Cloud Computing Ex 6
No ratings yet
Cloud Computing Ex 6
8 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Procedure: 1
No ratings yet
Procedure: 1
29 pages
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
No ratings yet
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
61 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Bda Lab Manuel
No ratings yet
Bda Lab Manuel
9 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
Big Data Manual
No ratings yet
Big Data Manual
82 pages
Big Data Lab Guide for AI Students
No ratings yet
Big Data Lab Guide for AI Students
83 pages
Anushka Shetty 35
No ratings yet
Anushka Shetty 35
34 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
Exp 5 - 9
No ratings yet
Exp 5 - 9
25 pages
Hadoop & HDFS Installation Guide
No ratings yet
Hadoop & HDFS Installation Guide
54 pages
DA Lab Manual Final
No ratings yet
DA Lab Manual Final
46 pages
Big Data Record
No ratings yet
Big Data Record
14 pages
Bda File
No ratings yet
Bda File
28 pages
CC Worksheet 3.1 2396
No ratings yet
CC Worksheet 3.1 2396
3 pages
Part B Assignment - No - 11
No ratings yet
Part B Assignment - No - 11
6 pages
Bda Manual
No ratings yet
Bda Manual
33 pages
BIG DATA Finalised
No ratings yet
BIG DATA Finalised
28 pages
Cp5261 Da Lab Me-Cse 2021 - Edit
No ratings yet
Cp5261 Da Lab Me-Cse 2021 - Edit
88 pages
Hadoop Lab Practical Guide
No ratings yet
Hadoop Lab Practical Guide
69 pages
DA Lab EXERCISE
No ratings yet
DA Lab EXERCISE
24 pages
BIGDATALABCURRENT
No ratings yet
BIGDATALABCURRENT
54 pages
Hitungalh Volume Lubang Bor Dan Volume Anulus Curut
No ratings yet
Hitungalh Volume Lubang Bor Dan Volume Anulus Curut
9 pages
Myp TQ Sac Sa 0501
No ratings yet
Myp TQ Sac Sa 0501
6 pages
Electric Actuator Butterfly Valve ZIP
100% (1)
Electric Actuator Butterfly Valve ZIP
5 pages
MIS107 (20) Assignment Tasmiah Jamil Esha 2013670030 Section 20
No ratings yet
MIS107 (20) Assignment Tasmiah Jamil Esha 2013670030 Section 20
7 pages
Liturature Review
No ratings yet
Liturature Review
5 pages
ChatGPT For Finance - September24
100% (3)
ChatGPT For Finance - September24
229 pages
Indore-Smart-City-Case-Study Remname
No ratings yet
Indore-Smart-City-Case-Study Remname
63 pages
Contact List of Names & Numbers
No ratings yet
Contact List of Names & Numbers
54 pages
Mpggungnir300r Pairflow White en
No ratings yet
Mpggungnir300r Pairflow White en
34 pages
A10 Vertical Machining Center Specs
No ratings yet
A10 Vertical Machining Center Specs
6 pages
Industry Presentation Value Stream Analysis
No ratings yet
Industry Presentation Value Stream Analysis
10 pages
Schneider - Catalogo CLP
No ratings yet
Schneider - Catalogo CLP
54 pages
Ict Notes
No ratings yet
Ict Notes
17 pages
Position Announcement Temp Clerk II May 2023 RA1
No ratings yet
Position Announcement Temp Clerk II May 2023 RA1
2 pages
C - C4h56i - Q&a 2
0% (1)
C - C4h56i - Q&a 2
24 pages
Tpoint Tech - Free Online Tutorials
No ratings yet
Tpoint Tech - Free Online Tutorials
22 pages
Software Security Issue 1.0 1M7Kfk2
No ratings yet
Software Security Issue 1.0 1M7Kfk2
28 pages
LIST OF EXHIBITOR by Country 12 - 3 - 17 Final Highlighted PDF
No ratings yet
LIST OF EXHIBITOR by Country 12 - 3 - 17 Final Highlighted PDF
16 pages
CIR FORMAT GJ18AZ6696 (2) Amc
No ratings yet
CIR FORMAT GJ18AZ6696 (2) Amc
6 pages
Telecare Technologies and The Transformation of Healthcare (Health, Technology and Society) - , 978-0230300200
100% (28)
Telecare Technologies and The Transformation of Healthcare (Health, Technology and Society) - , 978-0230300200
23 pages
Luxury Home Decor & Lighting Singapore
No ratings yet
Luxury Home Decor & Lighting Singapore
5 pages
Sensing Blocks and Examples in PictoBlox
No ratings yet
Sensing Blocks and Examples in PictoBlox
12 pages
Stage 5 Age 9 10 Student Workbook 5 Illustrated Edition 38041672
No ratings yet
Stage 5 Age 9 10 Student Workbook 5 Illustrated Edition 38041672
128 pages
Acoustic Beamforming
No ratings yet
Acoustic Beamforming
42 pages
Empowering The Next Generation of Tech Leaders: B.Tech in Computer Science & Artificial Intelligence
No ratings yet
Empowering The Next Generation of Tech Leaders: B.Tech in Computer Science & Artificial Intelligence
20 pages
BLOCKCHAIN
100% (1)
BLOCKCHAIN
35 pages
Steel & Timber Design Activities
No ratings yet
Steel & Timber Design Activities
9 pages
EzRay - Premium VEX-S100
No ratings yet
EzRay - Premium VEX-S100
2 pages
Math-9-LAW #4-Q1-4
No ratings yet
Math-9-LAW #4-Q1-4
8 pages
Catálogo de Peças de Reposição: 6415,6615 TRACTORS
100% (2)
Catálogo de Peças de Reposição: 6415,6615 TRACTORS
752 pages

Data Science Record

Uploaded by

Data Science Record

Uploaded by

LIST OF EXPERIMENTS

1. Install, configure and run Hadoop/HDFS/Pig and R

2. Implement word count / frequency programs using MapReduce

3. Implement an MR program that processes a weather dataset R

4. Implement Linear and logistic Regression

5. Implement SVM / Decision tree classification techniques

6. Implement clustering techniques

7. Visualize data using any plotting framework

1. Install, configure and run Hadoop and R

Implement word count / frequency

Implement an MR program that processes

4 Implement Linear and Logistic Regression

Implement SVM / Decision Tree

6. Implement Clustering Techniques

Visualize data using any plotting

Implement an application that stores big

To install, configure and run Hadoop and R.

Download and install R for windows from - https://cran.r-project.org/bin/windows/base/

To implement word count program using MapReduce.

To implement an MR program that processes a weather dataset.

1. Download dataset from - ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/daily01

if (temp_Min < 15) {

public static void main(String[] args) throws Exception {

To implement linear and logistic regression.

> dataset = read.csv("data-marketing-budget-12mo.csv", header=T, colClasses = c("numeric",

To implement SVM Classification technique.

You might also like