8000 Merge pull request #283 from jsbean/Swift3LinearRegression · codee/swift-algorithm-club@5549e1b · GitHub
[go: up one dir, main page]

Skip to content

Commit 5549e1b

Browse files
authored
Merge pull request kodecocodes#283 from jsbean/Swift3LinearRegression
Swift 3 Migration: Linear Regression
2 parents 91e39bc + b5bda5b commit 5549e1b

File tree

2 files changed

+27
-26
lines changed

2 files changed

+27
-26
lines changed

Linear Regression/LinearRegression.playground/Contents.swift

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ let carPrice: [Double] = [500, 400, 7000, 8500, 11000, 10500]
77
var intercept = 0.0
88
var slope = 0.0
99

10-
func predictedCarPrice(carAge: Double) -> Double {
10+
func predictedCarPrice(_ carAge: Double) -> Double {
1111
return intercept + slope * carAge
1212
}
1313

@@ -29,20 +29,20 @@ print("A car age of 4 years is predicted to be worth £\(Int(predictedCarPrice(4
2929

3030
// A closed form solution
3131

32-
func average(input: [Double]) -> Double {
33-
return input.reduce(0, combine: +) / Double(input.count)
32+
func average(_ input: [Double]) -> Double {
33+
return input.reduce(0, +) / Double(input.count)
3434
}
3535

36-
func multiply(input1: [Double], _ input2: [Double]) -> [Double] {
37-
return input1.enumerate().map({ (index, element) in return element*input2[index] })
36+
func multiply(_ a: [Double], _ b: [Double]) -> [Double] {
37+
return zip(a,b).map(*)
3838
}
3939

40-
func linearRegression(xVariable: [Double], _ yVariable: [Double]) -> (Double -> Double) {
41-
let sum1 = average(multiply(xVariable, yVariable)) - average(xVariable) * average(yVariable)
42-
let sum2 = average(multiply(xVariable, xVariable)) - pow(average(xVariable), 2)
40+
func linearRegression(_ xs: [Double], _ ys: [Double]) -> (Double) -> Double {
41+
let sum1 = average(multiply(xs, ys)) - average(xs) * average(ys)
42+
let sum2 = average(multiply(xs, xs)) - pow(average(xs), 2)
4343
let slope = sum1 / sum2
44-
let intercept = average(yVariable) - slope * average(xVariable)
45-
return { intercept + slope * $0 }
44+
let intercept = average(ys) - slope * average(xs)
45+
return { x in intercept + slope * x }
4646
}
4747

4848
let result = linearRegression(carAge, carPrice)(4)

Linear Regression/README.markdown

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ We can describe the straight line in terms of two variables:
3030

3131
This is the equation for our line:
3232

33-
carPrice = slope * carAge + intercept
33+
`carPrice = slope * carAge + intercept`
3434

3535

3636
How can we find the best values for the intercept and the slope? Let's look at two different ways to do this.
@@ -50,18 +50,19 @@ This is how we can represent our straight line:
5050
```swift
5151
var intercept = 0.0
5252
var slope = 0.0
53-
func predictedCarPrice(carAge: Double) -> Double {
53+
func predictedCarPrice(_ carAge: Double) -> Double {
5454
return intercept + slope * carAge
5555
}
56+
5657
```
5758
Now for the code which will perform the iterations:
5859

5960
```swift
6061
let numberOfCarAdvertsWeSaw = carPrice.count
61-
let iterations = 2000
62+
let numberOfIterations = 100
6263
let alpha = 0.0001
6364

64-
for n in 1...iterations {
65+
for n in 1...numberOfIterations {
6566
for i in 0..<numberOfCarAdvertsWeSaw {
6667
let difference = carPrice[i] - predictedCarPrice(carAge[i])
6768
intercept += alpha * difference
@@ -74,7 +75,7 @@ for n in 1...iterations {
7475

7576
The program loops through each data point (each car age and car price). For each data point it adjusts the intercept and the slope to bring them closer to the correct values. The equations used in the code to adjust the intercept and the slope are based on moving in the direction of the maximal reduction of these variables. This is a *gradient descent*.
7677

77-
We want to minimse the square of the distance between the line and the points. We define a function J which represents this distance - for simplicity we consider only one point here. This function J is proprotional to ((slope.carAge+intercept) - carPrice))^2
78+
We want to minimse the square of the distance between the line and the points. We define a function `J` which represents this distance - for simplicity we consider only one point here. This function `J` is proprotional to `((slope.carAge + intercept) - carPrice)) ^ 2`.
7879

7980
In order to move in the direction of maximal reduction, we take the partial derivative of this function with respect to the slope, and similarly for the intercept. We multiply these derivatives by our factor alpha and then use them to adjust the values of slope and intercept on each iteration.
8081

@@ -104,17 +105,17 @@ There is another way we can calculate the line of best fit, without having to do
104105
First we need some helper functions. This one calculates the average (the mean) of an array of Doubles:
105106

106107
```swift
107-
func average(input: [Double]) -> Double {
108-
return input.reduce(0, combine: +) / Double(input.count)
108+
func average(_ input: [Double]) -> Double {
109+
return input.reduce(0, +) / Double(input.count)
109110
}
110111
```
111112
We are using the ```reduce``` Swift function to sum up all the elements of the array, and then divide that by the number of elements. This gives us the mean value.
112113

113114
We also need to be able to multiply each element in an array by the corresponding element in another array, to create a new array. Here is a function which will do this:
114115

115116
```swift
116-
func multiply(input1: [Double], _ input2: [Double]) -> [Double] {
117-
return input1.enumerate().map({ (index, element) in return element*input2[index] })
117+
func multiply(_ a: [Double], _ b: [Double]) -> [Double] {
118+
return zip(a,b).map(*)
118119
}
119120
```
120121

@@ -123,15 +124,15 @@ We are using the ```map``` function to multiply each element.
123124
Finally, the function which fits the line to the data:
124125

125126
```swift
126-
func linearRegression(xVariable: [Double], _ yVariable: [Double]) -> (Double -> Double) {
127-
let sum1 = average(multiply(xVariable, yVariable)) - average(xVariable) * average(yVariable)
128-
let sum2 = average(multiply(xVariable, xVariable)) - pow(average(xVariable), 2)
127+
func linearRegression(_ xs: [Double], _ ys: [Double]) -> (Double) -> Double {
128+
let sum1 = average(multiply(ys, xs)) - average(xs) * average(ys)
129+
let sum2 = average(multiply(xs, xs)) - pow(average(xs), 2)
129130
let slope = sum1 / sum2
130-
let intercept = average(yVariable) - slope * average(xVariable)
131-
return { intercept + slope * $0 }
131+
let intercept = average(ys) - slope * average(xs)
132+
return { x in intercept + slope * x }
132133
}
133134
```
134-
This function takes as arguments two arrays of Doubles, and returns a function which is the line of best fit. The formulas to calculate the slope and the intercept can be derived from our definition of the function J. Let's see how the output from this line fits our data:
135+
This function takes as arguments two arrays of Doubles, and returns a function which is the line of best fit. The formulas to calculate the slope and the intercept can be derived from our definition of the function `J`. Let's see how the output from this line fits our data:
135136

136137
![graph3](Images/graph3.png)
137138

@@ -145,4 +146,4 @@ Well, the line we've found doesn't fit the data perfectly. For one thing, the gr
145146

146147
It turns out that in some of these more complicated models, the iterative approach is the only viable or efficient approach. This can also occur when the arrays of data are very large and may be sparsely populated with data values.
147148

148-
*Written for Swift Algorithm Club by James Harrop*
149+
*Written for Swift Algorithm Club by James Harrop*

0 commit comments

Comments
 (0)
0