Iris Data Set Project

In today's article, we will carry out our first artificial neural networks project. We'll have a chance to put into practice the theoretical knowledge we've learned so far. In this project we will use the iris data set, which is a famous data set. This data set has different characteristics for setosa, versicolor and virginica, which are 3 different leaf varieties.

Before starting our project, we should install a sklearn library in addition to the libraries we have established in our Library Installations article. Using this library, we bring our data into a form where artificial neural networks can function. You can download it via Anaconda Navigator or via the Anaconda terminal by running the conda install -c anaconda sckit-learn command. We will carry out our project in 3 sections: data preprocessing, building the structure of the artificial neural network model and training of the model.

Data Preprocessing

First, we download the csv file where our data is located https://gist.github.com/netj/8836201 via this link and throw the iris.csv file in the zip into the folder where we work. We're ready to write code now. First, after importing the necessary libraries, we read our csv file with the help of pandas library. We select the following block of code on spyder and run it with F9.

import pandas as pd
import numpy as np
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense

We're reading our #csv file.
dataset = pd.read_csv("iris.csv")

After running this block of code, we can examine our data by double-clicking on the dataset data in variable explorer.

iris data set
Data Set

There are 150 examples in our data set. Each example contains information about the length and width of the dish leaf (sepal), the length and width of the petal, and what class it belongs to. Then we separate our inputs(x) and outputs(y) in the data set. We perform this separation with iloc. When separating with iloc , the left part of the comma and the right part refer to our columns. " and we say that we should take all the rows, and with "[0,1,2,3]" we say to take the first, second, third and fourth columns. In this way, we obtain our input values. [:,4] with this demonstration, we take the whole fifth column, which means we separate our output values. After you run the following block of code, you can still examine the x and y variables in variable explorer.

# we separate inputs and outputs.
x = dataset.iloc[:,[0,1,2,3]]
y = dataset.iloc[:,4]

As you can see, our outputs are not the kind of data that our artificial neural network can process. We must convert this data to numerical data and keep it in the form of one-hot encoded, as we will do multiple classifications. For example, [1,0,0] shows the setosa class. First, we need to convert our data into numerical data. In doing this, we use the LabelEncoder class from the preprocessing module of the sklearn library. First of all, we create an object from this class and convert our outputs into numerical values with the fit_transform function of this object. After you run the following block of code, you can examine the y variable on variable explorer.

label_encoder = preprocessing. LabelEncoder()
y = label_encoder.fit_transform(y)

When we examine the y variable, we see that the setosa class is 0, the versicolor class is 1, and the virginica class is 2. This is not an adequate result for us, in addition we must perform the one-hot encoding process. This time, we create our object from the OneHotEncoder class and perform one-hot encoding with the help of the fit_transform function.

onehot_encoder = preprocessing. OneHotEncoder()
y = onehot_encoder.fit_transform(y[:,np.newaxis]).toarray()

As a result, we are bringing our output values to the form we want.

iris data set
One-hot encoded Outputs

Our next step will be to scale our input values. With scaling, we compress our input data between the same value ranges, which accelerates the process of reaching the minimum point of our model. Let's have a car data set as an example and guess the brand of the car from the horsepower and weight of the car. The weight values of the cars should be valued in the range of 1500 kg and 4000 kg, and the horsepower should be in the range of 50 to 180 kg. Below you can examine the graph of our data and loss function.

iris data set
Unscathed Data

When we examine the graph, we notice that our data is disorganized irregularly. Here we can compare our loss function to a large plate, while we prefer a smooth loss function in the form of a bowl so that it is easier to reach its minimum value. The following is a scaled, smooth data distribution chart.

iris data set
Scaled Data

When performing the scaling process, we create an object from the StandardScaler class of the sklearn library and perform the scaling process with the fit_transform function.

scaler = preprocessing. StandardScaler()
scaler.fit_transform(x)
iris data set
Scaled Values

As the final step in the data preprocessing process, we will divide the data we have prepared into a training and test set. In this way, we will be able to evaluate the success of our trained model in test sets that it has never seen before. In addition, we can draw some conclusions about our model by comparing the success values of training and test sets. For example, if a model that achieves very good success in the training set fails very well in the test set, we can consider overfitting here and try to solve this problem. We will address these issues in detail in our following articles. We are injured by the train_test_split function of the sklearn library when we perform the separation of the dataset. As parameters, we give our x value, y value, test_size 0.3 (reserve 30% of the data set for the test set) and 2 for random_state. Since the allocations are random, thanks to the random_state parameter, we can get the same result every time. As a result, there are 105 samples in our training cluster and 45 samples in our test cluster.

x_egitim, x_test, y_egitim, y_test = train_test_split(x, y, test_size = 0.3, random_state=2)

Creating an Artificial Neural Network Model

We will create our artificial neural network model very easily with the help of the keras library. Since there are 4 different features in our inputs, we will have a 4-neuron input layer. Our secret layer will consist of 8 neurons and we will use ReLU as an activation function. Since our output layer is 3 classes, it will consist of 3 neurons and we will use softmax function because we have multiple classifications. You can examine the structure of the model that we will create in the picture below.

iris data set
Structure of artificial neural network model

First of all, we determine the number of neurons of our input layer and the number of neurons of our output layer.

girdi_sayisi = x_egitim.shape[1]
sinif_sayisi = y_egitim.shape[1]

Then we produce an object of the Sequential class that will form the basis of our model. We will build our layers on this foundation.

model = Sequential()

model.add(Dense(8,input_dim = girdi_sayisi, activation = 'relu'))
model.add(Dense(sinif_sayisi, activation = 'softmax'))

model.summary()

With the add function, we add our layers in order. When adding our input layer, we give the number of neurons in the hidden layer to be connected, the number of neurons in the input layer, and the activation function to be used in the hidden layer as parameters to the Dense object, and we give this Dense obej to our add function. Now that we've added our input and secret caterer, we're adding our output layer. We give the number of classes as the number of neurons and the softmax value as the activation function. Finally, we take the summary function and the summary of our model as output.

iris data set
Model Summary

We see that there are a total of 67 parameters to be trained in our model. Of these, 56 are the weight values of neurons and the remaining 11 are the threshold values of neurons.

Training and Testing of the Model

Before we start the training of our model, we must determine the loss function that our model will use and the optimization technique that we will use to minimize this loss function. Since we do multiple classifications, we choose categorical cross entropy as the loss function. For optimization technique, we choose a mini-batch gradient descent. In addition, we choose accuracy as a metric, another measure of the success of our model. Apart from accuracy, we have different values and we will examine them in more detail in our following articles. We give the compile function these values that we select. We carry out the training of our model with the fit function. As parameters to the function, we give the x and y values that we have reserved for training. We give how many replicas we will train our model for and choose this value as 70. Finally, we value how many samples the model will optimize after processing during training, which means that the bathc_size. We chose this as 16.

model.compile(optimizer='sgd',loss="categorical_crossentropy",metrics=['accuracy'])
model.fit(x_egitim, y_egitim, batch_size=16, epochs=70)

After we run this code block, our training begins. You can track the time and development of the model as output in the console. In the first iteration, the loss value of our model is 1.2774 and the accuracy value in the training set is 0.20. It's a very unsuccessful model. In the 70th iteration, the loss value of our model decreased to 0.4834 and the accuracy value increased to 0.9238. With a accuracy of 92%, our model is very good in the training set.

iris data set
Training Process

Our model also needs to perform successfully on data it has never seen before. For this, we have separated our test data. Finally, we can measure the success of our model in the test set.

sonuc = model.evaluate(x_test, y_test, verbose=0)
print("Test loss value : ", sonuc[0])
print("Test accuracy value : ",sonuc[1])

Our model is also very successful in test data with a accuracy rate of 923%.

iris data set
Test Results

We have successfully completed our first artificial neural networks project. You can find the code we wrote in one piece below.

import pandas as pd
import numpy as np
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense

We're reading our #csv file.
dataset = pd.read_csv("iris_veriseti.csv")

# we separate inputs and outputs.
x = dataset.iloc[:,[0,1,2,3]]
y = dataset.iloc[:,4]

# we convert our outputs into numerical values.

label_encoder = preprocessing. LabelEncoder()
y = label_encoder.fit_transform(y)

We require #one-hot encoding.
onehot_encoder = preprocessing. OneHotEncoder()
y = onehot_encoder.fit_transform(y[:,np.newaxis]).toarray()

# we perform scaling processing.
scaler = preprocessing. StandardScaler()
x = scaler.fit_transform(x)

# training and test cluster allocation
x_egitim, x_test, y_egitim, y_test = train_test_split(x, y, test_size = 0.3, random_state=2)

girdi_sayisi = x_egitim.shape[1]
sinif_sayisi = y_egitim.shape[1]

# We're building our model. 
model = Sequential()

model.add(Dense(8,input_dim = girdi_sayisi, activation = 'relu'))
model.add(Dense(sinif_sayisi, activation = 'softmax'))

model.summary()

# We're doing the training.
model.compile(optimizer='sgd',loss="categorical_crossentropy",metrics=['accuracy'])
model.fit(x_egitim, y_egitim, batch_size=16, epochs=70)

# test process
sonuc = model.evaluate(x_test, y_test, verbose=0)
print("Test loss value : ", sonuc[0])
print("Test accuracy value : ",sonuc[1])