Machine Learning Tutorial for Java Developers

What is Machine  Learning?

Machine Learning is a computer algorithm that is able to adjust its own internal parameters using sample data, in order to be able to estimate/predict something useful for similar data.

The procedure of automatically adjusting internal parameters using sample data is called learning,  training or model building. Once the algorithm has finished the training, we get so called trained model. That is the model that can be used to predict something for some new, similar data. Sample data used for learning/training is called training set.
So what it can learn? Most commonly used type of machine learning algorithms called supervised learning, can learn to predict label or numeric value for some given input. You can think of it as learning from examples, where examples of predictions are given as a set of (input-prediction) pairs within the training set. This fundamentally changes the way we use computers, so instead of telling them how to do something, we’re showing them what to do by giving them examples [1].

Machine Learning as a black box
Machine Learning as a black-box

What it can do?

Specific predictive tasks performed by supervised machine learning models most often used in practice include:

  • Classification –  when you want to predict a category (a class label) for a given inputs. For example: predict whether a user is going to click on ad or not. It answers the question to which category something belongs, based on the given set of attributes. If desired prediction is yes/no or some qualitative value, then it is a classification task.
  •  Regression  – when you want to predict some continuous numeric value for a given inputs. For example, predict sales for the given marketing budget. It answers the question what will be the value of something, for the given value of something related. So if desired prediction is numeric value, then it is a regression task.

Note that there are also other types of tasks which are not included in this quick introductory overview, but the general principles described here are the same.

Example Code

Here is example code that shows how to train neural network for regression task using Feed forward neural network from Deep Netts library. This type of machine learning model can be used to roughly predict [give example use case] More details about the linear regression and feed forward network is explained in the following post For now take a look at the code for the simplest example for loading data from CSV file and training a machine learning model.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Load data from CSV file
DataSet trainingSet = DataSets.readCsv("fileName.csv", inputsNum, outputsNum);
 
// Create a feed forward neural network using builder
FeedForwardNetwork neuralNet = FeedForwardNetwork.builder()
                               .addInputLayer(inputsNum)
                               .addOutputLayer(outputsNum, ActivationType.LINEAR)
                               .lossFunction(LossType.MEAN_SQUARED_ERROR)
                               .build();
 
// Train network
neuralNet.train(trainingSet);
 
// use the trained model (neural network) for prediction
neuralNet.setInput(someNewInput);
float[] prediction = neuralNet.getOutput();

Now when you’re familiar with basic concepts, Deep Netts API provides very easy and intuitive way to use machine learning in Java:

  • [Line 2]: Class DataSet holds the data that is used for training a machine learning algorithm – a training set.
  • [Line 2]: Class DataSets provides utility methods to work with data sets, and one of them is readCsv which loads data from CSV file and returns an instance of DataSet that is used as training set [line2]
  • [Lines 5-9]: Class FeedForwardNetwork provides widely used type of machine learning technique which can be used for both classification an regression problems. It provides a builder through static builder() method, that is used to specify various setting of a neural network. 
  • [Line 12]: FeedForwardNetwork also provides the train method which performs the training/learning procedure on feed forward network using given trainingSet parameter.

Note: Links are still under construction.

Next

Introduction to Deep Learning: from linear regression to convolutional networks

Key concepts: Machine Learning, Classification, Regression, Model, Training Set, FeedForwardNetwork

Additional content

Slides from Deep Java Dev Meetup on this topic.

References
[1] Geoffrey Hinton in his interview to Andrew Ng