Package deepnetts.data
Class TabularDataSet<T extends MLDataItem>
java.lang.Object
javax.visrec.ml.data.BasicDataSet<T>
deepnetts.data.TabularDataSet<T>
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Represents a basic data set item (single row) with input tensor and target vector in a data set. -
Constructor Summary
ConstructorsConstructorDescriptionTabularDataSet
(int numInputs, int numOutputs) Create a new instance of BasicDataSet with specified size of input and output. -
Method Summary
Modifier and TypeMethodDescriptionint[]
int
countMissingValues
(int colIdx) String[]
int
int
String[]
boolean[]
boolean
hasMissingValues
(int colIdx) void
setColumnNames
(String[] columnNames) void
shuffle()
Shuffles the data set items using the default random generator.void
shuffle
(int seed) Shuffles data set items using java random generator initializes with specified seedjavax.visrec.ml.data.DataSet[]
split
(double... parts) Splits data set into several parts specified by the input parameter partSizes.javax.visrec.ml.data.DataSet[]
split
(int parts) Split data set into specified number of part of equal sizes.trainTestSplit
(double splitRatio) Methods inherited from class javax.visrec.ml.data.BasicDataSet
getColumns, getItems, setAsTargetColumns, setAsTargetColumns, setColumns
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface javax.visrec.ml.data.DataSet
add, addAll, clear, get, isEmpty, iterator, shuffle, size, split, split, split, stream
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Constructor Details
-
TabularDataSet
public TabularDataSet(int numInputs, int numOutputs) Create a new instance of BasicDataSet with specified size of input and output.- Parameters:
numInputs
- number of input featuresnumOutputs
- number of output features
-
-
Method Details
-
getNumInputs
public int getNumInputs() -
getNumOutputs
public int getNumOutputs() -
split
public javax.visrec.ml.data.DataSet[] split(int parts) Split data set into specified number of part of equal sizes. Utility method used during cross-validation Note: this could be default method- Parameters:
parts
-- Returns:
-
trainTestSplit
-
split
public javax.visrec.ml.data.DataSet[] split(double... parts) Splits data set into several parts specified by the input parameter partSizes. Values of partSizes parameter represent the sizes of data set parts that will be returned. Part sizes are decimal values that represent percents, cannot be negative or zero, and their sum must be 1- Specified by:
split
in interfacejavax.visrec.ml.data.DataSet<T extends MLDataItem>
- Overrides:
split
in classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>
- Parameters:
parts
- sizes of the parts in percents- Returns:
- parts of the data set of specified size
-
shuffle
public void shuffle()Shuffles the data set items using the default random generator. Default rng can be initialized independently -
shuffle
public void shuffle(int seed) Shuffles data set items using java random generator initializes with specified seed- Parameters:
seed
- a seed number to initialize random generator- See Also:
-
getColumnNames
- Overrides:
getColumnNames
in classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>
-
setColumnNames
- Overrides:
setColumnNames
in classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>
-
getTargetColumnsNames
- Specified by:
getTargetColumnsNames
in interfacejavax.visrec.ml.data.DataSet<T extends MLDataItem>
- Overrides:
getTargetColumnsNames
in classjavax.visrec.ml.data.BasicDataSet<T extends MLDataItem>
-
hasMissingValues
public boolean hasMissingValues(int colIdx) -
hasMissingValues
public boolean[] hasMissingValues() -
countMissingValues
public int countMissingValues(int colIdx) -
countMissingValues
public int[] countMissingValues()
-