controller

Provides building blocks for writing a complete prediction engine consisting of DataSource, Preparator, Algorithm, Serving, and Evaluation.

Start Building an Engine

The starting point of a prediction engine is the Engine class.

The DASE Paradigm

The building blocks together form the DASE paradigm. Learn more about DASE here.

Types of Building Blocks

Depending on the problem you are solving, you would need to pick appropriate flavors of building blocks.

Engines

There are 3 typical engine configurations:

In both configurations 1 and 2, data is sourced and prepared in a parallelized fashion, with data type as RDD.

The difference between configurations 1 and 2 come at the algorithm stage. In configuration 1, the algorithm operates on potentially large data as RDDs in the Spark cluster, and eventually outputs a model that is small enough to fit in a single machine.

On the other hand, configuration 2 outputs a model that is potentially too large to fit in a single machine, and must reside in the Spark cluster as RDD(s).

With configuration 1 (P2LAlgorithm), PredictionIO will automatically try to persist the model to local disk or HDFS if the model is serializable.

With configuration 2 (PAlgorithm), PredictionIO will not automatically try to persist the model, unless the model implements the PersistentModel trait.

In special circumstances where both the data and the model are small, configuration 3 may be used. Beware that RDDs cannot be used with configuration 3.

Data Source

PDataSource is probably the most used data source base class with the ability to process RDD-based data. LDataSource cannot handle RDD-based data. Use only when you have a special requirement.

Preparator

With PDataSource, you must pick PPreparator. The same applies to LDataSource and LPreparator.

Algorithm

The workhorse of the engine comes in 3 different flavors.

P2LAlgorithm

Produces a model that is small enough to fit in a single machine from PDataSource and PPreparator. The model cannot contain any RDD. If the produced model is serializable, PredictionIO will try to automatically persist it. In addition, P2LAlgorithm.batchPredict is already implemented for Evaluation purpose.

PAlgorithm

Produces a model that could contain RDDs from PDataSource and PPreparator. PredictionIO will not try to persist it automatically unless the model implements PersistentModel. PAlgorithm.batchPredict must be implemented for Evaluation.

LAlgorithm

Produces a model that is small enough to fit in a single machine from LDataSource and LPreparator. The model cannot contain any RDD. If the produced model is serializable, PredictionIO will try to automatically persist it. In addition, LAlgorithm.batchPredict is already implemented for Evaluation purpose.

Serving

The serving component comes with only 1 flavor--LServing. At the serving stage, it is assumed that the result being served is already at a human- consumable size.

Model Persistence

PredictionIO tries its best to persist trained models automatically. Please refer to LAlgorithm.makePersistentModel, P2LAlgorithm.makePersistentModel, and PAlgorithm.makePersistentModel for descriptions on different strategies.

Linear Supertypes

AnyRef, Any

Type Members

abstract class AverageMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Double]

Returns the global average of the score returned by the calculate method.
trait CustomQuerySerializer extends BaseQuerySerializer

If your query class cannot be automatically serialized/deserialized to/from JSON, implement a trait by extending this trait, and overriding the querySerializer member with your custom JSON4S serializer.
trait Deployment extends EngineFactory

Defines a deployment that contains an Engine
type EmptyActualResult = SerializableClass

Empty actual result.
type EmptyAlgorithmParams = EmptyParams

Empty algorithm parameters.
type EmptyDataParams = EmptyParams

Empty data parameters.
type EmptyDataSourceParams = EmptyParams

Empty data source parameters.
type EmptyEvaluationInfo = SerializableClass

Empty evaluation info.
type EmptyMetricsParams = EmptyParams

Empty metrics parameters.
type EmptyModel = SerializableClass

Empty model.
case class EmptyParams() extends Params with Product with Serializable

A concrete implementation of Params representing empty parameters.
type EmptyPreparatorParams = EmptyParams

Empty preparator parameters.
type EmptyPreparedData = SerializableClass

Empty prepared data.
type EmptyServingParams = EmptyParams

Empty serving parameters.
type EmptyTrainingData = SerializableClass

Empty training data.
class Engine[TD, EI, PD, Q, P, A] extends BaseEngine[EI, Q, P, A]

This class chains up the entire data process.
abstract class EngineFactory extends AnyRef

If you intend to let PredictionIO create workflow and deploy serving automatically, you will need to implement an object that extends this class and return an Engine.
class EngineParams extends Serializable

This class serves as a logical grouping of all required engine's parameters.
trait EngineParamsGenerator extends AnyRef

Defines an engine parameters generator.
trait Evaluation extends EngineFactory with Deployment

Defines an evaluation that contains an engine and a metric.
class FastEvalEngine[TD, EI, PD, Q, P, A] extends Engine[TD, EI, PD, Q, P, A]

:: Experimental :: FastEvalEngine is a subclass of Engine that exploits the immutability of controllers to optimize the evaluation process
class FastEvalEngineWorkflow[TD, EI, PD, Q, P, A] extends Serializable

:: Experimental :: Workflow based on FastEvalEngine
class IdentityPreparator[TD] extends BasePreparator[TD, TD]

A helper concrete implementation of io.prediction.core.BasePreparator that passes training data through without any special preparation.
abstract class LAlgorithm[PD, M, Q, P] extends BaseAlgorithm[RDD[PD], RDD[M], Q, P]

Base class of a local algorithm.
class LAverageServing[Q] extends LServing[Q, Double]

A concrete implementation of LServing returning the average of all algorithms' predictions, where their classes are expected to be all Double.
abstract class LDataSource[TD, EI, Q, A] extends BaseDataSource[RDD[TD], EI, Q, A]

Base class of a local data source.
class LFirstServing[Q, P] extends LServing[Q, P]

A concrete implementation of LServing returning the first algorithm's prediction result directly without any modification.
abstract class LPreparator[TD, PD] extends BasePreparator[RDD[TD], RDD[PD]]

Base class of a local preparator.
abstract class LServing[Q, P] extends BaseServing[Q, P]

Base class of serving.
trait LocalFileSystemPersistentModel[AP <: Params] extends PersistentModel[AP]

This trait is a convenience helper for persisting your model to the local filesystem.
trait LocalFileSystemPersistentModelLoader[AP <: Params, M] extends PersistentModelLoader[AP, M]

Implement an object that extends this trait for PredictionIO to support loading a persisted model from local filesystem during serving deployment.
abstract class Metric[EI, Q, P, A, R] extends Serializable

Base class of a Metric.
class MetricEvaluator[EI, Q, P, A, R] extends BaseEvaluator[EI, Q, P, A, MetricEvaluatorResult[R]]

:: DeveloperApi :: Do no use this directly.
case class MetricEvaluatorResult[R](bestScore: MetricScores[R], bestEngineParams: EngineParams, bestIdx: Int, metricHeader: String, otherMetricHeaders: Seq[String], engineParamsScores: Seq[(EngineParams, MetricScores[R])], outputPath: Option[String]) extends BaseEvaluatorResult with Product with Serializable

Contains all results of a MetricEvaluator
case class MetricScores[R](score: R, otherScores: Seq[Any]) extends Product with Serializable

Case class storing a primary score, and other scores
abstract class OptionAverageMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsOptionMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Option[Double]]

Returns the global average of the non-None score returned by the calculate method.
abstract class OptionStdevMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsOptionMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Option[Double]]

Returns the global standard deviation of the non-None score returned by the calculate method
abstract class P2LAlgorithm[PD, M, Q, P] extends BaseAlgorithm[PD, M, Q, P]

Base class of a parallel-to-local algorithm.
abstract class PAlgorithm[PD, M, Q, P] extends BaseAlgorithm[PD, M, Q, P]

Base class of a parallel algorithm.
abstract class PDataSource[TD, EI, Q, A] extends BaseDataSource[TD, EI, Q, A]

Base class of a parallel data source.
abstract class PPreparator[TD, PD] extends BasePreparator[TD, PD]

Base class of a parallel preparator.
trait Params extends Serializable

Base trait for all kinds of parameters that will be passed to constructors of different controller classes.
trait PersistentModel[AP <: Params] extends AnyRef

Mix in and implement this trait if your model cannot be persisted by PredictionIO automatically.
trait PersistentModelLoader[AP <: Params, M] extends AnyRef

Implement an object that extends this trait for PredictionIO to support loading a persisted model during serving deployment.
trait QPAMetric[Q, P, A, R] extends AnyRef

Trait for metric which returns a score based on Query, PredictedResult, and ActualResult
trait SanityCheck extends AnyRef

Extends a data class with this trait if you want PredictionIO to automatically perform sanity check on your data classes during training.
class SerializableClass extends Serializable

Base class of several helper types that represent emptiness
class SimpleEngine[TD, EI, Q, P, A] extends Engine[TD, EI, TD, Q, P, A]

SimpleEngine has only one algorithm, and uses default preparator and serving layer.
class SimpleEngineParams extends EngineParams

This shorthand class serves the SimpleEngine class.
abstract class StdevMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Double]

Returns the global standard deviation of the score returned by the calculate method
abstract class SumMetric[EI, Q, P, A, R] extends Metric[EI, Q, P, A, R] with QPAMetric[Q, P, A, R]

Returns the sum of the score returned by the calculate method.
class ZeroMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double]

Returns zero.
trait IEngineFactory extends EngineFactory

DEPRECATED.
trait IFSPersistentModel[AP <: Params] extends LocalFileSystemPersistentModel[AP]

DEPRECATED.
trait IFSPersistentModelLoader[AP <: Params, M] extends LocalFileSystemPersistentModelLoader[AP, M]

DEPRECATED.
trait IPersistentModel[AP <: Params] extends PersistentModel[AP]

DEPRECATED.
trait IPersistentModelLoader[AP <: Params, M] extends PersistentModelLoader[AP, M]

DEPRECATED.
class LIdentityPreparator[TD] extends IdentityPreparator[TD]

DEPRECATED.
class PIdentityPreparator[TD] extends IdentityPreparator[TD]

DEPRECATED.
trait WithPrId extends AnyRef

Mix in this trait for queries that contain prId (PredictedResultId).
trait WithQuerySerializer extends CustomQuerySerializer

DEPRECATED.

Value Members

object Engine extends Serializable

This object contains concrete implementation for some methods of the Engine class.
object EngineParams extends Serializable

Companion object for creating EngineParams instances.
object FastEvalEngineWorkflow extends Serializable

:: Experimental :: Workflow based on FastEvalEngine
object IdentityPreparator extends Serializable

Companion object of IdentityPreparator that conveniently returns an instance of the class of IdentityPreparator for use with EngineFactory.
object LAverageServing extends Serializable

A concrete implementation of LServing returning the average of all algorithms' predictions, where their classes are expected to be all Double.
object LFirstServing extends Serializable

A concrete implementation of LServing returning the first algorithm's prediction result directly without any modification.
object MetricEvaluator extends Serializable

Companion object of MetricEvaluator
object Utils

Controller utilities.
object ZeroMetric extends Serializable

Companion object of ZeroMetric
package java

package controller

Start Building an Engine

The DASE Paradigm

Types of Building Blocks

Engines

Data Source

Preparator

Algorithm

P2LAlgorithm

PAlgorithm

LAlgorithm

Serving

Model Persistence

Type Members

abstract class AverageMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Double]

trait CustomQuerySerializer extends BaseQuerySerializer

trait Deployment extends EngineFactory

type EmptyActualResult = SerializableClass

type EmptyAlgorithmParams = EmptyParams

type EmptyDataParams = EmptyParams

type EmptyDataSourceParams = EmptyParams

type EmptyEvaluationInfo = SerializableClass

type EmptyMetricsParams = EmptyParams

type EmptyModel = SerializableClass

case class EmptyParams() extends Params with Product with Serializable

type EmptyPreparatorParams = EmptyParams

type EmptyPreparedData = SerializableClass

type EmptyServingParams = EmptyParams

type EmptyTrainingData = SerializableClass

class Engine[TD, EI, PD, Q, P, A] extends BaseEngine[EI, Q, P, A]

abstract class EngineFactory extends AnyRef

class EngineParams extends Serializable

trait EngineParamsGenerator extends AnyRef

trait Evaluation extends EngineFactory with Deployment

class FastEvalEngine[TD, EI, PD, Q, P, A] extends Engine[TD, EI, PD, Q, P, A]

class FastEvalEngineWorkflow[TD, EI, PD, Q, P, A] extends Serializable

class IdentityPreparator[TD] extends BasePreparator[TD, TD]

abstract class LAlgorithm[PD, M, Q, P] extends BaseAlgorithm[RDD[PD], RDD[M], Q, P]

class LAverageServing[Q] extends LServing[Q, Double]

abstract class LDataSource[TD, EI, Q, A] extends BaseDataSource[RDD[TD], EI, Q, A]

class LFirstServing[Q, P] extends LServing[Q, P]

abstract class LPreparator[TD, PD] extends BasePreparator[RDD[TD], RDD[PD]]

abstract class LServing[Q, P] extends BaseServing[Q, P]

trait LocalFileSystemPersistentModel[AP <: Params] extends PersistentModel[AP]

trait LocalFileSystemPersistentModelLoader[AP <: Params, M] extends PersistentModelLoader[AP, M]

abstract class Metric[EI, Q, P, A, R] extends Serializable

class MetricEvaluator[EI, Q, P, A, R] extends BaseEvaluator[EI, Q, P, A, MetricEvaluatorResult[R]]

case class MetricScores[R](score: R, otherScores: Seq[Any]) extends Product with Serializable

abstract class OptionAverageMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsOptionMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Option[Double]]

abstract class OptionStdevMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsOptionMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Option[Double]]

abstract class P2LAlgorithm[PD, M, Q, P] extends BaseAlgorithm[PD, M, Q, P]

abstract class PAlgorithm[PD, M, Q, P] extends BaseAlgorithm[PD, M, Q, P]

abstract class PDataSource[TD, EI, Q, A] extends BaseDataSource[TD, EI, Q, A]

abstract class PPreparator[TD, PD] extends BasePreparator[TD, PD]

trait Params extends Serializable

trait PersistentModel[AP <: Params] extends AnyRef

trait PersistentModelLoader[AP <: Params, M] extends AnyRef

trait QPAMetric[Q, P, A, R] extends AnyRef

trait SanityCheck extends AnyRef

class SerializableClass extends Serializable

class SimpleEngine[TD, EI, Q, P, A] extends Engine[TD, EI, TD, Q, P, A]

class SimpleEngineParams extends EngineParams

abstract class StdevMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double] with StatsMetricHelper[EI, Q, P, A] with QPAMetric[Q, P, A, Double]

abstract class SumMetric[EI, Q, P, A, R] extends Metric[EI, Q, P, A, R] with QPAMetric[Q, P, A, R]

class ZeroMetric[EI, Q, P, A] extends Metric[EI, Q, P, A, Double]

trait IEngineFactory extends EngineFactory

trait IFSPersistentModel[AP <: Params] extends LocalFileSystemPersistentModel[AP]

trait IFSPersistentModelLoader[AP <: Params, M] extends LocalFileSystemPersistentModelLoader[AP, M]

trait IPersistentModel[AP <: Params] extends PersistentModel[AP]

trait IPersistentModelLoader[AP <: Params, M] extends PersistentModelLoader[AP, M]

class LIdentityPreparator[TD] extends IdentityPreparator[TD]

class PIdentityPreparator[TD] extends IdentityPreparator[TD]

trait WithPrId extends AnyRef

trait WithQuerySerializer extends CustomQuerySerializer

Value Members

object Engine extends Serializable

object EngineParams extends Serializable

object FastEvalEngineWorkflow extends Serializable

object IdentityPreparator extends Serializable