Federated Linear Regression

Linear Regression(LinR) is a simple statistic model widely used for predicting continuous numbers. FATE provides Heterogeneous Linear Regression(HeteroLinR) and SSHE Linear Regression(HeteroSSHELinR).

Below lists features of HeteroLinR & HeteroSSHELinR models:

Linear Model	Arbiter-less Training	Weighted Training	Multi-Host	Cross Validation	Warm-Start/CheckPoint
Hetero LinR	✗	✓	✓	✓	✓
Hetero SSHELinR	✓	✓	✗	✓	✓

Heterogeneous LinR

HeteroLinR also supports multi-Host training. You can specify multiple hosts in the job configuration file like the provided examples/dsl/v2/hetero_linear_regression.

Here we simplify participants of the federation process into three parties. Party A represents Guest, party B represents Host. Party C, which is also known as “Arbiter,” is a third party that works as coordinator. Party C is responsible for generating private and public keys.

The process of HeteroLinR training is shown below:

A sample alignment process is conducted before training. The sample alignment process identifies overlapping samples in databases of all parties. The federated model is built based on the overlapping samples. The whole sample alignment process is conducted in encryption mode, and so confidential information (e.g. sample ids) will not be leaked.

In the training process, party A and party B each compute the elements needed for final gradients. Arbiter aggregates, calculates, and transfers back the final gradients to corresponding parties. For more details on the secure model-building process, please refer to this paper.

Features

L1 & L2 regularization
Mini-batch mechanism
Five optimization methods:
- sgd
  gradient descent with arbitrary batch size
- rmsprop
  RMSProp
- adam
  Adam
- adagrad
  AdaGrad
- nesterov_momentum_sgd
  Nesterov Momentum
- stochastic quansi-newton
  The algorithm details can refer to this paper.
Three converge criteria:
- diff
  Use difference of loss between two iterations, not available for multi-host training
- abs
  Use the absolute value of loss
- weight_diff
  Use difference of model weights
Support multi-host modeling task. For details on how to configure for multi-host modeling task, please refer to this guide
Support validation for every arbitrary iterations
Learning rate decay mechanism
Support early stopping mechanism, which checks for performance change on specified metrics over training rounds. Early stopping is triggered when no improvement is found at early stopping rounds.
Support sparse format data as input.
Support stepwise. For details on stepwise mode, please refer to stepwise .

Hetero-SSHE-LinR features:

L1 & L2 regularization
Mini-batch mechanism
Support different encrypt-mode to balance speed and security
Support early-stopping mechanism
Support setting arbitrary metrics for validation during training
Support model encryption with host model

linear_regression.md 5.4 KB Histórico Raw

Federated Linear Regression

Heterogeneous LinR

Features

Hetero-SSHE-LinR features:

linear_regression.md 5.4 KB

Histórico Raw