Trained FATE models from federated job of host
and guest
may be merged locally and reused in other frameworks.
Currently, FATE supports merging and exporting models from HeteroLR
, HeteroSSHELR
and HeteroSecureBoost
into PMML or sklearn Logistic Regression Model/lightGBM objects.
Below we walk through the general process of merging a Hetero Logistic Regression model using FATE-Flow Client command line tool.
flow
command line tool is distributed along with FATE-Client.
Once FATE-Client
installed, one can find a cmd enterpoint named flow
:
pip install fate_client
flow --help
Usage: flow [OPTIONS] COMMAND [ARGS]...
Fate Flow Client
Options:
-h, --help Show this message and exit.
Commands:
checkpoint Checkpoint Operations
component Component Operations
data Data Operations
init Flow CLI Init Command
job Job Operations
key Key Operations
model Model Operations
privilege Privilege Operations
provider Component Provider Operations
queue Queue Operations
resource Resource Manager
server FATE Flow Server Operations
service FATE Flow External Service Operations
table Table Operations
tag Tag Operations
task Task Operations
template Template Operations
test FATE Flow Test Operations
tracking Component Operations
To use FATE-Flow Client in terminal, we need to first specify which FATE Flow Service
to connect to.
Assume we have a FATE Flow Service at 172.15.0.1:9380, then execute the following command to initialize FATE-Flow Client:
flow init -c /data/projects/fate/conf/service_conf.yaml
flow init --ip 172.15.0.1 --port 9380
In the following steps, we assume that host
is the role executes merging action.
To obtain guest's
trained Hetero Logistic Regression, we need to first make host
load exported model.
For models trained in standalone mode, you may skip this step.
Below is an example model exportation configuration, adopted from here:
{
"role": "guest",
"party_id": 9999,
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "202208291446341887230",
"output_path": "/data/projects/fate/examples/export_model"
}
Run the following command to export specified model:
flow model export -c export_model.json
If model exportation succeeds, flow
will return success message like this:
{
"retcode": 0,
"file": "/data/projects/fate/examples/model_export/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip",
"retmsg": "download successfully, please check /data/projects/fate/examples/model_export/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip"
}
If the FATE model is trained in distributed mode, where guest
and host
have FATE each installed in different machines,
exported model needs to be imported/registered on merging role's system first before merging models.
Here is an example configuration for model importation
adopted from here,
assuming we have put the exported guest's
model from above step to directory "/data/projects/fate/temp" on host's
machine:
{
"role": "guest",
"party_id": 9999,
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "202208291446341887230",
"file": "/data/projects/fate/temp/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip"
}
Run the following command to import specified model:
flow model import -c import_model.json
If model importation succeeds, flow
will return success message like this:
{
"data": {
"job_id": "202208291521250963160",
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "202208291446341887230",
"party_id": "9999",
"role": "guest"
},
"retcode": 0,
"retmsg": "success"
}
Once model is ready, run the following command to merge models into a sklearn Logistic Regression model:
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 --guest-party-id 9999 --host-party-ids 10000 --component-name hetero_lr_0 --model-type lr --output-format sklearn --target-name y --include-guest-coef --output-path model_merge_sklearn
We may then load and predict with the obtained sklearn model:
import base64
import json
import pandas
import pickle
filename = "model_merge_sklearn"
test_data_h = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_host.csv",
header=0, index_col="id")
test_data_g = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_guest.csv",
header=0, index_col="id")
test_data_g = test_data_g.loc[:, test_data_g.columns != 'y']
test_data = pandas.concat((test_data_g, test_data_h), axis=1)
with open(filename, "r") as f:
loaded_model = pickle.loads(base64.b64decode(json.load(f)))
predict_score = loaded_model.predict_proba(test_data)
If the federated model is a HeteroSecureBoost Model, run this command to merge model and export it in LightGBM format.
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format lgb --target-name y --output-path ./lgb.txt
If you want a model in PMML format, use this command:
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format pmml --target-name y --output-path ./lgb.txt
Please notice that the model merge function offers a parameter 'host_rename'. Host features will be renamed by adding a sitename suffix to the original feature name once this option is on:
Host party id is 9998, sitename is host_9998, origin feature is x1
x1 -> x1, default setting
x1 -> x1_host_9998, when enable host_rename
Use this command to rename host names when output
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 --host-rename
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format lgb --target-name y --output-path ./lgb.txt
We may then load and predict with the obtained lightgbm model. Remember that to get the correct predicted result, the order of features in predicted data should be in the same order as the merged tree model.
import json
import pandas
import lightgbm as lgb
filename = "./lgb.txt"
test_data_h = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_host.csv",
header=0, index_col="id")
test_data_g = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_guest.csv",
header=0, index_col="id")
test_data_g = test_data_g.loc[:, test_data_g.columns != 'y']
test_data = pandas.concat((test_data_g, test_data_h), axis=1)
bst = lgb.Booster(model_str=open(filename, 'r').read())
pred_rs = bst.predict(test_data)
For more information on command options, please check help
menu:
flow component hetero-model-merge --help
Usage: flow component hetero-model-merge [OPTIONS]
- DESCRIPTION:
Merge a hetero model.
- USAGE:
flow component hetero-model-merge --model-id guest-9999#host-9998#model --model-version 202208241838502253290 --guest-party-id 9999 --host-party-ids 9998,9997 --component-name hetero_secure_boost_0 --model-type secureboost --output-format pmml --target-name y --no-host-rename --no-include-guest-coef --output-path model.xml
Options:
--model-id TEXT Model id. [required]
--model-version TEXT Model version. [required]
-gid, --guest-party-id TEXT A valid party id. [required]
-hids, --host-party-ids TEXT Multiple party ids, use a comma to separate
each one. [required]
-cpn, --component-name TEXT A valid component name. [required]
--model-type TEXT [required]
--output-format TEXT [required]
--target-name TEXT
--host-rename / --no-host-rename
--include-guest-coef / --no-include-guest-coef
-o, --output-path PATH User specifies output directory path.
[required]
-h, --help Show this message and exit.