## Pipeline Tutorial with HeteroSecureBoost

### install

`Pipeline` is distributed along with [fate_client](https://pypi.org/project/fate-client/).

```bash
pip install fate_client
```

To use Pipeline, we need to first specify which `FATE Flow Service` to connect to. Once `fate_client` installed, one can find an cmd enterpoint name `pipeline`:

In [1]:
!pipeline --help

Usage: pipeline [OPTIONS] COMMAND [ARGS]...

Options:
 --help Show this message and exit.

Commands:
 config pipeline config tool
 init - DESCRIPTION: Pipeline Config Command.


Assume we have a `FATE Flow Service` in 127.0.0.1:9380(defaults in standalone), then exec

In [2]:
!pipeline init --ip 127.0.0.1 --port 9380

Pipeline configuration succeeded.


### Hetero SecureBoost Example

 Before start a modeling task, data to be used should be uploaded. Please refer to this [guide](https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/pipeline_tutorial_upload.ipynb).

The `pipeline` package provides components to compose a `FATE pipeline`.

In [3]:
from pipeline.backend.pipeline import PipeLine
from pipeline.component import Reader, DataTransform, Intersection, HeteroSecureBoost, Evaluation
from pipeline.interface import Data

Make a `pipeline` instance:

 - initiator: 
 * role: guest
 * party: 9999
 - roles:
 * guest: 9999
 * host: 10000
 

In [4]:
pipeline = PipeLine() \
 .set_initiator(role='guest', party_id=9999) \
 .set_roles(guest=9999, host=10000)

Define a `Reader` to load data

In [5]:
reader_0 = Reader(name="reader_0")
# set guest parameter
reader_0.get_party_instance(role='guest', party_id=9999).component_param(
 table={"name": "breast_hetero_guest", "namespace": "experiment"})
# set host parameter
reader_0.get_party_instance(role='host', party_id=10000).component_param(
 table={"name": "breast_hetero_host", "namespace": "experiment"})

Add a `DataTransform` component to parse raw data into Data Instance

In [6]:
data_transform_0 = DataTransform(name="data_transform_0")
# set guest parameter
data_transform_0.get_party_instance(role='guest', party_id=9999).component_param(
 with_label=True)
data_transform_0.get_party_instance(role='host', party_id=[10000]).component_param(
 with_label=False)

Add a `Intersection` component to perform PSI for hetero-scenario

In [7]:
intersect_0 = Intersection(name="intersect_0")

Now, we define the `HeteroSecureBoost` component. The following parameters will be set for all parties involved.

In [8]:
hetero_secureboost_0 = HeteroSecureBoost(name="hetero_secureboost_0",
 num_trees=5,
 bin_num=16,
 task_type="classification",
 objective_param={"objective": "cross_entropy"},
 encrypt_param={"method": "paillier"},
 tree_param={"max_depth": 3})


To show the evaluation result, an "Evaluation" component is needed.

In [9]:
evaluation_0 = Evaluation(name="evaluation_0", eval_type="binary")

Add components to pipeline, in order of execution:

 - data_transform_0 comsume reader_0's output data
 - intersect_0 comsume data_transform_0's output data
 - hetero_secureboost_0 consume intersect_0's output data
 - evaluation_0 consume hetero_secureboost_0's prediciton result on training data

Then compile our pipeline to make it ready for submission.

In [10]:
pipeline.add_component(reader_0)
pipeline.add_component(data_transform_0, data=Data(data=reader_0.output.data))
pipeline.add_component(intersect_0, data=Data(data=data_transform_0.output.data))
pipeline.add_component(hetero_secureboost_0, data=Data(train_data=intersect_0.output.data))
pipeline.add_component(evaluation_0, data=Data(data=hetero_secureboost_0.output.data))
pipeline.compile();


Now, submit(fit) our pipeline:

In [11]:
pipeline.fit()


[32m2021-12-31 03:24:22.633[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m123[0m - [1mJob id is 202112310324182459270
[0m
[32m2021-12-31 03:24:23.152[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[0mm2021-12-31 03:24:23.671[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m173[0m - [1m
[32m2021-12-31 03:24:27.861[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m177[0m - [1m[80D[1A[KRunning component reader_0, time elapse: 0:00:05[0m
[0mm2021-12-31 03:24:30.533[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m173[0m - [1m
[32m2021-12-31 03:24:34.732[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_sta

Once training is done, trained model may be used for prediction. Optionally, save the trained pipeline for future use.

In [12]:
pipeline.dump("pipeline_saved.pkl");

First, deploy needed components from train pipeline

In [13]:
pipeline = PipeLine.load_model_from_file('pipeline_saved.pkl')
pipeline.deploy_component([pipeline.data_transform_0, pipeline.intersect_0, pipeline.hetero_secureboost_0]);

Define new `Reader` components for reading prediction data

In [14]:
reader_1 = Reader(name="reader_1")
reader_1.get_party_instance(role="guest", party_id=9999).component_param(table={"name": "breast_hetero_guest", "namespace": "experiment"})
reader_1.get_party_instance(role="host", party_id=10000).component_param(table={"name": "breast_hetero_host", "namespace": "experiment"})

Optionally, define new `Evaluation` component.

In [15]:
evaluation_0 = Evaluation(name="evaluation_0", eval_type="binary")

Add components to predict pipeline in order of execution:

In [16]:
predict_pipeline = PipeLine()
predict_pipeline.add_component(reader_1)\
 .add_component(pipeline, 
 data=Data(predict_input={pipeline.data_transform_0.input.data: reader_1.output.data}))\
 .add_component(evaluation_0, data=Data(data=pipeline.hetero_secureboost_0.output.data));


Then, run prediction job

In [17]:
predict_pipeline.predict()

[32m2021-12-31 03:25:35.541[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m123[0m - [1mJob id is 202112310325328444510
[0m
[32m2021-12-31 03:25:47.384[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:11[0m
[0mm2021-12-31 03:25:47.903[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m173[0m - [1m
[32m2021-12-31 03:25:52.078[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m177[0m - [1m[80D[1A[KRunning component reader_1, time elapse: 0:00:16[0m
[0mm2021-12-31 03:25:54.161[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m173[0m - [1m
[32m2021-12-31 03:25:58.545[0m | [1mINFO [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_sta

For more demo on using pipeline to submit jobs, please refer to [pipeline demos](https://github.com/FederatedAI/FATE/tree/master/examples/pipeline/demo)