Shellmiao 9279d1873b Add projects 1 year ago
..
mnist_train 9279d1873b Add projects 1 year ago
upload_config 9279d1873b Add projects 1 year ago
README.md 9279d1873b Add projects 1 year ago
UCI_Credit_Card.csv 9279d1873b Add projects 1 year ago
UCI_Credit_Card_val.csv 9279d1873b Add projects 1 year ago
breast_hetero_guest.csv 9279d1873b Add projects 1 year ago
breast_hetero_guest_repeated_id.csv 9279d1873b Add projects 1 year ago
breast_hetero_guest_sample_id.csv 9279d1873b Add projects 1 year ago
breast_hetero_host.csv 9279d1873b Add projects 1 year ago
breast_hetero_host_sample_id.csv 9279d1873b Add projects 1 year ago
breast_hetero_host_tag_value.csv 9279d1873b Add projects 1 year ago
breast_hetero_mini_guest.csv 9279d1873b Add projects 1 year ago
breast_hetero_mini_host.csv 9279d1873b Add projects 1 year ago
breast_homo_guest.csv 9279d1873b Add projects 1 year ago
breast_homo_host.csv 9279d1873b Add projects 1 year ago
breast_homo_test.csv 9279d1873b Add projects 1 year ago
default_credit_hetero_guest.csv 9279d1873b Add projects 1 year ago
default_credit_hetero_host.csv 9279d1873b Add projects 1 year ago
default_credit_homo_guest.csv 9279d1873b Add projects 1 year ago
default_credit_homo_host_1.csv 9279d1873b Add projects 1 year ago
default_credit_homo_host_2.csv 9279d1873b Add projects 1 year ago
default_credit_homo_test.csv 9279d1873b Add projects 1 year ago
dvisits_hetero_guest.csv 9279d1873b Add projects 1 year ago
dvisits_hetero_host.csv 9279d1873b Add projects 1 year ago
epsilon_5k_hetero_guest.csv 9279d1873b Add projects 1 year ago
epsilon_5k_hetero_host.csv 9279d1873b Add projects 1 year ago
epsilon_5k_homo_guest.csv 9279d1873b Add projects 1 year ago
epsilon_5k_homo_host.csv 9279d1873b Add projects 1 year ago
epsilon_5k_homo_test.csv 9279d1873b Add projects 1 year ago
give_credit_hetero_guest.csv 9279d1873b Add projects 1 year ago
give_credit_hetero_host.csv 9279d1873b Add projects 1 year ago
give_credit_hetero_test.csv 9279d1873b Add projects 1 year ago
give_credit_homo_guest.csv 9279d1873b Add projects 1 year ago
give_credit_homo_host.csv 9279d1873b Add projects 1 year ago
give_credit_homo_test.csv 9279d1873b Add projects 1 year ago
ionosphere_scale_hetero_guest.csv 9279d1873b Add projects 1 year ago
ionosphere_scale_hetero_host.csv 9279d1873b Add projects 1 year ago
mock_tag_hetero_host.csv 9279d1873b Add projects 1 year ago
mocked_string_data.csv 9279d1873b Add projects 1 year ago
motor_hetero_guest.csv 9279d1873b Add projects 1 year ago
motor_hetero_host.csv 9279d1873b Add projects 1 year ago
motor_hetero_host_1.csv 9279d1873b Add projects 1 year ago
motor_hetero_host_2.csv 9279d1873b Add projects 1 year ago
motor_hetero_mini_guest.csv 9279d1873b Add projects 1 year ago
motor_hetero_mini_host.csv 9279d1873b Add projects 1 year ago
nus_wide_train_guest.csv 9279d1873b Add projects 1 year ago
nus_wide_train_host.csv 9279d1873b Add projects 1 year ago
nus_wide_validate_guest.csv 9279d1873b Add projects 1 year ago
nus_wide_validate_host.csv 9279d1873b Add projects 1 year ago
student_hetero_guest.csv 9279d1873b Add projects 1 year ago
student_hetero_host.csv 9279d1873b Add projects 1 year ago
student_homo_guest.csv 9279d1873b Add projects 1 year ago
student_homo_host.csv 9279d1873b Add projects 1 year ago
student_homo_test.csv 9279d1873b Add projects 1 year ago
svmlight_guest.csv 9279d1873b Add projects 1 year ago
svmlight_host.csv 9279d1873b Add projects 1 year ago
tag.csv 9279d1873b Add projects 1 year ago
tag_value_1000_140.csv 9279d1873b Add projects 1 year ago
unified_dataset_noniid_1_1000.csv 9279d1873b Add projects 1 year ago
unified_dataset_noniid_2_1000_mani.csv 9279d1873b Add projects 1 year ago
unified_dataset_validation_1000.csv 9279d1873b Add projects 1 year ago
unittest_data.csv 9279d1873b Add projects 1 year ago
vehicle_scale_hetero_guest.csv 9279d1873b Add projects 1 year ago
vehicle_scale_hetero_host.csv 9279d1873b Add projects 1 year ago
vehicle_scale_homo_guest.csv 9279d1873b Add projects 1 year ago
vehicle_scale_homo_host.csv 9279d1873b Add projects 1 year ago
vehicle_scale_homo_test.csv 9279d1873b Add projects 1 year ago

README.md

Data

This document provides explanation and source information on data used for running examples. Many of the data sets have been scaled or transformed from their original version.

Data Set Naming Rule

All data sets are named according to this guideline: name: "{content}_{mode}_{size}_{role}_{role_index}"

  • content: brief description of data content
  • mode: how original data is divided, either "homo""or hetero"; some data sets do not have this information
  • size: includes keyword "mini" if the data set is truncated from another larger set
  • role: role name, either "host" or "guest"
  • role_index: if a data set is further divided and shared among multiple hosts in some example, indices are used to distinguish different parties, starts at 1

Data sets used for running examples are uploaded to FATE data storage at the time of deployment. Uploaded tables share the same namespace "experiment" and have table_name matching to original file names. Below lists example data sets and their information.

Horizontally Divided Data

For Homogeneous Federated Learning

breast_homo:

  • 30 features
  • label type: binary
  • source
  • data sets:
    1. "breast_homo_guest.csv"
      • name: "breast_homo_guest"
      • namespace: "experiment"
    2. "breast_homo_host.csv"
      • name: "breast_homo_host"
      • namespace: "experiment"
    3. "breast_homo_test.csv"
      • name: "breast_homo_test"
      • namespace: "experiment"

default_credit_homo:

  • 23 features
  • label type: binary
  • source
  • data sets:
    1. "default_credit_homo_guest.csv"
      • name: "default_credit_homo_guest"
      • namespace: "experiment"
    2. "default_credit_homo_host_1.csv"
      • name: "default_credit_homo_host_1"
      • namespace: "experiment"
    3. "default_credit_homo_host_2.csv"
      • name: "default_credit_homo_host_2"
      • namespace: "experiment"
    4. "default_credit_homo_test.csv"
      • name: "defeault_credit_homo_test"
      • namespace: "experiment"

epsilon_5k_homo:

  • 100 features
  • label type: binary
  • mock data
  • data sets:
    1. "epsilon_5k_homo_guest.csv"
      • name: "epsilon_5k_homo_guest"
      • namespace: "experiment"
    2. "epsilon_5k_homo_hostt.csv"
      • name: "epsilon_5k_homo_host"
      • namespace: "experiment"
    3. "epsilon_5k_homo_test.csv"
      • name: "epsilon_5k_homo_test"
      • namespace: "experiment"

give_credit_homo:

  • 10 features
  • label type: binary
  • source
  • data sets:
    1. "give_credit_homo_guest.csv"
      • name: "give_credit_homo_guest"
      • namespace: "experiment"
    2. "give_credit_homo_host.csv"
      • name: "give_credit_homo_host"
      • namespace: "experiment"
    3. "give_credit_homo_test.csv"
      • name: "give_credit_homo_test"
      • namespace: "experiment"

student_homo:

  • 13 features
  • label type: continuous
  • source
  • data sets:
    1. "student_homo_guest.csv"
      • name: "student_homo_guest"
      • namespace: "experiment"
    2. "student_homo_host.csv"
      • name: "student_homo_host"
      • namespace: "experiment"
    3. "student_homo_test.csv"
      • name: "student_homo_test"
      • namespace: "experiment"

vehicle_scale_homo:

  • 18 features
  • label type: multi-class
  • source
  • data sets:
    1. "vehicle_scale_homo_guest.csv"
      • name: "vehicle_scale_homo_guest"
      • namespace: "experiment"
    2. "vehicle_scale_homo_host.csv"
      • name: "vehicle_scale_homo_host"
      • namespace: "experiment"
    3. "vehicle_scale_homo_test.csv"
      • name: "vehicle_scale_homo_test"
      • namespace: "experiment"

Vertically Divided Data

For Heterogeneous Federated Learning

breast_hetero:

  • 30 features
  • label type: binary
  • source
  • data sets:
    1. "breast_hetero_guest.csv"
      • name: "breast_hetero_guest"
      • namespace: "experiment"
    2. "breast_hetero_host.csv"
      • name: "breast_hetero_host"
      • namespace: "experiment"

breast_hetero_mini:

  • 7 features
  • label type: binary
  • source
  • data sets:
    1. "breast_hetero_mini_guest.csv"
      • name: "breast_hetero_mini_guest"
      • namespace: "experiment"
    2. "breast_hetero_mini_host.csv"
      • name: "breast_hetero_mini_host"
      • namespace: "experiment"

default_credit_hetero:

  • 23 features
  • label type: binary
  • source
  • data sets:
    1. "default_credit_hetero_guest.csv"
      • name: "default_credit_hetero_guest"
      • namespace: "experiment"
    2. "default_credit_hetero_host.csv"
      • name: "default_credit_hetero_host"
      • namespace: "experiment"

dvisits_hetero:

  • 12 features
  • label type: continuous
  • source
  • data sets:
    1. "dvisits_hetero_guest.csv"
      • name: "dvisits_hetero_guest"
      • namespace: "experiment"
    2. "dvisits_hetero_host.csv"
      • name: "dvisits_hetero_host"
      • namespace: "experiment"

epsilon_5k_hetero:

  • 100 features
  • label type: binary
  • mock data
  • data sets:
    1. "epsilon_5k_hetero_guest.csv"
      • name: "epsilon_5k_hetero_guest"
      • namespace: "experiment"
    2. "epsilon_5k_hetero_host.csv"
      • name: "epsilon_5k_hetero_host"
      • namespace: "experiment"

give_credit_hetero:

  • 10 features
  • label type: binary
  • source
  • data sets:
    1. "give_credit_hetero_guest.csv"
      • name: "give_credit_hetero_guest"
      • namespace: "experiment"
    2. "give_credit_hetero_host.csv"
      • name: "give_credit_hetero_host"
      • namespace: "experiment"

ionosphere_scale_hetero

  • 34 features
  • label type: binary
  • source
  • data sets:
    1. "ionosphere_scale_hetero_guest.csv"
      • name: "ionosphere_scale_hetero_guest"
      • namespace: "experiment"
    2. "ionosphere_scale_hetero_host.csv"
      • name: "ionosphere_scale_hetero_host"
      • namespace: "experiment"

motor_hetero:

  • 11 features
  • label type: continuous
  • source
  • data sets:
    1. "motor_hetero_guest.csv"
      • name: "motor_hetero_guest"
      • namespace: "experiment"
    2. "motor_hetero_host.csv"
      • name: "motor_hetero_host"
      • namespace: "experiment"
    3. "motor_hetero_host_1.csv"
      • name: "motor_hetero_host_1"
      • namespace: "experiment"
    4. "motor_hetero_host_2.csv"
      • name: "motor_hetero_host_2"
      • namespace: "experiment"

motor_hetero_mini:

  • 7 features
  • label type: continuous
  • source
  • data sets:
    1. "motor_hetero_mini_guest.csv"
      • name: "motor_hetero_mini_guest"
      • namespace: "experiment"
    2. "motor_hetero_mini_host.csv"
      • name: "motor_hetero_mini_host"
      • namespace: "experiment"

student_hetero:

  • 13 features
  • label type: continuous
  • source
  • data sets:
    1. "student_hetero_guest.csv"
      • name: "student_hetero_guest"
      • namespace: "experiment"
    2. "student_hetero_host.csv"
      • name: "student_hetero_host"
      • namespace: "experiment"

svm_light:

  • svmlight / libsvm format
  • label type: categorical
  • data sets:
    1. "svm_light_guest.csv"
      • name: "svm_light_guest"
      • namespace: "experiment"
    2. "svm_light_host.csv"
      • name: "svm_light_host"
      • namespace: "experiment"

vehicle_scale_hetero:

  • 18 features
  • label type: multi-class
  • source
  • data sets:
    1. "vehicle_scale_hetero_guest.csv"
      • name: "vehicle_scale_hetero_guest"
      • namespace: "experiment"
    2. "vehicle_scale_hetero_host.csv"
      • name: "vehicle_scale_hetero_host"
      • namespace: "experiment"

Federated Transfer Learning Data

For Federated Transfer Learning

nus_wide:

  • 636/1000 features
  • label type: binary
  • source
  • data sets:
    1. "nus_wide_train_guest.csv"
      • name: "nus_wide_train_guest"
      • namespace: "experiment"
    2. "nus_wide_train_host.csv"
      • name: "nus_wide_train_host"
      • namespace: "experiment"
    3. "nus_wide_validate_guest.csv"
      • name: "nus_wide_validate_guest"
      • namespace: "experiment"
    4. "nus_wide_validate_host.csv"
      • name: "nus_wide_validate_host"
      • namespace: "experiment"

Non-Divided Data

Generated Data for Data Operation Demo

tag_value:

  • data sets:

    1. "mocked_string_data.csv"
    2. "mock_tag_hetero_host.csv"
      • name: "mock_tag_hetero_host"
      • namespace: "experiment"
    3. "mocked_string_data.csv"
    4. "tag.csv"
    5. "tag_value_1000_140.csv"
      • name: "tag_value_1", "tag_value_2", "tag_value_3"
      • namespace: "experiment"
    6. "unittest_data.csv"