Use case: Predict Loan Eligibility
Dataset description:
-Features:Loan_ID,Gender,Married,Dependents,Education,Self_Employed,
ApplicantIncome,CoapplicantI0come,LoanAmount,Loan_Amount_Term,Credit_History
-Target: 1: Loan approved 0: Loan does not approved.
-My feedback: The input Description in documentation for TPOT XGBoost Classification model is not accurate enough:
Actual Input Description : { “Feature1”: 12, “Feature2”: 222, …, “FeatureN”: 110}
*Error: When we follow this structure for example in this use case: {“Loan_ID”: 001002 ,“Gender”: 0,“Married”: 0,“Dependents”: 0,“Education”: 1,“Self_Employed”: 0,“ApplicantIncome”: 5849,“CoapplicantI0come”: 0 ,“LoanAmount”: 128,“Loan_Amount_Term”: 360,“Credit_History”: 1}
We got the following error :
If using all scalar values, you must pass an index
Solution: { “Feature1”: [12], “Feature2”: [222], …, “FeatureN”: [110]}
We have to put braces when we have numerical features.
Example: {“Loan_ID”: [001002] ,“Gender”: [0],“Married”: [0],“Dependents”: [0],“Education”: [1],“Self_Employed”: [0],“ApplicantIncome”: [5849],“CoapplicantI0come”: [0] ,“LoanAmount”: [128],“Loan_Amount_Term”: [360],“Credit_History”: [1]}
Proposition: Modify the Input Description in Documentation.