Hi @amithvs,
I have wasted a lot of hours as I said before trying to understand Pipe is broken error and I have chosen to move away from the Python Scope Activity. My other two fallbacks are Invoke PowerShell Activity and Start Process Activity. Sadly the Invoke PowerShell activity is currently broken in Windows projects (UiPath Community 2022.12 Release - #41 by jeevith)
So the Start Process activity is the only option I can see working using standard UiPath activities. It has a way to call a command asynchronously or synchronously, which is great. Meaning we can let the Start Process complete the command and only then will the other activities in the sequence carry on.
- Start Process (python script and its arguments → save to a .csv)
- Read the csv file in UiPath
- If needed delete the csv as a cleanup
Here is requirements.txt file :
requirements.txt (646 Bytes)
(you can install all of the dependencies to your system python installation by using pip install -r requirements.txt
)
The Start Process activity uses Windows Command Prompt to execute the script and uses the python path from environment variables.
My implementation looks like this using Polars
import polars as pl
from fuzzywuzzy import fuzz
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("ThresholdValue", help="The threshold similarity value to be checked", type=str)
parser.add_argument("Input1", help="The first excel file path", type=str)
parser.add_argument("SheetName1", help="The sheet name for Input1", type=str)
parser.add_argument("Input2", help="The second excel file path", type=str)
parser.add_argument("SheetName2", help="The sheet name for Input2", type=str)
args = parser.parse_args()
# Helper functions
def ReadData(FilePath:str, SheetName:str):
datatable = pl.read_excel(file=FilePath, sheet_name=SheetName)
return datatable
def FuzzRatio(FirstName,AliasName):
ratio = fuzz.ratio(FirstName,AliasName)
return ratio
def CheckThreshold(SimilarityValue, ThresholdValue):
return int(SimilarityValue) > ThresholdValue
# Polars dataframe queries
def CalculateSimilarity(dataF)-> pl.DataFrame:
"""
Calculate the similarity ratio and return a dataframe
"""
result = dataF.with_columns( [
pl.struct(["First Name", "NameAlias_WholeName"])
.apply(lambda x: FuzzRatio(x["First Name"], x["NameAlias_WholeName"])).alias("StringSimilarity")])
return result
def FilterRows(dataF, Thresholdvalue)-> pl.DataFrame:
"""
Filter data rows where the similarity is over the threshold
"""
result = dataF.with_columns([
pl.struct(["StringSimilarity"])
.apply(lambda x: CheckThreshold(x["StringSimilarity"], Thresholdvalue)).alias("OverThreshold")])
result = result.filter(
(pl.col("OverThreshold") == True))
return result
def ProcessData(DataF:pl.DataFrame, Thresholdvalue:int)->pl.DataFrame:
"""
Performs all tasks in the pipeline and return json string
"""
ProcessPipeline = (DataF
.pipe(CalculateSimilarity)
.pipe(FilterRows, Thresholdvalue)
)
return ProcessPipeline
if __name__ == "__main__":
# Ingest Data
input1 = ReadData(args.Input1, args.SheetName1 )
input2 = ReadData(args.Input2, args.SheetName2)
CombinedInput_dataframe = pl.concat([input1,input2,],how="horizontal")
resultdf = ProcessData(CombinedInput_dataframe, int(args.ThresholdValue))
resultdf.write_csv("result.csv") # this csv can be later accessed by UiPath
Here is the project folder PolarsDataframeinUiPath.zip (13.4 KB)