CDQ Data Quality as Service

Idea Portal

ML Data Mapper Classifier - mapping result + default mapping consideration

Hi Team,

thank you for providing this awesome feature! I used it multiple times and it was good and helpfull.


I would like to share two remarks with you.


Remark 1 - mapping result:

after successful execution of the api, the mapped "JSON" content is displayed as the result.


As a user, I would like to see an indication in the header of the result of which "columns" could not be mapped. Currently, I have to compare the JSON result with an Excel to complete the mapping manually.


I would be happy with the most favourable variant, as long as I can see which columns I have to correct.


Example:

{
"predictionResults": [
{
"resultType": "JSON_STRING",
"resultDescription": "The data mapper ready to use in upsert CDQ services. Please conduct a review before you apply it to your inbound data.",
>>>>>>>> "failedmapping": "Name2", "Name3", "Gummibärfarbe".
"resultContent": {
"id": "default",
"name": "Data Matching Definition for ml-mapper-generator/Lanxess-2021-11-24_Partner+Adressdaten_upload.xlsx"


Remark 2 - Default Mapping consideration for BP-Identifier like STCEG:

I noticed that the mapper only uses the localisations in the extract for mapping. If, for example, the extract contains the tax number (STCEG) for a BP for Germany and Belgium, these are mapped perfectly. However, if another extract is uploaded at some point, where localisations such as Russia and Mongolia are added, these are not taken into account in the mapping (because you have to know that, this was not part of the mapping and that you have to adjust the mapping).


As a user, I would therefore like to see a feature in the payload that takes into account all available mappings from the "default mapping". Here, too, this would be an immense improvement, firstly because I don't have to do this manually and secondly because this would also prevent errors from creeping in.


Again, as cheap as possible.


Example:

{

"modelName": "ML_DATA_MAPPER_CLASSIFIER",

"modelParameters": [

{"key" : "ML_S3_FILE_URI", "value" : "ml-mapper-generator/2021-10-21__12-11-38-Matching+Report+PMD.xlsx"},

{"key" : "ML_S3_BUCKET_NAME", "value" : "cdq-data-analytics"}

]

"Consider_default_BP-identifiers": "true"

}


Thank you guys, you are awesome!


Cheery,


Onur

  • Onur Dinc
  • Dec 7 2021
  • Completed
  • +1