In this example, we are a company that sells many different products. Our customer sends us a small table of all the items they want to buy. However, since the customer creates the table, it can look slightly different every time. For example, they write Product ID instead of product_id, Qty instead of Quantity, and the shipping times have no clear structure.
Product ID
Qty
Unit price
Manufacturar
Shipping Time
TZX22-EHZ2
100
2.76$
Tusp
1-2 days
ZUI23-772L6
250
5.00$
-
1 week
UIUU-13BMW
340'000
0.001$
puma
About a month
QUE2-AIME2
56
45.56$
lebra
Tomorrow
Waveline Extract makes it easy to unify these fields into one format we define.
Let's construct a Shape to extract product_id, unit_price, quantity, and shipping_timefor each product:
[ {"name":"products","type":"object","description":"All products from the table","isArray":true,"elements": [ {"name":"product_id","type":"string","description":"The id of that product. aka product number","isArray":false }, {"name":"quantity","type":"number","description":"Quantity of how many units. Aka Qty","isArray":false }, {"name":"unit_price","type":"number","description":"Unit price of that product in dollars","isArray":false }, {"name":"shipping_time","type":"string","description":"Time it takes to ship this product. (In days)","isArray":false } ] }]
We can now call the /extract-document endpoint with this shape and the table as the payload to create the job: