Invoice Extraction

We extract the necessary information to process invoices

Let's say we get a lot of invoices as a PDF. But for each invoice, we only want to extract the first and last name of the person that gets billed and the total amount to pay.

Now let's construct a Shape to extract those fields.

[
  {
    "name": "first_name",
    "type": "string",
    "description": "The first name of the one who receives the bill.",
    "isArray": false
  },
  {
    "name": "last_name",
    "type": "string",
    "description": "The last name of the one who receives the bill.",
    "isArray": false
  },
  {
    "name": "total",
    "type": "number",
    "description": "Total amount to pay",
    "isArray": false
  }
]

Once the shape is defined, we can call the /extract-document endpoint with it and the invoice PDF in the payload to create this job:

curl -X POST "https://waveline.ai/api/v1/extract-document" \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -d '{
          "fileName": "invoice.pdf",
          "contentType": "application/pdf",
          "base64Content": "JVBERi0xLjMKMSAwIG9iago8PC9UeXBlL0NhdGF...",
          "shape": YOUR_SHAPE
        }'

We then process your call. Typically the job completion time lies between 10s and 3 minutes. From our request, we receive the following response:

{
    "id": "a5ecc735-c48e-43ea-a739-d42bfb19edb3" 
    "status": "CREATED"; 
    "type": "extract"; 
    "result": null
    "urls": {
        "get": "https://waveline.ai/api/v1/jobs/a5ecc735-c48e-43ea-a739-d42bfb19edb3"; 
    }
}

With urls["get"] we can now query that job. This calls our job endpoint with the correct job_id conveniently already pre-filled. If we call this URL 20s later when the job has finished, we get back the following:

{
    "id": "a5ecc735-c48e-43ea-a739-d42bfb19edb3" 
    "status": "FINISHED"; 
    "type": "extract"; 
    "result": {
        "first_name": "Ben",
        "last_name": "Timond",
        "total": 330.75,
     },
    "urls": {
        "get": "https://waveline.ai/api/v1/jobs/a5ecc735-c48e-43ea-a739-d42bfb19edb3"; 
    }
}

In this response, we see the job status has changed to FINISHED and the result field now contains our requested information.

Last updated