# Invoice Extraction

Let's say we get a lot of invoices as a PDF. But for each invoice, we only want to extract the **first** and **last name** of the person that gets billed and the **total** amount to pay.&#x20;

<figure><img src="https://3353734977-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FtsxpBApbdj8REMTZkWMA%2Fuploads%2FbL9OO7tmkEFoo6SclZVW%2Finvoice.png?alt=media&#x26;token=f350f07e-f6d3-453d-9af4-63f1a1fd1802" alt=""><figcaption></figcaption></figure>

{% file src="<https://3353734977-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FtsxpBApbdj8REMTZkWMA%2Fuploads%2FI2nmkoSevBJ2XKebiboL%2Finvoice.pdf?alt=media&token=c6e38b89-fbc0-4e2a-8c94-c72012e646ad>" %}

Now let's construct a [Shape](https://docs.waveline.ai/extract/types/shape) to extract those fields.&#x20;

```typescript
[
  {
    "name": "first_name",
    "type": "string",
    "description": "The first name of the one who receives the bill.",
    "isArray": false
  },
  {
    "name": "last_name",
    "type": "string",
    "description": "The last name of the one who receives the bill.",
    "isArray": false
  },
  {
    "name": "total",
    "type": "number",
    "description": "Total amount to pay",
    "isArray": false
  }
]
```

Once the shape is defined, we can call the [`/extract-document`](https://docs.waveline.ai/extract/endpoints/extract-document) endpoint with it and the invoice PDF in the payload to create this job:

```bash
curl -X POST "https://waveline.ai/api/v1/extract-document" \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -d '{
          "fileName": "invoice.pdf",
          "contentType": "application/pdf",
          "base64Content": "JVBERi0xLjMKMSAwIG9iago8PC9UeXBlL0NhdGF...",
          "shape": YOUR_SHAPE
        }'
```

We then process your call. Typically the job completion time lies between 10s and 3 minutes. From our request, we receive the following response:

```json
{
    "id": "a5ecc735-c48e-43ea-a739-d42bfb19edb3" 
    "status": "CREATED"; 
    "type": "extract"; 
    "result": null
    "urls": {
        "get": "https://waveline.ai/api/v1/jobs/a5ecc735-c48e-43ea-a739-d42bfb19edb3"; 
    }
}
```

With `urls["get"]` we can now query that job. This calls our [job](https://docs.waveline.ai/extract/endpoints/jobs-id) endpoint with the correct `job_id` conveniently already pre-filled.\
If we call this URL 20s later when the job has finished, we get back the following:

```json
{
    "id": "a5ecc735-c48e-43ea-a739-d42bfb19edb3" 
    "status": "FINISHED"; 
    "type": "extract"; 
    "result": {
        "first_name": "Ben",
        "last_name": "Timond",
        "total": 330.75,
     },
    "urls": {
        "get": "https://waveline.ai/api/v1/jobs/a5ecc735-c48e-43ea-a739-d42bfb19edb3"; 
    }
}
```

In this response, we see the job status has changed to `FINISHED` and the `result` field now contains our requested information.
