Shape

We use Shape to specify what you want to extract from a document. It's an array where each element is a DataShapeElement.

type Shape = DataShapeElement[];

Create a Shape

There are three main ways to build a Shape for your use case:

  • Use our intuitive shape tools in the dashboard.

  • Let us guess the shape you want to extract with our guess-shape endpoint.

  • Define your own shape based on our type definition.

Examples

Minimal Example

We extract the title of a provided newspaper article. The title should be a string.

[
  {
    "name": "title",
    "type": "string",
    "description": "The title of this newspaper article",
    "isArray": false
  }
]

Multiple fields

We extract multiple fields. Let's assume we process and invoice. We extract the name of the recipient and the total amount to pay.

Array

If we have a field with multiple answers, we can set isArray to true. In this example, we process a conference paper and want to get back all the authors of that paper.

Object

In this example, we want to process mechanical parts that are mentioned in our document. For these parts, we specify an object with two properties: part_id and shipping_price. Both properties could be saved separately like in the example above, but in this case, we can simply save them as properties of this object:

Advanced Shape

This example is a more complex variation of the previous one. Since our document can contain multiple parts to ship, we can simply set the isArray property to true and our results will contain a list of parts! Additionally, we can create another field called date that holds the current date.

Last updated