Waveline Extract
  • Waveline Extract
    • Introduction
    • Getting Started
  • Endpoints
    • /extract-document
    • /guess-shape
    • /raw-extract
    • /jobs
    • /jobs/{id}
    • /me
  • Types
    • Shape
    • DataShapeElement
  • Examples
    • Invoice Extraction
    • Order Table Extraction
    • Email Extraction
    • CV Extraction
    • Raw Extraction
  • Additional Material
    • FAQ
    • Limitations
  • Pricing
    • Pages
Powered by GitBook
On this page
  • Example (extract-document)
  • Example (guess-shape)
  • How does it work?
  1. Waveline Extract

Introduction

Welcome to Waveline Extract!

NextGetting Started

Last updated 1 year ago

Waveline Extract is a powerful AI-driven service designed for developers to effortlessly extract information from unstructured data using our API.

We currently support two services:

  • (Extract specific JSON information)

  • (Guess what Information you could extract)

Example (extract-document)

You provide us with a Document (this can be plain text, a PDF, ...) and a (describes what information you want to extract), and we will return you a JSON with that information extracted from your document.

Document:

Dear Resort Aroma,

I would like to book a room from the 25th of April until the 30th of April.
When I looked through the website, I saw that you have rooms with a balcony. 
It would be wonderful if I could get such a room.  

Best,
Richard Fulmar

Shape:

"shape": [
  {
    "name": "name",
    "type": "string",
    "description": "First and last name of the guest",
    "isArray": false
  }
]

After sending us these two things, we return you a JSON containing the desired information.

Returned JSON:

{
  "name": "Richard Fulmar"
}

Example (guess-shape)

Document:

Dear Resort Aroma,

I would like to book a room from the 25th of April until the 30th of April.
When I looked through the website, I saw that you have rooms with a balcony. 
It would be wonderful if I could get such a room.  

Best,
Richard Fulmar

We then return a shape with fields we think you want to extract.

Returned Shape:

"shape": [
  {
    "name": "name",
    "type": "string",
    "description": "First and last name of the guest",
    "isArray": false
  },
  {
    "name": "room_preference",
    "type": "string",
    "description": "Guest's preferred room location, room size, bed type,...",
    "isArray": false
  },
  {
    "name": "reservation_number",
    "type": "number",
    "description": "Unique number that identifies this reservation",
    "isArray": false
  }
]

How does it work?

The core of our processing is done with Large Language Models (LLMs) like GPT-4, Claude, PaLM, etc. The difference to conventional methods like regex-parsing and keyword-matching is that we are much more flexible and can handle more complicated queries this way. While LLMs have their own flaws, like hallucinations, incorrect formatting, and slow performance, we take care of that and only give the advantages of LLMs to our end-users. Under to hood, we give the name, type, and description of each shape element to the Language Model. When using our API, you can help us by providing descriptive names and accurate descriptions for better performance.

You provide us with a Document (this can be plain text, a PDF, ...). We will then guess the most important types of information in it and return it in the form of a .

Extract-Document
Guess-Shape
Shape
Shape