/raw-extract

Convert documents (PDFs, Images, ...) to plain text

Creates a Raw Extraction Job

POST https://waveline.ai/api/v1/raw-extract

Creates a new job that converts documents into plain text.

Headers

Name
Type
Description

Content-Type

String

Should be application/json.

Authorization*

String

Bearer <YOUR_API_KEY>

Request Body

Name
Type
Description

contentUrl*

String

A URL pointing to your data. (e.g. https://example.com/invoice.pdf)

{
    "id": string,
    "createdAt": string,
    "status": "CREATED",
    "type": "raw-extract",
    "message": string,
    "pages": number, // Number of billed pages in this job
    "fileName": string,
    "result": null, // Is null after creation
    "urls": {
        "get": string; // Query this URL to get the status/result of your job
    }
}

⚠️To use this endpoint, book a meeting to discuss how you would like the result formatted⚠️

With this endpoint, you can extract everything from PDFs: Text, Titles, Tables, Images etc. To give you a better intuition have a look at our Example.

curl -X POST "https://waveline.ai/api/v1/raw-extract" \
     -H "Content-Type: application/pdf" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -d '{
          "fileName": "pointe_8-8.pdf",
          "contentType": "application/pdf",
          "contentUrl": "https://vwxzjwxlflvltwsntpsb.supabase.co/storage/v1/object/public/documentation/pointe_8-8.pdf",
        }'

If you already have an account, you can get an API key here.

Last updated