Datasets
Datasets Overview

Datasets Overview

Fixpoint has two types of datasets:

  1. pre-built: datasets already built by Fixpoint, such as company data, people data, patents, academic papers, etc.
  2. custom: datasets built by you, which combine your own custom web search and scraping, other datasets, and your own documents

We are continually adding new pre-built datasets.

Pre-built Datasets

Company Data

Fixpoint pre-extracts structured data about companies, giving you a comprehensive starting point for the company data you need for your applications, data workflows, or AI agents. If you want, you can further enrich this data via web scraping, or use the pre-extracted company data.

See the Company Data page for more information.

Patents

Fixpoint indexes patents and lets you search them, turn them into LLM-ready plain-text or RAG chunks, or do data extraction on them. Fixpoint can also resolve all of the people and companies involved in the patent.

See the Patents page for more information.

Custom Datasets

Research Documents

One type of custom dataset is a Research Document, which is a essentially a table of extracted data built when you run Record Extraction. When you run record extraction, specify a document_id and your extractions will be grouped into that Research Document.

import os
 
from fixpoint.client import FixpointClient
from fixpoint.client.types import CreateRecordExtractionRequest, WebpageSource
 
client = FixpointClient(api_key=os.environ["FIXPOINT_API_KEY"])
 
extraction = client.extractions.record.create(
  CreateRecordExtractionRequest(
    # Specify a `document_id` to group extractions into a Research Document.
    document_id="example-research",
    # Optionally, set a human-readable display name.
    document_name="My Example Research",
    source=WebpageSource(url=site),
    questions=[
      "What is the product summary?",
      "What are the industries the business serves?",
      "What are the use-cases of the product?",
    ],
  )
)