Skip to main content

Ingest Data

Store memories, conversations, and documents into Hindsight.

Prerequisites

Make sure you've completed the Quick Start to install the client and start the server.

Store a Single Memory

from hindsight_client import Hindsight

client = Hindsight(base_url="http://localhost:8888")

client.retain(
bank_id="my-bank",
content="Alice works at Google as a software engineer"
)

Store with Context and Date

Add context and event dates for better retrieval:

client.retain(
bank_id="my-bank",
content="Alice got promoted to senior engineer",
context="career update",
timestamp="2024-03-15T10:00:00Z"
)

The timestamp enables temporal queries like "What happened last spring?"

Batch Ingestion

Store multiple memories in a single request:

client.retain_batch(
bank_id="my-bank",
items=[
{"content": "Alice works at Google", "context": "career"},
{"content": "Bob is a data scientist at Meta", "context": "career"},
{"content": "Alice and Bob are friends", "context": "relationship"}
],
document_id="conversation_001"
)

The document_id groups related memories for later management.

Store from Files

# Single file
hindsight memory put-files my-bank document.txt

# Multiple files
hindsight memory put-files my-bank doc1.txt doc2.md notes.txt

# With document ID
hindsight memory put-files my-bank report.pdf --document-id "q4-report"
How Retain Works

Learn about fact extraction, entity resolution, and graph construction in the Retain Architecture guide.

Async Ingestion

For large batches, use async ingestion:

# Start async ingestion
result = client.retain_batch(
bank_id="my-bank",
items=[...large batch...],
document_id="large-doc",
async_=True
)

# Result contains operation_id for tracking
print(result["operation_id"])

Best Practices

DoDon't
Include context for better retrievalStore raw unstructured dumps
Use document_id to group related contentMix unrelated content in one batch
Add timestamp for temporal queriesOmit dates if time matters
Store conversations as they happenWait to batch everything